Introduction to ML in Elm

We will be using Elm v0.18. If there are minor language revisions released throughout the quarter, we will decide whether or not to upgrade. You should get Elm up and running as soon as possible to make sure that you have a working development environment.

Let’s jump in with some examples at the REPL (read-eval-print loop).

% elm-repl
---- elm repl 0.18.0 -----------------------------------------------------------
 :help for help, :exit to exit, more at <https://github.com/elm-lang/elm-repl>
--------------------------------------------------------------------------------
>

Basic Values

> True
True : Bool

> False
False : Bool

> 'a'
'a' : Char

> "abc"
"abc" : String

> 3.0
3 : Float

Numeric literals without a decimal point are described by the type variable number, which describes both Ints and Floats.

> 3
3 : number

One way to read the last line above is “for every type number such that number = Int or number = Float, 3 has type number.” In other words, “3 has type Int and Float” and depending on how the expression is used, the Elm type checker will choose to instantiate the type variable number with one of these types.

> truncate
<function: truncate> : Float -> Int

> truncate 3
3 : Int

> truncate 3.0
3 : Int

If you are familiar with Haskell, think of number as a type class that is “baked in” to the language. Elm does not have general support for type classes, but it does have a few special purpose type classes like number.

Tuples

Tuples package two or more expressions into a single expression. The type of a tuple records the number of components and each of their types.

> (True, False)
(True,False) : ( Bool, Bool )

> (1, 2, 3.0)
(1,2,3) : ( number, number1, Float )

Notice the suffix on the type of the second number. That’s because the expressions 1 and 2 both have type number (i.e. Int or Float) but they may be different kinds of numbers. So, suffixes are used to create different variables so that each numeric type can be specified independently. If you’re familiar with Haskell, the type of this triple would be something like (Num a, Num b) => (a, b, Float). This can be read as saying “for any types a and b that are numbers, the tuple has type (a, b, Float).”

Lone expressions prefer to remain alone:

> ("Leave me alone!")
"Leave me alone!" : String

> (((((("Leave me alone!"))))))
"Leave me alone!" : String

Functions

Like in most functional languages, all functions take exactly one argument and return exactly one value.

> exclaim = \s -> s ++ "!"
<function> : String -> String

> exclaim s = s ++ "!"
<function> : String -> String

> exclaim "Hi"
"Hi!" : String

Multiple arguments in uncurried style:

> plus = \(x,y) -> x + y
<function> : ( number, number ) -> number

> plus (x,y) = x + y
<function> : ( number, number ) -> number

> plus xy = Tuple.first xy + Tuple.second xy
<function> : ( number, number ) -> number

Notice the lack of suffixes in the types above. That’s because the addition operator takes two numeric arguments of the same type:

> (+)
<function> : number -> number -> number

Infix operators can be used as functions:

> (+) 3 4
7 : number

> (+) ((+) 3 4) 5
12 : number

(Note to Haskellers: Recent versions of Elm disallow the use of backticks to treat named functions into infix operators, as well as a couple other syntactic features originally derived from Haskell.)

Multiple arguments in curried style:

> plus x y = x + y
<function> : number -> number -> number

> plus x = \y -> x + y
<function> : number -> number -> number

> plus = \x -> \y -> x + y
<function> : number -> number -> number

> plus = \x y -> x + y
<function> : number -> number -> number

Partial application of curried functions:

> plus7 = plus 7
<function> : number -> number

> plus7 1
8 : number

> plus7 11
18 : number

(Note to Haskellers: Elm does not support sections.)

What if we wanted to restrict our plus function to Ints rather than arbitrary numbers? We need some way to “cast” a number to an Int. Although the Basics library does not provide such a toInt function, we can define something to help ourselves:

> toInt n = n // 1
<function> : Int -> Int

This doesn’t quite have the type number -> Int we sought… but on second thought, we don’t really need our casting function to have that type. Why not?

> plusInt x y = (toInt x) + y
<function> : Int -> Int -> Int

> plusInt x y = toInt (x + y)
<function> : Int -> Int -> Int

Type Annotations

Elm, like most ML dialects, automatically infers most types. Nevertheless, it is often good practice to explictly declare type annotations for “top-level” definitions (we will see “local” definitions shortly).

In an Elm source file (e.g. IntroML.elm), a top-level definition can be preceded by a type annotation. The type checker will check whether the implementation actually satisfies the type you’ve declared.

plus : number -> number -> number
plus x y = x + y

plusInt : Int -> Int -> Int
plusInt x y = x + y

Notice that by using an explicit annotation for plusInt, we avoid the need to use the roundabout toInt function from before. In fact, we can refactor the definition as follows:

plusInt : Int -> Int -> Int
plusInt = plus

This version really emphasizes the fact that our implementation of plusInt is more general than the API (i.e. type) exposed to clients of the function. Designing software is full of decisions like this one.

There’s nothing stopping us from writing programs where the expressions we write do not satisfy the type signatures we write:

plus : number -> number -> Bool
plus x y = x + y

When we do, Elm reports helpful error messages explaining the inconsistencies:

-- TYPE MISMATCH ----------------------------------------------- ././IntroML.elm

The definition of `plus` does not match its type annotation.

5| plus : number -> number -> Bool
6|>plus x y = x + y

The type annotation for `plus` says it always returns:

    Bool

But the returned value (shown above) is a:

    number

Hint: Your type annotation uses type variable `number` which means any type of
value can flow through. Your code is saying it CANNOT be anything though! Maybe
change your type annotation to be more specific? Maybe the code has a problem?

More at:
<https://github.com/elm-lang/elm-compiler/blob/0.18.0/hints/type-annotations.md>

Importing Modules

Now that we’ve started putting definitions in source files, how do we import them from the REPL and from other files? Notice that the file IntroML.elm defines a module of the same name, which can be imported in several ways.

The following import will require all imported definitions to be qualified for use.

> import IntroML

> IntroML.plusInt 2 3
5 : Int

> plusInt 2 3
-- NAMING ERROR ---------------------------------------------- repl-temp-000.elm

Cannot find variable `plusInt`

4|   plusInt 3 4
     ^^^^^^^
Maybe you want one of the following?

    IntroML.plusInt

Another option is to specify which definitions to import for use without qualification. All other definitions from IntroML will still be accessible with qualification.

> import IntroML exposing (plusInt)

> plusInt 2 3
5 : Int

> IntroML.plus 2.0 3.0
5 : Float

> IntroML.exclaim "Cool"
"Cool!" : String

You can also import all definitions for use without qualification.

> import IntroML (..)

> (plusInt 2 3, exclaim "Cool")
(5,"Cool!") : ( Int, String )

Finally, you can also define an abbreviation for the imported module.

> import IntroML as M

> M.plusInt 2 3
5 : Int

Whew, that was a lot of choices! This kind of flexibility will come in handy, because it can be hard to remember where functions are defined when importing many modules. Furthermore, many modules will define functions with popular names, such as map and foldr, so qualified access will be needed.

You may have noticed that we have been using some library functions without any imports. That’s because Basics, as well as a few other very common libraries such as Maybe, are opened by default.

Hot-Swapping

If you change the following definition in IntroML.elm to append additional exclamation points…

exclaim s = s ++ "!!!"

… you will immediately have access to the new version without having to first import the module again.

> M.exclaim "Whoa"
"Whoa!!!" : String

This kind of hot-swapping can be useful once we get to writing and running more interesting programs.

Conditionals

Conditional expressions must return the same type of value on both branches.

> if 1 == 1 then "yes" else "no"
"yes" : String

> if False then 1.0 else 1
1 : Float

(Note to Racketeers: Even if you know for sure that returning different types of expressions on different branches will jive with the rest of your program, Elm will not let you do it. You have to use union types, discussed below. Restrictions like this may sometimes annoy the programmer. But in return, they enable the type system to provide static checking error detection that becomes really useful, especially as programs get large.)

Polymorphic Types

Type variables are identifiers that start with a lower case letter and are often a single character.

> choose b x y = if b then x else y
<function> : Bool -> a -> a -> a

As with the number type discussed above, this function type should be read as having an implicit “forall” at the beginning that “defines” the scope of the type variable: “for all types a, choose has type Bool -> a -> a -> a.

When calling a polymorphic function such as choose, Elm (like other ML dialects) will automatically instantiate the type variables with type arguments appropriately based on the value arguments.

> choose True True False      -- a instantiated to Bool
> choose True "a" "b"         -- a instantiated to String
> choose True 1.0 2.0         -- a instantiated to Float
> choose True 1 2             -- a instantiated to number

These function calls can be thought of as taking type arguments (one for each type universally quantified type variable for the function) that are automatically inferred by the type checker. If the syntax of Elm were to allow explicit type instantiations, the above expressions might look something like:

choose [Bool] True True False
choose [String] True "a" "b"
choose [Float] True 1.0 2.0
choose [number] True 1 2

Imagine that polymorphic types in Elm required an explicit forall quantifier. The result of instantiating a polymorphic type with a type argument T is obtained by substituting bound occurrences of the type variable with T.

choose : forall a. Bool -> a      -> a      -> a

choose [Bool]    : Bool -> Bool   -> Bool   -> Bool
choose [String]  : Bool -> String -> String -> String
choose [Float]   : Bool -> Float  -> Float  -> Float
choose [number]  : Bool -> number -> number -> number

Just as the particular choices of program variables does not matter, neither do the particular choices of type variables. So polymorphic types are equivalent up to renaming. For example, choose can be annotated with polymorphic types that choose a different variable name than a.

choose : Bool -> b -> b -> b 

choose : Bool -> c -> c -> c 

choose : Bool -> thing -> thing -> thing

What happens if choose is annotated as follows?

choose : Bool -> number -> number -> number

The choose function typechecks with this annotation, but this type is more restrictive than the earlier ones. Remember that number, as discussed earlier, can only be instantiated with the types Int and Float. This special handling of the particular variable number — as opposed to other identifiers — is the way that Elm shoehorns a limited form of type classes into the language. It’s a pretty interesting design choice!

While we are on the subject, there is another special purpose type variable called comparable that is used to describe types that are, well, comparable using an ordering relation. See Basics for more info.

> (<)
<function> : comparable -> comparable -> Bool

> 1 < 2
True : Bool

> 1 < 2.0
True : Bool

> "a" < "ab"
True : Bool

> (2, 1) < (1, 2)
False : Bool

> (1 // 1) < 2.0
-- TYPE MISMATCH --------------------------------------------- repl-temp-000.elm
...

> True < False
-- TYPE MISMATCH --------------------------------------------- repl-temp-000.elm
...

Hint: Only ints, floats, chars, strings, lists, and tuples are comparable.

Infix Operators

There are a bunch of really useful infix operators in Basics, so take a look around. Make sure to visit (<|), (|>), (<<), and (>>), which can be used to write elegant chains of function applications.

NOTE: Added on Mar 30:

Depending on your prior experience and tastes, you may prefer to write the expression

\x -> h (g (f x))

in a flavor that emphasizes composition, such as

(\x -> x |> f |> g |> h)

(f >> g >> h)

(\x -> h <| g <| f <| x)

(h << g << f)

(\x -> (g >> h) <| f <| x)

(\x -> x |> f |> (h << g))

All of these definitions are equivalent, so choose a style that you like best and that fits well within the code around it. (But you better not choose versions like the last two, because “pipelining” in both directions won’t help anyone, including yourself, understand your code.)

Lists

Without further ado, lists.

> 1::2::3::4::[]
[1,2,3,4] : List number

> [1,2,3,4]
[1,2,3,4] : List number

For those keeping score, the list syntax above is part OCaml ((::) for cons rather than (:)) and part Haskell (, to separate elements rather than ;).

Strings are not lists of Chars like they are in Haskell:

> ['a','b','c']
['a','b','c'] : List Char

> "abc"
"abc" : String

> ['a','b','c'] == "abc"
-- TYPE MISMATCH --------------------------------------------- repl-temp-000.elm
...

Pattern matching to destruct lists; the \ character is used to enter a multi-line expression in the REPL:

> len xs = case xs of \
|   x::xs -> 1 + len xs \
|   []    -> 0
<function> : List a -> number

> len [1,2,3]
3 : number

> len []
0 : number

(Note to Racketeers: The first branch of the case expression above essentially combines the functionality of checking whether pair? xs is #t and, if so, calling car xs and cdr xs.)

Non-exhaustive patterns result in a (compile-time) type error:

> head xs = case xs of x::_ -> x

-- MISSING PATTERNS ------------------------------------------ repl-temp-000.elm

This `case` does not have branches for all possibilities.

5| head xs = case xs of x::_ -> x
             ^^^^^^^^^^^^^^^^^^^^
You need to account for the following values:

    []

Add a branch to cover this pattern!

If you really must write a partial function:

> unsafe_head xs = case xs of \
|   x::_ -> x \
|   []   -> Debug.crash "unsafe_head: empty list"
<function> : List a -> a

> unsafe_head [1]
1 : number

> unsafe_head []
... Error: Ran into a `Debug.crash` ...

Using Debug.crash as a “placeholder” during development is extremely useful, so that you can typecheck, run, and test your programs before you have finished handling all cases. (Check out the type of Debug.crash.)

Elm also statically rejects programs with a redundant pattern, which will never match at run-time because previous patterns subsume it:

> len xs = case xs of \
|   _::xs -> 1 + len xs \
|   []    -> 0 \
|   []    -> 9999

-- REDUNDANT PATTERN ----------------------------------------- repl-temp-000.elm

The following pattern is redundant. Remove it.

8|   []    -> 9999
      ^
Any value with this shape will be handled by a previous pattern.

Higher-Order Functions

The classics:

> List.filter
<function> : (a -> Bool) -> List a -> List a

> List.filter (\x -> rem x 2 == 0) (List.range 1 10)
[2,4,6,8,10] : List Int

> List.map
<function> : (a -> b) -> List a -> List b

> List.map (\x -> x ^ 2) (List.range 1 10)
[1,4,9,16,25,36,49,64,81,100] : List number

> List.foldr
<function> : (a -> b -> b) -> b -> List a -> b

> List.foldl
<function> : (a -> b -> b) -> b -> List a -> b

A quick refresher on how folding from the right and left differ:

List.foldr f init [e1, e2, e3]
  === f e1 (f e2 (f e3 init))
  === init |> f e3 |> f e2 |> f e1

List.foldl f init [e1, e2, e3]
  === f e3 (f e2 (f e1 init))
  === init |> f e1 |> f e2 |> f e3

Thus:

> List.foldr (\x acc -> x :: acc) [] (List.range 1 10)
[1,2,3,4,5,6,7,8,9,10] : List number

> List.foldl (\x acc -> x :: acc) [] (List.range 1 10)
[10,9,8,7,6,5,4,3,2,1] : List number

For any (well-typed) function expression e, the function (\x -> e x) is said to be eta-equivalent to e. The verbose version is said to be eta-expanded whereas the latter is eta-contracted.

The following emphasizes that the lambda used in the last call to List.foldl above is eta-expanded:

> (::)
<function> : a -> List a -> List a

> List.foldl (\x acc -> (::) x acc) [] (List.range 1 10)
[10,9,8,7,6,5,4,3,2,1] : List number

The eta-reduced version is nicer:

> List.foldl (::) [] (List.range 1 10)
[10,9,8,7,6,5,4,3,2,1] : List number

Datatypes and Pattern Matching

List is a built-in inductive, algebraic datatype. You can define your own datatypes (or union types or “disjoint sums” or “sums-of-products”). Each type constructor is defined with one or more data constructors, each of which is defined to “hold” zero or more values.

> type Diet = Herb | Carn | Omni | Other String

> Carn
Carn : Repl.Diet

> Omni
Omni : Repl.Diet

> Other "Lactose Intolerant"
Other ("Lactose Intolerant") : Repl.Diet

Non-nullary data constructors are themselves functions:

> Other
<function> : String -> Repl.Diet

Use datatypes to simulate “heterogeneous” lists of values:

> diets = [Herb, Herb, Omni, Other "Vegan", Carn]
[Herb,Herb,Omni,Other "Vegan",Carn] : List Repl.Diet

Pattern matching is the (only) way to “use,” or “destruct,” constructed values. Patterns that describe values of a datatype t are either:

variables,
the wildcard pattern (written _), or
data constructors of t applied to an appropriate number of patterns for that data constructor.

For example:

> maybeHuman d = case d of \
|   Carn -> False \
|   _    -> True
<function> : Repl.Diet -> Bool

> List.map maybeHuman diets
[True,True,True,True,False] : List Bool

As before, be careful with non-exhaustive and redundant patterns.

The fact that Elm reports compile-time errors for redundant patterns helps prevent the following bug that pops up pretty frequently when learning functional programming:

> carn = Carn
Carn : Repl.Diet

> isCarn d = case d of \
|   carn -> True \
|   _    -> False

-- REDUNDANT PATTERN ----------------------------------------- repl-temp-000.elm
...

A variable pattern matches anything, even if that variable is in scope and binds a particular value. Note that the wildcard pattern also matches anything; it is useful when the value it binds does not need to be referred to in the subsequent branch expression.

Patterns can be nested. For example, the function …

firstTwo xs =
  case xs of
    x::ys -> case ys of
               y::_ -> (x, y)
               []   -> Debug.crash "firstTwo"
    []    -> Debug.crash "firstTwo"

… can be written more clearly as follows:

firstTwo xs =
  case xs of
    x::y::_ -> (x, y)
    _       -> Debug.crash "firstTwo"

Test your understanding: what’s the type of firstTwo?

Type Aliases

Defining an alias or synonym for an existing type:

type alias IntPair = (Int, Int)

Types for Errors

Our unsafe_head function above fails with a run-time error when its argument is non-empty. Another way to deal with error cases is to track them explicitly, by introducing data values that are used explicitly to represent the error, or the lack of a meaningful answer.

For example, the type

> type MaybeInt = YesInt Int | NoInt

describes two kinds of values: ones labeled YesInt that do come bundled with an Int, and ones labeled NoInt that do not come bundled with any other data. In other words, the latter can be used to encode when there is no meaningful Int to return:

> head xs = case xs of \
|   x::_ -> YesInt x \
|   []   -> NoInt
<function> : List Int -> Repl.MaybeInt

> head (List.range 1 4)
YesInt 1 : Repl.MaybeInt

> head []
NoInt : Repl.MaybeInt

Ah, much better than a run-time error!

This MaybeInt type is defined to work only with Ints, but the same pattern — the presence or absence of a meaningful result — will emerge with all different types of values.

Polymorphic datatypes to the rescue:

> type MaybeData a = YesData a | NoData

As when calling polymorphic functions, type variables for type constructors like MaybeData get instantiated to particular type arguments in order to match the kinds of values it is being used with.

Polymorphic datatypes and polymorphic functions make a formidable duo:

> head xs = case xs of \
|   x::_ -> YesData x \
|   []   -> NoData
<function> : List a -> Repl.MaybeData a

> head ['a','b','c']
YesData 'a' : Repl.MaybeData Char

> head ["a","b","c"]
YesData 'a' : Repl.MaybeData String

> head (List.range 1 4)
YesData 1 : Repl.MaybeData number

> head []
NoData : Repl.MaybeData a

“For every type a, NoData has type a.” Cool, NoData is a polymorphic constant and its type may be instantiated, or specialized, depending on how it is used.

The MaybeData pattern is so common that there’s a library called Maybe that provides the following type, which is like ours but with different names:

type Maybe a = Just a | Nothing

There’s also a related library and type called Result that generalizes the Maybe pattern. Check them out, and also see IntroML.elm for a couple simple examples.

Let-Expressions

So far we have worked only with top-level definitions. Elm’s let-expressions allow the definition of variables that are “local” to the enclosing scope. As with other language features, whitespace matters so make sure equations are aligned.

plus3 a =
  let b = a + 1 in
  let c = b + 1 in
  let d = c + 1 in
    d

No need to write so many lets and ins:

plus3 a =
  let
    b = a + 1
    c = b + 1
    d = c + 1
  in
    d

Too many local variables can sometimes obscure meaning (just as too few variables can). In this case, the “pipelined” definition

plus3 a = a |> plus 1 |> plus 1 |> plus 1

and, better yet, the definition by function composition

plus3 = plus 1 << plus 1 << plus 1

are, arguably, more readable.

List Concatenation

There’s a “primitive typeclass” (in addition to number and comparable, discussed above) called appendable, which describes types including lists and strings:

> (++)
<function> : appendable -> appendable -> appendable

> "hello" ++ " world"
"hello world" : String

> List.range 1 10 ++ List.range 11 20
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20] : List number

Records

Records are like tuples, where the components (i.e. fields) are denoted by name rather than position and where the order of components is irrelevant. Record patterns bind the values of components by name, and they can omit fields that are not needed.

> type alias Point = { x : Int, y : Int }

> let {x,y} = {y=2, x=1} in x + y
3 : number

> let {x} = {y=2, x=1} in x
1 : number

Read more about records. Records can be polymorphic and even extensible:

type alias PolymorphicPoint number = { x : number, y : number }

type alias PointLike a number = { a | x : number, y : number }

Datatypes, record types, and type aliases are orthogonal:

> type alias T = {x:String}
> type S1 = S1 {x:String}
> type S2 = S2 T
> type U = U1 T | U2 {x:Int} | U3 (Int, String) | U4

Reading

Required

Syntax Reference
Libraries: Basics, Maybe, List

Additional

If you would like to see the syntax and features of two other ML dialects, Standard ML and OCaml, take a look at this and this.

PFP, Spring 2018