NOTE: Updated 5/11 and 5/16.
A common data structure that incorporates laziness is a lazy list (a.k.a. stream). Having worked through laziness in Elm in detail using the previous examples, our discussion of streams here will be brief, mainly focusing on picking the right representation.
NotSoLazyList.elmOne possibility for representing LazyLists is the following type.
type LazyList a
= Nil
| Cons (Lazy a) (LazyList a)
This datatype describes lists that are not very lazy, however. We can define a function range : Int -> Int -> LazyList Int and demonstrate how a LazyList of n elements immediately builds n Cons cells.
> range 1 10
Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>) Nil)))))))))
: NotSoLazyList.LazyList Int
PrettyLazyList.elmAnother option is the following.
type LazyList a
= Nil
| Cons a (Lazy (LazyList a))
This is pretty good, but notice that a non-Nil list must have its first value evaluated. Consider what the representation of a range of Ints looks like.
> range 1 10
Cons 1 (Lazy <function>) : PrettyLazyList.LazyList Int
LazyList.elmWhat we really want is for all elements in the list, including the first, to be delayed until needed. We can achieve this as follows.
type alias LazyList a = Lazy (LazyListCell a)
type LazyListCell a
= Nil
| Cons a (LazyList a)
Thought Exercise: Why didn’t we use a similar strategy in defining the the lazy Nats before?
rangeThe range function is incremental. Notice the trivial suspension lazy (\_ -> Nil).
range : Int -> Int -> LazyList Int
range i j =
if i > j
then lazy (\_ -> Nil)
else lazy (\_ -> Cons i (range (i+1) j))
The comparison i > j isn’t expensive, so we decided to evaluate it right away rather than delaying it by putting it inside the LazyList.
We can also define a “debug” version to emphasize when list items get forced to evaluate:
range_ : Int -> Int -> LazyList Int
range_ i j =
if i > j then lazy (\_ -> Nil)
else lazy <| \_ ->
let _ = Debug.log "force" i in
Cons i (range_ (i+1) j)
toListConverting a stream to a List is monolithic:
toList : LazyList a -> List a
toList xs =
let foo acc xs = case force xs of
Nil -> acc
Cons x xs_ -> foo (x::acc) xs_
in
List.reverse <| foo [] xs
Now we can force the incremental range function to do its work:
> range_ 1 5 |> toList
force: 1
force: 2
force: 3
force: 4
force: 5
[1,2,3,4,5]
: List Int
infiniteWe can also describe infinite streams.
infinite : Int -> LazyList Int
infinite i = lazy (\_ -> Cons i (infinite (i+1)))
Let’s define a debug version again:
infinite_ : Int -> LazyList Int
infinite_ i = lazy <| \_ ->
let _ = Debug.log "force" i in
Cons i (infinite_ (i+1))
Not surprisingly, we don’t have enough memory to represent all positive integers:
> infinite_ 1 |> toList
FATAL ERROR: JS Allocation failed - process out of memory
takeThe take function is incremental.
take : Int -> LazyList a -> LazyList a
take k xs =
case (k, force xs) of
(0, _) -> lazy (\_ -> Nil)
(_, Nil) -> lazy (\_ -> Nil)
(_, Cons x xs) -> lazy (\_ -> Cons x (take (k-1) xs))
Incremental function in action:
> infinite 1
Lazy <function> : Lazy.Lazy (LazyList.LazyListCell Int)
> infinite 1 |> take 10
Lazy <function> : Lazy.Lazy (LazyList.LazyListCell Int)
> infinite 1 |> take 10 |> toList
[1,2,3,4,5,6,7,8,9,10] : List Int
But there is still some unnecessary work; take forces the input list even if no elements are taken:
> infinite_ 1 |> take 0 |> toList
force: 1
[]
: List Int
A slightly lazier version of take:
take k xs =
if k <= 0 then lazy (\_ -> Nil)
else
case force xs of
Nil -> lazy (\_ -> Nil)
Cons x xs_ -> lazy (\_ -> Cons x (take (k-1) xs_))
This no longer forces the list when zero elements are taken…
> infinite_ 1 |> take 0 |> toList
[] : List Int
> infinite_ 1 |> take 5 |> toList
force: 1
force: 2
force: 3
force: 4
force: 5
[1,2,3,4,5]
: List Int
… but it does force the list even when the first element is really needed:
> infinite_ 1 |> take 5
force: 1
Lazy <function>
: LazyList.LazyList Int
Lazier:
take k xs =
if k <= 0 then lazy (\_ -> Nil)
else
lazy <| \_ ->
case force xs of
Nil -> Nil
Cons x xs_ -> Cons x (take (k-1) xs_)
That’s better:
> infinite_ 1 |> take 5
Lazy <function> : LazyList.LazyList Int
dropThe drop function is also incremental.
drop : Int -> LazyList a -> LazyList a
drop k xs =
if k <= 0 then xs
else
lazy <| \_ ->
case force xs of
Nil -> Nil
Cons _ xs_ -> force (drop (k-1) xs_)
For example:
> infinite 1 |> drop 10 |> take 10 |> toList
[11,12,13,14,15,16,17,18,19,20] : List Int
appendCombining two streams using append is incremental.
append : LazyList a -> LazyList a -> LazyList a
append xs ys =
lazy <| \_ ->
case force xs of
Nil -> force ys
Cons x xs_ -> Cons x (append xs_ ys)
reverseReversing a stream delays forcing the input list…
reverse : LazyList a -> LazyList a
reverse xs =
lazy <| \_ ->
case force xs of
Nil -> Nil
Cons x xs_ -> force (append (reverse xs_) (singleton x))
nil = lazy (\_ -> Nil)
singleton x = lazy (\_ -> Cons x nil)
… but once it is forced, the recursion is monolithic:
> reverse (range_ 1 5) |> toList
force: 1
force: 2
force: 3
force: 4
force: 5
[5,4,3,2,1]
: List Int
> eq (range 1 1) (range 1 10000)
False : Bool
> eq (range 1 1) (reverse (range 1 10000))
RangeError: Maximum call stack size exceeded
So, we should make it tail-recursive: (NOTE 5/16: Updated the Cons case below.)
reverse : LazyList a -> LazyList a
reverse xs =
let foo acc xs =
case force xs of
Nil -> acc
Cons x xs_ -> lazy (\_ -> force (foo (lazy (\_ -> Cons x acc)) xs_))
-- Cons x xs_ -> foo (lazy (\_ -> Cons x acc)) xs_
in
lazy (\_ -> force (foo nil xs))
Notice that lazy (\_ -> Cons x acc) above is another example of a trivial thunk. The values x and acc have already been evaluated, so building the Cons value does not force any additional computations.
Hmm, even though this version does not make the recursive call to the helper function foo right away, it still busts the stack…
> eq (range 1 1) (reverse (range 1 10000))
RangeError: Maximum call stack size exceeded
What if we write a tail-recursive function that does not attempt to delay any of the (non-trivial) computation?
reverse2 : LazyList a -> LazyList a
reverse2 xs =
let foo acc xs =
case force xs of
Nil -> acc
Cons x xs_ -> foo (lazy (\_ -> Cons x acc)) xs_
in
foo nil xs
This works okay here…
> eq (range 1 1) (reverse2 (range 1 10000))
False : Bool
… but there are new issues:
> range 1 5 |> reverse2 |> toList
FATAL ERROR: JS Allocation failed - process out of memory
> range 1 5 |> reverse2 |> take 2 |> toList
[<internal structure>,<internal structure>] : List Int
Out of memory for such a small list? And “internal structure” values? If we swap out the use of Lazy with hand-rolled thunks instead…
-- import Lazy exposing (Lazy, lazy, force)
type Lazy a = Lazy (() -> a)
force (Lazy f) = f ()
lazy = Lazy
… we get the same last two behaviors above. So, the issue does not seem to stem from the Lazy library.
I’m not sure… let’s live with the version above that busts the stack.
eqOur final monolithic example function checks for equality, forcing only as many elements as needed when the lists are not equal.
eq : LazyList a -> LazyList a -> Bool
eq xs ys =
case (force xs, force ys) of
(Nil, Nil) -> True
(Cons x xs_, Cons y ys_) -> x == y && eq xs_ ys_
_ -> False
Can break out early, but busts the stack:
> eq (range 0 1000) (range 0 1000)
True : Bool
> eq (range 0 1000) (range 0 10000)
False : Bool
> eq (range 0 10000) (range 0 10000)
RangeError: Maximum call stack size exceeded
Even though (&&) has short-circuiting semantics, this syntactic expression eludes the compiler’s support for tail call elimination. So let’s use a conditional instead:
...
(Cons x xs_, Cons y ys_) -> if x /= y then False else eq xs_ ys_
...
That’s better:
> eq (range 0 10000) (range 0 10000)
True : Bool
> eq (range 1 10) (range 1 10000000)
False : Bool
> eq (range 1 10) (range 1 1000000000000000)
False : Bool
> eq (range 1 10) (infinite 1)
False : Bool