NOTE: Updated 5/11 and 5/16.
A common data structure that incorporates laziness is a lazy list (a.k.a. stream). Having worked through laziness in Elm in detail using the previous examples, our discussion of streams here will be brief, mainly focusing on picking the right representation.
NotSoLazyList.elm
One possibility for representing LazyList
s is the following type.
type LazyList a
= Nil
| Cons (Lazy a) (LazyList a)
This datatype describes lists that are not very lazy, however. We can define a function range : Int -> Int -> LazyList Int
and demonstrate how a LazyList
of n elements immediately builds n Cons
cells.
> range 1 10
Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>)
(Cons (Lazy <function>) Nil)))))))))
: NotSoLazyList.LazyList Int
PrettyLazyList.elm
Another option is the following.
type LazyList a
= Nil
| Cons a (Lazy (LazyList a))
This is pretty good, but notice that a non-Nil
list must have its first value evaluated. Consider what the representation of a range
of Int
s looks like.
> range 1 10
Cons 1 (Lazy <function>) : PrettyLazyList.LazyList Int
LazyList.elm
What we really want is for all elements in the list, including the first, to be delayed until needed. We can achieve this as follows.
type alias LazyList a = Lazy (LazyListCell a)
type LazyListCell a
= Nil
| Cons a (LazyList a)
Thought Exercise: Why didn’t we use a similar strategy in defining the the lazy Nat
s before?
range
The range
function is incremental. Notice the trivial suspension lazy (\_ -> Nil)
.
range : Int -> Int -> LazyList Int
range i j =
if i > j
then lazy (\_ -> Nil)
else lazy (\_ -> Cons i (range (i+1) j))
The comparison i > j
isn’t expensive, so we decided to evaluate it right away rather than delaying it by putting it inside the LazyList.
We can also define a “debug” version to emphasize when list items get forced to evaluate:
range_ : Int -> Int -> LazyList Int
range_ i j =
if i > j then lazy (\_ -> Nil)
else lazy <| \_ ->
let _ = Debug.log "force" i in
Cons i (range_ (i+1) j)
toList
Converting a stream to a List
is monolithic:
toList : LazyList a -> List a
toList xs =
let foo acc xs = case force xs of
Nil -> acc
Cons x xs_ -> foo (x::acc) xs_
in
List.reverse <| foo [] xs
Now we can force the incremental range
function to do its work:
> range_ 1 5 |> toList
force: 1
force: 2
force: 3
force: 4
force: 5
[1,2,3,4,5]
: List Int
infinite
We can also describe infinite streams.
infinite : Int -> LazyList Int
infinite i = lazy (\_ -> Cons i (infinite (i+1)))
Let’s define a debug version again:
infinite_ : Int -> LazyList Int
infinite_ i = lazy <| \_ ->
let _ = Debug.log "force" i in
Cons i (infinite_ (i+1))
Not surprisingly, we don’t have enough memory to represent all positive integers:
> infinite_ 1 |> toList
FATAL ERROR: JS Allocation failed - process out of memory
take
The take
function is incremental.
take : Int -> LazyList a -> LazyList a
take k xs =
case (k, force xs) of
(0, _) -> lazy (\_ -> Nil)
(_, Nil) -> lazy (\_ -> Nil)
(_, Cons x xs) -> lazy (\_ -> Cons x (take (k-1) xs))
Incremental function in action:
> infinite 1
Lazy <function> : Lazy.Lazy (LazyList.LazyListCell Int)
> infinite 1 |> take 10
Lazy <function> : Lazy.Lazy (LazyList.LazyListCell Int)
> infinite 1 |> take 10 |> toList
[1,2,3,4,5,6,7,8,9,10] : List Int
But there is still some unnecessary work; take
forces the input list even if no elements are taken:
> infinite_ 1 |> take 0 |> toList
force: 1
[]
: List Int
A slightly lazier version of take
:
take k xs =
if k <= 0 then lazy (\_ -> Nil)
else
case force xs of
Nil -> lazy (\_ -> Nil)
Cons x xs_ -> lazy (\_ -> Cons x (take (k-1) xs_))
This no longer forces the list when zero elements are taken…
> infinite_ 1 |> take 0 |> toList
[] : List Int
> infinite_ 1 |> take 5 |> toList
force: 1
force: 2
force: 3
force: 4
force: 5
[1,2,3,4,5]
: List Int
… but it does force the list even when the first element is really needed:
> infinite_ 1 |> take 5
force: 1
Lazy <function>
: LazyList.LazyList Int
Lazier:
take k xs =
if k <= 0 then lazy (\_ -> Nil)
else
lazy <| \_ ->
case force xs of
Nil -> Nil
Cons x xs_ -> Cons x (take (k-1) xs_)
That’s better:
> infinite_ 1 |> take 5
Lazy <function> : LazyList.LazyList Int
drop
The drop
function is also incremental.
drop : Int -> LazyList a -> LazyList a
drop k xs =
if k <= 0 then xs
else
lazy <| \_ ->
case force xs of
Nil -> Nil
Cons _ xs_ -> force (drop (k-1) xs_)
For example:
> infinite 1 |> drop 10 |> take 10 |> toList
[11,12,13,14,15,16,17,18,19,20] : List Int
append
Combining two streams using append
is incremental.
append : LazyList a -> LazyList a -> LazyList a
append xs ys =
lazy <| \_ ->
case force xs of
Nil -> force ys
Cons x xs_ -> Cons x (append xs_ ys)
reverse
Reversing a stream delays forcing the input list…
reverse : LazyList a -> LazyList a
reverse xs =
lazy <| \_ ->
case force xs of
Nil -> Nil
Cons x xs_ -> force (append (reverse xs_) (singleton x))
nil = lazy (\_ -> Nil)
singleton x = lazy (\_ -> Cons x nil)
… but once it is forced, the recursion is monolithic:
> reverse (range_ 1 5) |> toList
force: 1
force: 2
force: 3
force: 4
force: 5
[5,4,3,2,1]
: List Int
> eq (range 1 1) (range 1 10000)
False : Bool
> eq (range 1 1) (reverse (range 1 10000))
RangeError: Maximum call stack size exceeded
So, we should make it tail-recursive: (NOTE 5/16: Updated the Cons
case below.)
reverse : LazyList a -> LazyList a
reverse xs =
let foo acc xs =
case force xs of
Nil -> acc
Cons x xs_ -> lazy (\_ -> force (foo (lazy (\_ -> Cons x acc)) xs_))
-- Cons x xs_ -> foo (lazy (\_ -> Cons x acc)) xs_
in
lazy (\_ -> force (foo nil xs))
Notice that lazy (\_ -> Cons x acc)
above is another example of a trivial thunk. The values x
and acc
have already been evaluated, so building the Cons
value does not force any additional computations.
Hmm, even though this version does not make the recursive call to the helper function foo
right away, it still busts the stack…
> eq (range 1 1) (reverse (range 1 10000))
RangeError: Maximum call stack size exceeded
What if we write a tail-recursive function that does not attempt to delay any of the (non-trivial) computation?
reverse2 : LazyList a -> LazyList a
reverse2 xs =
let foo acc xs =
case force xs of
Nil -> acc
Cons x xs_ -> foo (lazy (\_ -> Cons x acc)) xs_
in
foo nil xs
This works okay here…
> eq (range 1 1) (reverse2 (range 1 10000))
False : Bool
… but there are new issues:
> range 1 5 |> reverse2 |> toList
FATAL ERROR: JS Allocation failed - process out of memory
> range 1 5 |> reverse2 |> take 2 |> toList
[<internal structure>,<internal structure>] : List Int
Out of memory for such a small list? And “internal structure” values? If we swap out the use of Lazy
with hand-rolled thunks instead…
-- import Lazy exposing (Lazy, lazy, force)
type Lazy a = Lazy (() -> a)
force (Lazy f) = f ()
lazy = Lazy
… we get the same last two behaviors above. So, the issue does not seem to stem from the Lazy
library.
I’m not sure… let’s live with the version above that busts the stack.
eq
Our final monolithic example function checks for equality, forcing only as many elements as needed when the lists are not equal.
eq : LazyList a -> LazyList a -> Bool
eq xs ys =
case (force xs, force ys) of
(Nil, Nil) -> True
(Cons x xs_, Cons y ys_) -> x == y && eq xs_ ys_
_ -> False
Can break out early, but busts the stack:
> eq (range 0 1000) (range 0 1000)
True : Bool
> eq (range 0 1000) (range 0 10000)
False : Bool
> eq (range 0 10000) (range 0 10000)
RangeError: Maximum call stack size exceeded
Even though (&&)
has short-circuiting semantics, this syntactic expression eludes the compiler’s support for tail call elimination. So let’s use a conditional instead:
...
(Cons x xs_, Cons y ys_) -> if x /= y then False else eq xs_ ys_
...
That’s better:
> eq (range 0 10000) (range 0 10000)
True : Bool
> eq (range 1 10) (range 1 10000000)
False : Bool
> eq (range 1 10) (range 1 1000000000000000)
False : Bool
> eq (range 1 10) (infinite 1)
False : Bool