Recall the FastQueue.elm
implementation (referred to in the textbook as the “batched queue” implementation) that has O(1) amortized cost for each of the operations, including dequeue
, which has an O(n) worst-case cost. The amortized analysis, however, assumes that no Queue
object is used as the input for more than one Queue
operation. We will now see queue implementations that employ laziness in order to achieve O(1) amortized bounds despite the possibility that each queue is used persistently.
Consider the following example from the textbook, where q_0
is a queue whose front
and back
lists have equal length m
.
q_0 = ... -- |front| = |back| = m
q_1 = dequeue q_0
q_2 = dequeue q_1
...
q_m = dequeue q_m_minus_1
q_m_plus_1 = dequeue q_m
Using the FastQueue
implementation, the operation dequeue q_m_minus_1
triggers a call to List.reverse
, which runs in O(m
) time; all other calls to dequeue
run in O(1) time. In the absence of persistence, the amortized analysis is able to reason that this expensive operation happens infrequently compared to the cheap ones and that any sequence of n FastQueue
operations takes O(n) time overall.
LazyBatchedQueue.elm
As a first attempt, we might port the FastQueue
implementation so that it uses LazyList
s (a.k.a streams) and memoizes the computation that involves the expensive call to List.reverse
.
We start by defining front
and back
to be LazyList
s rather than List
s.
type Queue a = Q { front : LazyList a, back : LazyList a }
The empty
queue contains two empty LazyList
s.
empty = mkQ nil nil
nil = lazy (\_ -> Nil)
Just like in FastQueue
, we will maintain the invariant that front
is empty only if back
is. As a result, we check whether a Queue
is empty by forcing front
to evaluate to a LazyListCell
and then checking whether it is Nil
or not.
isEmpty (Q {front, back}) =
case force front of
Nil -> True
_ -> False
Note: Why is it not a good idea to define isEmpty = (==) empty
?
The implementations of enqueue
, dequeue
, and peek
are similar to before, except that LazyListCell
s are created and pattern matched rather than typical List
cells.
enqueue x (Q {front, back}) =
checkFront front (lazy (\_ -> Cons x back))
dequeue (Q {front, back}) = case force front of
Nil -> Nothing
Cons _ f_ -> Just (checkFront f_ back)
peek (Q {front, back}) = case force front of
Nil -> Nothing
Cons x _ -> Just x
Recall that the checkFront
function enforces and reestablishes, if necessary, the invariant that front
is empty only if back
is. A first option is the following.
checkFront f b = case force f of
Nil -> mkQ (reverse b) nil
_ -> mkQ f b
However, this version forces the (monolithic) reverse
function to process the entire LazyList
right away; there are no suspensions. The expensive reverse
operation will be performed every time the expensive operation dequeue q_m_minus_1
is evaluated.
Another option is to suspend the reverse
computation as follows.
checkFront f b = case force f of
Nil -> mkQ (lazy (\_ -> force (reverse b))) nil
_ -> mkQ f b
This delays the reverse
until the resulting LazyList
is actually needed (i.e. dequeue q_m
) and memoizes the result in case the expensive operation is evaluated again. A downside is that peek
no longer runs in worst-case O(1) time.
And in any case, if the operation dequeue q_m_minus_1
is evaluated again, a completely different reverse
suspension is created, so laziness and memoization cannot help amortize the cost. So this naive translation of FastQueue
does not address the challenge of using arbitrary versions of a Queue
persistently.
BankersQueue.elm
A more clever approach is based on the idea of having dequeue q_0
create a suspension involving the reverse
that is not forced until the dequeue q_m
operation. Separating the creation and evaluation of the suspension allows O(m) time to pay for the expensive O(m) cost of the reverse
.
To realize this strategy, we do not wait until the front
list is about to become empty before reversing the back
list. Instead, the back
list is reverse
d as soon as it becomes longer than the front
and is then append
ed to the front
in order to maintain the correct order of elements in the Queue
. The key is that the LazyList
defined by
front +++ reverse back -- where (+++) = append
does not immediately perform the monolithic call to reverse
because append
is an incremental function. Only after sufficient calls to dequeue
exhaust the front
list is back
reversed. Let’s go through the implementation of this strategy, and then discuss how it fares with respect to the problematic sequence of operations above.
The representation maintains the explict Int
eger sizes of the front
and back
streams.
type Queue a = Q Int (LazyList a) Int (LazyList a)
Describing the empty
queue is straightforward.
nil = lazy (\_ -> Nil)
empty = Q 0 nil 0 nil
isEmpty (Q i _ _ _) = i == 0
When the size of the front
stream is greater than 0
, peek
calls the LazyList
head
operation, which force
s the evaluation of the LazyListCell
and returns the first element of the resulting Cons
value.
peek (Q i front j back) =
if i == 0
then Nothing
else Just (head front)
If the back
stream is strictly smaller than the front
, then enqueue
(lazily) adds the new element x
to the back
. Otherwise, the back
is reverse
d and (lazily) append
ed to the front
, updating the size counts appropriately.
enqueue x (Q i front j back) =
if j < i
then Q i front (j+1) (lazy (\_ -> Cons x back))
else Q (i+j+1) (front +++ reverse back) 0 nil
Similarly, dequeue
checks whether the operation results in the back
stream being longer than the new front
, in which case the back
is reverse
d and appended to the new front
. Recall that the LazyList
tail
operation force
s its argument and returns the second element of the resulting Cons
value.
dequeue (Q i front j back) =
if i == 0 then Nothing
else if i == j then Just (Q (i+j-1) (tail front +++ reverse back) 0 nil)
else Just (Q (i-1) (tail front) j back)
These two operations can be refactored to use a common check
function that enforces the invariant that the rear
is never longer than the front
.
enqueue x (Q i front j back) =
check i front (j+1) (lazy (\_ -> Cons x back))
dequeue (Q i front j back) =
if i == 0
then Nothing
else Just (check (i-1) (tail front) j back)
check i front j back =
if j > i
then Q (i+j) (front +++ reverse back) 0 nil
else Q i front j back
To see how this approach fares well even with persistent data structures, consider the sequence of m dequeue
operations from before. The first one, dequeue q_0
, creates a suspension involving reverse
that is forced by the dequeue q_m
operation. No other operation in the sequence creates a suspension. Therefore, to force another expensive call to reverse
requires another call to dequeue q_0
followed by m-1 calls to dequeue
. So, the O(m) cost of the reverse
can be amortized over the sequence of O(m) operations that must precede it.
Sections 6.1, 6.2, and 6.3 of the textbook show how to formalize this argument by adapting the banker’s method to account for lazy evaluation.
PhysicistsQueue.elm
The textbook describes another way to implement a strategy similar to the one employed by BankersQueue
. In that version, the (incremental) append
function waits until front
becomes empty before applying the (monolithic) reverse
function to back
.
Because the back
list is only ever processed by monolithic functions, there is no need for it to be represented using a LazyList
. Thus, one change to the representation is to use an ordinary List
for back
.
type Queue a = Q Int (LazyList a) Int (List a)
Using this representation, the key operation from before becomes
front ++++ List.reverse back
assuming that
(++++) : LazyList a -> List a -> LazyList a
is an incremental function. (Exercise — Implement (++++)
.)
Because this (++++)
function, like append
, is incremental, the resulting list is not entirely evaluated right away. As it turns out, the amortized analysis can be made to work even if this concatenation is performed eagerly. Thus, a second change to the representation is to store front
as a Lazy List
(an ordinary List
that is suspended) rather than a LazyList
(a stream).
type Queue a = Q Int (Lazy (List a)) Int (List a)
Using this representation, the key operation from before becomes:
force front ++ List.reverse back
A consequence of this representation is that peek
and dequeue
must force
the entire suspended front
list. To reduce the costs of these operations, the final change to the representation is to keep an additional (evaluated) List
called pre
that is a prefix of (the suspended List
) front
to facilitate fast access to the front of front
.
type Queue a = Q (List a) Int (Lazy (List a)) Int (List a)
Like in the BankersQueue
, the size of the back
is never allowed to become larger than the front
. In addition, pre
is allowed to be empty only if front
is empty. The check
and checkPre
functions enforce these invariants.
check pre i front j back =
if j <= i then checkPre pre i front j back
else
let front_ = lazy (\_ -> force front ++ List.reverse back) in
checkPre pre (i+j) front_ 0 []
checkPre pre i front j back =
case pre of
[] -> Q (force front) i front j back
_ -> Q pre i front j back
Emptiness:
empty = Q [] 0 (lazy (\_ -> [])) 0 []
isEmpty (Q _ i _ _ _) = i == 0
The enqueue
operation adds to the back
and peek
pulls from pre
, the partially evaluated front of the front
.
enqueue x (Q pre i front j back) = check pre i front (j+1) (x::back)
peek (Q pre _ _ _ _) = List.head pre
Notice that dequeue
uses tail
to update both pre
and front
.
dequeue (Q pre i front j back) =
if i == 0 then Nothing
else
let pre_ = tail pre in
let front_ = lazy (\_ -> tail (force front)) in
Just (check pre_ (i-1) front_ j back)
tail = fromJust << List.tail
Section 6.4 of the textbook shows how to adapt the physicist’s method to account for lazy evaluation and use to argue that this implementation, like the BankersQueue
, has O(1) amortized costs even in the face of persistent access.