Let’s revisit the tree type as defined before.
type Tree a = Empty | Node a (Tree a) (Tree a)
We will define height as follows:
height t =
case t of
Empty -> 0
Node _ t1 t2 -> 1 + max (height t1) (height t2)
For example:
height Empty = 0
height (Node 0 Empty Empty) = 1
height (Node 0 (Node 1 Empty Empty) Empty) = 2
With this terminology, the height of a Tree
t
is the number of Node
s along the longest path from the root (i.e. t
) to an Empty
tree. Equivalently, the height is the number of edges along the longest path from the root to an Empty
tree.
Alternatively, we can define height to be the number of edges along the longest path from the root to a “leaf” node (in our case, a Node
with two Empty
children). Perhaps this definition is more common and intuitive. In a functional language, however, both definitions are, arguably, reasonable since empty nodes are represented explicitly (i.e. Empty
) instead of implicitly as a null
pointer to a tree. (In a language like Java, a value of a reference type T
is either null
or a reference to a value of type T
— akin to Maybe (Ref T)
).
In any case, let’s stick with the first definition above. (We will use it again in Homework 3). Furthermore, let’s refer to the depth of a Tree
t
(with respect to a root Tree
) as the number of edges from the root to t
. With this terminology, depth in a Tree
of height h
ranges from 0
to h
.
MoreTrees.elm
contains the code below.
Let’s write a function that determines whether a Tree
is full.
isFull : Tree a -> Bool
There are many ways to implement this. Let’s start with the following version:
isFull t =
let h = height t in
let foo depth tree =
if depth == h then True
else {- depth < h -}
case tree of
Empty -> False
Node _ t1 t2 ->
let b1 = foo (depth+1) t1 in
let b2 = foo (depth+1) t2 in
b1 && b2
in
h == 0 || foo 0 t
Here, we first compute the height h
and then use a helper function foo
to recursively walk the tree, keeping track of the depth
of the tree
under consideration. Empty
is allowed only at the bottom-most level; the then
branch returns True
without discriminating tree
, because it must be Empty
based on how height
has computed h
. At all other levels (handled by the else
branch), the tree
is required to be a Node
and, if so, its children are recursively traversed.
Let’s improve the definition of isFull
above in two ways. First, we will eliminate the second recursive call when the first returns False
:
...
let b1 = foo (depth+1) t1 in
let b2 = foo (depth+1) t2 in
b1 && b2
...
Second, we will avoid the need to call height
first before making a “second pass” over the Tree
to check for fullness:
...
let h = height t in
...
We might attempt the first improvement as follows:
(foo (depth+1) t1) && (foo (depth+1) t2)
Elm is an eager (a.k.a. strict) language, however, so all arguments to a function are fully evaluated before evaluating the function body. So, won’t both recursive calls be evaluated before the (&&)
function is called?
To answer this question, we could look for guidance in the language documentation and then, to answer any lingering questions, we could either dive into the language implementation or we could devise some hypotheses and make queries to test them. Let’s do the latter.
To observe when an expression really does get evaluated, let’s define the following function:
> false () = let _ = Debug.log "thunk evaluated " () in False
<function> : () -> Bool
Evaluating the false
function will print a message and return the Bool
ean False
:
> false ()
thunk evaluated : ()
False
: Bool
We have “delayed” the evaluation of the expression by wrapping it in a function that takes the dummy argument ()
. Functions like are called “thunks” and turn out to be extremely useful for programming in (and compiling) functional languages. We will put thunks to good use later in the course.
Now let’s write some tests to exercise the behavior of (&&)
. As a first sanity check, let’s make sure that evaluate false
twice forces it to evaluate twice (akin to our two, in-sequence recursive calls to isFull
above):
> let b1 = false () in \
| let b2 = false () in \
| b1 && b2
thunk evaluated : ()
thunk evaluated : ()
False
: Bool
That’s not surprising. We might expect the same effects on…
> false () && false ()
thunk evaluated : ()
False
: Bool
… but this time false
is forced to evaluated only once! What’s going on? Elm is not lazy, but if we wanted to run a sanity test anyway just to make sure:
> let _ = false () in 0
thunk evaluated : ()
0
: number
Okay, how about this?
> and = (&&)
<function> : Bool -> Bool -> Bool
> and (false ()) (false ())
thunk evaluated : ()
thunk evaluated : ()
False
: Bool
Interesting! When (&&)
masquerades as and
, both arguments are evaluated.
What’s going on is that when (&&)
is called syntactically, the expression e1 && e2
gets parsed and translated (essentially) to the expression if e1 then e2 else False
which does capture the essence of short-circuiting. But when (&&)
is “hidden” inside another expression, it is no longer parsed and treated specially; instead, it is an ordinary function call, where the arguments are fully evaluated before evaluating the function itself. (Note: (||)
is treated similarly).
This is not a surprising compromise actually. The short-circuiting semantics is often what one wants when using (&&)
. But to provide the same when it is computed from other expressions would require solving a very hard problem known as higher-order control flow analysis, which aims to determine (rather, approximate) the set of function definitions that may “flow into” a particular call-site expression. Such analysis is an important part of compiler optimizations and other static analysis but would be overkill for this particular use. Instead, if one wants the short-circuiting semantics, it seems reasonable to have to write (&&)
or if
explicitly.
Let’s write one more test to continue to develop our understanding:
> (&&) (false ()) (false ())
thunk evaluated : ()
thunk evaluated : ()
False
: Bool
So it turns out that using (&&)
as a prefix function (instead of an infix operator) does not result in short-circuiting. I would have expected this to be treated the same as the infix case, but it’s no big deal in any case.
For what it’s worth, OCaml treats prefix calls to (&&)
the same way it treats (||)
. If you’re curious and don’t have ocaml
installed, you can try the following out at TryOCaml and watch the Web Console for the logging messages.
# let returnFalse () = print "thunk evaluated\n"; false;;
val returnFalse : unit -> bool = <fun>
# returnFalse () ;;
- : bool = false
# (returnFalse ()) && (returnFalse ()) ;;
- : bool = false
# (&&) (returnFalse ()) (returnFalse ()) ;;
- : bool = false
Okay, so we can short-circuit the second recursive call by simplying using &&
after all:
(foo (depth+1) t1) && (foo (depth+1) t2)
NOTE: The following approach is much better than the original, clunky one described in class (and moved to the bottom of these notes).
We should be able to determine whether a Tree
is full without having to first compute its height and then making a second pass comparing the depth of Empty
nodes against the height. Instead, we need to do is recursively check that every Node
has two children of the same height.
We will write the recursive function
maybeFullTreeHeight : Tree a -> Maybe Int
to return
Nothing
if the input tree is not full; andJust h
if both subtrees are full and have height h-1
.The integer height is needed internally to help determine fullness, but can then be discarded if we only need the boolean answer:
isFullTree : Tree a -> Bool
isFullTree t =
case maybeFullTreeHeight t of
Just _ -> True
Nothing -> False
We start with Empty
trees, which are trivially full:
maybeFullTreeHeight t =
case t of
Empty -> Just 0
For inner nodes, both children must be full (maybeFullTreeHeight
must not evaluate to Nothing
) and of the same height:
Node _ left right ->
case maybeFullTreeHeight left of
Nothing -> Nothing
Just h1 ->
case maybeFullTreeHeight right of
Nothing -> Nothing
Just h2 ->
if h1 == h2
then Just (1 + h1)
else Nothing
Excellent, we have traversed the Tree
in a single pass and checked for fullness.
In the recursive case above, the “error plumbing” code is untidy. One attempted remedy is the following:
case (maybeFullTreeHeight left, maybeFullTreeHeight right) of
(Just h1, Just h2) -> if h1 == h2
then Just (1 + h1)
else Nothing
_ -> Nothing
Perhaps this version is more readable, but it no longer shortcuts the second recursive call when its result is not needed.
Instead, we can factor out a general pattern of “chaining” together operations on Maybe
values:
maybeAndThen : (a -> Maybe b) -> Maybe a -> Maybe b
maybeAndThen f mx =
case mx of
Nothing -> Nothing
Just x -> f x
This abstraction allows us to chain multiple Maybe
computations together using what looks like straight-line code:
Node _ left right ->
maybeFullTreeHeight left |> maybeAndThen (\h1 ->
maybeFullTreeHeight right |> maybeAndThen (\h2 ->
if h1 == h2 then Just (1 + h1) else Nothing
))
A couple more helper functions…
maybeReturn : a -> Maybe a
maybeReturn x = Just x
maybeGuard : Bool -> Maybe ()
maybeGuard b =
if b
then Just ()
else Nothing
… allow us to refactor once more in style:
Node _ left right ->
maybeFullTreeHeight left |> maybeAndThen (\h1 ->
maybeFullTreeHeight right |> maybeAndThen (\h2 ->
maybeGuard (h1 == h2) |> maybeAndThen (\() ->
maybeReturn (1 + h1)
)))
Much better (especially if we chose some shorter names)!
This “and then” pattern for chaining together expressions is rather common and comes with some library support:
Maybe.andThen : (a -> Maybe b) -> Maybe a -> Maybe b
Result.andThen : (a -> Result x b) -> Result x a -> Result x b
Task.andThen : (a -> Task x b) -> Task x a -> Task x b
Json.Decoder.andThen : (a -> Decoder b) -> Decoder a -> Decoder b
...
Remember how I said I wouldn’t say “monad” in this class? Well, if you have not seen them before, you may have just developed a bit of intuition for why… “chains” are useful in programming and why functional programmers think they’re such a big deal.
NOTE: This is the original, much clunkier “three-state” solution.
We should be able to determine whether a Tree
is full without having to first compute its height. What we need to do is check that the depth of each Empty
subtree is the same. This is a good exercise in functional programming: we need to maintain some information across multiple recursive calls, but of course we do not have access to any mutable variables to help.
There are three “states” we need to track as we traverse a tree:
Empty
subtree; andEmpty
subtrees seen so far have the same depth;If we are in the third state after processing the entire tree, then the tree is full.
We can use the type
Maybe (Maybe Int)
to represent these three states, respectively, as Nothing
, Just Nothing
, and Just maxDepth
. (Besides not having to define a custom datatype with three data constructors just for the purpose of this function, another benefit of this encoding is that we sometimes get to type “Just Nothing
”.)
We will define a function called traverse
that will do the heavy lifting, transforming the input state acc
into an output state based on the given tree
at depth depth
:
traverse : Maybe (Maybe Int) -> Int -> Tree a -> Maybe (Maybe Int)
traverse acc depth tree =
...
Once finished, we will define isFull
by calling traverse
and checking the final state:
isFullOnePass : Tree a -> Bool
isFullOnePass t =
case traverse (Just Nothing) 0 t of
Just (Just _) -> True
_ -> False
Okay, let’s start implementing the different cases for traverse
:
traverse acc depth tree =
case (acc, tree) of
Let’s first handle the case where the input acc
is already the “error” state, Nothing
, which we just propagate.
(Nothing, _) -> Nothing
Next are two cases for reaching an Empty
tree. If have not reached a leaf Node
yet (acc
== Nothing
), then we record the current depth
that all paths from the root to Empty
need to have. If we have seen an Empty
subtree already, then we compare the maxDepth
of previous paths to the current depth
.
(Just Nothing, Empty) -> Just (Just depth)
(Just (Just maxDepth), Empty) ->
if depth == maxDepth then acc else Nothing
Lastly is the case for Node
, where we recursively process the children and, if they both are full subtrees, check that their depths n
and m
are equal.
(acc, Node _ t1 t2) ->
let recurse = traverse acc (depth+1) in
case recurse t1 of
Nothing -> Nothing
Just Nothing -> Nothing
Just (Just n) ->
case recurse t2 of
Nothing -> Nothing
Just Nothing -> Nothing
Just (Just m) -> if n == m then Just (Just n) else Nothing
Excellent, we have traversed the Tree
in a single pass and checked for fullness. Now we can factor out a general pattern of “chaining” together operations on Maybe
values.
maybeAndThen
would help get rid of one level of the plumbing code above. To help with the second, we can essentially repeat the process:
maybeMaybeAndThen : (a -> Maybe (Maybe b)) -> Maybe (Maybe a) -> Maybe (Maybe b)
maybeMaybeAndThen f mmx =
case mmx of
Nothing -> Nothing
Just mx -> mx |> maybeAndThen f
All this abstraction then allows us to chain multiple Maybe
computations together using what looks like straight-line code:
(acc, Node _ t1 t2) ->
let recurse = traverse acc (depth+1) in
recurse t1 |> maybeMaybeAndThen (\n ->
recurse t2 |> maybeMaybeAndThen (\m ->
if n == m then Just (Just n) else Nothing
))
A couple more helper functions…
maybeMaybeGuard : Bool -> Maybe (Maybe ())
maybeMaybeGuard b =
if b
then Just (Just ())
else Nothing
maybeMaybeReturn : a -> Maybe (Maybe a)
maybeMaybeReturn x = Just (Just x)
… allow us to refactor once more in style:
(acc, Node _ t1 t2) ->
let recurse = traverse acc (depth+1) in
recurse t1 |> maybeMaybeAndThen (\n ->
recurse t2 |> maybeMaybeAndThen (\m ->
maybeMaybeGuard (n == m) |> maybeMaybeAndThen (\_ ->
maybeMaybeReturn n
)))