Learning monads in Clojure: a warning
Posted on 25 October 2011
I was inspired to learn about monads by Chris Ford recently; his description of encapsulating impurity safely within a pure language had me intrigued immediately. I decided that I wanted to learn about monads in Clojure, a language I am currently diving into.
However, I found learning about monads in Clojure full of fake difficulty (or accidental complexity, if you will). Here I document the issues I found. And the key issue I came across was this:
Learning monads requires reasoning about types
You probably know where I’m going with this. Clojure is dynamically typed. Haskell, the spiritual home of monads, is statically typed. For me, the key to understanding monads was reasoning about types — in particular, drawing a clear distinction between the ordinary type and the type of a monadic expression.
In drawing this distinction, it helped me reason about the behaviour
of the monadic functions. By learning that m-bind
must return a
monadic expression and not a simple value, I learned a key fact about
monads; but the number of times I tried to write m-bind
expressions
beforehand which did not return monadic expressions beforehand was too
many.
It’s quite possible to reason about types in a dynamically typed language, but it’s made much harder. If your reasoning is faulty, the program will try to carry on regardless, and in Clojure’s case, give an incredibly cryptic error message. This is not an environment that makes learning easy. If I had been learning in Haskell, my failure to understand the distinction between monadic expression and ordinary value would have immediately been set right by the type checker.
But it’s worse than just making learning hard: Clojure’s dynamic typing has led to a pervasive failure of type reasoning.
A key example of this is that Clojure’s implementation of the maybe
monad, maybe-m
, breaks the monad laws! It does this because it does
not properly distinguish between the monadic expression and the
underlying type. The law in question is the first monad law, expressed
here as a Midje test:
;;; given a monad which defines m-bind and m-result, ;;; f, an arbitrary function, and ;;; val, an arbitrary value (fact "The first monad law" (m-bind (m-result val) f) => (f val))
The failure of maybe-m
to adhere to this law is demonstrated thus:
;;; failing midje test (fact "maybe-m should adhere to the first monad law" (with-monad maybe-m (m-bind (m-result nil) not)) => (not nil))
The reason that this law is violated is that the maybe-m
monadic
expression type is no different from the underlying value type. It is
therefore possible to find a value such that (m-result val)
is
nil
, the maybe monad’s value for failure.
The Haskell Maybe monad is not so sloppy:
> let myNot x = Just (x == Nothing) > (return Nothing :: Maybe (Maybe Char)) >>= myNot Just True > myNot (Nothing :: Maybe (Maybe Char)) Just True
This is because in Haskell, there is no value foo
such that Nothing
== return foo
; in Clojure, there is such a value: (= nil (m-result
nil))
.
The repercussions of maybe-m
’s violation of the first monad law are
relatively minor: it means that when using maybe-m
, the value nil
has been appropriated and given a new meaning; which means that if you
had any other meaning for it, you’re stuffed.
For example, suppose you wanted to implement a distributed hash table
retrieval, where failure could be caused by a network outage. You want
a function behaviour similar to (get {:a 1} :b)
, where if the value
is not in the table you return nil. If you use maybe-m
to perform
this calculation, you cannot tell the difference between failing to
communicate with the DHT, and successfully determining that the DHT
does not contain anything under the key :b
; both will result in the
value nil
. Worse, if you want to use this value later in the
computation, the maybe-m
will assume a value missing in the DHT to
be a failure, and cut your computation short — even if that’s not
what you wanted.
Summary
If you want to learn monads, do it in Haskell.
If you must do it in Clojure, the key is to understand and distinguish
the various types in play. The monadic type is distinct from the
underlying type. m-result
takes an underlying value and gives you an
equivalent value in the monadic type. m-bind
takes a monadic value,
and a function from an underlying value to a monadic value.