Monads Demystified
With this entry, I am going where many bloggers have gone before. Writing a monad tutorial that makes sense to you (as the writer) but nobody else seems to be a rite of passage. Silly monad tutorial are so overdone that there’s even a running joke about how Monads Are Like Burritos.
So despite the fact that Monads are not a good choice as topic for your first Haskell blog entry, I’m going to be a little audacious here and make Monads the topic of my first Haskell blog entry. My previous “X Demystified” entries (React Demystified, LL and LR Parsing Demystified) have been well-received, which is allowing me to believe that I am good at explaining things. I have to apologize to Haskell people, as I’ll describe some things by way of analogy with C++ and use some C++-like notation, since it’s how I think. Did I mention I’ve written about 10 lines of Haskell in my life total? Which is fewer than the number of monad tutorials and articles I’ve studied, which is a lot. Anyway, here goes.
Monads are a Design Pattern
Our first task is to understand what a monad even is. Is a monad a value? Is it a type? Is it a language feature? Is it an I/O library? Most explanations either throw a mathematical construction at you or start showing you examples of monads. But neither of these is helps you classify what a monad even is.
The answer is actually pretty simple. Monads are a design pattern. In fact, it would probably be more clear to talk about monads in terms of “the Monad pattern” in the same way you’d talk about design patterns like Factory or Visitor. Any type you create that follows the monadic pattern and obeys a few key rules around that pattern is a monad.
It’s really important to understand that the pattern is what makes a monad a monad. This pattern can be used for a very wide range of things. It can be used to model side-effects and I/O in Haskell, but many monads have nothing to do with side-effects or I/O. This is one of the things that makes monads confusing to learn about. Whenever you are tempted to take something you learned about a monad and generalize it, you can often find a counterexample of a monad for which that thing is not true. There is an entire web page dedicated to listing a bunch of incorrect generalizations about monads. That page explains at length what monads are not. I am going to try and explain what monads are.
The best analogy I can think of is operator overloading in C++. People overload C++ operators to do everything from matrix multiplication to string concatenation to constructing a parser. Saying “monads are about side effects” would be like saying “operator overloading is about matrix multiplication,” which isn’t true in general: lots of overloaded operators have nothing to do with matrix multiplication.
Monads, like overloaded operators, also have some identities which you
should follow. With overloaded operators you should make sure that x
= x + y
has the same effect as x += y
, regardless of what the +
operator actually does. The monad pattern has a similar set of rules
that need to be followed for things to make sense and compose
correctly.
The Monad Abstraction: bind (“and then”)
Let’s continue the analogy with overloading +=
in C++. In C++, x
+= y
means “add y to x”, whatever “add” means in your context
(numerical addition, matrix addition, string concatenation, etc). So
what is the basic semantic of a monad?
The basic abstraction of a monad is “bind”, written x >>= f
. Bind
means, loosely, “and then f”. Basically a monad lets you keep adding
functions to a chain of operations. But the monad gets to decide what
“and then f” means for that monad. For this reason, monads are often
called “programmable semicolons.” As a monad author, you get to put
your own custom logic between each step of the sequence.
(Note for the pedantic: some people would interrupt me here and argue that Monads are not about ordering/sequencing. I would respectfully disagree. Just because a monad can be implemented to be commutative doesn’t take away from the fundamentally sequence-oriented nature of the abstraction. Later functions in the sequence have access to more value bindings than earlier functions. So from a pure data dependency perspective, a sequence is established in the abstraction, even if the functions do not literally need to be evaluated in sequence.)
How the Monad pattern works
We know now that monad is a pattern, and that the primary abstraction is bind (“and then”). Now we will look at how the pattern actually works.
The pattern is a bit tricky to wrap your head around, but I find it easiest to look at visually:
Starting from the top, we initially have a “monadic value.” A monadic value is an instance of the monad’s type, so it basically gets to contain whatever state the monad author wanted to put in it. I didn’t mention how you get the first monadic value, because that can vary based on the monad and how you’re using it. Right now I’m focusing on characteristics that apply to all monads.
The monad’s user calls bind()
and passes two arguments: the monadic
value and a function that will return a monadic value. Note that
bind()
is a function that was defined by the monad’s author and is
specific to that monad. Indeed, bind()
is the most interesting part
of the monad! This is where the monad’s author gets to decide what
“and then” means for this monad. In particular, note that bind()
gets to decide how to call the passed-in function. It can call it
once, many times, or not at all! All of these options are valid and
used by at least one real monad implementation.
bind()
needs to return a monadic value. And notice that the
passed-in function also returns a monadic value! So one very common
option is for bind()
to evaluate the passed-in function and then use
its return value as the return value for bind()
. If bind()
doesn’t evaluate the function at all, it will need to make something
up for the returned monadic value. If it evaluates the function many
times, it may merge their returned values into the output monadic
value. There are many options for how to implement bind()
.
There’s one other important thing I haven’t mentioned yet. The
passed-in function takes an argument. The bind()
function gets to
decide where this argument comes from and how to obtain it. This is
again why a monad is called a “programmable semicolon.” The monad
gets to make a bunch of really interesting decisions at every stage
about how to evaluate each function of the sequence, and where its
input data comes from. And it gets to carry along whatever data it
wants to inside the monadic value. The monad can use this to
propagate whatever extra state is desired.
The type of each function’s argument is a type parameter of the previous monadic value. If we annotate the previous diagram with type information, it will look more like this:
The most important monad: IO
Having established a lot of general information about monads, it
seems about the right time to dive into a specific monad. And of all
monads in Haskell, IO
is the most important, since it is used to
model all side effects. You come into contact with IO
immediately
when you write your first Haskell program, because Haskell’s main
function is defined as returning the type IO ()
.
When we want to understand a new monad, there are two key questions we want to ask ourselves:
- what does a monadic value represent in this monad?
- what semantic does
bind()
(“and then”) implement in this monad?
For the IO
monad, a monadic value IO a
(or IO<a>
in C++ notation
– a
is a type parameter) represents a side-effectful action
yielding a value of type a
, but which has not been performed yet.
And bind()
implements the semantic “add this function to the action
chain. If/when the function is evaluated, it will know that the
operation has been performed and will receive the result (if any) of
the operation. The function gets to return an IO
value deciding
what action should be performed next.”
So in Haskell, functions which look like they perform I/O, like
putStrLn
, don’t actually perform I/O but rather return a value
representing the not-yet-performed operation.
The I/O operation is not actually performed unless Haskell determines
that the monadic IO
value is required for something (remember that
Haskell is lazy). The easiest way to do this is to return it from
your main
function, since Haskell will by definition perform any I/O
returned from main
. So the main
function’s type IO ()
means “a
possibly side-effectful action that doesn’t return/yield anything.”
And indeed your main
function should perform some kind of
side-effect, otherwise the program is effectively a no-op and should
be optimized to nothing! Returning a useful IO ()
action from
main
is how you make your program do something useful.
That is why this program prints “Hello, world!” (since the IO
action
is returned from main
):
But this program does nothing (because we don’t return the IO
action
from main
):
The next example also does nothing, because even though we bind the
IO
value to an action, the monadic value returned from bind()
operation is never required, because we don’t return it from main
:
What does it look like to have a two stage operation, a read followed
by a write? Let’s use getLine
which has this type:
We’ll need to use bind()
to sequence two actions together, and to
get the value of the first operation for use in the second. Our
two-stage pipeline looks like this:
This example clearly illustrates using bind()
(ie. >>=
) to
sequence two IO
actions together! This is the fundamental idea of
the IO
monad. Monadic values represent unperformed I/O operations.
And bind()
will add a function to the chain, such that it will get
the result of the previous I/O operation and select the next operation
to be performed. But no operation in the chain will be performed
unless it is ultimately part of the I/O action returned by main
.
Chaining multiple operations / “do” notation
Let’s take our previous example one step farther and have three IO
operations in a row, where the third one wants to use the values from
both of the first two:
This example fails to compile, because bind()
only gives a function
direct access to the result of the directly preceding I/O operation.
However, this limitation is easy to work around by nesting the
functions we’re binding:
This is a little bit mind-bending, but it works because the function
on the right of a bind()
call has the responsibility to return a
monadic value, which bind()
itself does. So with this pattern,
we’re using the pattern in a right-associative way instead of
left-associative. In other words, if you’ll allow me to briefly
regress into Python-like function notation, instead of saying:
We’re now saying:
The major advantage of the second form is that later pipeline stages
have access to all previous bindings (ie. both x
and y
), not
just the directly preceding one.
As a result, the second form is, functionally speaking, pretty much strictly superior to the first. However it has a drawback from a convenience perspective, which is that each stage creates an extra level of function nesting. If you indent this properly, functions with many stages will start flowing off the right edge of your screen. Because this pattern is so popular, Haskell includes a convenience notation for this function nesting:
Just like our manually written-out version, both the x
and y
bindings are available in later stages of the sequence.
To properly represent this new variation, we should update our previous diagram. The diagram gets a little more crazy with all the nesting, but the key point to notice is that since the second function is nested inside the first, it has access to the first function’s argument / variable binding.
The Maybe monad and “return”
Let’s look at a different monad to diversify our understanding a bit.
Maybe
is probably the simplest monad that is actually useful, but is
also different from IO
in some important ways. Maybe doesn’t have
anything to do with side-effects: it is completely pure.
I said before that the two important questions to ask when
encountering a new monad are: (1) what does a monadic value represent,
and (2) what semantic does bind()
implement?
For Maybe
the answers to these questions are extremely simple. A
monadic value Maybe a
represents either a value of type a
or
Nothing
, the lack of any value:
And bind()
implements the semantic: “if there is a value, evaluate
the function (with that value), otherwise don’t evaluate the function
at all.”
So when you chain a bunch of functions together with the Maybe
monad, it will evaluate all of them in sequence until one of them
returns Nothing
. If any function in the sequence returns Nothing
then the subsequent functions won’t be evaluated at all and the chain
will return Nothing
.
But how do you get the whole thing started? With the IO
monad, we
were obtaining IO
values from I/O library functions like getLine
.
In the case of Maybe though, we want to provide our own value to start
the process. For this we could construct such a value by using the
Just
constructor directly:
But as a general rule, monads are required to provide a standard
function called return
that does just this. The job of return
(in
any monad) is to construct a monadic value out of a plain value. The
implementation details of a monadic value are not public, but the
return
function is a public interface for building a trivial monadic
value that simply wraps the given value.
It’s a little weird that the function is called return
when you can
use it to start a monadic pipeline. But it makes more sense when you
realize that the function is also useful for returning a monadic value
from a function, for example:
The monadic
laws (which I won’t get into in this article) guarantee that
return
doesn’t do anything too fancy, but just wraps the value and
yields it unchanged to the next function in the sequence.
Conclusion
There are many more monads than these two to explore, but this article
has gotten quite long already. I hope you have found this article
useful in your understanding of monads. If you can take one thing
away from this article, remember the two questions to ask yourself
when encountering a new monad: (1) what does a monadic value
represent, and (2) what semantic does bind()
implement?