So despite the fact that Monads are not a good choice as topic for your first Haskell blog entry, I'm going to be a little audacious here and make Monads the topic of my first Haskell blog entry. My previous "X Demystified" entries (React Demystified, LL and LR Parsing Demystified) have been well-received, which is allowing me to believe that I am good at explaining things. I have to apologize to Haskell people, as I'll describe some things by way of analogy with C++ and use some C++-like notation, since it's how I think. Did I mention I've written about 10 lines of Haskell in my life total? Which is fewer than the number of monad tutorials and articles I've studied, which is a lot. Anyway, here goes.
Monads are a Design Pattern
Our first task is to understand what a monad even is. Is a monad a value? Is it a type? Is it a language feature? Is it an I/O library? Most explanations either throw a mathematical construction at you or start showing you examples of monads. But neither of these is helps you classify what a monad even is.
The answer is actually pretty simple. Monads are a design pattern. In fact, it would probably be more clear to talk about monads in terms of "the Monad pattern" in the same way you'd talk about design patterns like Factory or Visitor. Any type you create that follows the monadic pattern and obeys a few key rules around that pattern is a monad.
It's really important to understand that the pattern is what makes a monad a monad. This pattern can be used for a very wide range of things. It can be used to model side-effects and I/O in Haskell, but many monads have nothing to do with side-effects or I/O. This is one of the things that makes monads confusing to learn about. Whenever you are tempted to take something you learned about a monad and generalize it, you can often find a counterexample of a monad for which that thing is not true. There is an entire web page dedicated to listing a bunch of incorrect generalizations about monads. That page explains at length what monads are not. I am going to try and explain what monads are.
The best analogy I can think of is operator overloading in C++. People overload C++ operators to do everything from matrix multiplication to string concatenation to constructing a parser. Saying "monads are about side effects" would be like saying "operator overloading is about matrix multiplication," which isn't true in general: lots of overloaded operators have nothing to do with matrix multiplication.
Monads, like overloaded operators, also have some identities which you should follow. With overloaded operators you should make sure that
x = x + yhas the same effect as
x += y, regardless of what the
+operator actually does. The monad pattern has a similar set of rules that need to be followed for things to make sense and compose correctly.
The Monad Abstraction: bind ("and then")
Let's continue the analogy with overloading
+=in C++. In C++,
x += ymeans "add y to x", whatever "add" means in your context (numerical addition, matrix addition, string concatenation, etc). So what is the basic semantic of a monad?
The basic abstraction of a monad is "bind", written
x >>= f. Bind means, loosely, "and then f". Basically a monad lets you keep adding functions to a chain of operations. But the monad gets to decide what "and then f" means for that monad. For this reason, monads are often called "programmable semicolons." As a monad author, you get to put your own custom logic between each step of the sequence.
(Note for the pedantic: some people would interrupt me here and argue that Monads are not about ordering/sequencing. I would respectfully disagree. Just because a monad can be implemented to be commutative doesn't take away from the fundamentally sequence-oriented nature of the abstraction. Later functions in the sequence have access to more value bindings than earlier functions. So from a pure data dependency perspective, a sequence is established in the abstraction, even if the functions do not literally need to be evaluated in sequence.)
How the Monad pattern works
We know now that monad is a pattern, and that the primary abstraction is bind ("and then"). Now we will look at how the pattern actually works.
The pattern is a bit tricky to wrap your head around, but I find it easiest to look at visually:
Starting from the top, we initially have a "monadic value." A monadic value is an instance of the monad's type, so it basically gets to contain whatever state the monad author wanted to put in it. I didn't mention how you get the first monadic value, because that can vary based on the monad and how you're using it. Right now I'm focusing on characteristics that apply to all monads.
The monad's user calls
bind()and passes two arguments: the monadic value and a function that will return a monadic value. Note that
bind()is a function that was defined by the monad's author and is specific to that monad. Indeed,
bind()is the most interesting part of the monad! This is where the monad's author gets to decide what "and then" means for this monad. In particular, note that
bind()gets to decide how to call the passed-in function. It can call it once, many times, or not at all! All of these options are valid and used by at least one real monad implementation.
bind()needs to return a monadic value. And notice that the passed-in function also returns a monadic value! So one very common option is for
bind()to evaluate the passed-in function and then use its return value as the return value for
bind()doesn't evaluate the function at all, it will need to make something up for the returned monadic value. If it evaluates the function many times, it may merge their returned values into the output monadic value. There are many options for how to implement
There's one other important thing I haven't mentioned yet. The passed-in function takes an argument. The
bind()function gets to decide where this argument comes from and how to obtain it. This is again why a monad is called a "programmable semicolon." The monad gets to make a bunch of really interesting decisions at every stage about how to evaluate each function of the sequence, and where its input data comes from. And it gets to carry along whatever data it wants to inside the monadic value. The monad can use this to propagate whatever extra state is desired.
The type of each function's argument is a type parameter of the previous monadic value. If we annotate the previous diagram with type information, it will look more like this:
The most important monad: IO
Having established a lot of general information about monads, it seems about the right time to dive into a specific monad. And of all monads in Haskell,
IOis the most important, since it is used to model all side effects. You come into contact with
IOimmediately when you write your first Haskell program, because Haskell's
mainfunction is defined as returning the type
When we want to understand a new monad, there are two key questions we want to ask ourselves:
- what does a monadic value represent in this monad?
- what semantic does
bind()("and then") implement in this monad?
IOmonad, a monadic value
IO<a>in C++ notation --
ais a type parameter) represents a side-effectful action yielding a value of type
a, but which has not been performed yet. And
bind()implements the semantic "add this function to the action chain. If/when the function is evaluated, it will know that the operation has been performed and will receive the result (if any) of the operation. The function gets to return an
IOvalue deciding what action should be performed next."
So in Haskell, functions which look like they perform I/O, like
putStrLn, don't actually perform I/O but rather return a value representing the not-yet-performed operation.
-- putStrLn takes a string parameter and returns an unperformed I/O -- action representing printing of that string. The I/O action -- yields nothing (since we are printing, not getting input) or "()". putStrLn :: String -> IO ()The I/O operation is not actually performed unless Haskell determines that the monadic
IOvalue is required for something (remember that Haskell is lazy). The easiest way to do this is to return it from your
mainfunction, since Haskell will by definition perform any I/O returned from
main. So the
IO ()means "a possibly side-effectful action that doesn't return/yield anything." And indeed your
mainfunction should perform some kind of side-effect, otherwise the program is effectively a no-op and should be optimized to nothing! Returning a useful
IO ()action from
mainis how you make your program do something useful.
That is why this program prints "Hello, world!" (since the
IOaction is returned from
main = putStrLn "Hello, world!"But this program does nothing (because we don't return the
main = let action = putStrLn "Hello, world!" in return ()The next example also does nothing, because even though we bind the
IOvalue to an action, the monadic value returned from
bind()operation is never required, because we don't return it from
main = let action = putStrLn "Hello, world!" >>= \x -> return () in return ()What does it look like to have a two stage operation, a read followed by a write? Let's use
getLinewhich has this type:
-- getLine takes no parameters and returns an I/O action yielding a string. getLine :: IO StringWe'll need to use
bind()to sequence two actions together, and to get the value of the first operation for use in the second. Our two-stage pipeline looks like this:
main = getLine >>= \x -> putStrLn ("You typed: " ++ x)This example clearly illustrates using
>>=) to sequence two
IOactions together! This is the fundamental idea of the
IOmonad. Monadic values represent unperformed I/O operations. And
bind()will add a function to the chain, such that it will get the result of the previous I/O operation and select the next operation to be performed. But no operation in the chain will be performed unless it is ultimately part of the I/O action returned by
Chaining multiple operations / "do" notation
Let's take our previous example one step farther and have three
IOoperations in a row, where the third one wants to use the values from both of the first two:
-- Doesn't work -- "x" isn't in scope in the last function. main = getLine >>= (\x -> getLine) >>= (\y -> putStrLn ("You typed: " ++ x ++ " and then " ++ y))This example fails to compile, because
bind()only gives a function direct access to the result of the directly preceding I/O operation. However, this limitation is easy to work around by nesting the functions we're binding:
main = getLine >>= \x -> getLine >>= \y -> putStrLn ("You typed: " ++ x ++ " and then " ++ y)This is a little bit mind-bending, but it works because the function on the right of a
bind()call has the responsibility to return a monadic value, which
bind()itself does. So with this pattern, we're using the pattern in a right-associative way instead of left-associative. In other words, if you'll allow me to briefly regress into Python-like function notation, instead of saying:
bind(bind(getLine, lambda x: getLine), lambda y: putStrLn)We're now saying:
bind(getLine, lambda x: bind(getLine, lambda y: putStrLn))The major advantage of the second form is that later pipeline stages have access to all previous bindings (ie. both
y), not just the directly preceding one.
As a result, the second form is, functionally speaking, pretty much strictly superior to the first. However it has a drawback from a convenience perspective, which is that each stage creates an extra level of function nesting. If you indent this properly, functions with many stages will start flowing off the right edge of your screen. Because this pattern is so popular, Haskell includes a convenience notation for this function nesting:
main = do x <- getLine y <- getLine putStrLn ("You typed: " ++ x ++ " and then " ++ y)Just like our manually written-out version, both the
ybindings are available in later stages of the sequence.
To properly represent this new variation, we should update our previous diagram. The diagram gets a little more crazy with all the nesting, but the key point to notice is that since the second function is nested inside the first, it has access to the first function's argument / variable binding.
The Maybe monad and "return"
Let's look at a different monad to diversify our understanding a bit.
Maybeis probably the simplest monad that is actually useful, but is also different from
IOin some important ways. Maybe doesn't have anything to do with side-effects: it is completely pure.
I said before that the two important questions to ask when encountering a new monad are: (1) what does a monadic value represent, and (2) what semantic does
Maybethe answers to these questions are extremely simple. A monadic value
Maybe arepresents either a value of type
Nothing, the lack of any value:
data Maybe t = Just t | NothingAnd
bind()implements the semantic: "if there is a value, evaluate the function (with that value), otherwise don't evaluate the function at all."
So when you chain a bunch of functions together with the
Maybemonad, it will evaluate all of them in sequence until one of them returns
Nothing. If any function in the sequence returns
Nothingthen the subsequent functions won't be evaluated at all and the chain will return
But how do you get the whole thing started? With the
IOmonad, we were obtaining
IOvalues from I/O library functions like
getLine. In the case of Maybe though, we want to provide our own value to start the process. For this we could construct such a value by using the
getMonadicValue x = Just xBut as a general rule, monads are required to provide a standard function called
returnthat does just this. The job of
return(in any monad) is to construct a monadic value out of a plain value. The implementation details of a monadic value are not public, but the
returnfunction is a public interface for building a trivial monadic value that simply wraps the given value.
It's a little weird that the function is called
returnwhen you can use it to start a monadic pipeline. But it makes more sense when you realize that the function is also useful for returning a monadic value from a function, for example:
add :: Maybe Int -> Maybe Int -> Maybe Int add mx my = do x <- mx y <- my return (x + y)The monadic laws (which I won't get into in this article) guarantee that
returndoesn't do anything too fancy, but just wraps the value and yields it unchanged to the next function in the sequence.
There are many more monads than these two to explore, but this article has gotten quite long already. I hope you have found this article useful in your understanding of monads. If you can take one thing away from this article, remember the two questions to ask yourself when encountering a new monad: (1) what does a monadic value represent, and (2) what semantic does