My Attempt To Explain A Monad

#absurd-abstractions

2023-11-01

I encountered the concept of a Monad for the first time when I started looking into Haskell.
When I learn a language I usually don't go for the tutorials straight away but instead try to read real source code to test how far my intuition gets me. I noticed relatively quickly the type signatures over the functions and I saw that the main function had a Type of "IO ()".
Now drawing from my past experiences I assumed that IO and the "()" was like a class with a template argument, as we would see in C++ Templates. So to clarify I looked at an introduction course about Haskell syntax on YouTube. And there it was, on the 4th or 5th video, I finally saw he use of IO(), and the reference to a Monad. Let the confusion begin. I expected a simple explanation like IO is a class, which is a familiar concept. So I thought maybe a monad is a specific kind of class. But very quickly I noticed that it wasn't that simple. So I did what probably everyone would do: I googled "What is a monad?". And what ended up happening is the content of this article.

First you encounter funny remarks of how "a Monad is just a Monoid in the Category of Endofunctors, what's the problem?", trivialising the brutality of its crypticness. Or that "once you understand a Monad you would be unable to explain it". This is also my attempt to prove the last claim wrong. Challenge accepted. I think the problem of explaining what a Monad is, is context. Instead of asking "What is a Monad?" we should rather ask, in "respect to what?". And so I did.

What is a Monad in respects to Category Theory?

Short Answer: A monoid in the category of Endofunctors.

First of all this very cryptic statement of the Monoids and Endofunctors. And what in the Hell is a Category? And I ended up in articles about Category Theory. This was the first time after a long time where I was absolutely lost. I am not exaggerating if I say this, but I threw my head against these concepts again and again, over and over. For one week, every day I read the same few articles from top to bottom. And one day it finally clicked! I understood!

It is a very abstract concept. And it was supposed to be that abstract, because it essentially allowed to represent anything in terms of it, like the types used in a program (more on this later). Categories are collections of objects that within the category are described in terms of their relation to other objects. These relationships can be understood as transformations and are called arrows or morphisms. And every object in a category can be turned in any other object by following the arrows like a graph. Example time, there is three Objects: A, B and C. There is an arrow from A->B and an Arrow of B->C. Now by following the arrow from a to b, we can then jump from b to c, effectively creating the composed arrow from A->C. Ending up with an endless amount of connections from the composition of the other arrows.

A monoid is just a succinct way to express an object within a category which combined with itself (has an arrow pointed at itself) returns another instance of itself with an accumulative effect. Imagine that in addition to the already shown connection there was an implicit connection A->A, in terms of some binary operation, which takes two things of A and makes more of A. Added to that, it has the ability to have an empty condition under that operation, which when combined to itself doesn't change it, called an identity. Simple example numbers under the operation of + are monoids. 1 + 1 -> 2, 1 + 0 -> 1. So if you understand addition, you understand monoids. There is of course a ton of different laws to proof the "monoidness" of something, but honestly, who cares right now? I just want to get across the essence.

A functor in category theory on the other hand is a mapping of one category to another category, like a translator, doing all the effort to ensure that the original relationship doesn't get lost in translation. An endofunctor is simply a functor that maps objects and their arrows within a category to other objects in that same category by preserving their relationship. In the case of programming that is the category of all Types. Now Category being so abstract, of course there is a category for the mappings of the Categories, the Category of Endofunctors.

So now we have the ingredients for a Monad. In the Category of Endofunctors, which is the Category of all the Functors inside of a Category mapping or wrapping Types to other Types a Monad is like the monoid of a regular category, only instead of doing it with plain objects it does it with mapped or wrapped objects. It allows to therefor merge itself or accumulate under a binary operation, and allows a null value or some operation that when applied does not change the object. In addition to that it supports lifting a normal type inside the context of the Functor.

Hence a Monoid in the Category of Endofunctors!

What is a Monad in respects to Haskell?

Short Answer: A type class that supports chaining operations inside of the context of another type by merging the contexts.

Now my question was: "but why?" For me the definition still looked like something extremely specific. Nothing as abstract or semantically versatile like a function or a class. So I looked for applications.

To understand its application, I first needed to get the misconception of classes out of the way I had for Type classes in Haskell. Instead of having a relationship built on inheritance of a parent and child class, like in object-oriented languages, a type class behaves more like an Interface, determining what kind of functions you can apply to a type. Also a type class does not contain any state, like an instance of a class in object-oriented programming. Instead it gives a value a property or context. Some types however only make sense when they contain other types. Let's make an example: The type List which in Haskell can simply be expressed by the symbols "[ ]". A List is only useful, if it contains elements. These elements themselves are not a list, however to do operations that are typical for a list we mostly don't care what type those elements are. Like for getting its length. So Haskell has Type variables.

listOfSomething :: List a 
-- or
listOfSomething :: [a]

'a' can be anything, so that you can do the same operations on a List whether it contains numbers, or characters, or booleans, or other nested types. However some operations on lists also require the Type to be comparable, to sort a list for example, resulting in this

sortableList :: Ord a => [a] 

Which now in an abstract sense says, hey you can have whatever type you like, however it needs to be an ordable type. Boom, a Type Class. It is simply a property of a type. Can the type be ordered? Then it is of that Type Class. Of course you need to implement the operation that supports this property as a proof of your bold claim, but then you are good to go. So now we know what a type class is, what is a Monad? We will get there. For now I will just say, it is also a type class. Back to our list. Lets say we want to combine multiple lists

<> :: [a] -> [a] -> [a]

list1 <> list2 <> list3

This works. But lets say we want to build the list by building up from an empty list, or we want to add an empty list, if a certain condition is not met, as to not change the result.

mempty :: [a]
mempty = []

mempty <> list1 <> list2 <> (if condition then list3 else mempty)

Ok this is weird. But bare with me. How does Haskell know that mempty is an empty list? Well because list is of the type Class Monoid. This class specifies functions to combine/assemble two values of the same type, and what a null value of that type looks like. Think of it like numbers in the context of addition. You can say 3 + 4, or 2 + 1 + 2, but you can also say 1 + 0, where 0 is a value that does not change the other value. This is mempty. A null value. Now the cool thing is, any type that is implemented as a Monoid can be used like Addition. It can be incrementally summed up and it can be combined with a null value. Ok next.

Lets say we want to access the values inside the list without caring about the list. Since the lists behavior is always the same, having to always iterate through it is tiresome, so we can instead say "hey how about I pass a function to the list, which is applied to every entry, and then returns a new list with each entry modified?" Well then we need a mapping function or more precisely fmap.

fmap :: (a -> b) -> [a] -> [b]
fmap dosomethingwitha list

Well not only is this super useful, especially for more special types, it is also comes as a feature of the Type Class "Functor": wow!

Now imagine there was a way to grab a value which is minding its own business putting it inside the context, applying a function to it in terms of its context and then merging it again with that context. But we don't care about the wrapping type (List), it can just take any function that accepts the unwrapped type (a) and turns it into the value inside the wrapping type.

a -> [b]
-- turning into the function signature
[a] -> (a -> [b]) -> [b]

Let's say we have a function repeat10 now, we take 'a', which is repeated 10 times.
First 'a' was just itself, now through this new function it became a list, and now we apply this function again to 'a', now instead of applying it just to 'a', we apply it to every 'a' in the list, which is 10 'a's.

repeat10 :: a -> [a]
-- ...
repeat10 a >>= repeat10

Now the result will be 10 lists of 10 'a's right? That... sucks. But wait, no! It has already been merged (or flattened) to a list of 100 'a's, cool. I can concentrate on just the operation that I cared about, while the iterating, merging, wrapping, passing the values around was all done for me automatically? What kind of magic is this? Man it would be so cool, if I could do this in terms of other types as well. This effect sounds so powerful and versatile. What? I can? How?!
Drumroll... Monad! Turns out that this automatic merging, wrapping and linking of functions is the implementation for the Monad type class. Here comes the full definition for it from Hackage:

(>>=) :: [a] -> (a -> [b]) -> [b] -- given a list of 'a' apply a function that takes 'a' and returns a list of 'b' and merge it into one big list of 'b'

(>>) :: [a] -> [b] -> [b] -- This one is relevant for when you only care about the context and not the value. More on this further down in the article.

return :: a -> [a] -- Take a "naked" value and wrap it in a cozy type context

What is a Monad in respects to Programming?

Short Answer: A way to express effects as values and make those effects composable

In programming most of us are probably familiar with a style of coding where we list instructions, one after another to describe an algorithm. To add complexity we build "loops" and branch out with "if"s, assign variables and change the value of those variables. When we want to ask the computer how it was feeling, we would do so with a function that leaves that domain of our little instructions to interface with the guts of the computer. An IO operation. Well, Haskell and in general functional programming languages work slightly different. Instead of listing one instruction after another, you evaluate operations or functions to values. In functional programming you can't assign variables which are then changed based on the course of your program. Instead you take the value do some operation on it and return a modified value. To create more complexity, you simply compose multiple function evaluations together into a bigger evaluation, turning your entire program in one big value being evaluated. At any point you can reconstruct the state before the modification and the state after the modification by applying the function. This is called purity. There is nothing that can change the variable outside of the scope of the function, and there is also nothing that can interact with the outside world. At least not that easily. Now "why would you want that?" I thought, like probably most of you. Yet instead of being turned off, I was intrigued. I had to get to the bottom of this. Turns out that in Haskell you have this essential principle of design that supports the composition of single value functions. Everything is tailored to this goal. The function signatures for example.

someFunction :: Bool -> Int -> (a -> Int) -> Int

But wait, this function takes 3 arguments. Have I just lied to you? Well no. See the little arrows between the Type declarations? Those are similar to the arrows I talked about in In the paragraph about Category Theory, representing transformation from one type to another. This is called currying, splitting up multiple arguments to a function, into a chained call of multiple one argument functions. Each time returning a new function which takes one single argument. Which is done for you implicitly. Now this allows to theoretically apply partially most arguments to a function to then return a function what only takes one argument. The end goal of this is function composition where you apply the result of one function to the next function as argument. If done correctly you can write something called point free notation.

normalFunction :: Bool -> Int
normalFunction a = if a then 1 else 0

pointFreeWithoutPointFree :: Bool -> Int
pointFreeWithoutPointFree a = stringToInt (boolToString a)

pointFree :: Bool -> Int
pointFree = boolToString . stringToInt

In the first two function definitions we have the normal declaration, where we repeat the function name and pass it an argument variable and reference it in its definition to do some explicit operation. However notice on the 'pointFree' example, that we don't pass in any argument. Yet we have the same effect. The little "." is the operator for function composition, effectively creating a bigger function out of the simple two ingredient functions and the variable being passed around is never seen, this is what is meant by point free notation, eventho there clearly is a point in the syntax. Confusing, but true. And if we were to respect the types at each end of the arrows, we could repeat this infinitely as long as the resulting chain of functions takes one 'Bool' as input and produces one 'Int' as output.

Now another feature of Haskell is that of Lazy evaluation. This again sounds very useless. When I first encountered this term, I was thinking "Why are you lazy, you should be doing the work for me, otherwise what good would you be to me?". But what lazy refers to, is how the execution of the program works after compilation. It postpones most operations to a time when it is actually needed. By releasing the control of when to do lazy operations to the processor, it has absolute freedom of the low-level algorithms that it can implement to optimize your requested higher-level operation. And most of us don't really care how they work anyway, we just call them out of the standard library and are happy that it is taken care of. And this also applies to side-effects, or things that interact outside the scope of your functions with other systems, or functions that produce some form of effect together with the value they return, like logging. And this effect is passed around like a value from one end of the program to the one, that is actually responsible to transform it into action! Wait do I hear an effect or context together with a value? Could that be... a Monad again? Yes.

We have seen how a Monad behaves in terms of a Type Class in Haskell. Lets now actually see how this is useful. So since in Haskell we post-pone most operations, because it is lazy, most IO Operations reside on the surface of the program's architecture, while the core logic of the program stays pure. It is like two worlds, the pure world and the impure world. Once trespass into the impure world, you can't turn back pure anymore. If you have sinned once, there is no salvation for you. Mind you, I am agnostic. The functions in which I can however be impure have an agreement, "if we are going to be impure, lets all be impure and support each other, so maybe nobody will notice". Let's stick to my example of the List (you could however replace List with any other Monadic Type that applies some property to its values or gives them a context).

alonglistoflistfunctions :: [a] -> [b]
alonglistoflistfunctions = 
action1 >>=
    (\ x1 -> action2 x1 "something else") -- with anonymous function
       >>=
         (\ x2 -> mk_action3 x1 x2 ))

Now this is the chaining of operations in the context of lists. Every function that is part of this chain is of type

a -> [b]

meaning it extracts the unwrapped value, applies the function putting it inside of the monadic context and merging it with the remaining monadic context. In case of the list, instead of applying the function to the list as a whole it applies the function to every entry in the list, producing a list for every function call, and merging those lists together.

And as explained in the paragraph about type classes the function/operator ">>=" does all the magic for us, in taking out the value, passing it to the next function, merging the contexts, etc...
Well turns out that if we are nice and stick to the agreement of keeping everything inside the context of that function of that type we can do this:

alonglistoflistfunctions :: [a] -> [b]
alonglistoflistfunctions = 
	do { x1 <- action1
	   ; x2 <- action2
	   ; mk_action3 x1 x2 }

This is the do notation and allows us to reason in terms of the context provided by our Monad, in this case List. Everything in there is messy and "listy", however we don't care. The hard list stuff is abstracted away and handled for us. (It is a monadic context) We just care about the values inside the lists and about the sequence of functions processing them. Now let's finally get to real side effects, IO operations. Let's take a simple console output. One thing we need to understand is that, like everything in haskell also side-effects are values passed around from one function to another to the part of the program responsible to execute them, when it feels like it: lazy bum! So if you just have side-effects and don't really care about their returned values we have this fellow in the definition of a Monad.

(>>) :: IO a -> IO b -> IO b -- This one is relevant for when you only care about the context and not the value.

printSomething :: IO String
printSomething = putStr "Hello" >> 
				putStr " " >> 
				putStr "world!" >> 
				putStr "\n"

This looks familiar to any imperative programmer. A list of statements. But it isn't. Instead it is the composition of multiple "IO String" functions in one big package for the processor to execute when it feels like it. But we ignore the result of the computation, we just care about the side-effect, or the context associated with the computation. And if we were to write it in our alternative syntax it turns into this

printSomething :: IO String
printSomething =
	do { putStr "Hello"
	   ; putStr " "
	   ; putStr "world!"
	   ; putStr "\n" }

Essentially mimicking the syntax of an imperative language, without abandoning the elegant world of composition. This chain is infinitely extendable as long as we promise to only use functions of that type. This turns programming into the definition of little computational spaces, organizing your program by the common language they all speak. Like a person who has different personas in different contexts. Work, at Home or in University. Maybe he attends a Tennis club and has a group of mutual friends he plays board games with. He is the same person, that in different contexts focuses on different activities and thinks and speaks about different things. I think this is very beautiful! And I let you in on a little secret.

A type is not just limited to a value, a function can also be a type, and its internal values can be the the other types wrapped into it. And since in functional programming you can pass functions to functions or return functions from functions as values, I let your imagination go wild what crazy things you can do with monoids, functors and monads! ;)

Take Away

When I was in University my Prof. for Media Theory once said that being able to distinguish something, is to uncover its essence. That you can describe something with as much detail as possible, but often it is enough to distinguish it from other related concepts. I think the difficulty of explaining a monad is, that it is so abstract and the definition often gets mixed up with the examples, creating a very blurred description which is hard to distinguish at all. So when trying to understand something very abstract or complicated, try to first put it into context and to distinguish it from related concepts.