How to Make a Thing in Haskell, Part 6: Maybe

As mentioned previously, I needed and wrote a function to find every suffix of a string that starts with a particular character. Next, I need another function to find the length of the first match of a query in a string. Here’s an initial implementation:

matchLength :: Eq a => [a] -> [a] -> Integer
matchLength [] _ = 0
matchLength query@(q:qs) (c:cs) = 1 + matchLength rq cs
    where rq = if q == c then qs else query

It recursively eats through the choice, incrementing an accumulator and consuming the query if the first character of the query and choice match. When the query is consumed, there’s a 0-length match and the accumulator can be evaluated1.

It works for a couple simple tests, but the compiler warns:

In an equation for ‘matchLength’: Patterns not matched: (_ : _) []

That is to say, this function doesn’t account for when the choice is empty, but the query isn’t. This corresponds to when the accumulator consumes the choice without matching the entire query. 0 isn’t a reasonable return value here – that already has a meaning in this context. We want to represent something like “this function has no meaning given these arguments”.

Representing Failure

In other languages, you might throw an exception. If you want to represent failure but continue computation you might return a meaningless value like nil, None, null, or -1 as a sigil of sorts. Of course, doing so (at least with null) is perpetuating the billion-dollar mistake. You’d think that such a principled language as Haskell would have a way around this2.

And of course it does! The built-in construct for representing “this function is undefined for those inputs” is a option type called Maybe3. It has two possible values: a parameterized value called Just and a failure-representing value called Nothing. So, for my purposes, matchLength can return Nothing when it’s meaningless to ask for the length of a match, and return Just x when the result is x:

matchLength :: Eq a => [a] -> [a] -> Maybe Integer
matchLength [] _ = Just 0
matchLength query@(q:qs) (c:cs) = (+1) <$> matchLength rq cs
    where rq = if q == c then qs else query
matchLength _ [] = Nothing

Here, I’ve used the <$> operator from Control.Applicative. I won’t talk about it in too much detail, because I don’t fully understand it in the abstract yet4. (Also because I promised myself to not write another damn monad tutorial.) But, in this function, <$> takes a function f and a Maybe. If the Maybe is Nothing, it returns Nothing, and if it’s Just x, it returns Just (f x). So, in our case, if the inner call fails and returns Nothing, we get (+1) <$> Nothing -> Nothing, and when it bottoms out to Just 0, we get (+1) <$> Just 0 -> Just 1.

This gives us some very nice semantics for this function. Nothing means “this choice does not match the given query”, while Just x means “there’s a match! Also the match has length x”.

One nice effect of this type-enforced representation of failed computation is that the compiler warns you when you use the function but don’t account for failure. It’s easy to show you’ve failed when you write the function in ghci:

λ let matchLengthAsInteger (Just x) = x

<interactive>:4:5: Warning:
    Pattern match(es) are non-exhaustive
    In an equation for ‘matchLengthAsInt’:
        Patterns not matched: Nothing

Instead of having to chase mysterious null pointer exceptions at runtime, you get yelled at at compile-time for not properly implementing against this function’s failure interface.

Above the Fold

I spent some time trying to figure out how to reimplement this with a fold operation – a reduction operation that accumulates over the elements of the list. In my experience on exercism, I’ve gotten the impression that Haskell programmers try to avoid explicit recursion, even for tail-recursive functions like this, so I try to use folds to make my code more idiomatic. However, I wasn’t able to figure out how to represent both failure and success in a fold. While a fold might be more efficient in some cases, I like the fidelity of this representation of matchLength, and I’ll be sticking with it for now. If anyone else has some insight that could help me rewrite this as a fold, let me know!

  1. Note that it finds the length of the first full match, including any non-matching characters before the match. Thus, it only works for our purposes if called with a query and choice starting with the same character.

  2. Though, I was unpleasantly surprised that some parts of the Haskell standard library are partial functions – for example, head and minimum both error out given the empty list as an argument (link via Stephen Diehl’s amazing post on what he wishes he’d known when learning Haskell).

  3. Rust has an option type too. This kind of typesafe way to represent failure states is one of the reasons I think Rust is so cool and exciting.

  4. It’s just an infix implementation of fmap over applicative functors; what’s the problem?