The problem with closed Error types

I was always wondering how to deal with Errors in Rust in a better way. Rust seems to encourage one-module-one-Error style, and you have to compose Errors before writing any code.

Let's face it: this is good, but not very good. What can a programmer tell from the following signature?

fn transfer(alice: Account, bob: Account) -> Result<(), BankError>;

So... This function could return an error, but we don't know what could be returned. Let's take a look at the definition of BankError.

enum BankError {
    AccountNotFound,
    DatacenterOffline,
    ClerkNotOnDuty,
    ServerOnFire,
    ...
}

So, which of them could be returned? Without looking at (read, digging into) the source code, one cannot tell with confidence.

Also, for quick'n'dirty jobs, Rustaceans are forced to take a hard way to compose an Error type, or try hard to make use of an existing one... Why can't we just compose freely what we want?

Data types à la carte

All of the problems boil down to the problem of the extensibily of the Error type. This turns out to be a long standing problem, called the Expression Problem.

How can you add variants to a data type, without recompiling other parts of the program?

Functional Pearls gave an awesome answer, presented in the paper "Data types à la carte". Let's see how it works.

data Expr f = In (f (Expr f))

data Val e = Val Int
data Add e = Add e e

Wait. What does that mean? I don't know yet... Expr looks like a random fixed point, while Val and Add are supposed to be the variants of Expr.

Val and Add turn out (unsurprisingly) to be Functors.

instance Functor Val where
  fmap _ (Val x) = Val x

instance Functor Add where
  fmap f (Add e1 e2) = Add (f e1) (f e2)

So the type parameter e reflects the substructure. To compose them, we need more facility:

infixr 6 :+:
data (f :+: g) e = Inl (f e) | Inr (g e)

instance (Functor f, Functor g) => Functor (f :+: g) where
  fmap f (Inl x) = Inl $ fmap f x
  fmap f (Inr y) = Inr $ fmap f y

Since Expr is a nested structure, we need some way to "walk" it.

foldExpr :: Functor f => (f a -> a) -> Expr f -> a
foldExpr f (In t) = f (fmap (foldExpr f) t)

f is an F-algebra. f a -> a specifies one step of recursion. To interpret Val and Add,

instance Eval Val where
  evalAlgebra (Val x) = x

instance Eval Add where
  evalAlgebra (Add x y) = x + y

instance (Eval f, Eval g) => Eval (f :+: g) where
  evalAlgebra (Inl x) = evalAlgebra x
  evalAlgebra (Inr y) = evalAlgebra y
  
eval :: Eval f => Expr f -> Int
eval expr = foldExpr evalAlgebra expr

We can't yet make everything go. In the definition of Expr, the structure looks like f (Expr f), so the inner type and outer type must match. We need some way to promote it. Take inspiration from the From trait of Rust.

class (Functor sub, Functor sup) => sub :<: sup where
  inj :: sub a -> sup a

instance Functor f => f :<: f where
  inj = id
  
instance (Functor f, Functor g) => f :<: (f :+: g) where
  inj = Inl
  
instance (Functor f, Functor g, Functor h, f :<: g) => f :<: (h :+: g) where
  inj = Inr . inj
  
inject :: (g :<: f) => g (Expr f) -> Expr f
inject = In . inj

Now everything should work, but using such types is quite painful, let's make some smart constructors.

val :: (Val :<: f) => Int -> Expr f
val x = inject (Val x)

infixl 6 ⊕
(⊕) :: (Add :<: f) => Expr f -> Expr f -> Expr f
x ⊕ y = inject (Add x y)

Let's try:

eval (val 1 ⊕ val 2)

Error types à la carte

As long as you have followed along, you should have noticed that, we can compose the variants pretty easily:

Expr Val
Expr Add
Expr (Val :+: Add)

Isn't this what we wanted? Let's apply this idea to error types!

foldError :: Functor f => (f a -> a) -> Error f -> a
foldError f (In t) = f (fmap (foldError f) t)

injectError :: (g :<: f) => Error g -> Error f
injectError = foldError inject

class Functor f => Report f where
  reportAlgebra :: Report g => f (Error g) -> String
instance (Report f, Report g) => Report (f :+: g) where
  reportAlgebra (Inl x) = reportAlgebra x
  reportAlgebra (Inr y) = reportAlgebra y

report :: Report f => Error f -> String
report (In t) = reportAlgebra t

data Success f = Success deriving (Show, Functor)
data Result a f = Result a deriving (Show, Functor)
data BadCode f = BadCode Int deriving (Show, Functor)
data UserError f = UserError String deriving (Show, Functor)

instance Report Success where
  reportAlgebra Success = "Success"
instance Report BadCode where
  reportAlgebra (BadCode x) = "Bad code (" ++ (show x) ++ ")"
instance Report UserError where
  reportAlgebra (UserError msg) = "User error: " ++ msg
instance Show a => Report (Result a) where
  reportAlgebra (Result x) = "Result: " ++ show x

success :: (Success :<: f) => Error f
success = inject Success

badCode :: (BadCode :<: f) => Int -> Error f
badCode code = inject (BadCode code)

userError :: (UserError :<: f) => String -> Error f
userError msg = inject (UserError msg)

result :: (Result a :<: f) => a -> Error f
result r = inject (Result r)

liftEither :: (g :<: f) => Either (Error g) a -> Either (Error f) a
liftEither (Left x) = Left (injectError x)
liftEither (Right y) = Right y

mapLeft :: (a -> c) -> Either a b -> Either c b
mapLeft f (Left a) = Left (f a)
mapLeft f (Right a) = Right a

from :: (g :<: f) => (a -> Either (Error g) b) -> a -> Either (Error f) b
from f x = liftEither (f x)

checkInt :: Int -> Either (Error BadCode) Int
checkInt x = if x < 0 then Left (badCode x) else Right x

checkString :: String -> Either (Error UserError) Int
checkString s = if length s > 10 then Left (userError "string too long") else Right (length s)

job :: Int -> String -> Either (Error (BadCode :+: UserError)) Int
job x y = do
  x <- from checkInt x
  y <- from checkString y
  return (x+y)

There is still a small annoyance, that one has to convert from a lower error type to a higher one, but overall this looks pretty neat.

In retrospect

Expr is a fixed point operator, and Expr Val is the fixed point of Val. The type of a value like Val 9 is actually Val e where e is free. By wrapping it with Expr, the e now disappears, and the type of the whole expression is now closed. foldXX is actually catamorphism: The fixed point operator gives us an initial F-algebra $(\mu F, In_F)$, and for any F-algebra $(X,\phi)$, there exists a catamorphism from $\mu F$ to $X$. The definition of foldExpr seemed like magic out of air, but it's so obvious, looking at the following commutative diagram.

$$\require{amscd} \begin{CD} F (\mu F) @>{In_F}>> \mu F\ @V{F (\text{cata}~\phi)}VV @VV{\text{cata}~\phi}V\ FX @>>{\phi}> X \end{CD}$$

Catamorphism is the generalized fold (that's why we call it foldXXX here), which is used to destroy the structure: what we have is a $\mu F$ (like, Expr (Val :+: Add)), and we want a final result (Int), then by finding an algebra (Val :+: Add) Int -> Int, we can get for free a catamorphism Expr (Val :+: Add) -> Int. By writing a few typeclasses, we can compose types easily, and they gradually build up the algebra we want.

How did the authors find this solution? I should say I don't know, but my guess is that at first they separated the cases out and found a way to compose them, then their knowledge of recursion schemes (F-algebra) helped.

In the original "Data types à la carte" paper, free Monads can also be composed. Read it if you are interested.

Unfortunately, error handling has more to do with software engineering than stupid type checking. I might not use this approach in the future. Just do what others do!