In a previous blog post, we discussed using an OCaml feature called polymorphic variants for error handling. Here, let’s discuss some more advanced forms of this error-handling approach: returning multiple errors, performing error recovery, as well as handling warnings.
In many programs (for example, in a compiler or an IDE), we want to report all detected errors, instead of quitting on the first error. We also often want to collect warnings that are not critical for the execution of the program.
To demonstrate the problem and the solutions, let’s use an artificial example: a type checker for a small language with types t
and terms e
, which is technically a subset of OCaml:
t → bool
| int
e → true
| false
| 0
| 1
| 2
|
…
| (
e :
t)
| if
e then
e else
e
It’s not a very useful language, but it is enough for us to discuss the interesting error handling cases. We will represent this language with the following OCaml types:
module Type = struct
type t = Bool | Int
end
module Term = struct
type t =
of bool
| Bool of int
| Int of t * Type.t
| Annotation of {
| If
conditional: t;
consequence: t;
alternative: t;
}end
open Term
As a recap, let’s first write the type checker while using exceptions for error handling. We start by defining failwithf
, which raises an exception like failwith
, but allows printf-like formatting:
let failwithf f = Printf.ksprintf failwith f
And now, the type checker itself, in the form of a function called infer
, which infers the type or fails with an exception:
let rec infer = function
| Bool _ -> Type.Bool ❶
| Int _ -> Type.Int
| Annotation (term, annotated_t) -> ❷let term_t = infer term in
if term_t <> annotated_t then
"Expected %s, but got %s"
failwithf
(Type.to_string annotated_t)
(Type.to_string term_t)else
annotated_t
| If {conditional; consequence; alternative} -> ❸let conditional_t = infer conditional in
let consequence_t = infer consequence in
let alternative_t = infer alternative in
if conditional_t <> Type.Bool then
"If condition must be boolean"
failwithf else if consequence_t <> alternative_t then
"If branches must match: %s vs. %s"
failwithf
(Type.to_string consequence_t)
(Type.to_string alternative_t)else
consequence_t
Writing type checkers, I find it helpful to follow a convention of adding an _t
suffix to distinguish between terms and their types, like term
and term_t
above. Some highlights:
if
conditional, we
Now, let’s rewrite the type checker using the result type with polymorphic variants for errors (like we did in the earlier blog post):
let return x = Ok x
let error x = Error x
let (let*) = Result.bind
let rec infer = function
| Bool _ -> return Type.Bool
| Int _ -> return Type.Int
| Annotation (term, annotated_t) ->let* term_t = infer term in
if term_t <> annotated_t then
error (`Expected_x_got_y (annotated_t, term_t))else
return annotated_t
| If {conditional; consequence; alternative} ->let* conditional_t = infer conditional in
let* consequence_t = infer consequence in
let* alternative_t = infer alternative in
if conditional_t <> Type.Bool then
error `If_conditional_must_be_booleanelse if consequence_t <> alternative_t then
error (`If_branches_must_match (consequence_t,
alternative_t))else
return consequence_t
It reads very similarly. We have used Result.bind
for our let*
bindings instead of the regular let
bindings. We also used return
and error
as aliases for Ok
and Error
result constructors.
Instead of raising an exception, the infer
function returns a type (Type.t, _) result
where _
is the inferred polymorphic variant type. Still, our type checker stops at the first error. Let’s change that.
As the first step, let’s change our error
alias to allow for a list of errors to be returned:
let error x = Error [x]
This changes the return type of infer
to (Type.t, _ list) result
.
Now, to actually return multiple errors we need to control which bindings are short-circuiting as before and which bindings can be used to detect and report errors together. For example, we can’t report that the conditional must be boolean before we infer the type of the conditional, but we can report errors happening in both conditional branches (the consequence and the alternative) as they can be checked separately. Fortunately, OCaml has a neat feature to help us with that.
Similarly to how consequent let
bindings are dependent on each other, and
bindings allow to bind independent values. And OCaml allows custom and*
bindings similarly to custom let*
bindings.
The
and*
bindings are syntactic sugar for monoidal products. For more info, see the relevant section of the OCaml Manual.
We can specify that when two bindings are combined using and*
the error lists are appended, using the @
operator below:
let (and*) left right = match left, right with
| Ok left, Ok right -> Ok (left, right)
| Error left, Error right -> Error (left @ right) | Error e, _ | _, Error e -> Error e
Appending lists is not efficient, so a different data structure would be a better fit here, such as a catenation list. But that’s a topic for a different blog post.
Now, for cases where errors can be detected independently we can use the and*
bindings. We need some care to apply them in the right places to catch as many independent errors as possible. As a result, our type checker now looks like this:
let return x = Ok x
let error x = Error [x]
let (let*) = Result.bind
let (and*) left right = match left, right with
| Ok left, Ok right -> Ok (left, right)
| Error left, Error right -> Error (left @ right)
| Error e, _ | _, Error e -> Error e
let rec infer = function
| Bool _ -> return Type.Bool
| Int _ -> return Type.Int
| Annotation (term, annotated_t) ->let* term_t = infer term in
if term_t <> annotated_t then
error (`Expected_x_got_y (annotated_t, term_t))else
return annotated_t
| If {conditional; consequence; alternative} ->let* () =
let* conditional_t = infer conditional in
if conditional_t <> Type.Bool then
error `If_conditional_must_be_booleanelse
return ()and* result_t =
let* consequence_t = infer consequence
and* alternative_t = infer alternative in
if consequence_t <> alternative_t then
error (`If_branches_must_match (consequence_t,
alternative_t))else
return consequence_tin
return result_t
The error handling is a bit more involved than before, but it can capture many possible errors. For example, type checking the following program…
if 1 then 2 else true
…will detect both the non-boolean conditional and the branch mismatch errors:
assert (infer (If {
1;
conditional=Int 2;
consequence=Int true;
alternative=Bool
}) = Error [
`If_conditional_must_be_boolean;
`If_branches_must_match (Type.Int, Type.Bool); ]);
While checking the following snippet…
if (1: bool) then (2: bool) else (true: int)
…our type-checker will capture all three annotation errors (unlike what the OCaml compiler does, for example):
assert (infer (If {
1, Type.Bool);
conditional=Annotation (Int 2, Type.Bool);
consequence=Annotation (Int true, Type.Int);
alternative=Annotation (Bool
}) = Error [
`Expected_x_got_y (Type.Bool, Type.Int);
`Expected_x_got_y (Type.Bool, Type.Int);
`Expected_x_got_y (Type.Int, Type.Bool); ]);
We should include the source code locations with the errors in practice to point to the right place in the source code.
Even though we find and report multiple errors, we have not considered the possibility of error recovery. For example, in case of an annotation we could detect and report a type mismatch, but then recover by assuming that the annotation is correct to continue type checking (and potentially uncover more errors). To achieve that, we need to change our approach slightly.
The result
type can express either a success result or an error (or a list of errors, in our case). To express a result obtained after recovering from an error, we need to change the result
type from a sum type (variant) to a product type (record). For a lack of a better name, let’s call such type an outcome
:
type ('ok, 'error) outcome = {
option;
result: 'ok list;
errors: 'error }
Let’s reimplement return
, error
, (let*)
, and (and*)
for this type:
let return x = {result=Some x; errors=[]}
let error x = {result=None; errors=[x]}
let (let*) body callback = match body with
None; errors} as e -> e
| {result=Some ok; errors=previous_errors} ->
| {result=let {result; errors} = callback ok in
{result; errors=previous_errors @ errors}
let (and*) left right =
let result = match left.result, right.result with
Some left, Some right -> Some (left, right)
| None
| _ -> in
{result; errors=left.errors @ right.errors}
The implementation is very similar as before, including (let*)
, which has some similarities to how Result.bind
is implemented.
With these combinators, our existing type checker works the same way without change. The only difference is in the shape of the return type. So far, this gives us nothing. That is, until we add a new combinator function:
let recoverable_error x = {result=Some (); errors=[x]}
It is a constructor that holds a non-empty result and an error. This allows us to capture the error and recover from it by returning a result.
Now we can change how we handle annotations:
let rec infer = function
| Bool _ -> return Type.Bool
| Int _ -> return Type.Int
| Annotation (term, annotated_t) ->let* term_t = infer term in
let* () =
if term_t <> annotated_t then
recoverable_error (
`Expected_x_got_y (annotated_t, term_t))else
return ()in
return annotated_t
| If {conditional; consequence; alternative} -> …
Here, if the annotated and the inferred types do not match, we report a recoverable error but continue by returning the annotated type.
This affects our previous example:
if (1: bool) then (2: bool) else (true: int)
This will capture an additional error: the fact that the two branches are of a different type.
assert (infer (If {
1, Type.Bool);
conditional=Annotation (Int 2, Type.Bool);
consequence=Annotation (Int true, Type.Int);
alternative=Annotation (Bool
}) = {None;
result=
errors=[
`Expected_x_got_y (Type.Bool, Type.Int);
`Expected_x_got_y (Type.Bool, Type.Int);
`Expected_x_got_y (Type.Int, Type.Bool);
`If_branches_must_match (Type.Bool, Type.Int);
]; });
So, thanks to error recovery, we can capture more errors.
What we have implemented can be described as a composition of an option monad and a writer monad. Similar results can be achieved using a monad transformer library.
Since OCaml is not a purely functional language, we can just print warning messages to the stderr
channel, as they happen, without anything special. For that, we can use eprintf
from the Printf
module, or we can use a logging library. We can also append warnings to a mutable collection to deal with them later.
To handle warnings, a logical extension of our purely-functional approach would be to add a warnings
record field to the outcome type:
type ('ok, 'error, 'warning) t = {
option;
result: 'ok list;
errors: 'error list;
warnings: 'warning }
And a combinator that introduces warnings:
let warn w = {result=Some (); errors=[]; warnings=[w]}
This way, we can, for example, issue a warning in case an if
conditional is a constant:
let* () =
let* conditional_t = infer conditional in
if conditional_t <> Type.Bool then
error `If_conditional_must_be_booleanelse
match conditional with
| Bool value ->
warn (`Conditional_always value)
| _ -> return ()
Now, let’s go and write some friendly programs that help presenting a comprehensive view of errors and warnings to the user, instead of bailing out at first sight of a problem. ☰
Subscribe to receive an occasional email from me about compilers, functional programming, or my book Compiling to Assembly from Scratch.
Unsubscribe at any time.
@misc{Keleshev:2021-1,
title="Advanced Error Handling in OCaml",
author="Vladimir Keleshev",
year=2021,
howpublished=
"\url{https://keleshev.com/advanced-error-handling-in-ocaml}",
}