Concise Error Absolution in Rust

An approach for dealing with errors early while keeping code elegant

Introduction

Error handling is a critical aspect of software development, often determining the robustness and reliability of our applications. In the Rust ecosystem, where safety and correctness are paramount, choosing the right error handling strategy becomes even more crucial. This blog post delves into the nuances of error handling in Rust, focusing on a specific use case: error handling in a custom window manager (penrose).

Why do we care about error handling?

Error handling is one of the topics I hear about the most in my day-to-day work as a software engineer. A facet of the topic that we'll be considering here is whether to pass errors to the calling function or whether to handle them directly. While there are merits of both, in my particular case — writing a custom window manager using penrose — it seems prudent to handle the error immediately. We almost always want to keep the window manager running at all costs. Linus Torvalds has written, with some "passion", on how any behavior caused by an application should never result in a bug in the kernel. I would want to take a similar approach with any aspect of my environment that might cause other applications to crash, including the window manager.

Given that mission, what we probably don't want to do is use ? or unwrap. Certainly using unwrap would have us quickly wanting to use another window manager, or may even have us longing to use Windows 9x. Using ? to pass the error up the call chain delays responsibility as we discussed, but we are ultimately responsible for our window manager. In most cases, we might as well handle the error at the call site to make the code more straightforward. Among the many benefits of simple code, it is more likely to lead to code that behaves as expected.

The approach I settled on is to try to supply reasonable defaults in case there is either a missing value or an error value, and then to (usually) log the error, if it is something we aren't expecting to happen at all or with much frequency.

"Flowchart illustrating error handling process in a window manager. The chart flows from top to bottom with the following steps: Start, Attempt Operation, then branches into Success and Error paths. The Error path leads to Log Error and then Use Default Value, while both Success and Error paths ultimately converge to Continue. Arrows connect each step, showing the flow of execution."

Execution flow with a recoverable error

A sequence of fallible operations

Let's take the example where we want to customize a penrose workspace bar. Each workspace has a tag displayed in the workspace bar, typically representing the workspace number (e.g. 1, 2, 3, ...). In penrose, tags usually are the same as the workspace index (starting at index 1). But, they are stored as strings and could be something else. In my particular penrose configuration, I haven't thought of a good reason to try to break this convention, so I'm going to assume they are always integers, but I'm going to do so in a safe way.

  let tag_num = tag_string.parse::<usize>()
    .log_err(&format!("couldn't parse int from {tag_string}"));

The first line of this snippet is mostly self-explanatory, but it is worth noting that parse returns a Result:

 pub fn parse<F: FromStr>(&self) -> Result<F, F::Err> {
     FromStr::from_str(self)
 }

So, log_err must be dealing with this somehow, unless tag_num is just a Result<usize, F::Err> (it isn't); log_err is a custom method I defined in my penrose project:

pub trait LogPenroseError<T, E>
where
    Self: Sized,
{
    fn log_err(self, fstr: &str) -> Option<T>;
}

impl<T, E: Debug> LogPenroseError<T, E> for Result<T, E> {
    fn log_err(self, fstr: &str) -> Option<T> {
        match self {
            Ok(val) => Some(val),
            Err(err) => {
                let msg = &format!("{}: {:?}", fstr, err);
                log_penrose(msg).unwrap_or_else(|er| {
                    eprintln!("Couldn't log error {}\nDue to error {:?}", msg, er)
                });
                None
            }
        }
    }
}

Here we see that log_err can operate on any Result<T, E>, as long as E implements the Debug trait (if I ever come across an E I need to deal with that doesn't, I can define another similar function, maybe log_err_nodbg, and use a slightly less informative error message.)

We also note here that log_err is returning an Option<T> and not just T, so we haven't truly absolved the error. We will get to why that is the case soon. First, let's take a look at another possible error case we need to consider immediately after the call to parse:

let ws_ix = tag_num.checked_sub(1)
  .log_err(&format!("In ui_tag: couldn't subtract 1 from {tag_num}"));

In this case, we're checking to make sure our assumption that tags are 1-indexed is true (or at least that we can convert to a 0-indexed value operating under this assumption). This could likely be checked by inspecting the penrose codebase or documentation, but even with that, things could change or anomalies in tag usage might occur - better to be safe than sorry.

In this case, checked_sub directly returns an Option<Self>, so we use a different implementation of log_err:

impl<T> LogPenroseError<T, ()> for Option<T> {
    fn log_err(self, fstr: &str) -> Self {
        match self {
            Some(_) => self,
            None => {
                let msg = &format!("{}: None when Some expected", fstr);
                log_penrose(msg).unwrap_or_else(|er| {
                    eprintln!("Couldn't log error {}\nDue to error {:?}", msg, er)
                });
                self
            }
        }
    }
}

This is similar to before - if the value is present we pass it through, otherwise we log an error. Also, note that in both cases I've elected to call eprintln! with an auxiliary error message in case the logger fails. This could be overkill, but again, why not have it? Though it probably makes sense to build this into log_penrose.

Finally, we need to retrieve the apps running on the specified workspace:

self.ws_apps.get(ws_ix)

This also returns an Option, but we will want to deal with both the Some and the None cases. If it is None, we'll just return the tag string. If it is Some(app_info), we'll use the app_info to create a fun unicode tag that corresponds to the app (e.g. a 🦊 for Firefox).

We now have a sequence of fallible operations that we log in the case of failure. How do we compose these together and actually absolve the error?

We can use Option's andThen to chain together a sequence of operations that each yields another Option, and then match on the result. It would look like this:

 match tag_string
     .parse::<usize>()
     .log_err(&format!("couldn't parse int from {tag_string}"))
     .and_then(|tag_num| {
         tag_num
             .checked_sub(1)
             .log_err(&format!("In ui_tag: couldn't subtract 1 from {tag_num}"))
             .and_then(|ws_ix| self.ws_apps.get(ws_ix))
     }) {
     Some(app_info) => app_info.iconic_tag(tag_string),
     None => tag_string,
 };

If there's an error or any missing value, we absolve the situation by returning the original tag_string, which is what we would display without the new custom behavior.

Alternatively, but equivalently, we can use map_or instead of the match, though in this particular case, I'd argue it doesn't buy us much in terms of readability:

tag_string
   .parse::<usize>()
   .log_err(&format!("couldn't parse int from {tag_string}"))
   .and_then(|tag_num| {
      tag_num
          .checked_sub(1)
          .log_err(&format!("In ui_tag: couldn't subtract 1 from {tag_num}"))
          .and_then(|ws_ix| self.ws_apps.get(ws_ix))
   })
   .map_or(tag_string.clone(), |app_info| {
      app_info.iconic_tag(tag_string)
   })

Monads and do_notation

Unlike many folks, I enjoy the appearance of most of the Rust code I write, but I have to admit this could be improved. For one, the way cargo fmt is placing the andThens is a bit disordered, even though concepteually they should be at the same level. Then there's all the extra punctuation. I had actually written the later version (monadic style code — we'll get to that) first, and trying to rewrite it using and_then and getting the parentheses and braces all in the right place was difficult.

A sequence of and_thens done this way, with an optional map at the end, can be modeled as operations taking place in a monad. The first sentence from Wikipedia) suffices for our purposes of defining what a monad is:

In functional programming, a monad is a structure that combines program fragments (functions) and wraps their return values in a type with additional computation.

There are countless monad tutorials available, and I won't be suggesting any in particular here - knowing more about them is interesting but not necessary for our purposes.

Some languages like Scala (for comprehensions) and Haskell (do notation) provide syntactic sugar for sequencing monads, since as we saw, chaining and_thens may not be pleasant to many. While Rust itself does not provide syntactic sugar, its comprehensive macro system has made it possible to have syntax that is nearly as good as if it were built into the language. In particular, we'll have a look at the do_notation crate.

As in other functional programming languages, we can use the do_notation with structures other than Option (see the crate docs for details).

After adding in use do_notation::m; we can:

match m! {
  tag_num <- tag_string.parse::<usize>()
    .log_err(&format!("couldn't parse int from {tag_string}"));
  ws_ix <- tag_num.checked_sub(1)
    .log_err(&format!("In ui_tag: couldn't subtract 1 from {tag_num}"));
  self.ws_apps.get(ws_ix)
} {
  Some(app_info) => app_info.iconic_tag(tag_string),
  None => tag_string,
}

This is equivalent to the previous code, but at least to my eyes, it is much simpler to read.

Show me the code

My dotpenrose repository has the relevant code; here are some permalinks to the versions used here:

log_err associated traits and implementations
workspace bar icon example that is discussed here

Closing thoughts

My understanding and preferences for error handling will likely continue to change; error handling is one of many aspects of programming where there are multiple dimensions to the issue, and choosing the right solution may involve shades of grey and experience with both error handling and the domain. Using traits, custom macros (`m!` from the do_notation crate), and the concepts of monads and error absolution, we can customize our error handling and logging while keeping code readable. I'm sure some improvements could be made to the approach taken here, and if I find any, I'll try to update this post. If you find any, please let me know, and I'll share that here as well.

Acknowledgements

I'd also like to thank Claude.ai for suggesting (and drafting) the execution diagram for recoveral errors 🤖.