Cats-Effect IO, take it serious before use it

I have seen many times that using Cats Effect IO in the wrong way has caused P1 incidents in production. Here, I’ll briefly discuss two of them.

Missing IO.blocking

If we read the Cats Effect documentation, it clearly warns us about blocking I/O and asks us to wrap it in IO.blocking instead of IO.apply. However, we often ignore this advice or don’t take it seriously, mostly because we’re not aware that the action we’re wrapping in IO is actually blocking.

If we make a mistake here, we will see weird behavior in production. For example, suddenly, when we have heavy I/O operations like file uploads, other functionality in the system starts timing out. This happens simply because we are performing a blocking action on the compute thread pool, which has only a limited number of threads. When there are no threads available to serve requests, they must wait — and most of the time, the requests hit their timeout before a thread becomes free.

If we read the cats-effect documents, it clearly warns us about blocking I/O and asks us to wrap them in IO.blocking instead of IO.apply however, we often ignore it or don’t take it seriously just because we are not aware that the action we wrap in IO is actually blocking.

If we make a mistake here, we will see weird behavior in production. For example, suddenly, when we have heavy I/O operations like file uploads, other functionality in the system starts timing out. This happens simply because we are performing a blocking action on the compute thread pool, which has only a limited number of threads. When there are no threads available to serve requests, they must wait — and most of the time, the requests hit their timeout before a thread becomes free.

To stay safe, we should think twice before wrapping everything in an IO. A simple rule is: if it involves Input/Output, or if it involves a network call, assume it is blocking unless proven otherwise. This includes database calls over JDBC, API calls, file operations, InputStream/OutputStream, etc. These should be wrapped in IO.blocking to ensure they run on a separate, cached, unbounded thread pool.

Why is this important? Take JDBC as an example. JDBC drivers are blocking — they don’t use modern async techniques. When you run a query, the request goes to the database, and the CPU has no choice but to block the requester thread until the result comes back. Newer OS features (like epoll or io_uring) solve this problem, but while JDBC doesn’t support them, you must wrap database calls in IO.blocking.

If you use libraries like doobie (a wrapper over JDBC), it already provides a separate blocking thread pool for database actions. In contrast, libraries like skunk (not based on JDBC) support async database interactions out of the box, but only for Postgres.

The same principle applies to external API calls. If you use libraries like sttp, it supports asynchronous non-blocking HTTP requests.

Dangling IOs

object MainIO extends IOApp{
  override def run(args: List[String]): IO[ExitCode] = {
    IO(println("hello")
    IO(println(" world").as(ExitCode.Success)
  }
}

This doesn’t print hello!

This is a common mistake we make when we first start using IO. It’s especially easy to fall into when refactoring legacy code to make it cats-effect IO enabled.

In this case, the first IO doesn’t get executed at all, because it isn’t part of the final IO that runs in the IO run-loop.

To catch mistakes like this, you can enable the Scala compiler option -Wnonunit-statement. If you’re not familiar with compiler options, just use the sbt-tpolecat plugin — it automatically configures a good set of scalac options for you.

Cats-effect IO is simply a data model that represents our computations, which are only executed at the “end of the world” — when we actually run the program.

Ideally, our code boils down to a single IO that runs in the main method via cats-effect’s IOApp. In legacy systems, we might have multiple IOs run using a Dispatcher. In both cases, we must ensure there are no dangling IOs — every IO must be part of the final structure.

This “chaining” means that each IO should be combined into the final data model of computations. For example:

Two IOs can be connected using flatMap (sequentially). Or they can be combined concurrently. If they are not combined, they will never be executed by the IO run-loop.

So, for example, the code above could be rewritten like this:

object MainIO extends IOApp{
  override def run(args: List[String]): IO[ExitCode] = {
    IO(print("hello") *> IO(print(" world").as(ExitCode.Success)
  }
}

or run them concurrently as

object MainIO extends IOApp{
  override def run(args: List[String]): IO[ExitCode] = {
    (IO(print("hello")), IO(print(" world"))).parTupled.as(ExitCode.Success)
  }
}

It is the same case for something like IO(IO()) if we dont’t flattern this then the internal IO wouldn’t be executed. So the below case wouldn’t print anything.

object MainIO extends IOApp{
  override def run(args: List[String]): IO[ExitCode] = {
    IO(IO(println("hello"))).as(ExitCode.Success)
  }
}

Summary

Take libraries, and tools documentation seriously.