SQLite Persistence Canary

A canary talking to a toucan while sitting on a chain.
Image by Annie Ruygt

As the world boldly moves towards running SQLite in production, there’s a bit of a problem that looms in today’s container-based production environments: persistence.

Ever since cloud servers arrived on scene, we’ve been told over and over again to not depend on writing important data to disk because disks fail. Docker took this a step further by making it common practice to rebuild the entire file system from scratch on every deploy. There’s lots to like about this approach, and generally we’re much better off for it, but it does create a huge problem for SQLite in production: how do you properly educate developers their rebuild-the-world-from-scratch deployments could destroy their production data?

A warning message isn’t enough

In 2021 Rails introduced a way of displaying a warning message in the logs:

You are running SQLite in production, this is generally not recommended. You can disable this warning by setting config.active_record.sqlite3_production_warning=false

Cool, but let’s be real—it’s really easy to miss log file messages, especially if you’re new to web applications and deploying them to production. It’s highly likely this log message would be seen after data is lost in production.

This was actually a very reasonable approach in 2021 before the industry seriously considered running SQLite workloads in production, but as Bob Dylan famously said, “the times they are a-changin’”.

Test persistence between the first and second deploys

What if instead we could test the persistence of the application within the environment?

Here’s how this might work:

First deployment

  1. When Rails boots, there’s no SQLite database on the file system so it creates one on disk and refuses to boot with an error message: “Rails detected SQLite in production. Re-deploy your application to test if your file system will save your production data between deploys”
  2. A file would also be written to ./tmp/persistence-test.lock that would prevent Rails from booting if a SQLite database and this file are present. This deals with environments that try to boot applications multiple times when they fail to boot.

The key friction point this introduces would force developers to deploy their application again, which is where the file system can actually be tested to persist data between deploys.

Second deployment

Assuming persistence is properly setup on the production environment, Rails would perform the following checks:

  1. When Rails boots, it checks for the existence of the SQLite database in production.
  2. Since the SQLite database exists in a persistent volume, Rails then checks to see if the ./tmp/persistence-test.lock file is present. Assuming the ./tmp directory is wiped out between deploys (it should), Rails will boot as you’d expect and be reasonably confident that the SQLite database is being stored on a persistent disk.

This approach introduces the “right” friction points for developers who want to deploy their SQLite applications to production, minimizes surprises, and teaches them along the way about an abstraction they need to consider in their production environment.

Additional considerations

There’s a lot to think about with this approach:

Edge cases should be fail-safe

It’s possible that the ./tmp file is not erased between deploys. In that case Rails could display within the error message instruction that the developer can delete that file if they’re confident their database is being written to a persistent disk.

This is actually a good trade-off because it errs on the side of safety and runs developers through all the things they should think about with respect to writing data to a file on disk.

Is crashing the first deploy a good idea?

It does seem a bit crazy to intentionally crash the first production deployment, but it’s even crazier to write production data to a location that could potentially be erased on the next deploy.

The other thing to keep in mind about a first deployment is that it doesn’t have users yet, so there’s little to worry about in terms of downtime.

Its scary having a mechanism that could intentionally crash Rails if I don’t want it to

There is a legitimate concern for the remote possibility that a ./tmp/persistence-test.lock file makes it somehow into a production environment.

In this case, the config.active_record.sqlite3_production_warning=false could completely disable the persistence test and restore confidence that Rails would never enter this state when booting.

Wrap-up

The prospect of running SQLite & Rails in production is exciting! It stands to greatly simplify the infrastructure needed to deploy small-to-medium size production applications be eliminating the need for running Postgres and Redis services, especially when used with libraries like Litestack, but extra thought and care must be put into giving developers the information they need ensure they don’t lose production data.

This method is applicable not just to Rails, but other frameworks as well that face similar problems of potentially losing production data written to an ephemeral disk.