Run Ordinary Rails Apps Globally

We’re Fly.io. We take container images and run them on our hardware around the world. It’s pretty neat, and you should check it out; with an already-working Docker container, you can be up and running on Fly.io in well under 10 minutes.

If you've used your own Rails application from another continent, you may get the feeling that physics has beaten your performance tuning efforts. Page loads feel a bit sluggish, even with all the right database indexes and fancy CDN-backed assets.

We've said it before: when it comes to responsiveness, sub-100ms times are the magic number; below 100ms, and things feel instantaneous.

Now, simple regional asset caching — the CDN pitch — can bring apps closer to 100ms response times. But what if you could easily deploy your application globally? Not just images, but application logic. And what if you could do it without changes to your code?

This type of global deployment sounds like a major infrastructure project — unrealistic to undertake in the short term, and long-term reserved for giant companies with serious technical faangs. It shouldn't be that way. All apps should run close to end users. And with the right plumbing, we can distribute and scale globally from day one.

Fly.io has been doing a lot of cool stuff with Elixir and Phoenix. Elixir, built on Erlang's distributed-by-design BEAM runtime, begs to be clustered and geographically distributed, and it really sings on Fly.io. But people do real work in Rails, too. That's why I wrote the fly-ruby gem. It's a tiny library that makes it trivial to deploy Rails apps on a global platform like Fly.io. No new framework or functional language learning required.

This post is going to talk you through how fly-ruby works. Before we dig into the details of the gem itself, it's worth a minute to talk about how Fly.io works and what it means to optimize an application for it.

What Fly.io Does

For our 100ms performance goal, Fly.io has two major features that Rails can take advantage of.

Region-local database replicas: Deploying a global Postgres cluster of read replicas on Fly.io is easy. And read replicas are, by themselves, a good first step to improving Rails performance.

Rails instances read from their corresponding regional replica, so your "find local recipes" app serves information about Pan-fried rice noodles in Hong Kong from a Postgres replica in Hong Kong, and Italian beef sandwiches from a replica in Chicago.

Replayable HTTP requests: Read replicas work great for retrieving data locally. But we sometimes need to write to the database, and replicas don't handle writes.

Somewhere in the world (let's say Paris) we'll have a main Postgres — that means our write requests need to make their way to Paris, over Fly.io's private network.

For example, an HTTP request may arrive in Hong Kong that needs to write to the main database in Paris. We can tell Fly.io's proxy - via the Fly-Replay response header - to move the entire HTTP request to Paris's Rails instance, and it will just work.

The magical Fly.io Ruby Gem

So let's do some testing and figure out how to get below that 100ms threshold from Paris, Chicago, Sydney and Santiago, Chile.

We've tried most performance testing tools and our current favorite is k6, a modern, open source web performance testing tool. It's unique in its approach: you write tests in Javascript, interpreted and executed in a Go runtime. It has exquisite documentation - especially for those unfamiliar with web performance testing. Their hosted option supports distributed tests, but we can also run tests from a global Fly.io app!

First, we should see how a vanilla, single-region deployment fares. We just need a Postgres database in Paris, and a Rails app deployed in the same region. It would be weird to write a whole article without mentioning food, so here's a little recipe search app that's good for testing. For bonus points, it shows different recipes to people in different cities.

Here's how it performs:

Single region deployment: Time to First Byte
cdg 72.7ms
ord 168.7ms
syd 286.2ms
scl 442.9ms

With the current, single region app config, every request is bounced to Paris. Great for people in Paris, not great for people in Santiago with a hankerin' for Pastel de Choclo.

If we deploy our app to more regions, we get a nasty surprise:

Multiregion deployment with single database: Time to First Byte
cdg 76ms
ord 258ms
syd 528.7ms
scl 1341ms

Performance got worse!? This isn't a very good sales pitch. There's a simple explanation, though, and we're halfway to faster Chilean recipe suggestions.

The Rails instance in Sydney still needs to query the database — often multiple times — and each of those database queries bounces around the world to Paris (over an encrypted private network). Adding latency between the app server and database multiplies internet latency. One round trip from Sydney to Paris might take 400ms. Ten in a row feels like an hour.

Now, here's the sales pitch. The fly-ruby gem will switch to regional replicas for database reads and magically route write request to the primary database.

If we add the fly-ruby gem, set the PRIMARY_REGION environment variable, here's what happens:

Multiregion deployment with regional database replicas: Time to First Byte
cdg 75.1ms
ord 50.4ms
syd 45.8ms
scl 84.4ms

One tiny configuration change, 90% latency reduction, and our Rails app suddenly responds in sub-100ms. No architecture work required.

It's not actually magic

This 300-line gem doesn't really do much. Postgres and the Fly.io global proxy do all the heavy lifting. It's a set of Rack middleware that does the last little bit of work for you. And it's usable in any Rack-compatible application, for people who like their Ruby without restrictive rails.

The magic here lives in the Fly-Replay header. I pass a state: an arbitrary value written to the Fly-Replay-Src header, appended to the final replayed request to the primary application instance.

This state assists the middleware in handling the replay under different conditions, as we'll see below.

def self.replay_in_primary_region!(state:)
  res = Rack::Response.new(
    "",
    409,
    {"Fly-Replay" => "region=#{Fly.configuration.primary_region};state=#{state}"}
  )
  res.finish
end

I exploit the web perf rule of thumb that most requests are reads, and most reads use HTTP GET requests. I can safely reconnect Rails to the region-local database replica. The gem builds the replica URI using Fly.io's DNS service discovery and the FLY_REGION environment variable.

database_uri = URI.parse(ENV['DATABASE_URL'])
database_uri.host = "#{ENV['FLY_REGION']}.#{database_uri.hostname}"
database_uri.port = 5433
database_uri.to_s

As a result, HTTP GET requests are passed directly down to the Rails application. And, in the normal case, they return after a speedy round trip to the database replica.

But GET requests occasionally perform writes. It's dirty, but true.

Fortunately, Postgres won't allow writes to a Postgres read replica. When a database write slips through, the Ruby Postgres library throws an exception. The gem inserts another middleware — at the bottom of the stack — to catch thePG::ReadOnlySqlTransaction exception. This halts the response and asks Fly.io to replay the original request in the primary region.

def call(env)
  @app.call(env)
rescue PG::ReadOnlySqlTransaction, ActiveRecord::StatementInvalid => e
  if e.is_a?(PG::ReadOnlySqlTransaction) || e&.cause&.is_a?(PG::ReadOnlySqlTransaction)
    RegionalDatabase.replay_in_primary_region!(state: "captured_write")
  else
    raise e
  end
end

It could stop here. But there are a bunch of requests for which we don't have to do this dance. It's safe to assume that non-idempotent HTTP requests intend to write to the database. This includes, by default, POST, PUT, PATCH and DELETE requests.

So, from high in the middleware stack, the gem halts and replay probably-write requests in the primary region, which prevents unnecessary application requests in the secondary region.

One catch with this setup is: physics. Imagine we're handling a large replicated write — say, an HTTP POST of a large recipe entry in Santiago, Chile. Something that can happen is that a request to read that entry back from Santiago can race the replication of the write from Paris, and lose. You see this pattern, "create-and-redirect-to-show", somewhat regularly in Rails apps, and if you break it, you can get a poor user experience.

To prevent this, replayed requests set a configurable time threshold in a cookie. Requests arriving within the threshold sent by the browser will be sent to the primary region. This is a simple but valuable trade-off: a temporary performance penalty in exchange for consistency. Remember, we assume that most uses of the application won't write at all; the worst case isn't terrible, and the common case is very fast. It's usually the right trade.

Curiously, this approach mirrors the Rails default implementation of read/write splitting between primary and replica databases.

I ❤️ Rack

Apart from Fly.io's magic, the Rack standard made this gem a cinch to implement. Rack is one of the major successes of the Ruby and Rails development environment. It's underappreciated and deserves more appreciation, so here's some love.

Most web apps share a lot of common behavior in marshaling, unmarshaling, validating, and routing requests. These are the basic features that a web framework provides, and why frameworks are so popular. It used to be difficult (and on some platforms it still is) to change those behaviors: you had to change your application, or, worse, the framework itself to accomplish it.

Python's WSGI was probably the first standard aimed at solving this problem. Rack came shortly after, inspired by WSGI. Both provide a simple, elegant interface for inserting common behavior between web servers and applications. This also happens to be a great way to simplify framework-specific behavior.

Try typing rails middleware in a Rails app production environment:

use ActionDispatch::HostAuthorization
use ActionDispatch::SSL
use Rack::Sendfile
use ActionDispatch::Static
use ActionDispatch::Executor
use ActiveSupport::Cache::Strategy::LocalCache::Middleware
use Rack::Runtime
use Rack::MethodOverride
use ActionDispatch::RequestId
use ActionDispatch::RemoteIp
use Rails::Rack::Logger
use ActionDispatch::ShowExceptions
use ActionDispatch::DebugExceptions
use ActionDispatch::ActionableExceptions
use ActionDispatch::Callbacks
use ActionDispatch::Cookies
use ActionDispatch::Session::CookieStore
use ActionDispatch::Flash
use ActionDispatch::ContentSecurityPolicy::Middleware
use ActionDispatch::PermissionsPolicy::Middleware
use Rack::Head
use Rack::ConditionalGet
use Rack::ETag
use Rack::TempfileReaper
run Cookherenow::Application.routes

Exception handling, caching, session management, cookie encryption, static file delivery - all implemented as Rack middleware. Building apps this way provides a clear path for a request to reach an application, and more importantly, a standard way to insert middlewares at a specific location. The framework is now programmable.

The fly-ruby gem implements two Rack "middlewares". It's idiomatic, and easy to shoplift (from, say, Sentry's exception handling library).

What about background jobs?

Background jobs are a core piece of infrastructure for most Rails apps. Naturally, they'll need to write to the database.Restricting worker processes to the primary region is the simplest way to handle such jobs in a multi-region scenario.

But if we're using a database - like Postgres or Redis - to store the jobs, queuing up the job itself will be slow from secondary regions. If we enqueue lots of jobs in GET requests, this performance loss could offset our gains.

Furthermore, some apps - like Discourse - run smaller background jobs in the web process itself. Both scenarios need to write the primary database without relying on HTTP trickery.

For example, we might add code to fly-ruby like this.

Fly.on_primary do
  Recipes.transform
end

The Rails support for read/write splits takes a similar path to force a specific database connection.

Where this breaks down

Some complex Rails applications make this kind of setup difficult, like Discourse.

Some apps write on every request. Think about things like lazy authentication session token refresh, or touching a user's last_seen attribute. These generate unexpected writes, and, worse, waste cycles on regional app servers.

Moving work like this to a background job is a fine solution to this problem. It also happens to be a best practice for keeping applications performant and resilient. So if you can do this, you should.

Background jobs in Rails apps without infrastructure support for jobs might seem like a pain. But it doesn't have to be. You could implement the ActiveJob in-memory queue for jobs you would not mind losing on restart.

Complex interactions with other data stores may slow requests down. By default, Discourse backs statistics and logs into Redis, and reads and writes to it on every request. This can be tricky to deal with in a global deployment. Solutions like read/write splitting may be useful here, but they're not "just install this gem"-simple to implement.

Relying on catching read-only exceptions could lead to inconsistent data. For example, a visit counter being incremented in Redis before the Postgres exception is raised would be bumped twice: once in the secondary region request, and again in the replayed primary region request. Most apps aren't going to care about this, but you want to be aware of it.

Large multipart file uploads might be doubly slow if they're replayed after the browser upload completes.

What's next?

Region-local Redis caches would be dope. For Rails apps, this could mean that the common approach of fragments or Russian-doll caching could get a boost at the global level without much work.

And more adapters! Adapters for Nodejs/Express, Phoenix, Django, and friends. They're totally doable and you should get in touch if you like these kind of projects.