Your Rails app works on localhost. Getting it to stay working under real traffic, real deploys, and real background jobs is a different problem entirely.
You’ve got a Rails app that passes tests, handles your local seed data, and renders correctly in development. Now you need to put it in front of users. That means picking a host, configuring a Ruby runtime, compiling assets, wiring up a job queue, attaching persistent storage, and making sure the whole thing doesn’t fall over when two deploys happen in the same hour. Rails hosting isn’t one decision. It’s a stack of decisions that interact with each other in ways that only become obvious when something breaks at 2am.
Getting this right matters because the failure modes are specific and expensive. A misconfigured Puma thread count causes request queuing under moderate load. Missing asset precompilation breaks your deploy silently. A background job worker that shares a database connection pool with your web process causes timeouts under load. None of these are hypothetical. They’re the standard first-month production incidents for teams that treat hosting as an afterthought. When you understand what a Rails hosting environment actually needs, you can make deliberate choices instead of cargo-culting a setup from a five-year-old tutorial.
This page covers what a production Rails hosting environment requires, how the server components fit together, where the common failure modes live, and how Rails hosting compares to Django hosting and Laravel hosting at the infrastructure level.
Key takeaways
- Rails hosting requires coordinating a Ruby runtime, an application server (typically Puma), a web server or proxy (typically Nginx), a process manager, a job queue, and persistent storage. These are not optional layers.
- Concurrency in Rails is handled at the application server level through Puma’s thread and worker configuration. Getting this wrong causes either resource exhaustion or underutilization, depending on which direction you misconfigure it.
- Asset compilation must happen at deploy time, not at runtime. If your hosting setup doesn’t run
assets:precompilebefore traffic hits the new release, users will see broken stylesheets and missing JavaScript. - A production Rails hosting setup is working correctly when deploys are zero-downtime, background jobs process without database connection contention, and the application server thread count matches your database connection pool size.
What is Rails Hosting?
Rails hosting means running a Ruby on Rails application in a production environment that handles real HTTP traffic, manages application state, processes background jobs, and survives deploys without dropping requests. That’s a broader surface area than most frameworks advertise.
At minimum, a production Rails hosting environment needs:
- A compatible Ruby runtime. Rails versions have specific Ruby version requirements. Running the wrong Ruby version causes boot failures or subtle compatibility bugs. Your host needs to support the exact Ruby version your app targets, and you need a way to pin and upgrade it without rebuilding your entire server.
- An application server. Rails ships with Puma as the default. Puma handles concurrent requests using threads within worker processes. You configure thread count and worker count based on your available memory and CPU.
- A reverse proxy or web server. Nginx or a platform-level load balancer sits in front of Puma to handle slow clients, serve static assets, terminate TLS, and buffer requests. Puma alone is not designed to face the public internet directly.
- Asset compilation. Rails uses Sprockets or Propshaft to compile CSS, JavaScript, and images into fingerprinted static files. This must run during the deploy process. The compiled output needs to be served efficiently, either from the filesystem or a CDN.
- A database with a connection pool. ActiveRecord manages a pool of database connections per process. The pool size must be tuned to match your Puma thread count. Mismatches cause connection exhaustion or idle connections wasting database resources.
- A background job system. Sidekiq, Solid Queue, or GoodJob handle async work. Each requires its own process, its own connection pool configuration, and its own process management.
- Persistent storage. File uploads, cached assets, and any data written to disk need a storage layer that survives deploys and scales across multiple instances.
- Process management. Something needs to start, monitor, and restart your Puma and worker processes. Systemd, Foreman, or a container orchestrator handles this.
That’s the baseline. Everything else is configuration and tuning on top of it.
How Does Rails Hosting Work?
The request path through a Rails hosting stack is worth understanding precisely, because each layer introduces failure modes.
An HTTP request arrives at your load balancer or reverse proxy. Nginx (or an equivalent) terminates TLS, checks if the request is for a static asset, and either serves the file directly from disk or proxies the request upstream to Puma. Puma receives the request on a listening socket, assigns it to an available thread within a worker process, and runs the Rails request cycle: routing, middleware, controller, view, response. The response travels back through Nginx to the client.
Client
|
v
Load Balancer / Nginx (TLS termination, static assets, request buffering)
|
v
Puma (multi-threaded application server, N workers x M threads)
|
v
Rails (routing, middleware stack, ActiveRecord, views)
|
v
PostgreSQL / MySQL (connection pool managed by ActiveRecord)
The concurrency math matters here. If you run Puma with 2 workers and 5 threads each, you can handle 10 simultaneous requests. Your database connection pool should be set to at least 5 per worker (10 total) to avoid thread starvation waiting for a connection. Your background job workers add to this connection count. If you’re running Sidekiq with 10 concurrency on the same host, your database needs to support at least 20 connections from that single application server.
Asset Compilation and Deployment Flow
Asset compilation is where Rails hosting setups break silently. The symptom is a successful deploy that results in 404s for CSS and JavaScript files.
Rails fingerprints compiled assets with a content hash. The HTML your app renders references /assets/application-abc123.css. If that file doesn’t exist on disk (or in your CDN), the browser gets a 404. This happens when:
- The deploy process skips
RAILS_ENV=production bundle exec rails assets:precompile - The compiled assets are written to a directory that isn’t mounted or shared across instances
- A rolling deploy serves HTML from the new release but static assets from the old release’s compiled output
The fix is to run asset precompilation as a deploy step, before traffic switches to the new release, and to ensure compiled assets are either baked into the container image or written to a shared volume that all instances can read.
# Typical deploy sequence for a containerized Rails app
bundle install
RAILS_ENV=production bundle exec rails assets:precompile
RAILS_ENV=production bundle exec rails db:migrate
# Then swap traffic to the new release
Zero-downtime deploys add another layer. You need the old release to keep serving traffic while the new release boots. That means your web server or load balancer needs health check support, and your new Puma processes need to pass health checks before the old ones are terminated. Most container platforms handle this with readiness probes. On bare VMs, you configure this in Nginx upstream blocks with max_fails and fail_timeout settings.
Background Jobs and Persistent Storage
Background jobs are a first-class concern in Rails hosting, not an add-on. Most Rails applications use Active Job with a backend like Sidekiq (Redis-backed), Solid Queue (database-backed), or GoodJob (database-backed). Each approach has different infrastructure requirements.
Sidekiq requires a Redis instance. It runs as a separate process from Puma. It needs its own database connection pool configuration. Under load, Sidekiq’s concurrency setting directly affects how many database connections it holds open. A common mistake is setting Sidekiq concurrency to 25 without accounting for those 25 additional database connections on top of your Puma pool.
Solid Queue and GoodJob use your existing database as the job store, which simplifies infrastructure but increases database load. For low-to-medium job volumes, this tradeoff is usually worth it. For high-throughput job processing, Redis-backed queues handle the load more efficiently.
Persistent storage is the other stateful concern. If your app accepts file uploads, you need a storage backend that works across multiple application instances. Writing uploads to the local filesystem works on a single server but breaks immediately when you scale to two instances or when a deploy replaces the container. Active Storage with an S3-compatible backend (or a similar object store) is the standard solution. Your hosting environment needs to provide either managed object storage or a persistent volume that survives deploys.
Rails Hosting vs. Django Hosting vs. Laravel Hosting
Rails, Django, and Laravel solve similar problems at the framework level, but the hosting requirements differ at the runtime and server configuration layer.
| Concern | Rails Hosting | Django Hosting | Laravel Hosting |
|---|---|---|---|
| Runtime | Ruby (version-pinned via .ruby-version or rbenv) |
Python (version-pinned via pyenv or virtualenv) | PHP (version-pinned via PHP-FPM configuration) |
| Application server | Puma (multi-threaded, multi-process) | Gunicorn or uWSGI (multi-process, optional async with ASGI) | PHP-FPM (process pool, not a separate app server) |
| Web server role | Nginx proxies to Puma socket | Nginx proxies to Gunicorn socket | Nginx passes requests directly to PHP-FPM via FastCGI |
| Dependency management | Bundler (Gemfile.lock) |
pip with requirements.txt or Poetry |
Composer (composer.lock) |
| Asset compilation | Sprockets or Propshaft, runs at deploy time | Whitenoise or Django’s collectstatic |
Vite or Mix, runs at deploy time |
| Background jobs | Sidekiq, Solid Queue, GoodJob | Celery (Redis or RabbitMQ backend) | Laravel Queues (database, Redis, or SQS backend) |
| Database migrations | rails db:migrate |
python manage.py migrate |
php artisan migrate |
The structural difference worth noting: PHP-FPM handles concurrency through a process pool managed by the PHP-FPM daemon itself, so Laravel hosting doesn’t require a separate application server process in the same way Rails and Django do. Nginx talks to PHP-FPM directly via FastCGI. Rails and Django both require a separate application server process (Puma, Gunicorn) that Nginx proxies to over a Unix socket or TCP port.
Django hosting with ASGI (using Daphne or Uvicorn) adds async request handling, which changes the concurrency model significantly. Rails has ActionCable for WebSocket support, but the core request handling remains synchronous and thread-based. These differences matter when you’re evaluating hosting setups for applications with long-lived connections or high concurrency requirements.
The dependency management difference is operationally significant. Bundler’s lockfile is strict and reproducible. pip’s ecosystem has historically been less consistent, though Poetry and pip-tools have improved this. Composer is reliable and widely understood in the PHP ecosystem. All three require that your hosting environment runs dependency installation as part of the build or deploy process, not at runtime.
When to Use a Dedicated Rails Hosting Setup
Not every Rails application needs the full production stack from day one, but certain conditions make a properly configured hosting environment non-negotiable.
- You’re handling more than one concurrent user. The moment you have two users hitting your app at the same time, Puma’s thread and worker configuration starts to matter. A single-threaded setup will queue requests and produce slow response times under any real load.
- Your app runs background jobs. If you’re sending email, processing uploads, syncing external APIs, or doing anything async, you need a separate worker process with its own connection pool. Running jobs in-process with your web server is a footgun that causes request timeouts and job failures under load.
- You’re deploying more than once a week. Frequent deploys without zero-downtime configuration means users see errors during every release. Once your deploy cadence picks up, you need health checks, readiness probes, and a release command that runs migrations before traffic switches.
- Your app writes files or accepts uploads. Local filesystem storage breaks the moment you run more than one instance or replace a container. If your app touches the filesystem for anything user-facing, you need a persistent volume or object storage before you scale.
- You’re running multiple environments. Staging and production environments that share infrastructure configuration need reproducible Ruby version pinning, consistent Bundler lockfiles, and environment-specific secrets management. Ad-hoc setups accumulate drift that causes production-only bugs.
Common Challenges and Trade-offs
Concurrency configuration is easy to get wrong in both directions
Setting Puma thread count too low wastes available CPU and memory. Setting it too high exhausts your database connection pool. The right number depends on your workload (I/O-bound vs. CPU-bound), your database’s max_connections, and how many other processes (Sidekiq, cron jobs) are competing for connections. There’s no universal default. You have to measure.
# config/puma.rb
workers ENV.fetch("WEB_CONCURRENCY") { 2 }
threads_count = ENV.fetch("RAILS_MAX_THREADS") { 5 }
threads threads_count, threads_count
preload_app!
on_worker_boot do
ActiveRecord::Base.establish_connection if defined?(ActiveRecord)
end
The preload_app! directive loads your Rails application before forking workers, which reduces memory usage through copy-on-write. The on_worker_boot block re-establishes database connections after forking, which is required because database connections cannot be shared across fork boundaries.
Asset compilation failures are silent
A deploy that skips assets:precompile or writes compiled assets to the wrong directory will succeed from the CI perspective and break in production. The error shows up as 404s in the browser, not as a failed deploy. Build asset compilation into your container image or deploy pipeline and verify the output path before switching traffic.
Database migrations during deploys require careful ordering
Running db:migrate after traffic switches to the new release means your new code may run against the old schema for a window of time. Running it before traffic switches means your old code runs against the new schema. Neither is safe for all migration types. Additive migrations (adding columns, adding tables) are generally safe to run before traffic switches. Destructive migrations (dropping columns, renaming columns) require more careful sequencing, often involving multiple deploys.
Stateful processes complicate horizontal scaling
Sidekiq workers, Action Cable connections, and any in-memory state don’t automatically distribute across multiple application instances. Scaling from one instance to two requires that your job queue backend (Redis or your database) is accessible from all instances, that your WebSocket connections are routed consistently, and that any in-memory caching is either replaced with a shared cache (Redis, Memcached) or accepted as instance-local.
Rails Hosting on Fly.io
Fly.io runs Rails applications as hardware-isolated VMs (Fly Machines) that boot fast enough to handle HTTP requests and scale to zero when idle. The deployment model maps cleanly onto the Rails hosting stack described above: your app runs in a container image that includes your Ruby runtime, compiled assets, and application code, with Puma as the application server.
Fly handles TLS termination, global load balancing, and request routing at the platform level, so you don’t need to configure Nginx separately. Persistent volumes attach to your Machines for local storage needs. For background jobs, you run a separate Machine with your Sidekiq or Solid Queue process. Database connections go to Fly’s managed Postgres or an external database, with connection pooling handled by PgBouncer if needed.
The fly launch command detects Rails applications and generates a working Dockerfile and fly.toml configuration. Asset precompilation runs during the Docker build. Database migrations run as a release command before traffic switches to the new deployment.
# fly.toml (relevant sections)
[deploy]
release_command = "bundle exec rails db:migrate"
[http_service]
internal_port = 3000
force_https = true
[[http_service.checks]]
path = "/up"
interval = "10s"
timeout = "2s"
The /up health check endpoint (added in Rails 7.1) lets Fly’s load balancer verify that a new Machine is ready before sending it traffic, which gives you zero-downtime deploys without additional configuration.
Frequently Asked Questions
What does a rails hosting environment need to support?
A rails hosting environment needs to support Ruby version compatibility, asset compilation, background jobs, persistent storage, request routing, and concurrency to run Ruby on Rails applications in production.
How does rails hosting differ from django hosting or laravel hosting?
Rails hosting, django hosting, and laravel hosting all require framework-specific runtime support, application packaging, and server configuration, though each framework has its own language runtime and dependency management approach.
What server components are involved in hosting a Rails application?
Running a Rails application in production involves an application server, a web server, and process management working together to handle incoming requests and serve responses.
What are the main concerns when evaluating a rails hosting setup for production?
Performance, security, scaling, and deployment automation are the primary concerns when evaluating how a Rails application is served in a production environment.
How does concurrency factor into rails hosting?
Rails hosting setups handle concurrency through the configuration of the application server and web server, which together determine how many simultaneous requests the application can process.