Autoscaling

Autostop/Autostart: The "Only Pay for What You Use" Button

A user calls your app, your Machines wake up. Like ... really, really fast. No traffic? Machines go back to sleep. No CPU or RAM charges while stopped. It's easy to configure this behavior to cover exactly what your project needs.

Wake up in milliseconds when requests arrive
Stop automatically during idle periods
Zero CPU/RAM charges when stopped

Set minimum Machines to keep warm
Perfect for sporadic traffic patterns
Enabled by default for new apps

Metrics-Based Autoscaling: Because Queue Depth > Request Count

Sometimes "number of HTTP requests" isn't the right metric. Got background workers chomping through a job queue? Temporal workflows piling up? Scale based on what actually matters to your app: queue depth, pending work, custom metrics from Prometheus, whatever keeps you up at night.

Scale based on queue depth, pending jobs, or custom metrics

Pull metrics from Prometheus or Temporal

Write scaling rules with expressions and arithmetic

Create or destroy Machines dynamically

Scale multiple apps with common naming patterns

What Even is This Magic?

Fly Proxy sits at the edge and watches traffic. When a request shows up for a sleeping Machine, the proxy wakes it up faster than you can say "cold start problem." Machines boot in milliseconds, handle the request, and go back to sleep when things quiet down. You configure the behavior, we handle the orchestration.

Fly Proxy detects incoming traffic and wakes Machines instantly

Configure stop/start behavior in your fly.toml

Set minimum Machines running to avoid cold starts

Machines only get charged when they're actually running

Read the autoscaling docs

Autoscaling on Fly.io

Autostop/Autostart: The "Only Pay for What You Use" Button

Metrics-Based Autoscaling: Because Queue Depth > Request Count

What Even is This Magic?

Read the Docs

Autoscaling Overview

Autostop/Autostart

Autoscale by Metric

Configuration Reference

Stop Paying for Idle Servers