Scaling and Fly

Fly Scaling Architecture

Scaling Dimensions

There are multiple dimensions of scaling on Fly.

Scaling And Configuration

Fly scales within regions by creating more instances of the application as needed. That need is defined by the number of concurrent connections that an application has. The thresholds are defined in the fly.toml file under services.concurrency.

By default, when an application sees 20+ connections, a new instance of the application is started and new connections go to that instance. By adjusting the soft and hard limits of concurrency in the configuration file, you can set how many connections will trigger the creation of a new instance.

Where that instance will appear is down to the regional availability and Fly's auto-balancing mechanisms. These mechanisms decide where a new instance is created.

Regional Scaling

Scaling over regions is where you deploy your applications to different datacenters around the world. The region pool is a list of regions where the application is allowed to deploy. When an application is created, the first region in the pool is the region determined to be nearest to the user creating the application. You can confirm this by running flyctl regions list.

flyctl regions list
Region Pool:
Backup Region:

The create command, in this case, was issued in the UK, so London (LHR-London Heathrow) is the closest region. There is an exception to this rule. When Turboku applications are created they default to the region nearest to location of their source Heroku application: iad for US applications and ams for European applications).

Backup Regions

Notice also the list of Backup Regions. If for any reason, the application can't be deployed in LHR, Fly will attempt to bring it up in either AMS (Amsterdam) or FRA (Frankfurt). Users won't notice this as they will be directed to running instance automatically. Backup Regions are selected based on the Region Pool and the geographical closeness of other regions.

Scaling Modes

Regional scaling is based on a pool of regions where the application can be run. Using a selected model, the system will then create at least the minimum number of application instances across those regions. The model will then be able create instances up to the maximum count. The min and max are global parameters for the scaling. There are two scaling modes, Standard and Balanced.

  • Standard: Instances of the application, up to the minimum count, are evenly distributed among the regions in the pool. They are not relocated in response to traffic. New instances are added where there is demand, up to the maximum count.

  • Balanced: Instances of the application are, at first, evenly distributed among the regions in the pool up to the minimum count. Where traffic is high in a particular region, new instances will be created there and then, when the maximum count of instances has been used, instances will be moved from other regions to that region. This movement of instances is designed to balance supply of compute power with demand for it.

To determine what the current settings of an application are, run flyctl scale show:

flyctl scale show
     Scale Mode: Standard
      Min Count: 1
      Max Count: 10
        VM Size: micro-2x

This scaling plan sees standard, even distribution on instances, with a minimum of 1 instance and up to 10 instances that can be created on demand.

Modifying The Region Pool

To control which regions an application can be deployed to, the flyctl regions command has two more sub-commands - add and remove. Each take a space-separated list of regions and then, as required, add or remove them from the region pool. The add command also sets the scaling plan's minimum count of instances to the number of regions in the pool, to save having to adjust it. Note, it only adjusts the value upwards so if you remove regions, you will have to manually reset the minimum count.

Modifying The Scaling Plan

As mentioned above, the scaling mode controls how the regions in the pool are used for allocating instances. To set the mode use:

flyctl scale standard


flyctl scale balanced

Both of these commands set the scaling mode and can take extra settings that tune the mode, specifically setting the minimum count (min) and maximum count (max) of instances. For example, to set balanced mode with a minimum number of instances of 5, you would give this command:

flyctl scale balanced min=5

Want to set a maximum of 10 too? Then do this:

flyctl scale balanced min=5 max=10

If you just want to set the max or min for the currently selected model use the set sub-command:

flyctl scale set min=5 max=10

Viewing The Application's Scaled Status

To view where the instances of a Fly application are currently running, use flyctl status:

flyctl status
  Name     = hellofly
  Owner    = dj
  Version  = 299
  Status   = running
  Hostname =

Deployment Status
  ID          = 59b60abf-ba4f-fb2f-9f78-35a249e2bef5
  Version     = v299
  Status      = successful
  Description = Deployment completed successfully
  Allocations = 3 desired, 3 placed, 3 healthy, 0 unhealthy

  8a9358d1   299       ams      run       running   1 passing       15m36s ago
  7c08ce47   299       nrt      run       running   1 passing       15m36s ago
  1b17a5e6   299       sjc      run       running   1 passing       15m36s ago

If a region is listed with (b) following it, that means the region being used is a backup region in use.

Scaling Virtual Machines

Each application instance on Fly runs in its own virtual machine. The number of cores and memory available in the virtual machine can be set for all application instances using the flyctl scale vm command.

Viewing The Current VM Size

Using flyctl scale vm on its own will display the details of the application's current VM sizing.

flyctl scale vm
           Size: micro-1x
      CPU Cores: 0.12
         Memory: 128 MB
  Price (Month): $2.670000
 Price (Second): $0.000001

It shows the size name (micro-1x), number of CPU cores, memory (in GB and MB), estimated price per month (if an instance was kept running for a month) and price per second (if an instance was only brought up on demand).

Viewing Available VM Sizes

The flyctl platform vm-sizes command will display the various sizes with cores and memory and current pricing:

flyctl platform vm-sizes
micro-1x   0.12        128 MB   $0.000001        $2.670000
micro-2x   0.25        512 MB   $0.000003        $8.000000
cpu1mem1   1           1 GB     $0.000013        $35.000000
cpu2mem2   2           2 GB     $0.000027        $70.000000
cpu4mem4   4           4 GB     $0.000053        $140.000000
cpu8mem8   8           8 GB     $0.000107        $280.000000

Note: This pricing is correct as of writing (March 2020), run flyctl platform vm-sizes to get the most current pricing.

The CPU Cores column shows how many vCPU cores will be allocated to the virtual machine. Lower than 1, the value reflects the proportion of a shared core that the VM will have available. Greater than 1, it represents the number of cores (from a pool of hyper-threaded cores) that will be available to the VM.

Setting VM Size For An App

Setting the size of the VM is handled by adding the required size name to flyctl scale vm. For example, if we want to double the VM size for our application, from micro-1x to micro-2x, we would run:

flyctl scale vm micro-2x
Scaled VM size to micro-2x
      CPU Cores: 0.25
         Memory: 512 MB
  Price (Month): $8.000000
 Price (Second): $0.000003

Flyctl responds with the sizes and pricing for a single new instance. All existing instances will be restarted at this new size.