We first showed Fly.io VMs to developers in early 2020. They were most interested in running CPU intensive apps doing image processing, machine learning predictions, and even video transcoding (despite what were, until recently, offensive bandwidth prices). So when we launched, most of the available VMs were thicc like oatmeal, but weak on the RAM.
The requests changed when we got in front of more developers. We knew we'd have to solve databases "at some point", we just didn't expect devs to ask day one. Databases – and apps that look like databases when you squint – need a higher RAM to CPU ratio. Small databases hoover up RAM, but largely leave the CPU alone.
Our new VMs come in two flavors,
shared-cpu with up to 2GB of RAM for lightweight apps, and
dedicated-cpu with up to 64GB of RAM for not-quite-big-data-but-it-should-be-fast. Here's the price breakdown (and if you're a future reader, you might see even more VM types):
|CPUs||CPU Type||RAM||Per second||Per Month (Approx)|
Cloud VM pricing, an abridged guide
We like working on Fly.io, and want to continue hacking away. Sustaining the company means getting to 70% margins on VMs – a number that comes up all the time.
The dirty secret of Virtual Machine pricing is ... the machines are virtual. The specs we promise are only loosely constrained by the underlying hardware. And there's no reason to tell customers the host hardware specs . We can even sell the same RAM several times over. Yay margins!
Fortunately, we don't want to make money on VMs. We do, however, want to help happy customers and minimize our own operational headaches. Oversubscribing host hardware is a great way to piss off customers and keep us hoppin' while we're on call. A few unexpected OOM errors will ruin everyone's day.
To the slide rule!
If you do the math, you'll estimate that our cost to run a
dedicated-cpu-1x VM is a little under $10 per month. That's a ridiculous simplification, but good for hasty math.
The actual cost is a function of hardware + colocation + power. And we commit to the hardware in yearly increments, while we bill you in seconds. For additional fun, we have to buy servers before we get customers. So there's dead time before we're even covering the expense, much less making margins.
Reading that back, it actually sounds pretty terrible. But we're lucky to have levers that make it work.
- We fund new hardware with revenue from large customers. Large customers are an immediate margin win, and we can "borrow from margins" to deploy more hardware for the people we really love (the developers with a $50 budget and a promising application).
- Shared CPU VMs. Over subscribing hardware is a terrible thing to do unless you set the right expectations.
Shared CPU instances let us fill in the gaps with a lower priced product that's decoupled from margins. We can load large hosts up with shared VMs, cover our costs pretty quickly, and avoid nasty surprises.
Shared CPU VMs
Over subscribing CPUs is less fraught than over subscribing RAM since there's enough oomf on any given server to move things around and avoid contention. We've run
micro VMs for the last 6 months to get an idea of how to price these, and the results have been favorable for pricing.
We can pool a bunch of CPUs together, and share each with an average of 12 VMs. This works great for bursty apps, the big pool of CPUs lets us mix and match busy and idle VMs to
produce vast quantities of heat maximise CPU utilization.
This works 100% of the time until it doesn't. CPU contention is a given. We have safeguards in place to ensure that VMs trying to eat a whole CPU have to give it up to your polite VM (
cpuset.priority if you're a cgroups nerd). And when that doesn't solve the contention, we can move VMs to different hardware. Sometimes in different regions!
The best part? These VMs cost you $1.94 per month to run full time. That's cheaper than most dollar menus, these days [^].
But running databases on ephemeral VMs is silly
Database apps also need persistent storage. VMs that boot up, then go away and take all their storage with them are not a good place to run DBs. We don't sell persistent storage. But, you know, we definitely should.
- ^ We mostly run AMD EPYC CPUs, 24-32 cores, 256GB of RAM. Sometimes equivalent Intel CPUs (some people enjoy researching PassMark scores), a few machines only have 128GB of RAM.