Fly GPUs

Fly.io has GPUs! If you have workloads that would benefit from GPU acceleration, Fly GPU Machines may be for you.

Illustration by Annie Ruygt of Frankie the hot air balloon admiring a fast-pacing motor bike

What can I use Fly GPUs for?

Four models of GPU are available: A10, L40S, NVIDIA A100 40G PCIe and A100 80G SXM.

A100 units are all about the tensor cores, and are positioned for inference, model training, and intensive high-precision computation tasks like scientific simulations. As their names suggest, they have 40GB and 80GB of GPU memory. (A100 datasheet)

L40S cards are all-rounders; they’ve got tensor cores, RT cores, and NVENC/NVDEC, and have 48GB of GPU RAM. Choose the L40S to accelerate graphics or video workloads, as well as for inference. (L40S datasheet)

A10 cards are all-arounders with less GPU RAM. They’ve got tensor cores, shader cores, NVENC/NVDEC, and can run Llama 3 8B at float16 without breaking the bank. Choose the A10 when you don’t need more than 8 billion parameters. This works great for smaller large language models, Stable Diffusion, and other such workflows. (A10 datasheet)

Right now each Fly GPU Machine uses a single full GPU. A single GPU is well suited to rendering, encoding/decoding, inference, and a smidgen of fine tuning. Training large models from scratch requires much, much beefier resources.

Go to the GPU Quickstart to get off the ground fast, or read more practicalities in Getting started with Fly GPUs.

Regions with GPUs

Currently GPUs are available in the following regions:

  • a10ord
  • l40sord
  • a100-40gbord
  • a100-80gbamsiadmiasjcsyd

Examples

Here’s some more inspiration for your GPU Machines project: