Fly.io runs apps close to users. We transmogrify Docker containers into Firecracker micro-VMs that run on our hardware around the world, and connect all of them to a global Anycast network that picks up requests from around the world and routes them to the nearest VM. The easiest way to learn more is to sign up. It’ll take you just a minute or two to get up and running.
The platform that makes all this stuff work, the engine of our system, is written in two different systems languages: Rust and Go. Go code powers our orchestration; it’s what converts Docker images and provisions VMs. Rust code drives
fly-proxy, our Anycast network.
We’re looking for people who want to work on
fly-proxy is an interesting piece of code. A request for some Fly.io app running in Dallas and Sydney arrives at our edge in, say, Toronto. We need to get it to the closest VM (yes, in this case, Dallas). So
fly-proxy needs a picture of where all the VMs are running, so it can make quick decisions about where to bounce the request.
We don’t simply forward those requests, though.
fly-proxy runs on both our (external-facing) edge hosts and our (internal) workers, where the VMs are. It builds on-the-fly, multiplexed HTTP2 transports (running over our internal WireGuard mesh) to move requests over. The “backhaul” configuration running on the worker demultiplexes and routes the request over a local virtual interface to the VM.
It gets more interesting in every direction you look. For instance: we don’t just route simple HTTP requests; we also do raw TCP (no HTTP2 for that forwarding path). And WebSockets. All these requests get balanced (most of our users run a bunch of instances, not just one). And in the HTTP case, we automatically configure HTTPS, and get certificates issued with the LetsEncrypt ALPN challenge.
Zoom in on the raw request routing and there’s still more stuff going on.
fly-proxy is build on Tokio, Hyper, and Tower. A single
fly-proxy is managing connectivity for lots and lots of Fly.io apps, and isolates the concurrency budget for each of those apps, so a busy app can’t starve the other apps. We’re tracking metrics for each of those apps and making them accessible to users.
fly-proxy also has some fun distsys problems. That global picture of where the VMs are is updating all the time. So too are the load stats for all those VMs, which impact how we balance requests. Requests can fail and get retried automatically elsewhere; in fact, that’s the core of how we do distributed Postgres.
All this is before we get to the eBPF code that runs alongside the proxy to make low-level networking things work.
It’s gnarly and technical and always-on and regularly updated but also needs to have high uptime and if this sounds fun, congradu-dolences! This might be the gig for you.
Things To Know About Us
- We’re a small team, almost entirely technical, and everyone wears a lot of hats. You’ll have the opportunity to get your hands dirty in a lot of different things here. But first and foremost, this is The Rust Job at Fly.io.
- We're at a stage where our engineering team will feel more like working on a big open source project than on a buttoned-down engineering team. Good things and bad things about that. You want to be comfortable working without a roadmap or an MRD, and with finding useful stuff to build.
- We’re remote, with team members in Colorado, Quebec, Chicago, London, Virginia, Rwanda, Spain, Brazil, and Utah.
- We’re an unusually public team, with an online community (at community.fly.io) that we try to be chatty with. If we’re doing things right, this role will likely increase your public profile.
- We’re a team, not a family, but we have families and want to be the kind of place where work doesn’t get in the way of that.
- We’re a real company – we hope that goes without saying – and this is a real, according-to-Hoyle full-time job with health care for US employees, flexible vacation time, hardware/phone allowances, the standard stuff. The comp range for this role is $160k-$200k, plus equity.
How We Hire People
We are weird about hiring. We’re skeptical of resumes and we don’t trust interviews (we’re happy to talk, though). We respect career experience but we aren’t hypnotized by it, and we’re thrilled at the prospect of discovering new talent.
The premise of our hiring process is that we’re going to show you the kind of work we’re doing and then see if you enjoy actually doing it; “work-sample challenges”. Unlike a lot of places that assign “take-home problems”, our challenges are the backbone of our whole process; they’re not pre-screeners for an interview gauntlet.
For this role, we’re asking people to write us a small proxy that does just a couple of interesting things (we’ll tell you more). We’re looking for people who are super-comfortable with Rust and network programming in general, but we’re happy to bring people up to speed with the domain-specific stuff in Fly.io.
If you’re interested, mail < email@example.com>. You can tell us a bit about yourself, if you like. Either way, we’ll ask you tell us what your least favorite Rust crate is (it can be a good crate, just one you didn’t have fun working with). We’re happy to answer questions: send them along!
There are lot of cool directions to take
fly-proxy in. It's a big deal to us. We're psyched to talk to you about it.