IPv6 WireGuard Peering

Author

Name: Thomas Ptacek
@tqbf: @tqbf

Fly.io transforms containers into swarms of fast-booting VMs and runs them close to users. Now you can connect those swarms privately to other networks with WireGuard.

They say that when you’re starting a product company, it’s a better plan to chase down something a bunch of people will really love a lot than it is to try to build something that everyone will just like a little bit. So when Fly.io launched, it had a pretty simple use case: taking conventional web applications – applications built for platforms outside of Fly.io – and speeding them up, by converting them into Firecracker MicroVMs that we can run close to users on a network of servers around the world.

This works great, and you should try it; for instance, if you’ve got an application running on Heroku, we have it down to just a button click on a web page. Or, our Fly.io speed-run can get a Dockerized application deployed everywhere from Chile to Singapore in just a couple commands. It’s easier than figuring out how to use rsync; so easy, it’s boring.

But, predictably, people have wanted to launch other stuff on Fly, besides making existing applications go fast. And we want to help them do that. We like stuff! We like talking to people about stuff! And we’ve gotten pretty good at getting stuff working on Fly. But stuff hasn’t always been as boring as we’d like.

We’re striking blows for the forces of boredom, by making it straightforward to get just about anything running on Fly.io. One way we’re doing that is by making it easy to peer networks with Fly, using WireGuard and IPv6.

Here’s the TL;DR:

If you deploy several different apps to Fly.io, they can find and talk to each other, privately, without any configuration.
If you’re running things on GCP or AWS, you can peer them to Fly.io with a simple flyctl, using WireGuard.

IPv6 Private Networking at Fly

Apps on Fly.io belong to accounts, which are associated with organizations. You don’t have to grok this; you just have an “organization”, trust us. Every Fly.io organization has its own private IPv6 network; we call them “6PNs”, or, at least I do; I’m trying to make it a thing.

Every app in your organization is connected to the same 6PN. Every app instance has a “6PN address”. Bind services to that address and they’re available other apps in your private network; bind a service only to it, and it’s only reachable privately.

You don’t need to understand the rest of this section, but in case you’re interested:

We carve up IPv6 addresses and embed information in them. We start with the IPv6 ULA prefix fdaa::/16 (the ULA space in IPv6 is analogous to the 10-net space in IPv4, except there’s a lot more of it).

Then, for every instance of every app we run, we collect a bit of information: a “network ID” associated with the app’s organization, an identifier for the hardware the instance is running on, and an identifier for the instance itself, and come up with this gem of an IPv6 address:

fdaa	16 bits	ULA prefix
network	32 bits	organization address
host	32 bits	machine identifier
instance	32 bits	app instance ID
	16 bits	free space

Technically, what we end up delegating to each instance is a /112, which is the IPv6 equivalent of an IPv4 Class B address; you can address 65,000 (and change) different things inside of an instance if you wanted to. I haven’t come up with any kind of use for this, but, why not?

Meanwhile, an organization has effectively a /48, or “mind-bogglingly huge”, 6PN prefix.

The core design idea of this system is pretty simple: we control IPv6 address assignments and routing inside our network. To lock an instance into a 6PN network, all we really need is a trivial BPF program that enforces the “don’t cross the streams” rule: you can’t send packets between different 6PN prefixes. We’re already BPF’ing all our interfaces to make UDP work, so this is an easy change.

6PN DNS

Having all these IPv6 addresses doesn’t help much if your apps can’t find each other, so we run an internal DNS service for our 6PN networks. So in reality, you never think about 6PN at all. You just need the names of your apps.

Our DNS service is a small Tokio Rust program backed by sqlite databases that our service discovery system builds on all our hosts. It accepts packets only from 6PN addresses, and, because of its network position, can trust source addresses, which it uses to determine the answers to questions.

Also, it forwards external DNS queries, so it can stand in as the sole nameserver in resolv.conf. That was another 20 lines of code.

I wish the server was interesting enough to talk about more, but it’s not; it’s practically the “hello world” of the NLNet “domain” crate. If you don’t use Fly.io, my message to you in this section is mostly “go forth and build ye a Rust DNS server, for lo, it is pretty easy to do”. Also, deploy it on Fly.io, it’s great.

If you do use Fly.io, first, thanks and congratulations. Also you might be interested in our naming scheme. Assume your app is fearsome-bagel-43, and has a sibling app serf-43. Then:


fearsome-bagel-32.internal	AAAA	addresses of all fearsome-bagel-32 instances
serf-43.internal	AAAA	addresses of all serf-43 instances
handsome-badget-92.internal	AAAA	trick question! that app isn’t in your organization, so: NXDOMAIN
regions.fearsome-bagel-32.internal	TXT	comma-separated list of all regions fearsome-bagel-32 runs in
nrt.fearsome-bagel-32.internal	AAAA	addresses of all fearsome-bagel-32 instances in Japan
_apps.internal	TXT	comma-separated list of all apps in your organization
_peer.internal	TXT	we’ll get to that in a second.

For instances of apps running in Fly, DNS is available on fdaa::3 (the one exception to the “no crossing the streams” 6PN access rule).

Internal WireGuard at Fly

The basic architecture of Fly.io is that we have hardware colocated in datacenters around the world. We direct global traffic to the edge of our network with BGP-driven Anycast, and we route it to nearby worker servers over Jason Donenfeld’s WireGuard.

WireGuard is amazing. It will likely replace all other VPN protocols. But it’s so lightweight and performant that I think it’s going to change the role VPNs have. It’s just as easy to set up a WireGuard connection as it is an SSH account. And you pay practically no performance penalty for using it. So you end up using VPNs for new things.

What makes WireGuard so interesting?

It’s based on Trevor Perrin’s Noise protocol framework, and inherits Noise’s modern, best-practices cryptography, much of which also powers Signal ––– an authenticated Curve25519 handshake, ChaCha20+Poly1305 for encryption, and Blake2s for hashing.
It does no negotiation; there are no cryptographic parameters to select.
Its reference implementation is just 5000 lines of kernel code; you can read it in an hour or so.
Unlike other VPN protocols, WireGuard was designed with implementation security in mind, so that, for example, it’s straightforward to implement without requiring on-demand dynamic memory allocation.
It’s extremely fast.

We run an internal WireGuard mesh, about which we’ll write more in the future. All you need to know here is that connectivity between hosts in our network happens entirely over WireGuard.

I come from old-breed ISP stock, from a time when Cisco AGS+’s roamed the land, and what I was taught very early on is that the best routing protocol is static routing, if you can get away with it. With the information embedded in our addresses, we can route 6PN statically.

But there’s a catch. A central part of WireGuard’s design is the notion of “cryptokey routing”. WireGuard peers are identified by a Curve25519 key (a short Base64 string), and each peering connection is tagged with a set of “Allowed IPs”. When the OS wants to send traffic over a WireGuard link, it routes packets to the WireGuard device, and WireGuard checks its table of peers to see which “allows” that destination. For that to work, “Allowed IPs” can’t overlap between links.

That’s a problem for our 6PN design, because a 6PN prefix obviously has to run across a bunch of hosts, and there’s no way to wildcard a chunk out of the middle of an address.

The solution is straightforward, though: we just use BPF to temporarily swap the “host” and “network” chunks of the address before and after routing through WireGuard. Our WireGuard mesh sees IPv6 addresses that look like fdaa:host:host::/48 but the rest of our system sees fdaa:net:net::48. This turns out to be an extremely simple transform, since swapping bytes in an IPv6 header doesn’t alter checksums.

WireGuard Peering

Here’s a thing you might want to do with an app running on Fly: connect it to to a database managed by AWS RDS.

Here’s a way to do that: boot up a WireGuard gateway in AWS (here, with a few dozen lines of Terraform, but use whatever you like; if Fly.io stands for anything, it’s “not having to know Terraform”) that peers into your 6PN network and exposes a Postgres proxy like PgBouncer. It’s a pretty boring configuration, which is the kind we like.

This works today at Fly.io because of WireGuard Peering. We will generate WireGuard configurations for you that will work in APAC, North America, and Europe. To do that, just run flyctl wireguard create. We’ll spit out a config that will drop into Linux, macOS, or Windows WireGuard.

WireGuard peers get /120 delegations (the equivalent of an IPv4 class C), and an organization-specific DNS endpoint baked into the config. When you add a WireGuard peer, we update DNS across the fleet, so your peer is available by its name; if we called this peer rds-us-east-1, our apps could reach it at rds-us-east-1._peer.internal. We can get a list of peers by looking up the TXT at _peer.internal.

A nice thing about this design is that it doesn’t require you to expose any management services on the AWS side; your AWS WireGuard gateway connects out to us, and the default security rules for your VPC should keep everything hermetically sealed inside (you obviously want to verify this part of your configuration; we’re just saying, we’re not asking you to open up any ports).

Of course, you can also use WireGuard 6PN peering to manage your app instances directly; for example, the config we generate drags-and-drops into macOS WireGuard.

Connect your containers with WireGuard

Launch your Docker apps on Fly and we’ll seamlessly connect them to any network you’d like using WireGuard
Try Fly for free →

Service Discovery In Fly Private Networks

I think you can get pretty far designing applications with the DNS we expose right now, but we’ve deliberately kept it boring, because we assume different people will want different things.

But nothing stops you from making service discovery exciting! I have, for instance, a multi-perspective DNS resolver example I’ll publish shortly that uses Hashicorp Serf to auto-discover new nodes. The important thing to know about 6PN networking is that it’s direct between nodes; we don’t proxy it or meddle with it in any way. Anything you want to run, including your own full Consul cluster, should just work

I’m pretty happy with how this design is turning out and optimistic that it achieves “boring” for connecting arbitrary services to Fly applications. So you can use Fly not only to make existing applications run faster, but also as a core component of new applications. I’m kind of in love with the ergonomics of our dev UX (I can say that because, as a late arrival to the Fly team, I had no hand in designing it), and anything that lets me use flyctl for more stuff is a win in my book.

Do you want to know more? Or have an idea? We’ve got a community forum just for you.

Next post ↑: Graceful VM exits, some dials
Previous post ↓: New VMs: more RAM, extra CPU, and a dollar menu

IPv6 Private Networking at Fly

6PN DNS

Internal WireGuard at Fly

WireGuard Peering

Connect your containers with WireGuard

Service Discovery In Fly Private Networks