The Fly BlogNews, tips, and tricks from the team at Fly/blog/2024-03-12T00:00:00+00:00FlyJIT WireGuardhttps://fly.io/blog/jit-wireguard-peers/2024-03-12T00:00:00+00:002024-03-14T16:37:21+00:00<div class="lead"><p>We’re Fly.io and we transmute containers into VMs, running them on our hardware around the world with the power of Firecracker alchemy. We do a lot of stuff with WireGuard, which has become a part of our customer API. This is a quick story about some tricks we played to make WireGuard faster and more scalable for the hundreds of thousands of people who now use it here.</p>
</div>
<p>One of many odd decisions we’ve made at Fly.io is how we use WireGuard. It’s not just that we use it in many places where other shops would use HTTPS and REST APIs. We’ve gone a step beyond that: every time you run <code>flyctl</code>, our lovable, sprawling CLI, it conjures a TCP/IP stack out of thin air, with its own IPv6 address, and speaks directly to Fly Machines running on our networks.</p>
<p>There are plusses and minuses to this approach, which we talked about <a href='https://fly.io/blog/our-user-mode-wireguard-year/' title=''>in a blog post a couple years back</a>. Some things, like remote-operated Docker builders, get easier to express (a Fly Machine, as far as <code>flyctl</code> is concerned, might as well be on the same LAN). But everything generally gets trickier to keep running reliably.</p>
<p>It was a decision. We own it.</p>
<p>Anyways, we’ve made some improvements recently, and I’d like to talk about them.</p>
<h2 id='where-we-left-off' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#where-we-left-off' aria-label='Anchor'></a>Where we left off</h2>
<p>Until a few weeks ago, our gateways ran on a pretty simple system.</p>
<ol>
<li>We operate dozens of “gateway” servers around the world, whose sole purpose is to accept incoming WireGuard connections and connect them to the appropriate private networks.
</li><li>Any time you run <code>flyctl</code> and it needs to talk to a Fly Machine (to build a container, pop an SSH console, copy files, or proxy to a service you’re running), it spawns or connects to a background agent process.
</li><li>The first time it runs, the agent generates a new WireGuard peer configuration from our GraphQL API. WireGuard peer configurations are very simple: just a public key and an address to connect to.
</li><li>Our API in turn takes that peer configuration and sends it to the appropriate gateway (say, <code>ord</code>, if you’re near Chicago) via an RPC we send over the NATS messaging system.
</li><li>On the gateway, a service called <code>wggwd</code> accepts that configuration, saves it to a SQLite database, and adds it to the kernel using WireGuard’s Golang libraries. <code>wggwd</code> acknowledges the installation of the peer to the API.
</li><li>The API replies to your GraphQL request, with the configuration.
</li><li>Your <code>flyctl</code> connects to the WireGuard peer, which works, because you receiving the configuration means it’s installed on the gateway.
</li></ol>
<p>I copy-pasted those last two bullet points from <a href='https://fly.io/blog/our-user-mode-wireguard-year/' title=''>that two-year-old post</a>, because when it works, it does <em>just work</em> reasonably well. (We ultimately did end up defaulting everybody to WireGuard-over-WebSockets, though.)</p>
<p>But if it always worked, we wouldn’t be here, would we?</p>
<p>We ran into two annoying problems:</p>
<p>One: NATS is fast, but doesn’t guarantee delivery. Back in 2022, Fly.io was pretty big on NATS internally. We’ve moved away from it. For instance, our <a href='https://fly.io/blog/carving-the-scheduler-out-of-our-orchestrator/' title=''>internal <code>flyd</code> API</a> used to be driven by NATS; today, it’s HTTP. Our NATS cluster was losing too many messages to host a reliable API on it. Scaling back our use of NATS made WireGuard gateways better, but still not great.</p>
<p>Two: When <code>flyctl</code> exits, the WireGuard peer it created sticks around on the gateway. Nothing cleans up old peers. After all, you’re likely going to come back tomorrow and deploy a new version of your app, or <code>fly ssh console</code> into it to debug something. Why remove a peer just to re-add it the next day? </p>
<p>Unfortunately, the vast majority of peers are created by <code>flyctl</code> in CI jobs, which don’t have persistent storage and can’t reconnect to the same peer the next run; they generate new peers every time, no matter what.</p>
<p>So, we ended up with a not-reliable-enough provisioning system, and gateways with hundreds of thousands of peers that will never be used again. The high stale peer count made kernel WireGuard operations very slow - especially loading all the peers back into the kernel after a gateway server reboot - as well as some kernel panics.</p>
<p>There had to be</p>
<h2 id='a-better-way' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#a-better-way' aria-label='Anchor'></a>A better way.</h2>
<p>Storing bajillions of WireGuard peers is no big challenge for any serious n-tier RDBMS. This isn’t “big data”. The problem we have at Fly.io is that our gateways don’t have serious n-tier RDBMSs. They’re small. Scrappy. They live off the land.</p>
<p>Seriously, though: you could store every WireGuard peer everybody has ever used at Fly.io in a single SQLite database, easily. What you can’t do is store them all in the Linux kernel.</p>
<p>So, at some point, as you push more and more peer configurations to a gateway, you have to start making decisions about which peers you’ll enable in the kernel, and which you won’t.</p>
<p>Wouldn’t it be nice if we just didn’t have this problem? What if, instead of pushing configs to gateways, we had the gateways pull them from our API on demand?</p>
<p>If you did that, peers would only have to be added to the kernel when the client wanted to connect. You could yeet them out of the kernel any time you wanted; the next time the client connected, they’d just get pulled again, and everything would work fine.</p>
<p>The problem you quickly run into to build this design is that Linux kernel WireGuard doesn’t have a feature for installing peers on demand. However:</p>
<h2 id='it-is-possible-to-jit-wireguard-peers' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#it-is-possible-to-jit-wireguard-peers' aria-label='Anchor'></a>It is possible to JIT WireGuard peers</h2>
<p>The Linux kernel’s <a href='https://github.com/WireGuard/wgctrl-go' title=''>interface for configuring WireGuard</a> is <a href='https://docs.kernel.org/userspace-api/netlink/intro.html' title=''>Netlink</a> (which is basically a way to create a userland socket to talk to a kernel service). Here’s a <a href='https://github.com/WireGuard/wg-dynamic/blob/master/netlink.h' title=''>summary of it as a C API</a>. Note that there’s no API call to subscribe for “incoming connection attempt” events.</p>
<p>That’s OK! We can just make our own events. WireGuard connection requests are packets, and they’re easily identifiable, so we can efficiently snatch them with a BPF filter and a <a href='https://github.com/google/gopacket' title=''>packet socket</a>.</p>
<div class="callout"><p>Most of the time, it’s even easier for us to get the raw WireGuard packets, because our users now default to WebSockets WireGuard (which is just an unauthenticated WebSockets connect that shuttles framed UDP packets to and from an interface on the gateway), so that people who have trouble talking end-to-end in UDP can bring connections up.</p>
</div>
<p>We own the daemon code for that, and can just hook the packet receive function to snarf WireGuard packets.</p>
<p>It’s not obvious, but WireGuard doesn’t have notions of “client” or “server”. It’s a pure point-to-point protocol; peers connect to each other when they have traffic to send. The first peer to connect is called the <strong class='font-[550] text-navy-950'>initiator</strong>, and the peer it connects to is the <strong class='font-[550] text-navy-950'>responder</strong>.</p>
<div class="right-sidenote"><p><a href="https://www.wireguard.com/papers/wireguard.pdf" title=""><em>The WireGuard paper</em></a> <em>is a good read.</em></p>
</div>
<p>For Fly.io, <code>flyctl</code> is typically our initiator, sending a single UDP packet to the gateway, which is the responder. According <a href='https://www.wireguard.com/papers/wireguard.pdf' title=''>to the WireGuard paper</a>, this first packet is a <code>handshake initiation</code>. It gets better: the packet type is recorded in a single plaintext byte. So this simple BPF filter catches all the incoming connections: <code>udp and dst port 51820 and udp[8] = 1</code>.</p>
<p>In most other protocols, we’d be done at this point; we’d just scrape the username or whatnot out of the packet, go fetch the matching configuration, and install it in the kernel. With WireGuard, not so fast. WireGuard is based on Trevor Perrin’s <a href='http://www.noiseprotocol.org/' title=''>Noise Protocol Framework</a>, and Noise goes way out of its way to <a href='http://www.noiseprotocol.org/noise.html#identity-hiding' title=''>hide identities</a> during handshakes. To identify incoming requests, we’ll need to run enough Noise cryptography to decrypt the identity.</p>
<p>The code to do this is fussy, but it’s relatively short (about 200 lines). Helpfully, the kernel Netlink interface will give a privileged process the private key for an interface, so the secrets we need to unwrap WireGuard are easy to get. Then it’s just a matter of running the first bit of the Noise handshake. If you’re that kind of nerdy, <a href='https://gist.github.com/tqbf/9f2c2852e976e6566f962d9bca83062b' title=''>here’s the code.</a></p>
<p>At this point, we have the event feed we wanted: the public keys of every user trying to make a WireGuard connection to our gateways. We keep a rate-limited cache in SQLite, and when we see new peers, we’ll make an internal HTTP API request to fetch the matching peer information and install it. This fits nicely into the little daemon that already runs on our gateways to manage WireGuard, and allows us to ruthlessly and recklessly remove stale peers with a <code>cron</code> job.</p>
<p>But wait! There’s more! We bounced this plan off Jason Donenfeld, and he tipped us off on a sneaky feature of the Linux WireGuard Netlink interface.</p>
<div class="right-sidenote"><p>Jason is the hardest working person in show business.</p>
</div>
<p>Our API fetch for new peers is generally not going to be fast enough to respond to the first handshake initiation message a new client sends us. That’s OK; WireGuard is pretty fast about retrying. But we can do better.</p>
<p>When we get an incoming initiation message, we have the 4-tuple address of the desired connection, including the ephemeral source port <code>flyctl</code> is using. We can install the peer as if we’re the initiator, and <code>flyctl</code> is the responder. The Linux kernel will initiate a WireGuard connection back to <code>flyctl</code>. This works; the protocol doesn’t care a whole lot who’s the server and who’s the client. We get new connections established about as fast as they can possibly be installed.</p>
<figure class="post-cta">
<figcaption>
<h1>Launch an app in minutes</h1>
<p>Speedrun an app onto Fly.io and get your own JIT WireGuard peer ✨</p>
<a class="button bg-white text-violet-500 hover:bg-opacity-75 hover:text-violet-700 transition-colors" href="/docs/speedrun/">
Speedrun <span class='opacity-50'>→</span>
</a>
</figcaption>
<div class="image-container">
<img src="/static/images/cta-turtle.webp" srcset="/static/images/cta-turtle@2x.webp 2x" alt="">
</div>
</figure>
<h2 id='look-at-this-graph' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#look-at-this-graph' aria-label='Anchor'></a>Look at this graph</h2>
<p>We’ve been running this in production for a few weeks and we’re feeling pretty happy about it. We went from thousands, or hundreds of thousands, of stale WireGuard peers on a gateway to what rounds to none. Gateways now hold a lot less state, are faster at setting up peers, and can be rebooted without having to wait for many unused peers to be loaded back into the kernel.</p>
<p>I’ll leave you with this happy Grafana chart from the day of the switchover.</p>
<p><img alt="a Grafana chart of 'kernel_stale_wg_peer_count' vs. time. For the first few hours, all traces are flat. Most are at values between 0 and 50,000 and the top-most is just under 550,000. Towards the end of the graph, each line in turn jumps sharply down to the bottom, and at the end of the chart all datapoints are indistinguishable from 0." src="/images/wireguard-peers-graph.webp" /></p>
<p><strong class='font-[550] text-navy-950'>Editor’s note:</strong> Despite our tearful protests, Lillian has decided to move on from Fly.io to explore new pursuits. We wish her much success and happiness! ✨</p>
/blog/jit-wireguard-peers/assets/network-thumb.webpFly Kubernetes does more nowhttps://fly.io/blog/fks-beta-live/2024-03-07T00:00:00+00:002024-03-08T16:12:44+00:00<div class="lead"><p>Eons ago, we <a href="https://fly.io/blog/fks/" title="">announced</a> we were working on <a href="https://fly.io/docs/kubernetes/" title="">Fly Kubernetes</a>. It drummed up enough excitement to prove we were heading in the right direction. So, we got hard to work to get from barebones “early access” to a beta release. We’ll be onboarding customers to the closed beta over the next few weeks. Email us at <a href="mailto:sales@fly.io">sales@fly.io</a> and we’ll hook you up.</p>
</div>
<p>Fly Kubernetes is the “blessed path"™️ to using Kubernetes backed by Fly.io infrastructure. Or, in simpler terms, it is our managed Kubernetes service. We take care of the complexity of operating the Kubernetes control plane, leaving you with the unfettered joy of deploying your Kubernetes workloads. If you love Fly.io and K8s, this product is for you.</p>
<h2 id='what-even-is-a-kubernete' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#what-even-is-a-kubernete' aria-label='Anchor'></a>What even is a Kubernete?</h2>
<p>So how did this all come to be—and what even is a Kubernete?</p>
<div class="right-sidenote"><p>You can see more fun details in <a href="https://fly.io/blog/fks/" title="">Introducing Fly Kubernetes</a>.</p>
</div>
<p>If you wade through all the YAML and <a href='https://landscape.cncf.io/' title=''>CNCF projects</a>, what’s left is an API for declaring workloads and how it should be accessed. </p>
<p>But that’s not what people usually talk / groan about. It’s everything else that comes along with adopting Kubernetes: a container runtime (CRI), networking between workloads (CNI) which leads to DNS (CoreDNS). Then you layer on Prometheus for metrics and whatever the logging daemon du jour is at the time. Now you get to debate which Ingress—strike that—<em>Gateway</em> API to deploy and if the next thing is anything to do with a Service Mess, then as they like to say where I live, "bless your heart”.</p>
<p>Finally, there’s capacity planning. You’ve got to pick and choose where, how and what the <a href='https://kubernetes.io/docs/concepts/architecture/nodes/' title=''>Nodes</a> will look like in order to configure and run the workloads.</p>
<p>When we began thinking about what a Fly Kubernetes Service could look like, we started from first principles, as we do with most everything here. The best way we can describe it is the <a href='https://www.youtube.com/watch?v=Ddk9ci6geSs' title=''>scene from Iron Man 2 when Tony Stark discovers a new element</a>. As he’s looking at the knowledge left behind by those that came before, he starts to imagine something entirely different and more capable than could have been accomplished previously. That’s what happened to JP, but with K3s and Virtual Kubelet.</p>
<h2 id='ok-then-wtf-whats-the-fks' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#ok-then-wtf-whats-the-fks' aria-label='Anchor'></a>OK then, WTF (what’s the FKS)?</h2>
<p>We looked at what people need to get started—the API—and then started peeling away all the noise, filling in the gaps to connect things together to provide the power. Here’s how this looks currently:</p>
<ul>
<li>Containerd/CRI → <a href='https://fly.io/blog/carving-the-scheduler-out-of-our-orchestrator/' title=''>flyd</a> + Firecracker + <a href='https://fly.io/blog/docker-without-docker/' title=''>our init</a>: our system transmogrifies Docker containers into Firecracker microVMs
</li><li>Networking/CNI → Our <a href='https://fly.io/blog/ipv6-wireguard-peering/' title=''>internal WireGuard mesh</a> connects your pods together
</li><li>Pods → Fly Machines VMs
</li><li>Secrets → Secrets, only not the base64’d kind
</li><li>Services → The Fly Proxy
</li><li>CoreDNS → CoreDNS (to be replaced with our custom internal DNS)
</li><li>Persistent Volumes → Fly Volumes (coming soon)
</li></ul>
<p>Now…not everything is a one-to-one comparison, and we explicitly did not set out to support any and every configuration. We aren’t dealing with resources like Network Policy and init containers, though we’re also not completely ignoring them. By mapping many of the core primitives of Kubernetes to a Fly.io resource, we’re able to focus on continuing to build the primitives that make our cloud better for workloads of all shapes and sizes.</p>
<p>A key thing to notice above is that there’s no “Node”.</p>
<p><a href='https://virtual-kubelet.io/' title=''>Virtual Kubelet</a> plays a central role in FKS. It’s magic, really. A Virtual Kubelet acts as if it’s a standard Kubelet running on a Node, eager to run your workloads. However, there’s no Node backing it. It instead behaves like an API, receiving requests from Kubernetes and transforming them into requests to deploy on a cloud compute service. In our case, that’s Fly Machines.</p>
<p>So what we have is Kubernetes calling out to our <a href='https://virtual-kubelet.io/docs/providers/' title=''>Virtual Kubelet provider</a>, a small Golang program we run alongside K3s, to create and run your pod. It creates <a href='https://fly.io/blog/docker-without-docker/' title=''>your pod as a Fly Machine</a>, via the <a href='/docs/machines/api/' title=''>Fly Machines API</a>, deploying it to any underlying host within that region. This shifts the burden of managing hardware capacity from you to us. We think that’s a cool trick—thanks, Virtual Kubelet magic!</p>
<h2 id='speedrun' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#speedrun' aria-label='Anchor'></a>Speedrun</h2>
<p>You can deploy your workloads (including GPUs) across any of our available regions using the Kubernetes API.</p>
<p>You create a cluster with <code>flyctl</code>:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-dc2sxmhc"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-dc2sxmhc">fly ext create k8s --name hello --org personal --region iad
</code></pre>
</div>
</div>
<p>When a cluster is created, it has the standard <code>default</code> namespace. You can inspect it:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-2ancs8zn"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-2ancs8zn">kubectl get ns default --show-labels
</code></pre>
</div>
</div><div class="highlight-wrapper group relative output">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-tkuwanc6"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight output whitespace-pre'><code id="code-tkuwanc6">NAME STATUS AGE LABELS
default Active 20d fly.io/app=fks-default-7zyjm3ovpdxmd0ep,kubernetes.io/metadata.name=default
</code></pre>
</div>
</div>
<p>The <code>fly.io/app</code> label shows the name of the Fly App that corresponds to your cluster.</p>
<p>It would seem appropriate to deploy the <a href='https://github.com/kubernetes-up-and-running/kuard' title=''>Kubernetes Up And Running demo</a> here, but since your pods are connected over an <a href='https://fly.io/blog/ipv6-wireguard-peering/' title=''>IPv6 WireGuard mesh</a>, we’re going to use a <a href='https://github.com/jipperinbham/kuard' title=''>fork</a> with support for <a href='https://github.com/kubernetes-up-and-running/kuard/issues/46' title=''>IPv6 DNS</a>.</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-tj3014b7"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-tj3014b7">kubectl run \
--image=ghcr.io/jipperinbham/kuard-amd64:blue \
--labels="app=kuard-fks" \
kuard
</code></pre>
</div>
</div>
<p>And you can see its Machine representation via:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-jnlehl0f"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-jnlehl0f">fly machine list --app fks-default-7zyjm3ovpdxmd0ep
</code></pre>
</div>
</div><div class="highlight-wrapper group relative output">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-69p43771"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight output whitespace-pre'><code id="code-69p43771">ID NAME STATE REGION IMAGE IP ADDRESS VOLUME CREATED LAST UPDATED APP PLATFORM PROCESS GROUP SIZE
1852291c46ded8 kuard started iad jipperinbham/kuard-amd64:blue fdaa:0:48c8:a7b:228:4b6d:6e20:2 2024-03-05T18:54:41Z 2024-03-05T18:54:44Z shared-cpu-1x:256MB
</code></pre>
</div>
</div>
<p></div></p>
<p>This is important! Your pod is a Fly Machine! While we don’t yet support all kubectl features, Fly.io tooling will “just work” for cases where we don’t yet support the kubectl way. So, for example, we don’t have <code>kubectl port-forward</code> and <code>kubectl exec</code>, but you can use flyctl to forward ports and get a shell into a pod.</p>
<p>Expose it to your internal network using the standard ClusterIP Service:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-lelg6xxg"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-lelg6xxg">kubectl expose pod kuard \
--name=kuard \
--port=8080 \
--target-port=8080 \
--selector='app=kuard-fks'
</code></pre>
</div>
</div>
<p>ClusterIP Services work natively, and Fly.io internal DNS supports them. Within the cluster, CoreDNS works too.</p>
<p>Access this Service locally via <a href='https://fly.io/docs/networking/private-networking/#flycast-private-load-balancing' title=''>flycast</a>: Get connected to your org’s <a href='https://fly.io/docs/networking/private-networking/' title=''>6PN private WireGuard network</a>. Get kubectl to describe the <code>kuard</code> Service:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-fjz460mg"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-fjz460mg">kubectl describe svc kuard
</code></pre>
</div>
</div><div class="highlight-wrapper group relative output">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-qscb7fth"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight output'><code id="code-qscb7fth">Name: kuard
Namespace: default
Labels: app=kuard-fks
Annotations: fly.io/clusterip-allocator: configured
service.fly.io/sync-version: 11507529969321451315
Selector: app=kuard-fks
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv6
IP: fdaa:0:48c8:0:1::1a
IPs: fdaa:0:48c8:0:1::1a
Port: <unset> 8080/TCP
TargetPort: 8080/TCP
Endpoints: [fdaa:0:48c8:a7b:228:4b6d:6e20:2]:8080
Session Affinity: None
Events: <none>
</code></pre>
</div>
</div>
<p>You can pull out the Service’s IP address from the above output, and get at the KUARD UI using that: in this case, <code>http://[fdaa:0:48c8:0:1::1a]:8080</code>. </p>
<p>Using internal DNS: <code>http://<service_name>.svc.<app_name>.flycast:8080</code>. Or, in our example: <code>http://kuard.svc.fks-default-7zyjm3ovpdxmd0ep.flycast:8080</code>.</p>
<p>And finally CoreDNS: <code><service_name>.<namespace>.svc.cluster.local</code> resolves to the <code>fdaa</code> IP and is routable within the cluster.</p>
<figure class="post-cta">
<figcaption>
<h1>Get in on the FKS beta</h1>
<p>Email us at sales@fly.io</p>
</figcaption>
<div class="image-container">
<img src="/static/images/cta-kitty.webp" srcset="/static/images/cta-kitty@2x.webp 2x" alt="">
</div>
</figure>
<h2 id='pricing' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#pricing' aria-label='Anchor'></a>Pricing</h2>
<p>The Fly Kubernetes Service is free during the beta. Fly Machines and Fly Volumes you create with it will cost the <a href='https://fly.io/docs/about/pricing/' title=''>same as for your other Fly.io projects</a>. It’ll be <a href='https://fly.io/docs/about/pricing/#fly-kubernetes' title=''>$75/mo per cluster</a> after that, plus the cost of the other resources you create.</p>
<h2 id='today-and-the-future' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#today-and-the-future' aria-label='Anchor'></a>Today and the future</h2>
<p>Today, Fly Kubernetes supports only a portion of the Kubernetes API. You can deploy pods using Deployments/ReplicaSets. Pods are able to communicate via Services using the standard K8s DNS format. Ephemeral and persistent volumes are supported.</p>
<p>The most notable absences are: multi-container pods, StatefulSets, network policies, horizontal pod autoscaling and emptyDir volumes. We’re working at supporting autoscaling and emptyDir volumes in the coming weeks and multi-container pods in the coming months.</p>
<p>If you’ve made it this far and are eagerly awaiting your chance to tell us and the rest of the internet “this isn’t Kubernetes!”, well, we agree! It’s not something we take lightly. We’re still building, and conformance tests may be in the future for FKS. We’ve made a deliberate decision to only care about fast launching VMs as the one and only way to run workloads on our cloud. And we also know enough of our customers would like to use the Kubernetes API to create a fast launching VM in the form of a Pod, and that’s where this story begins. </p>
/blog/fks-beta-live/assets/fks-thumb.webpGlobally Distributed Object Storage with Tigrishttps://fly.io/blog/tigris-public-beta/2024-02-15T00:00:00+00:002024-02-26T14:41:54+00:00<div class="lead"><p>We’re Fly.io and we transmute containers into VMs, running them on our hardware around the world with the power of Firecracker alchemy. That’s pretty cool, but we want to talk about something someone else built, that <a href="https://fly.io/docs/reference/tigris/" title="">you can use today</a> to build applications.</p>
</div>
<p>There are three hard things in computer science:</p>
<ol>
<li>Cache invalidation
</li><li>Naming things
</li><li><a href='https://aws.amazon.com/s3/' title=''>Doing a better job than Amazon of storing files</a>
</li></ol>
<p>Of all the annoying software problems that have no business being annoying, handling a file upload in a full-stack application stands apart, a universal if tractable malady, the plantar fasciitis of programming.</p>
<p>Now, the actual act of clients placing files on servers is straightforward. Your framework <a href='https://hexdocs.pm/phoenix/file_uploads.html' title=''>has</a> <a href='https://edgeguides.rubyonrails.org/active_storage_overview.html' title=''>a</a> <a href='https://docs.djangoproject.com/en/5.0/topics/http/file-uploads/' title=''>feature</a> <a href='https://expressjs.com/en/resources/middleware/multer.html' title=''>that</a> <a href='https://github.com/yesodweb/yesod-cookbook/blob/master/cookbook/Cookbook-file-upload-saving-files-to-server.md' title=''>does</a> <a href='https://laravel.com/docs/10.x/filesystem' title=''>it</a>. What’s hard is making sure that uploads stick around to be downloaded later.</p>
<aside class="right-sidenote"><p>(yes, yes, we know, <a href="https://youtu.be/b2F-DItXtZs?t=102" title="">sharding /dev/null</a> is faster)</p>
</aside>
<p>Enter object storage, a pattern you may know by its colloquial name “S3”. Object storage occupies a funny place in software architecture, somewhere between a database and a filesystem. It’s like <a href='https://man7.org/linux/man-pages/man3/malloc.3.html' title=''><code>malloc</code></a><code>()</code>, but for cloud storage instead of program memory.</p>
<p><a href='https://www.kleenex.com/en-us/' title=''>S3</a>—err, object storage — is so important that it was the second AWS service ever introduced (EC2 was not the first!). Everybody wants it. We know, because they keep asking us for it.</p>
<p>So why didn’t we build it?</p>
<p>Because we couldn’t figure out a way to improve on S3. And we still haven’t! But someone else did, at least for the kinds of applications we see on Fly.io.</p>
<h2 id='but-first-some-back-story' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#but-first-some-back-story' aria-label='Anchor'></a>But First, Some Back Story</h2>
<p>S3 checks all the boxes. It’s trivial to use. It’s efficient and cost-effective. It has redundancies that would make a DoD contractor blush. It integrates with archival services like Glacier. And every framework supports it. At some point, the IETF should just take a deep sigh and write an S3 API RFC, XML signatures and all.</p>
<p>There’s at least one catch, though.</p>
<p>Back in, like, ‘07 people ran all their apps from a single city. S3 was designed to work for those kinds of apps. The data, the bytes on the disks (or whatever weird hyperputer AWS stores S3 bytes on), live in one place. A specific place. In a specific data center. As powerful and inspiring as The Architects are, they are mortals, and must obey the laws of physics.</p>
<p>This observation feels banal, until you realize how apps have changed in the last decade. Apps and their users don’t live in one specific place. They live all over the world. When users are close to the S3 data center, things are amazing! But things get less amazing the further away you get from the data center, and even less amazing the smaller and more frequent your reads and writes are.</p>
<p>(Thought experiment: you have to pick one place in the world to route all your file storage. Where is it? Is it <a href='https://www.tripadvisor.com/Restaurant_Review-g30246-d1956555-Reviews-Ford_s_Fish_Shack_Ashburn-Ashburn_Loudoun_County_Virginia.html' title=''>Loudoun County, Virginia</a>?)</p>
<p>So, for many modern apps, you end up having to <a href='https://stackoverflow.com/questions/32426249/aws-s3-bucket-with-multiple-regions' title=''>write things into different regions</a>, so that people close to the data get it from a region-specific bucket. Doing that pulls in CDN-caching things that complicated your application and put barriers between you and your data. Before you know it, you’re wearing custom orthotics on your, uh, developer feet. (<em>I am done with this metaphor now, I promise.</em>)</p>
<aside class="right-sidenote"><p>(well, okay, Backblaze B2 because somehow my bucket fits into their free tier, but you get the idea)</p>
</aside>
<p>Personally, I know this happens. Because I had to build one! I run a <a href='https://xeiaso.net/blog/xedn/' title=''>CDN backend</a> that’s a caching proxy for S3 in six continents across the world. All so that I can deliver images and video efficiently for the readers of my blog.</p>
<aside class="right-sidenote"><p>(shut up, it’s a sandwich)</p>
</aside>
<p>What if data was really global? For some applications, it might not matter much. But for others, it matters a lot. When a sandwich lover in Australia snaps a picture of a <a href='https://en.wikipedia.org/wiki/Hamdog' title=''>hamdog</a>, the people most likely to want to see that photo are also in Australia. Routing those uploads and downloads through one building in Ashburn is no way to build a sandwich reviewing empire.</p>
<p>Localizing all the data sounds like a hard problem. What if you didn’t need to change anything on your end to accomplish it?</p>
<h2 id='show-me-a-hero' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#show-me-a-hero' aria-label='Anchor'></a>Show Me A Hero</h2>
<p>Building a miniature CDN infrastructure just to handle file uploads seems like the kind of thing that could take a week or so of tinkering. The Fly.io unified theory of cloud development is that solutions are completely viable for full-stack developers only when they take less than 2 hours to get working.</p>
<p>AWS agrees, which is why they have a SKU for it, <a href='https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudfront_distribution' title=''>called Cloudfront</a>, which will, at some variably metered expense, optimize the read side of a single-write-region bucket: they’ll set up <a href='https://fly.io/blog/the-5-hour-content-delivery-network/' title=''>a simple caching CDN</a> for you. You can probably get S3 and Cloudfront working within 2 hours, especially if you’ve set it up before.</p>
<p>Our friends at Tigris have this problem down to single-digit minutes, and what they came up with is a lot cooler than a cache CDN.</p>
<p>Here’s how it works. Tigris runs redundant FoundationDB clusters in our regions to track objects. They use Fly.io’s NVMe volumes as a first level of cached raw byte store, and a queuing system modelled on <a href='https://www.foundationdb.org/files/QuiCK.pdf' title=''>Apple’s QuiCK paper</a> to distribute object data to multiple replicas, to regions where the data is in demand, and to 3rd party object stores… like S3.</p>
<p>If your objects are less than about 128 kilobytes, Tigris makes them instantly global. By default! Things are just snappy, all over the world, automatically, because they’ve done all the work.</p>
<p>But it gets better, because Tigris is also much more flexible than a cache simple CDN. It’s globally distributed from the jump, with inter-region routing baked into its distribution layer. Tigris isn’t a CDN, but rather a toolset that you can use to build arbitrary CDNs, with consistency guarantees, instant purge and relay regions.</p>
<p>There’s a lot going on in this architecture, and it’d be fun to dig into it more. But for now, you don’t have to understand any of it. Because Tigris ties all this stuff together with an S3-compatible object storage API. If your framework can talk to S3, it can use Tigris.</p>
<h2 id='fly-storage' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#fly-storage' aria-label='Anchor'></a><code>fly storage</code></h2>
<p>To get started with this, run the <code>fly storage create</code> command:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-fbcw6suf"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-fbcw6suf">$ fly storage create
Choose a name, use the default, or leave blank to generate one: xe-foo-images
Your Tigris project (xe-foo-images) is ready. See details and next steps with: https://fly.io/docs/reference/tigris/
Setting the following secrets on xe-foo:
AWS_REGION
BUCKET_NAME
AWS_ENDPOINT_URL_S3
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
Secrets are staged for the first deployment
</code></pre>
</div>
</div>
<p>All you have to do is fill in a bucket name. Hit enter. All of the configuration for the AWS S3 library will be injected into your application for you. And you don’t even need to change the libraries that you’re using. <a href='https://www.tigrisdata.com/docs/sdks/s3/' title=''>The Tigris examples</a> all use the AWS libraries to put and delete objects into Tigris using the same calls that you use for S3.</p>
<p>I know how this looks for a lot of you. It looks like we’re partnering with Tigris because we’re chicken, and we didn’t want to build something like this. Well, guess what: you’re right!</p>
<p>Compute and networking: those are things we love and understand. Object storage? <a href='https://fly.io/blog/the-5-hour-content-delivery-network/' title=''>We already gave away the game on how we’d design a CDN for our own content</a>, and it wasn’t nearly as slick as Tigris.</p>
<p>Object storage is important. It needs to be good. We did not want to half-ass it. So we partnered with Tigris, so that they can put their full resources into making object storage as ✨magical✨ as Fly.io is.</p>
<p>This also mirrors a lot of the Unix philosophy of Days Gone Past, you have individual parts that do one thing very well that are then chained together to create a composite result. I mean, come on, would you seriously want to buy your servers the same place you buy your shoes?</p>
<h2 id='one-bill-to-rule-them-all' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#one-bill-to-rule-them-all' aria-label='Anchor'></a>One bill to rule them all</h2>
<p>Well, okay, the main reason why you would want to do that is because having everything under one bill makes it really easy for your accounting people. So, to make one bill for your computer, your block storage, your databases, your networking, and your object storage, we’ve wrapped everything under one bill. You don’t have to create separate accounts with Supabase or Upstash or PlanetScale or Tigris. Everything gets charged to your Fly.io bill and you pay one bill per month.</p>
<aside class="right-sidenote"><p>This was actually going to be posted on Valentine’s Day, but we had to wait for the chocolate to go on sale.</p>
</aside>
<p>This is our Valentine’s Day gift to you all. Object storage that just works. Stay tuned because we have a couple exciting features that build on top of the integration of Fly.io and Tigris that allow really unique things, such as truly global static website hosting and turning your bucket into a CDN in 5 minutes at most.</p>
<p>Here’s to many more happy developer days to come.</p>
/blog/tigris-public-beta/assets/tigris-public-beta-thumb.webpGPUs on Fly.io are available to everyone!https://fly.io/blog/gpu-ga/2024-02-12T00:00:00+00:002024-02-20T19:03:55+00:00<div class="lead"><p>Fly.io makes it easy to spin up compute around the world, now including powerful GPUs. Unlock the power of large language models, text transcription, and image generation with our datacenter-grade muscle!</p>
</div>
<p>GPUs are now available to everyone!</p>
<p>We know you’ve been excited about wanting to use GPUs on Fly.io and we’re happy to announce that they’re available for everyone. If you want, you can spin up GPU instances with any of the following cards:</p>
<ul>
<li>Ampere A100 (40GB) <code>a100-40gb</code>
</li><li>Ampere A100 (80GB) <code>a100-80gb</code>
</li><li>Lovelace L40s (48GB) <code>l40s</code>
</li></ul>
<p>To use a GPU instance today, change the <code>vm.size</code> for one of your apps or processes to any of the above GPU kinds. Here’s how you can spin up an <a href='https://ollama.ai' title=''>Ollama</a> server in seconds:</p>
<div class="highlight-wrapper group relative toml">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-ihr66zeg"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-ihr66zeg"><span class="py">app</span> <span class="p">=</span> <span class="s">"your-app-name"</span>
<span class="py">region</span> <span class="p">=</span> <span class="s">"ord"</span>
<span class="py">vm.size</span> <span class="p">=</span> <span class="s">"l40s"</span>
<span class="nn">[http_service]</span>
<span class="py">internal_port</span> <span class="p">=</span> <span class="mi">11434</span>
<span class="py">force_https</span> <span class="p">=</span> <span class="kc">false</span>
<span class="py">auto_stop_machines</span> <span class="p">=</span> <span class="kc">true</span>
<span class="py">auto_start_machines</span> <span class="p">=</span> <span class="kc">true</span>
<span class="py">min_machines_running</span> <span class="p">=</span> <span class="mi">0</span>
<span class="py">processes</span> <span class="p">=</span> <span class="nn">["app"]</span>
<span class="nn">[build]</span>
<span class="py">image</span> <span class="p">=</span> <span class="s">"ollama/ollama"</span>
<span class="nn">[mounts]</span>
<span class="py">source</span> <span class="p">=</span> <span class="s">"models"</span>
<span class="py">destination</span> <span class="p">=</span> <span class="s">"/root/.ollama"</span>
<span class="py">initial_size</span> <span class="p">=</span> <span class="s">"100gb"</span>
</code></pre>
</div>
</div>
<p>Deploy this and bam, large language model inferencing from anywhere. If you want a private setup, see the article <a href='https://fly.io/blog/scaling-llm-ollama/' title=''>Scaling Large Language Models to zero with Ollama</a> for more information. You never know when you have a sandwich emergency and don’t know what you can make with what you have on hand.</p>
<p>We are working on getting some lower-cost A10 GPUs in the next few weeks. We’ll update you when they’re ready.</p>
<p>If you want to explore the possibilities of GPUs on Fly.io, here’s a few articles that may give you ideas:</p>
<ul>
<li><a href='https://fly.io/blog/not-midjourney-bot/' title=''>Deploy Your Own (Not) MidJourney Bot On Fly GPUs</a>
</li><li><a href='https://fly.io/blog/scaling-llm-ollama/' title=''>Scaling Large Language Models to zero with Ollama</a>
</li><li><a href='https://fly.io/blog/transcribing-on-fly-gpu-machines/' title=''>Transcribing on Fly GPU Machines</a>
</li></ul>
<p>Depending on factors such as your organization’s age and payment history, you may need to go through additional verification steps.</p>
<p>If you’ve been experimenting with Fly.io GPUs and have made something cool, let us know on the <a href='https://community.fly.io/' title=''>Community Forums</a> or by mentioning us <a href='https://hachyderm.io/@flydotio' title=''>on Mastodon</a>! We’ll boost the cool ones.</p>
/blog/gpu-ga/assets/gpu-ga-thumb.webpEvent Driven Machineshttps://fly.io/blog/event-driven-machines/2024-02-05T00:00:00+00:002024-02-05T16:14:27+00:00<div class="lead"><p>We’re Fly.io and we transmute containers into VMs, running them on our hardware around the world. We have fast booting VM’s, so why not <a href="https://fly.io/docs/speedrun/" title="">take advantage of them</a>?</p>
</div>
<p>Serverless is great because is has good ergonomics - when an event is received, a “not-server” boots quickly, code is run, and then everything is torn down. We’re billed only on usage.</p>
<p>It turns out that Fly.io shares many of <a href='https://fly.io/blog/the-serverless-server/' title=''>the same ergonomics</a> as serverless. Can we do a serverless on Fly.io? 🦆 Well, if it’s quacking like a duck, let’s call it a mallard.</p>
<p>Here’s a useful pattern for triggering our own not-servers with Fly Machines.</p>
<h2 id='triggering-machines' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#triggering-machines' aria-label='Anchor'></a>Triggering Machines</h2>
<p>I want to make Machines do some work based on my own events. Fly.io can already <a href='https://fly.io/docs/apps/autostart-stop/' title=''>stop Machines when idle</a> based on HTTP, so let’s concentrate on non-HTTP events.</p>
<p>The process of running evented Machines involves:</p>
<ol>
<li>Listening for events
</li><li>Spinning up Fly Machines to run our code (with the events as context)
</li><li>Having event-aware code to run
</li></ol>
<p>To do this, I made a project and named it <a href='https://github.com/fly-apps/lambdo' title=''><strong class='font-[550] text-navy-950'>Lambdo</strong></a> because reasons.
You can consider this project “reference architecture” in the same way you call a toddler’s scribbling “art”.</p>
<p>The goal is to run some of our code on a fresh not-server when an event is received. We want this done efficiently - a Machine should only exist long enough to process an event or 3.</p>
<p>Lambdo does just that - it receives some events, and spins up Fly Machines with those events placed <em>inside</em> the VMs. Once the code finishes, the Machine is destroyed.</p>
<div class='group relative min-w-0 bg-white shadow-md shadow-navy-500/10 rounded-xl mb-7 ring-1 ring-navy-300/40'><button type='button' class='bubble-wrap z-20 absolute right-2.5 top-2.5 text-transparent group-hover:text-navy-950 hocus:text-violet-600 bg-transparent group-hover:bg-white hocus:bg-violet-200/40 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none' data-wrap-target='#table-l68v0zta' data-wrap-type='nowrap'><svg class='w-5 h-5 pointer-events-none' viewBox='0 0 20 20' fill='none' stroke='currentColor' stroke-width='1.5' stroke-linecap='round' stroke-linejoin='round'><g buffered-rendering='static'><path d='M11.912 10.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.314 2.314 0 00-2.315-2.31H4.959M15.187 14.5H4.959M8.802 10H4.959' /><path d='M13.081 8.466l-1.548 1.571 1.548 1.571' /></g></svg><span class='bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950'>Wrap text</span></button><div class='min-w-0 overflow-x-auto rounded-xl'><table class='table-stripe table-stretch table-pad text-sm whitespace-nowrap m-0' id='table-l68v0zta'><thead class='text-navy-950 text-left'><tr>
<th style="text-align: center"><img alt="the files are inside the computer" src="/images/files-are-inside-the-computer-cover.webp" /></th>
</tr>
</thead><tbody><tr>
<td style="text-align: center">The files are <em>in</em> the computer!</td>
</tr>
</tbody></table></div></div><h2 id='listening-for-events' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#listening-for-events' aria-label='Anchor'></a>Listening for Events</h2>
<p>For our purposes, an event is just a JSON object. <code>{"any": "object", "will": "do"}</code>.</p>
<p>We want to turn events into compute, so we need some sort of event system. I decided to use a queue.</p>
<h3 id='the-queue' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-queue' aria-label='Anchor'></a>The Queue</h3>
<p>The first thing I needed was a place to send events! I chose to use SQS, which let me continue to pretend servers don’t exist.</p>
<p>It’s no surprise then that the first part of this project is <a href='https://github.com/fly-apps/lambdo/blob/main/internal/sqs/get_events.go' title=''>code that polls SQS</a>.</p>
<p>When the polling returns some non-zero number of events, it collects the SQS messages’ JSON strings (and some meta data), resulting in an array of objects (a list of events).</p>
<p>Then we send these events to some Machines.</p>
<h2 id='spinning-up-machines' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#spinning-up-machines' aria-label='Anchor'></a>Spinning Up Machines</h2>
<p>Fly Machines are fast-booting Micro-VM’s, controlled by an <a href='https://fly.io/docs/machines/working-with-machines/' title=''>API</a>.</p>
<p>A feature of that API is the ability to <a href='https://community.fly.io/t/machine-files/14453' title=''>create files</a> on a new Machine. This is how we’ll get our events into the Machine.</p>
<p>When Lambdo creates a Machine, it places a file at <code>/tmp/events.json</code>. Our code just needs to read that file and parse the JSON.</p>
<h3 id='running-our-code' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#running-our-code' aria-label='Anchor'></a>Running Our Code</h3>
<p>Part of the ergonomics of Serverless is (usually) being limited to running just a function. Fly.io doesn’t really care what you run, which is to our advantage. We can choose to write discreet functions per event, or we can bring our whole <a href='https://signalvnoise.com/svn3/the-majestic-monolith/' title=''>Majestic Monolith</a> to bear.</p>
<p>How do we package up our code? The real answer is “however you want!”, but here’s 2 ideas.</p>
<p><strong class='font-[550] text-navy-950'>Use Your Existing Code Base</strong></p>
<p>You can just use your existing code base. This is especially easy if you’re already deploying apps to Fly.io.</p>
<p>All we’d need to do is add some additional code - a command perhaps (<code>rake</code>, <code>artisan</code>, whatever) - that sucks in that JSON, iterates over the events, and does some stuff.</p>
<div class="highlight-wrapper group relative php">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-w9siqm3d"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-w9siqm3d"><span class="nv">$events</span> <span class="o">=</span> <span class="nb">json_decode</span><span class="p">(</span><span class="nb">file_get_contents</span><span class="p">(</span><span class="s2">"/tmp/events.json"</span><span class="p">));</span>
<span class="k">foreach</span> <span class="p">(</span><span class="nv">$events</span> <span class="k">as</span> <span class="nv">$event</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// do a thing</span>
<span class="p">}</span>
</code></pre>
</div>
</div>
<p>When we create an event, we’ll tell Lambdo how to run your code - more on that later.</p>
<p><strong class='font-[550] text-navy-950'>Use Lambdo’s Base Images</strong></p>
<p>This project also provides some “runtimes” (base images). This is a bit more “traditional serverless”, were you provide a function to run.</p>
<p>Lambdo contains <a href='https://github.com/fly-apps/lambdo/tree/main/runtimes' title=''>two runtimes</a> right now - Node and PHP. There could be more, of course, but you know…lazy.</p>
<p>The Node runtime <a href='https://github.com/fly-apps/lambdo/blob/main/runtimes/js/src/index.js' title=''>contains some code</a> that will read the JSON payload file (again, just an array of JSON events), and call a user-supplied JS function once per event.</p>
<p>An <a href='https://github.com/fly-apps/lambdo/tree/main/runtimes/js/sample-project' title=''>example is here</a> - our code just needs to export a function that does stuff to the given event:</p>
<div class="highlight-wrapper group relative javascript">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-3w12n3eu"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-3w12n3eu"><span class="c1">// File /app/index.js</span>
<span class="nx">exports</span><span class="p">.</span><span class="nx">handler</span> <span class="o">=</span> <span class="k">async</span> <span class="kd">function</span><span class="p">(</span><span class="nx">event</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">"</span><span class="s2">Let's process an event! The event:</span><span class="dl">"</span><span class="p">,</span> <span class="nx">event</span><span class="p">)</span>
<span class="p">}</span>
</code></pre>
</div>
</div>
<p>The <a href='https://github.com/fly-apps/lambdo/tree/main/runtimes/php' title=''>PHP runtime</a> is the same idea, a user-supplied handler looks like this:</p>
<div class="highlight-wrapper group relative php">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-3hz5ehfn"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-3hz5ehfn"><span class="c1">// File /app/index.php</span>
<span class="k">return</span> <span class="k">function</span> <span class="n">function</span><span class="p">(</span><span class="kt">array</span> <span class="nv">$event</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Do something with $event</span>
<span class="p">}</span>
</code></pre>
</div>
</div>
<p>Explore the <a href='https://github.com/fly-apps/lambdo/tree/main/runtimes' title=''>runtime</a> directory of the project to see how that’s put together.</p>
<h2 id='sending-an-event' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#sending-an-event' aria-label='Anchor'></a>Sending an Event</h2>
<p>Since our events are sent via SQS queue, it would be helpful to see an example SQS message. Remember how I mentioned the SQS message has some meta data?</p>
<p>Here’s an example, with said meta data:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-601r0d0o"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-601r0d0o">aws sqs send-message <span class="se">\</span>
<span class="nt">--queue-url</span><span class="o">=</span>https://sqs.<region>.amazonaws.com/<account>/<queue> <span class="se">\</span>
<span class="nt">--message-body</span><span class="o">=</span><span class="s1">'{"foo": "bar"}'</span> <span class="se">\</span>
<span class="nt">--message-attributes</span><span class="o">=</span><span class="s1">'{
"size":{"DataType":"String","StringValue":"performance-2x"},
"image":{"DataType":"String","StringValue":"fideloper/lambdo-php-sample:latest"}
}'</span>
</code></pre>
</div>
</div>
<p>The Body field of the SQS message is assumed to be a JSON string (it’s the event itself, and its contents are arbitrary - whatever makes sense for you).</p>
<p>The message Attributes contains the meta data - up to 3 important details:</p>
<ol>
<li><code>image</code>: The image to run (it might be a Docker Hub image, or something you pushed to registry.fly.io). This is <strong class='font-[550] text-navy-950'>required</strong>.
</li><li><code>size</code>: The CPU size and type to use† - defaults to <code>performance-2x</code>
</li><li><code>command</code>: The command to run, which is the Docker <code>CMD</code> equivalent - defaults to whatever your <code>CMD</code> is set in the <code>Dockerfile</code> used to create the Machine image.††
</li></ol>
<p>†You can get valid values for the <code>size</code> option by running <code>fly platform vm-sizes</code>.</p>
<p>††It’s an array form, e.g. <code>["php", "artisan", "foo"]</code>, you may need to do some escaping of double quotes if you’re sending messages to SQS via terminal.</p>
<h2 id='we-did-a-lambda' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#we-did-a-lambda' aria-label='Anchor'></a>We did a Lambda?</h2>
<p>Fly.io isn’t serverless, but it has all these primitives that add up to serverless. You have events, Fly.io has fast-booting VM’s. They just make sense together!</p>
<p>What we did here is use <a href='https://github.com/fly-apps/lambdo' title=''><strong class='font-[550] text-navy-950'>Lambdo</strong> to respond to events by spinning up a Machine</a>. Our code can process those events any way we want.</p>
<p>What I like about this approach is how flexible it can be. We can choose the base image to use and the server type (even using GPU-enabled Machines) <em>per event</em>.
Since we have full control over the Machine VM’s responding to the events, we can do whatever we want inside of them. Pretty neat!</p>
/blog/event-driven-machines/assets/lambdo-thumb.webpDelegating tasks to Fly Machineshttps://fly.io/blog/delegate-tasks-to-fly-machines/2024-02-01T00:00:00+00:002024-02-26T14:41:54+00:00<div class="lead"><p>We’re Fly.io. We run apps for our users on hardware we host around the world. Leveraging Fly.io Machines and Fly.io’s private network can make delegating expensive tasks a breeze. It’s easy to <a href="/docs/speedrun/" title="">get started</a>!</p>
</div>
<p>There are many ways to delegate work in web applications, from using background workers to serverless architecture. In this article, we explore a new machine pattern that takes advantage of Fly Machines and distinct process groups to make quick work of resource-intensive tasks.</p>
<h2 id='the-problem' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-problem' aria-label='Anchor'></a>The Problem</h2>
<p>Let’s say you’re building a web application that has a few tasks that demand a hefty amount of memory or CPU juice. Resizing images, for example, can require a shocking amount of memory, but you might not need that much memory <em>all</em> of the time, for handling most of your web requests. Why pay for all that horsepower when you don’t need it most of the time?</p>
<p>What if there’s a different way to delegate these resource-intensive tasks?</p>
<h2 id='the-solution' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-solution' aria-label='Anchor'></a>The Solution</h2>
<p>What if you could simply delegate these types of tasks to a more powerful machine <em>only</em> when necessary? Let’s build an example of this method in a sample app. We’ll be using Next.js today, but this pattern is framework (and language) agnostic.</p>
<p>Here’s how it will work:</p>
<ul>
<li>A request hits an endpoint that does some resource-intensive tasks
</li><li>The request is passed on to a copy of your app that’s running on a more beefy machine
</li><li>The beefy machine performs the intensive work and then hands the result back to the user via the “weaker” machine.
</li></ul>
<p><img alt="(A 3 panel comic of two characters, one small and one big and strong, both with computer screens for heads. Panel 1: Little guy hands the big guy a jar of pickles. Panel 2: Big guy opens the pickle jar. Panel 3: Big guy hands back the opened jar to the little guy, who is pleased; Illustration by Annie Sexton)" src="/images/./3-panel-comic-delegate-tasks-to-fly-machines.webp" /></p>
<p>To demonstrate this task-delegation pattern, we’re going to start with a single-page application that looks like this:</p>
<p><img alt="(Screenshot of the demo app; its a single-page app with the header and description "Open Pickle Jar: You've got a jar of pickles (a zip file of some high-def pickle photos) that you would like to open (resize and display below)". Under the description there are two inputs, one for width and one for height, and a button that says "Open pickle jar")" src="/images/./pickle-jar-screenshot.webp" /></p>
<p>Our “Open Pickle Jar” app is quite simple: you provide the width and height and it goes off and resizes some high-resolution photos to those dimensions (exciting!).</p>
<p>If you’d like to follow along, you can clone the <code>start-here</code> branch of this repository: <a href='https://github.com/fly-apps/open-pickle-jar' title=''>https://github.com/fly-apps/open-pickle-jar</a> . The final changes are visible on the <code>main</code> branch. This app uses S3 for image storage, so you’ll need to create a bucket called <code>open-pickle-jar</code> and provide <code>AWS_REGION</code>, <code>AWS_ACCESS_KEY_ID</code>, and <code>AWS_SECRET_ACCESS_KEY</code> as environment variables.</p>
<p>This task is really just a stand-in for any HTTP request that kicks off a resource-intensive task. Get the request from the user, delegate it to a more powerful machine, and then return the result to the user. It’s what happens when you can’t open a pickle jar, and you ask for someone to help.</p>
<p>Before we start, let’s define some terms and what they mean on Fly.io:</p>
<ul>
<li><strong class='font-[550] text-navy-950'>Machines:</strong> Extremely fast-booting VMs. They can exist in different regions and even run different processes.
</li><li><strong class='font-[550] text-navy-950'>App:</strong> An abstraction for a group of Machines running your code on Fly.io, along with the configuration, provisioned resources, and data we need to keep track of to run and route to your Machines.
</li><li><strong class='font-[550] text-navy-950'>Process group:</strong> A collection of Machines running a specific process. Many apps only run a single process (typically a public-facing HTTP server), but you can define any number of them.
</li><li><strong class='font-[550] text-navy-950'>fly.toml:</strong> A configuration file for deploying apps on Fly.io where you can set things like Machine specs, process groups, regions, and more.
</li></ul>
<hr>
<p><strong class='font-[550] text-navy-950'>Setup Overview</strong></p>
<p>Here’s what we’ll need for our application:</p>
<ol>
<li>A <strong class='font-[550] text-navy-950'>route</strong> that performs our resource-intensive task
</li><li>A <strong class='font-[550] text-navy-950'>wrapper function</strong> that either:
<ol>
<li>Runs our resource-intensive task OR
</li><li>Forwards the request to our more powerful Machine
</li></ol>
</li><li><strong class='font-[550] text-navy-950'>Two process groups</strong> running the <em>same process</em> but with differing Machine specs:
<ol>
<li>One for accepting HTTP traffic and handling most requests (let’s call it <code>web</code>)
</li><li>One internal-only group for doing the heavy lifting (let’s call it <code>worker</code>)
</li></ol>
</li></ol>
<p>In short, this is what our architecture will look like, a standard web and worker duo.</p>
<p><img alt="(A simple graphic illustrating two servers; a small box containing "npm run start" and a larger box containing the same thing. The small is labeled "web" and the larger box is labeled "worker".)" src="/images/./web-worker.webp" /></p>
<h3 id='creating-our-route' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#creating-our-route' aria-label='Anchor'></a>Creating our route</h3>
<p>Next.js has two distinct routing patterns: Pages and App router. We’ll use the App router in our example since it’s the preferred method moving forward.</p>
<p>Under your <code>/app</code> directory, create a new folder called <code>/open-pickle-jar</code> containing a <code>route.ts</code> .</p>
<p>(We’re using TypeScript here, but feel free to use normal JavaScript if you prefer!)</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-uhfi6x7v"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-uhfi6x7v">...
/app
/open-pickle-jar
route.ts
...
</code></pre>
</div>
</div>
<p>Inside <code>route.ts</code> we’ll flesh out our endpoint:</p>
<div class="highlight-wrapper group relative typescript">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-jm1pzg4x"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-jm1pzg4x"><span class="c1">// /app/open-pickle-jar/route.ts</span>
<span class="k">import</span> <span class="nx">delegateToWorker</span> <span class="k">from</span> <span class="dl">"</span><span class="s2">@/utils/delegateToWorker</span><span class="dl">"</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">NextRequest</span><span class="p">,</span> <span class="nx">NextResponse</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">"</span><span class="s2">next/server</span><span class="dl">"</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">openPickleJar</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">"</span><span class="s2">../openPickleJar</span><span class="dl">"</span><span class="p">;</span>
<span class="k">export</span> <span class="k">async</span> <span class="kd">function</span> <span class="nx">POST</span><span class="p">(</span><span class="nx">request</span><span class="p">:</span> <span class="nx">NextRequest</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">const</span> <span class="p">{</span> <span class="nx">width</span><span class="p">,</span> <span class="nx">height</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">request</span><span class="p">.</span><span class="nx">json</span><span class="p">();</span>
<span class="kd">const</span> <span class="nx">path</span> <span class="o">=</span> <span class="nx">request</span><span class="p">.</span><span class="nx">nextUrl</span><span class="p">.</span><span class="nx">pathname</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">body</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">delegateToWorker</span><span class="p">(</span><span class="nx">path</span><span class="p">,</span> <span class="nx">openPickleJar</span><span class="p">,</span> <span class="p">{</span> <span class="nx">width</span><span class="p">,</span> <span class="nx">height</span> <span class="p">});</span>
<span class="k">return</span> <span class="nx">NextResponse</span><span class="p">.</span><span class="nx">json</span><span class="p">(</span><span class="nx">body</span><span class="p">);</span>
<span class="p">}</span>
</code></pre>
</div>
</div>
<p>The function <code>openPickleJar</code> that we’re importing contains our resource-intensive task, which in this case is extracting images from a <code>.zip</code> file, resizing them all to the new dimensions, and returning the new image URLs.</p>
<p>The <code>POST</code> function is how one define routes for specific HTTP methods in Next.js, and ours implements a function <code>delegateToWorker</code> that accepts the path of the current endpoint (<code>/open-pickle-jar</code>) our resource-intensive function, and the same request parameters. This function doesn’t yet exist, so let’s build that next!</p>
<h3 id='creating-our-wrapper-function' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#creating-our-wrapper-function' aria-label='Anchor'></a>Creating our wrapper function</h3>
<p>Now that we’ve set up our endpoint, let’s flesh out the wrapper function that delegates our request to a more powerful machine.</p>
<p>We haven’t defined our process groups just yet, but if you recall, the plan is to have two:</p>
<ol>
<li><code>web</code> - Our standard web server
</li><li><code>worker</code> - For opening pickle jars (e.g. doing resource-intensive work). It’s essentially a duplicate of <code>web</code>, but running on beefier Machines.
</li></ol>
<p>Here’s what we want this wrapper function to do:</p>
<ul>
<li>If the current machine is a <code>worker</code> , proceed to execute the resource-intensive task
</li><li>If the current machine is NOT a <code>worker</code> , make a new request to the identical endpoint on a <code>worker</code> Machine
</li></ul>
<p>Inside your <code>/utils</code> directory, create a file called <code>delegateToWorker.ts</code> with the following content:</p>
<div class="highlight-wrapper group relative typescript">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-klx9diiw"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-klx9diiw"><span class="c1">// /utils/delegateToWorker.ts</span>
<span class="k">export</span> <span class="k">default</span> <span class="k">async</span> <span class="kd">function</span> <span class="nx">delegateToWorker</span><span class="p">(</span><span class="nx">path</span><span class="p">:</span> <span class="kr">string</span><span class="p">,</span> <span class="nx">func</span><span class="p">:</span> <span class="p">(...</span><span class="nx">args</span><span class="p">:</span> <span class="kr">any</span><span class="p">[])</span> <span class="o">=></span> <span class="nb">Promise</span><span class="o"><</span><span class="kr">any</span><span class="o">></span><span class="p">,</span> <span class="nx">args</span><span class="p">:</span> <span class="nx">object</span><span class="p">):</span> <span class="nb">Promise</span><span class="o"><</span><span class="kr">any</span><span class="o">></span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">FLY_PROCESS_GROUP</span> <span class="o">===</span> <span class="dl">'</span><span class="s1">worker</span><span class="dl">'</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">'</span><span class="s1">running on the worker...</span><span class="dl">'</span><span class="p">);</span>
<span class="k">return</span> <span class="nx">func</span><span class="p">({...</span><span class="nx">args</span><span class="p">});</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">'</span><span class="s1">sending new request to worker...</span><span class="dl">'</span><span class="p">);</span>
<span class="kd">const</span> <span class="nx">workerHost</span> <span class="o">=</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">NODE_ENV</span> <span class="o">===</span> <span class="dl">'</span><span class="s1">development</span><span class="dl">'</span> <span class="p">?</span> <span class="dl">'</span><span class="s1">localhost:3001</span><span class="dl">'</span> <span class="p">:</span> <span class="s2">`worker.process.</span><span class="p">${</span><span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">FLY_APP_NAME</span><span class="p">}</span><span class="s2">.internal:3000`</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">response</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span><span class="s2">`http://</span><span class="p">${</span><span class="nx">workerHost</span><span class="p">}${</span><span class="nx">path</span><span class="p">}</span><span class="s2">`</span><span class="p">,</span> <span class="p">{</span>
<span class="na">method</span><span class="p">:</span> <span class="dl">'</span><span class="s1">POST</span><span class="dl">'</span><span class="p">,</span>
<span class="na">headers</span><span class="p">:</span> <span class="p">{</span>
<span class="dl">'</span><span class="s1">Content-Type</span><span class="dl">'</span><span class="p">:</span> <span class="dl">'</span><span class="s1">application/json</span><span class="dl">'</span>
<span class="p">},</span>
<span class="na">body</span><span class="p">:</span> <span class="nx">JSON</span><span class="p">.</span><span class="nx">stringify</span><span class="p">({...</span><span class="nx">args</span> <span class="p">})</span>
<span class="p">});</span>
<span class="k">return</span> <span class="nx">response</span><span class="p">.</span><span class="nx">json</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre>
</div>
</div>
<p>In our <code>else</code> section, you’ll notice that while developing locally (aka, when <code>NODE_ENV</code> is <code>development</code>) we define the hostname of our <code>worker</code> process to be <code>localhost:3001</code>. Typically Next.js apps run on port <code>3000</code>, so while testing our app locally, we can have two instances of our process running in different terminal shells:</p>
<ul>
<li><code>npm run dev</code> - This will run on <code>localhost:3000</code> and will act as our local <code>web</code> process
</li><li><code>FLY_PROCESS_GROUP=worker npm run dev</code> - This will run on <code>localhost:3001</code> and will act as our <code>worker</code> process (Next.js should auto-increment the port if the original <code>3000</code> is already in use)
</li></ul>
<p>Also, if you’re wondering about the <code>FLY_PROCESS_GROUP</code> and <code>FLY_APP_NAME</code> constants, these are <a href='https://fly.io/docs/reference/runtime-environment/' title=''>Fly.io-specific runtime environment variables</a> available on all apps.</p>
<h3 id='accessing-our-worker-machines-internal' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#accessing-our-worker-machines-internal' aria-label='Anchor'></a>Accessing our <code>worker</code> Machines (<code>.internal</code>)</h3>
<p>Now, when this code is running in production (aka <code>NODE_ENV</code> is NOT <code>development</code>) you’ll see that we’re using a unique hostname to access our <code>worker</code> Machine.</p>
<p>Apps belonging to the same organization on Fly.io are provided a number of <a href='https://fly.io/docs/networking/private-networking/#fly-io-internal-addresses' title=''>internal addresses</a>. These <code>.internal</code> addresses let you point to different Apps and Machines in your private network. For example:</p>
<ul>
<li><code><region>.<app name>.internal</code> – To reach app instances in a particular region, like <code>gru.my-cool-app.internal</code>
</li><li><code><app instance ID>.<app name>.internal</code> - To reach a <em>specific</em> app instance.
</li><li><code><process group>.process.<app name>.internal</code> - To target app instances belonging to a specific process group. <strong class='font-[550] text-navy-950'>This is what we’re using in our app.</strong>
</li></ul>
<p>Since our <code>worker</code> process group is running the same process as our <code>web</code> process (in our case, <code>npm run start</code>), we’ll also need to make sure we use the same internal port (<code>3000</code>).</p>
<h3 id='defining-our-process-groups-and-machines' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#defining-our-process-groups-and-machines' aria-label='Anchor'></a>Defining our process groups and Machines</h3>
<p>The last thing to do will be to define our two process groups and their respective Machine specs. We’ll do this by editing our <code>fly.toml</code> configuration.</p>
<p>If you don’t have this file, go ahead and create a blank one and use the content below, but replace <code>app = open-pickle-jar</code> with your app’s name, as well as your preferred <code>primary_region</code>. If you don’t know what region you’d like to deploy to, <a href='https://fly.io/docs/reference/regions/' title=''>here’s the list of them</a>.</p>
<p><strong class='font-[550] text-navy-950'>Before you deploy:</strong> Note that deploying this example app will spin up <strong class='font-[550] text-navy-950'>billable</strong> machines. Please feel free to alter the Machine (<code>[[vm]]</code>) specs listed here to ones that suit your budget or app’s needs.</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-ot6fx8nh"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-ot6fx8nh">app = "open-pickle-jar"
primary_region = "sea"
[build]
[processes]
web = "npm run start"
worker = "npm run start"
[http_service]
internal_port = 3000
force_https = true
auto_stop_machines = true
auto_start_machines = true
min_machines_running = 1
processes = ["web"]
[[vm]]
cpu_kind = "shared"
cpus = 1
memory_mb = 1024
processes = ["web"]
[[vm]]
size = "performance-4x"
processes = ["worker"]
</code></pre>
</div>
</div>
<p>And that’s it! With our <code>fly.toml</code> finished, we’re ready to deploy our app!</p>
<p><img src="https://slabstatic.com/prod/uploads/p1b436gf/posts/images/tH4GaGLVaDkh3RhIwCpiDRX3.png" /></p>
<h2 id='discussion' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#discussion' aria-label='Anchor'></a>Discussion</h2>
<p>Today we built a machine pattern on top of Fly.io. This pattern allows us to have a lighter request server that can delegate certain tasks to a stronger server, meaning that we can have one Machine do all the heavy lifting that could block everything else while the other handles all the simple tasks for users. With this in mind, this is a fairly naïve implementation, and we can make this much better:</p>
<h3 id='using-a-queue-for-better-resiliency' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#using-a-queue-for-better-resiliency' aria-label='Anchor'></a>Using a queue for better resiliency</h3>
<p>In its current state, our code isn’t very resilient to failed requests. For this reason, you may want to consider keeping track of jobs in a queue with Redis (similar to Sidekiq in Ruby-land). When you have work you want to do, put it in the queue. Your queue worker would have to write the result somewhere (e.g., in Redis) that the application could fetch when it’s ready.</p>
<h3 id='starting-stopping-worker-machines' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#starting-stopping-worker-machines' aria-label='Anchor'></a>Starting/stopping worker Machines</h3>
<p>The benefit of this pattern is that you can limit how many “beefy” Machines you need to have available at any given time. Our demo app doesn’t dictate how many <code>worker</code> Machines to have at any given time, but by adding timeouts you could elect to start and stop them as needed.</p>
<p>Now, you may think that constantly starting and stopping Machines might incur higher response times, but note that we are NOT talking about creating/destroying Machines. Starting and stopping Machines only takes as long as it takes to start your web server (i.e. <code>npm run start</code>). The best part is that <strong class='font-[550] text-navy-950'>Fly.io does not charge for the CPU and RAM usage of stopped Machines.</strong> <a href='https://community.fly.io/t/we-are-going-to-start-collecting-charges-for-stopped-machines-rootfs-starting-april-25th/17825' title=''>We will charge for storage of their root filesystems on disk, starting April 25th, 2024</a>. Stopped Machines will still be much cheaper than running ones.</p>
<h3 id='what-about-serverless-functions' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#what-about-serverless-functions' aria-label='Anchor'></a>What about serverless functions?</h3>
<p>This “delegate to a beefy machine” pattern is similar to serverless functions with platforms like AWS Lambda. The main difference is that serverless functions usually require you to segment your application into a bunch of small pieces, whereas the method discussed today just uses the app framework that you deploy to production. Each pattern has its own benefits and downsides.</p>
<h2 id='conclusion' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#conclusion' aria-label='Anchor'></a>Conclusion</h2>
<p>The pattern outlined here is one more tool in your arsenal for scaling applications. By utilizing Fly.io’s private network and <code>.internal</code> domains, it’s quick and easy to pass work between different processes that run our app. If you’d like to learn about more methods for scaling tasks in your applications, check out <a href='https://fly.io/blog/rethinking-serverless-with-flame/' title=''>Rethinking Serverless with FLAME</a> by Chris McCord and <a href='https://fly.io/blog/print-on-demand/' title=''>Print on Demand</a> by Sam Ruby.</p>
<figure class="post-cta">
<figcaption>
<h1>Get more done on Fly.io</h1>
<p>Fly.io has fast booting machines at the ready for your dynamic workloads. It’s easy to get started. You can be off and running in minutes.</p>
<a class="button bg-white text-violet-500 hover:bg-opacity-75 hover:text-violet-700 transition-colors" href="https://fly.io/docs/speedrun/">
Deploy something today! <span class='opacity:50'>→</span>
</a>
</figcaption>
<div class="image-container">
<img src="/static/images/cta-rabbit.webp" srcset="/static/images/cta-rabbit@2x.webp 2x" alt="">
</div>
</figure>
/blog/delegate-tasks-to-fly-machines/assets/delegate-tasks-to-fly-machines-thumb.webpMacaroons Escalated Quicklyhttps://fly.io/blog/macaroons-escalated-quickly/2024-01-31T00:00:00+00:002024-02-05T15:30:06+00:00<div class="lead"><p>We’re Fly.io and we transmute containers into VMs, running them on our hardware around the world. We built a new security token system, and can I tell you the good news about our lord and savior the Macaroon?</p>
</div><h2 id='1' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#1' aria-label='Anchor'></a>1</h2>
<p>Let’s implement an API token together. It’s a design called “Macaroons”, but don’t get hung up on that yet.</p>
<p>First some <button toggle="#includes">throat-clearing</button>. Then:</p>
<div id="includes" toggle-content="" aria-label="show very boring code"><div class="highlight-wrapper group relative python">
<button type="button" class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none" data-wrap-target="#code-s8lzl2g5">
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959"></path><path d="M11.081 6.466L9.533 8.037l1.548 1.571"></path></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button type="button" class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none" data-copy-target="sibling">
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z"></path><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617"></path></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class="highlight relative group">
<pre class="highlight "><code id="code-s8lzl2g5"><span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">json</span>
<span class="kn">import</span> <span class="nn">hmac</span> <span class="k">as</span> <span class="n">hm</span>
<span class="kn">from</span> <span class="nn">base64</span> <span class="kn">import</span> <span class="n">b64encode</span><span class="p">,</span> <span class="n">b64decode</span>
<span class="kn">from</span> <span class="nn">hashlib</span> <span class="kn">import</span> <span class="n">sha256</span>
<span class="k">def</span> <span class="nf">hmac</span><span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="n">v</span><span class="p">):</span> <span class="k">return</span> <span class="n">hm</span><span class="p">.</span><span class="n">new</span><span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="n">v</span><span class="p">,</span> <span class="n">sha256</span><span class="p">).</span><span class="n">digest</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">enc</span><span class="p">(</span><span class="n">x</span><span class="p">):</span> <span class="k">return</span> <span class="n">b64encode</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">dec</span><span class="p">(</span><span class="n">x</span><span class="p">):</span> <span class="k">return</span> <span class="n">b64decode</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
</code></pre>
</div>
</div></div><div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-dsebu6g2"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-dsebu6g2"><span class="k">def</span> <span class="nf">blank_token</span><span class="p">(</span><span class="n">uid</span><span class="p">,</span> <span class="n">key</span><span class="p">):</span>
<span class="n">nonce</span> <span class="o">=</span> <span class="n">enc</span><span class="p">(</span><span class="s">":"</span><span class="p">.</span><span class="n">join</span><span class="p">([</span><span class="nb">str</span><span class="p">(</span><span class="n">uid</span><span class="p">),</span> <span class="n">os</span><span class="p">.</span><span class="n">urandom</span><span class="p">(</span><span class="mi">16</span><span class="p">)]))</span>
<span class="k">return</span> <span class="n">json</span><span class="p">.</span><span class="n">dumps</span><span class="p">([</span><span class="n">nonce</span><span class="p">,</span> <span class="n">enc</span><span class="p">(</span><span class="n">hmac</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="p">))])</span>
</code></pre>
</div>
</div><div class="right-sidenote"><p>Bearer tokens: like cookies, blobs you attach to a request (usually in an HTTP header).</p>
</div>
<p>We’re going to build a minimally-stateful bearer token, a blob signed with HMAC. Nothing fancy so far. <a href='https://api.rubyonrails.org/classes/ActiveSupport/MessageVerifier.html' title=''>Rails has done this</a> for a decade and a half.</p>
<p>There’s a <a href='http://cryto.net/~joepie91/blog/2016/06/13/stop-using-jwt-for-sessions/' title=''>fashion in API security for stateless tokens</a>, which encode all the data you’d need to check any request accompanied by that token – without a database lookup. Stateless tokens have some nice properties, and some less-nice. Our tokens won’t be stateless: they carry a user ID, with which we’ll look up the HMAC key to verify it. But they’ll stake out a sort of middle ground.</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-ljonyhji"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-ljonyhji"><span class="k">def</span> <span class="nf">attenuate</span><span class="p">(</span><span class="n">macStr</span><span class="p">,</span> <span class="n">cav</span><span class="p">):</span>
<span class="n">mac</span> <span class="o">=</span> <span class="n">json</span><span class="p">.</span><span class="n">loads</span><span class="p">(</span><span class="n">macStr</span><span class="p">)</span>
<span class="n">cavStr</span> <span class="o">=</span> <span class="n">json</span><span class="p">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">cav</span><span class="p">)</span>
<span class="n">oldTail</span> <span class="o">=</span> <span class="n">dec</span><span class="p">(</span><span class="n">mac</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">])</span>
<span class="n">newTail</span> <span class="o">=</span> <span class="n">enc</span><span class="p">(</span><span class="n">hmac</span><span class="p">(</span><span class="n">oldTail</span><span class="p">,</span> <span class="n">cavStr</span><span class="p">))</span>
<span class="k">return</span> <span class="n">json</span><span class="p">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">mac</span><span class="p">[</span><span class="mi">0</span><span class="p">:</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="p">[</span><span class="n">cavStr</span><span class="p">,</span> <span class="n">newTail</span><span class="p">])</span>
<span class="n">m0</span> <span class="o">=</span> <span class="n">blank_token</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="n">keys</span><span class="p">[</span><span class="mi">10</span><span class="p">])</span>
<span class="n">m1</span> <span class="o">=</span> <span class="n">attenuate</span><span class="p">(</span><span class="n">m0</span><span class="p">,</span> <span class="p">{</span><span class="s">'path'</span><span class="p">:</span> <span class="s">'/images'</span><span class="p">})</span>
<span class="n">m2</span> <span class="o">=</span> <span class="n">attenuate</span><span class="p">(</span><span class="n">m1</span><span class="p">,</span> <span class="p">{</span><span class="s">'op'</span><span class="p">:</span> <span class="s">'read'</span><span class="p">})</span>
</code></pre>
</div>
</div>
<p>Let’s add some stuff.</p>
<p>The meat of our tokens will be a series of claims we call “caveats”. We call them that because each claim restricts further what the token authorizes. After <code>{'path': '/images'}</code>, this token only allows operations that happen underneath the <code>/images</code> directory. Then, after <code>{'op': 'read'}</code>, it allows only reads, not writes.</p>
<p>(I guess we’re building a file sharing system. Whatever.)</p>
<p>Some important things about things about this design. First: by implication from the fact that caveats further restrict tokens, a token with no caveats restricts nothing. It’s a god-mode token. Don’t honor it.</p>
<div class="right-sidenote"><p>In other words: the ordering of caveats doesn’t matter.</p>
</div>
<p>Second: the rule of checking caveats is very simple: every single caveat must pass, evaluating <code>True</code> against the request that carries it, in isolation and without reference to any other caveat. If any caveat evaluates <code>False</code>, the request fails. In that way, we ensure that adding caveats to a token can only ever weaken it.</p>
<p>With that in mind, take a closer look at this code:</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-hi17cput"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-hi17cput"><span class="n">oldTail</span> <span class="o">=</span> <span class="n">dec</span><span class="p">(</span><span class="n">mac</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">])</span>
<span class="n">newTail</span> <span class="o">=</span> <span class="n">enc</span><span class="p">(</span><span class="n">hmac</span><span class="p">(</span><span class="n">oldTail</span><span class="p">,</span> <span class="n">cavStr</span><span class="p">))</span>
</code></pre>
</div>
</div>
<p>Every caveat is HMAC-signed independently, which is weird. Weirder still, the key for that HMAC is the output of the last HMAC. The caveats chain together, and the HMAC of the last caveat becomes the “tail” of the token.</p>
<p>Creating a new blank token for a particular user requires a key that the server (and probably only the server) knows. But adding a caveat doesn’t! Anybody can add a caveat. In our design, you, the user, can edit your own API token.</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-wem8i4dg"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-wem8i4dg"><span class="k">def</span> <span class="nf">verify</span><span class="p">(</span><span class="n">macStr</span><span class="p">,</span> <span class="n">keys</span><span class="p">):</span>
<span class="n">mac</span> <span class="o">=</span> <span class="n">json</span><span class="p">.</span><span class="n">loads</span><span class="p">(</span><span class="n">macStr</span><span class="p">)</span>
<span class="n">nonce</span> <span class="o">=</span> <span class="n">dec</span><span class="p">(</span><span class="n">mac</span><span class="p">[</span><span class="mi">0</span><span class="p">]).</span><span class="n">split</span><span class="p">(</span><span class="s">":"</span><span class="p">)</span>
<span class="n">key</span> <span class="o">=</span> <span class="n">keys</span><span class="p">[</span><span class="nb">int</span><span class="p">(</span><span class="n">nonce</span><span class="p">[</span><span class="mi">0</span><span class="p">])]</span>
<span class="n">tail</span> <span class="o">=</span> <span class="s">""</span>
<span class="k">for</span> <span class="n">cav</span> <span class="ow">in</span> <span class="n">mac</span><span class="p">[:</span><span class="o">-</span><span class="mi">1</span><span class="p">]:</span>
<span class="n">tail</span> <span class="o">=</span> <span class="n">hmac</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">cav</span><span class="p">)</span>
<span class="n">key</span> <span class="o">=</span> <span class="n">tail</span>
<span class="k">return</span> <span class="n">hm</span><span class="p">.</span><span class="n">compare_digest</span><span class="p">(</span><span class="n">tail</span><span class="p">,</span> <span class="n">dec</span><span class="p">(</span><span class="n">mac</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]))</span>
<span class="n">verify</span><span class="p">(</span><span class="n">m2</span><span class="p">,</span> <span class="n">keys</span><span class="p">)</span> <span class="c1"># => True
</span></code></pre>
</div>
</div>
<p>For completeness, and to make a point, there’s the verification code. Look up the original secret key from the user ID, and then it’s chained HMAC all the way down. The point I’m making is that Macaroons are very simple.</p>
<h2 id='2' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#2' aria-label='Anchor'></a>2</h2>
<p>Back in 2014, Google published <a href='https://storage.googleapis.com/pub-tools-public-publication-data/pdf/41892.pdf' title=''>a paper at NDSS</a> introducing “Macaroons”, a new kind of cookie. Since then, they’ve become a sort of hipster shibboleth. But they’re more talked about than implemented, which is a nice way to say that practically nobody uses them.</p>
<p>Until now! I dragged Fly.io into implementing them. Suckers!</p>
<p>We had a problem: our API tokens were much too powerful. We needed to scope them down and let them express roles, and I scoped up that project to replace OAuth2 tokens altogether. We now have what I think is one of the more expansive Macaroon implementations on the Internet.</p>
<p>I dragged us into using Macaroons because I wanted us to use a hipster token format. Google designed Macaroons for a bigger reason: they hoped to replace browser cookies with something much more powerful.</p>
<p>The problem with simple bearer tokens, like browser cookies or JWTs, is that they’re prone to being stolen and replayed by attackers.</p>
<div class="right-sidenote"><p>game-over: pentest jargon for “very bad”</p>
</div>
<p>Worse, a stolen token is usually a game-over condition. In most schemes, a bearer token is an all-access pass for the associated user. For some applications this isn’t that big a deal, but then, <a href='https://neilmadden.blog/2020/09/09/macaroon-access-tokens-for-oauth-part-2-transactional-auth/' title=''>think about banking</a>. A banking app token that authorizes arbitrary transactions is a recipe for having a small heart attack on every HTTP request.</p>
<div class="right-sidenote"><p>(Perfectly minimized API tokens: a software security holy grail)</p>
</div>
<p>Macaroons are user-editable tokens that enable JIT-generated least-privilege tokens. With minimal ceremony and no additional API requests, a banking app Macaroon lets you authorize a request with a caveat like, I don’t know, <code>{'maxAmount': '$5'}</code>. I mean, something way better than that, probably lots of caveats, not just one, but you get the idea: a token so minimized you feel safe sending it with your request. Ideally, a token that only authorizes that single, intended request.</p>
<h2 id='3' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#3' aria-label='Anchor'></a>3</h2>
<p>That’s not why we like Macaroons. We already assume our tokens aren’t being stolen.</p>
<p>In most systems, the developers come up with a permissions system, and you’re stuck with it. We run a public cloud platform, and people want a lot of different things from our permissions. The dream is, we (the low-level platform developers on the team) design a single permission system, one time, and go about our jobs never thinking about this problem again.</p>
<p>Instead of thinking of all of our “roles” in advance, we just model our platform with caveats:</p>
<ol>
<li>Users belong to <code>Organizations</code>.
</li><li><code>Organizations</code> own <code>Apps</code>.
</li><li><code>Apps</code> contain <code>Machines</code> and <code>Volumes</code>.
</li><li>To any of these things, you can <code>Read</code>, <code>Write</code>, <code>Create</code>, <code>Delete</code>, and/or <code>Control</code> <aside class="right-sidenote">control being change of state, like “start” and “stop”</aside>.
</li><li>Some administrivia, like expiration (<code>ValidityWindow</code>), locking tokens to specific Fly Machines (<code>FromMachineSource</code>), and escape hatches like <code>Mutation</code> (for our GraphQL API).
</li></ol>
<div class="right-sidenote"><p>(this is a vibes-based notation, don’t think too hard about it)</p>
</div>
<p>Simplistic. But it expresses admin tokens:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-1id1gktu"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-1id1gktu">Organization 4721, mask=*
</code></pre>
</div>
</div>
<p>And it expresses normal user tokens:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-4kcpqcjx"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-4kcpqcjx">Organization 4721, mask=read,write,control
(App 123, mask=control), (App 345, mask=read, write, control)
</code></pre>
</div>
</div>
<p>And also an auditor-only token for that user:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-k01yr13i"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-k01yr13i">Organization 4721, mask=read,write,control
(App 123, mask=control), (App 345, mask=read, write, control)
Organization 4721, mask=read
</code></pre>
</div>
</div><div class="right-sidenote"><p>(our deploy tokens are more complicated than this)</p>
</div>
<p>Or a deployment-only token, for a CI/CD system:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-wkj567p0"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-wkj567p0">Organization 4721, mask=write,control
(App 123, mask=*)
</code></pre>
</div>
</div>
<p>Those are just the roles we came up with. Users can invent others. The important thing is that they don’t have to bother me about them.</p>
<h2 id='4' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#4' aria-label='Anchor'></a>4</h2>
<p>Astute readers will have noticed by now that we haven’t shown any code that actually evaluates a caveat. That’s because it’s boring, and I’m too lazy to write it out. Got an <code>Organization</code> token for <code>image-hosting</code> that allows <code>Reads</code>? Ok; check and make sure the incoming request is for an asset of <code>image-hosting</code>, and that it’s a <code>Read</code>. Whatever code you came up with, it’d be fine.</p>
<p>These straightforward restrictions are called “first party caveats”. The first party is us, the platform. We’ve got all the information we need to check them.</p>
<p>Let’s kit out our token format some more.</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-xzlmx9lw"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-xzlmx9lw"><span class="k">def</span> <span class="nf">third_party_caveat</span><span class="p">(</span><span class="n">ka</span><span class="p">,</span> <span class="n">tail</span><span class="p">,</span> <span class="n">msg</span><span class="p">,</span> <span class="n">url</span><span class="p">):</span>
<span class="n">crk</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">urandom</span><span class="p">(</span><span class="mi">16</span><span class="p">)</span>
<span class="n">ticket</span> <span class="o">=</span> <span class="n">enc</span><span class="p">(</span><span class="n">encrypt</span><span class="p">(</span><span class="n">ka</span><span class="p">,</span> <span class="n">json</span><span class="p">.</span><span class="n">dumps</span><span class="p">({</span>
<span class="s">'crk'</span><span class="p">:</span> <span class="n">enc</span><span class="p">(</span><span class="n">crk</span><span class="p">),</span>
<span class="s">'msg'</span><span class="p">:</span> <span class="n">msg</span>
<span class="p">})))</span>
<span class="n">challenge</span> <span class="o">=</span> <span class="n">enc</span><span class="p">(</span><span class="n">encrypt</span><span class="p">(</span><span class="n">tail</span><span class="p">,</span> <span class="n">crk</span><span class="p">))</span>
<span class="k">return</span> <span class="p">{</span> <span class="s">'url'</span><span class="p">:</span> <span class="n">url</span><span class="p">,</span> <span class="s">'ticket'</span><span class="p">:</span> <span class="n">ticket</span><span class="p">,</span> <span class="s">'challenge'</span> <span class="p">:</span> <span class="n">challenge</span> <span class="p">}</span>
<span class="n">key</span> <span class="o">=</span> <span class="nb">bytes</span><span class="p">(</span><span class="s">"YELLOW SUBMARINE"</span><span class="p">)</span>
<span class="n">url</span> <span class="o">=</span> <span class="s">"https://canary.service"</span>
<span class="n">c3</span> <span class="o">=</span> <span class="n">third_party_caveat</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">tail</span><span class="p">,</span> <span class="n">url</span><span class="p">,</span> <span class="n">json</span><span class="p">.</span><span class="n">dumps</span><span class="p">({</span><span class="s">'user'</span><span class="p">:</span> <span class="s">'bobson.dugnutt'</span><span class="p">}))</span>
<span class="n">m3</span> <span class="o">=</span> <span class="n">attenuate</span><span class="p">(</span><span class="n">m2</span><span class="p">,</span> <span class="n">c3</span><span class="p">)</span>
</code></pre>
</div>
</div>
<p>Up till now, we’ve gotten by with nothing but HMAC, which is one of the great charms of the design. Now we need to encrypt. There’s no authenticated encryption in the Python standard library, but that won’t stop us. <button toggle="#hmac-ctr">Ready to make some candy? Hand me that brake fluid!</button></p>
<div id="hmac-ctr" toggle-content="" aria-label="show very silly code"><div class="highlight-wrapper group relative python">
<button type="button" class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none" data-wrap-target="#code-qcukp3s6">
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959"></path><path d="M11.081 6.466L9.533 8.037l1.548 1.571"></path></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button type="button" class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none" data-copy-target="sibling">
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z"></path><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617"></path></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class="highlight relative group">
<pre class="highlight "><code id="code-qcukp3s6"><span class="c1"># do i really need to say that i'm not serious about this?
</span>
<span class="k">def</span> <span class="nf">hmactr</span><span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="n">n</span><span class="p">):</span>
<span class="n">ks</span> <span class="o">=</span> <span class="n">hm</span><span class="p">.</span><span class="n">new</span><span class="p">(</span><span class="n">k</span><span class="o">+</span><span class="n">n</span><span class="p">)</span>
<span class="k">for</span> <span class="n">counter</span> <span class="ow">in</span> <span class="nb">xrange</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">maxint</span><span class="p">):</span>
<span class="n">ks</span><span class="p">.</span><span class="n">update</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">counter</span><span class="p">))</span>
<span class="n">kbs</span> <span class="o">=</span> <span class="n">ks</span><span class="p">.</span><span class="n">digest</span><span class="p">()</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">xrange</span><span class="p">(</span><span class="mi">16</span><span class="p">):</span> <span class="k">yield</span> <span class="n">kbs</span><span class="p">[</span><span class="n">i</span><span class="p">]</span>
<span class="k">def</span> <span class="nf">encrypt</span><span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="n">buf</span><span class="p">):</span>
<span class="n">ak</span> <span class="o">=</span> <span class="n">hm</span><span class="p">.</span><span class="n">new</span><span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="s">'auth'</span><span class="p">).</span><span class="n">digest</span><span class="p">()</span>
<span class="n">nonce</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">urandom</span><span class="p">(</span><span class="mi">16</span><span class="p">)</span>
<span class="n">cipher</span> <span class="o">=</span> <span class="n">hmactr</span><span class="p">(</span><span class="n">hm</span><span class="p">.</span><span class="n">new</span><span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="s">'enc'</span><span class="p">).</span><span class="n">digest</span><span class="p">(),</span> <span class="n">nonce</span><span class="p">)</span>
<span class="n">ctxt</span> <span class="o">=</span> <span class="nb">bytearray</span><span class="p">(</span><span class="n">buf</span><span class="p">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">xrange</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">buf</span><span class="p">)):</span>
<span class="n">ctxt</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">^=</span> <span class="nb">ord</span><span class="p">(</span><span class="n">cipher</span><span class="p">.</span><span class="nb">next</span><span class="p">())</span>
<span class="n">res</span> <span class="o">=</span> <span class="n">nonce</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">ctxt</span><span class="p">)</span>
<span class="k">return</span> <span class="n">res</span> <span class="o">+</span> <span class="n">hm</span><span class="p">.</span><span class="n">new</span><span class="p">(</span><span class="n">ak</span><span class="p">,</span> <span class="n">res</span><span class="p">).</span><span class="n">digest</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">decrypt</span><span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="n">buf</span><span class="p">):</span>
<span class="n">ak</span> <span class="o">=</span> <span class="n">hm</span><span class="p">.</span><span class="n">new</span><span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="s">'auth'</span><span class="p">).</span><span class="n">digest</span><span class="p">()</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">hm</span><span class="p">.</span><span class="n">compare_digest</span><span class="p">(</span><span class="n">buf</span><span class="p">[</span><span class="o">-</span><span class="mi">16</span><span class="p">:],</span> <span class="n">hm</span><span class="p">.</span><span class="n">new</span><span class="p">(</span><span class="n">ak</span><span class="p">,</span> <span class="n">buf</span><span class="p">[:</span><span class="o">-</span><span class="mi">16</span><span class="p">]).</span><span class="n">digest</span><span class="p">()):</span>
<span class="k">return</span> <span class="bp">False</span>
<span class="n">nonce</span> <span class="o">=</span> <span class="n">buf</span><span class="p">[:</span><span class="mi">16</span><span class="p">]</span>
<span class="n">cipher</span> <span class="o">=</span> <span class="n">hmactr</span><span class="p">(</span><span class="n">hm</span><span class="p">.</span><span class="n">new</span><span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="s">'enc'</span><span class="p">).</span><span class="n">digest</span><span class="p">(),</span> <span class="n">nonce</span><span class="p">)</span>
<span class="n">ptxt</span> <span class="o">=</span> <span class="nb">bytearray</span><span class="p">(</span><span class="n">buf</span><span class="p">[</span><span class="mi">16</span><span class="p">:</span><span class="o">-</span><span class="mi">16</span><span class="p">])</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">xrange</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">buf</span><span class="p">[</span><span class="mi">16</span><span class="p">:</span><span class="o">-</span><span class="mi">16</span><span class="p">])):</span>
<span class="n">ptxt</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">^=</span> <span class="nb">ord</span><span class="p">(</span><span class="n">cipher</span><span class="p">.</span><span class="nb">next</span><span class="p">())</span>
<span class="k">return</span> <span class="nb">str</span><span class="p">(</span><span class="n">ptxt</span><span class="p">)</span>
</code></pre>
</div>
</div></div>
<p>With “third-party” caveats comes a cast of characters. We’re still the first party. You’ll play the second party. The third party is any other system in the world that you trust: an SSO system, an audit log, a revocation checker, whatever.</p>
<p>Here’s the trick of the third-party caveat: our platform doesn’t know what your caveat means, and it doesn’t have to. Instead, when you see a third-party caveat in your token, you tear a ticket off it and exchange it for a “discharge Macaroon” with that third party. You submit both Macaroons together to us.</p>
<p>Let’s attenuate our token with a third-party caveat hooking it up to a “canary” service that generates a notice approximately any time the token is used.</p>
<p><img src="/images/third-party.png?1/2&wrap-left" /></p>
<p>To build that canary caveat, you first make a <code>ticket</code> that users of the token will hand to your canary, and then a <code>challenge</code> that Fly.io will use to verify discharges your checker spits out. The ticket and the challenge are both encrypted. The ticket is encrypted under <code>KA</code>, so your service can read it. The challenge is encrypted under the previous Macaroon tail, so only Fly.io can read it. Both hide yet another key, the random HMAC key <code>CRK</code> (“caveat root key”).</p>
<p>In addition to <code>CRK</code>, the ticket contains a message, which says whatever you want it to; Fly.io doesn’t care. Typically, the message describes some kind of additional checking you want your service to perform before spitting out a discharge token.</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-ouzywxtv"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-ouzywxtv"><span class="k">def</span> <span class="nf">discharge</span><span class="p">(</span><span class="n">ka</span><span class="p">,</span> <span class="n">ticket</span><span class="p">):</span>
<span class="n">ptxt</span> <span class="o">=</span> <span class="n">decrypt</span><span class="p">(</span><span class="n">ka</span><span class="p">,</span> <span class="n">dec</span><span class="p">(</span><span class="n">ticket</span><span class="p">))</span>
<span class="k">if</span> <span class="n">ptxt</span> <span class="o">==</span> <span class="bp">False</span><span class="p">:</span> <span class="k">return</span> <span class="bp">False</span>
<span class="n">tbody</span> <span class="o">=</span> <span class="n">json</span><span class="p">.</span><span class="n">loads</span><span class="p">(</span><span class="n">ptxt</span><span class="p">)</span>
<span class="c1"># not shown: do something with tbody['msg']
</span> <span class="k">return</span> <span class="n">json</span><span class="p">.</span><span class="n">dumps</span><span class="p">([</span><span class="n">ticket</span><span class="p">,</span> <span class="n">enc</span><span class="p">(</span><span class="n">hmac</span><span class="p">(</span><span class="n">dec</span><span class="p">(</span><span class="n">tbody</span><span class="p">[</span><span class="s">'crk'</span><span class="p">]),</span> <span class="n">ticket</span><span class="p">))])</span>
</code></pre>
</div>
</div>
<p>To authorize a request with a token that includes a third-party caveat for the canary service, you need to get your hands on a corresponding discharge Macaroon. Normally, you do that by <code>POST</code>ing the ticket from the caveat to the service.</p>
<p>Discharging is simple. The service, which holds <code>KA</code>, uses it to decrypt the ticket. It checks the message and makes some decisions. Finally, it mints a new macaroon, using <code>CRK</code>, recovered from the ticket, as the root key. The ticket itself is the nonce.</p>
<p>If it wants, the third-party service can slap on a bunch of first-party caveats of its own. When we verify the Macaroon, we’ll copy those caveats out and enforce them. Attenuation of a third-party discharge macaroon works like a normal macaroon.</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-nim5h0b4"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-nim5h0b4"><span class="k">def</span> <span class="nf">verify_third_party</span><span class="p">(</span><span class="n">tag</span><span class="p">,</span> <span class="n">cav</span><span class="p">,</span> <span class="n">discharges</span><span class="o">=</span><span class="p">[]):</span>
<span class="n">crk</span> <span class="o">=</span> <span class="n">decrypt</span><span class="p">(</span><span class="n">tag</span><span class="p">,</span> <span class="n">dec</span><span class="p">(</span><span class="n">cav</span><span class="p">[</span><span class="s">'challenge'</span><span class="p">]))</span>
<span class="k">if</span> <span class="n">crk</span> <span class="o">==</span> <span class="bp">False</span><span class="p">:</span> <span class="k">return</span> <span class="bp">False</span>
<span class="n">discharge</span> <span class="o">=</span> <span class="bp">None</span>
<span class="k">for</span> <span class="n">dcs</span> <span class="ow">in</span> <span class="n">discharges</span><span class="p">:</span>
<span class="k">if</span> <span class="n">json</span><span class="p">.</span><span class="n">loads</span><span class="p">(</span><span class="n">dcs</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="n">cav</span><span class="p">[</span><span class="s">'ticket'</span><span class="p">]:</span>
<span class="n">discharge</span> <span class="o">=</span> <span class="n">dcs</span>
<span class="k">break</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">discharge</span><span class="p">:</span> <span class="k">return</span> <span class="bp">False</span>
<span class="n">mac</span> <span class="o">=</span> <span class="n">json</span><span class="p">.</span><span class="n">loads</span><span class="p">(</span><span class="n">discharge</span><span class="p">)</span>
<span class="n">key</span> <span class="o">=</span> <span class="n">crk</span>
<span class="c1"># boring old stuff ---------------------
</span> <span class="n">tag</span> <span class="o">=</span> <span class="s">""</span>
<span class="k">for</span> <span class="n">cav</span> <span class="ow">in</span> <span class="n">mac</span><span class="p">[:</span><span class="o">-</span><span class="mi">1</span><span class="p">]:</span>
<span class="n">tag</span> <span class="o">=</span> <span class="n">hmac</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">cav</span><span class="p">)</span>
<span class="n">key</span> <span class="o">=</span> <span class="n">tag</span>
<span class="k">return</span> <span class="n">hm</span><span class="p">.</span><span class="n">compare_digest</span><span class="p">(</span><span class="n">tag</span><span class="p">,</span> <span class="n">dec</span><span class="p">(</span><span class="n">mac</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]))</span>
</code></pre>
</div>
</div>
<p>To verify tokens that have third-party caveats, start with the root Macaroon, walking the caveats like usual. At each third-party caveat, match the <code>ticket</code> from the caveat with the <code>nonce</code> on the discharge Macaroon. The key for root Macaroon decrypts the <code>challenge</code> in the caveat, recovering <code>CRK</code>, which cryptographically verifies the discharge.</p>
<p>(The Macaroons paper uses different terms: “caveat identifier” or <code>cId</code> for “ticket”, and “verification-key identifier” or <code>vId</code> for “challenge”. These names are self-evidently bad and our contribution to the state of the art is to replace them.)</p>
<p>There’s two big applications for third-party caveats in Popular Macaroon Thought. First, they facilitate microservice-izing your auth logic, because you can stitch arbitrary policies together out of third-party caveats. And, they seem like <a href='https://github.com/go-macaroon-bakery/macaroon-bakery' title=''>fertile ground for an ecosystem of interoperable Macaroon services</a>: Okta and Google could stand up SSO dischargers, for instance, or someone can do a really good revocation service.</p>
<p>Neither of these light us up. We’re allergic to microservices. As for public protocols, well, it’s good to want things. So we almost didn’t even implement third-party caveats.</p>
<h2 id='5' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#5' aria-label='Anchor'></a>5</h2>
<p>I’m glad we did though, because they’ve been pretty great.</p>
<p>The first problem third-party caveats solved for us was hazmat tokens. To the extent possible, we want Macaroon tokens to be safe to transmit between users. Our Macaroons express permissions, but not authentication, so it’s almost safe to email them.</p>
<p>The way it works is, our Macaroons all have a third-party caveat pointing to a “login service”, either identifying the proper bearer as a particular Fly.io user or as a member of some <code>Organization</code>. To allow a request with your token, you first need to collect the discharge from the login service, which requires authentication.</p>
<p>The login discharge is very sensitive, but there isn’t much reason to pass it around. The original permissions token is where all the interesting stuff is, and it’s not scary. So that’s nice.</p>
<p><img src="/images/fly-sso.png?1/3&wrap-left" /></p>
<p>Ben then came up with <a href="https://community.fly.io/t/organization-required-sso/17560">third-party caveats that require Google or Github SSO logins.</a> If your token has one of those caveats, when you run <code>flyctl deploy</code>, a browser will pop up to log you into your SSO IdP (if you haven’t done so recently already).</p>
<p>We’ve put a <a href='https://fly.io/blog/tokenized-tokens/#tokenizer-the-fabled-4th-way' title=''>bunch of work into getting the guts of our SSO system working</a>, but that work has mostly been invisible to customers. But Macaroon-ized SSO has a subtle benefit: you can configure <a href='http://Fly.io' title=''>Fly.io</a> to automatically add SSO requirements to specific <code>Organizations</code> (so, for instance, a dev environment might not need SSO at all, and prod might need two).</p>
<p>SSO requirements in most applications are a brittle pain in the ass. Ours are flexible and straightforward, and that happened almost by accident. Macaroons, baby!</p>
<p>Here’s a fun thing you can do with a Macaroon system: stand up a Slack bot, and give it an HTTP <code>POST</code> handler that accepts third-party tickets. Then:</p>
<p><img src="/images/bot-ok.png?1/2&center&border" /></p>
<p>So, the bot is cute, but any platform could do that. What’s cool is the way our platform <em>doesn’t</em> work with Slack; in fact, nothing on our platform knows anything about Slack, and Slack doesn’t know anything about us. We didn’t reach out to a Slack endpoint. Everything was purely cryptographic.</p>
<p>That bot could, if I sunk some time into it, enforce arbitrary rules: it could selectively add caveats for the requests it authorizes, based on lookups of the users requesting them, at specific times of day, with specific logging. Theoretically, it could add third-party caveats of its own.</p>
<p>The win for us for third-party caveats is that they create a plugin system for our security tokens. That’s an unusual place to see a plugin interface! But Macaroons are easy to understand and keep in your head, so we’re pretty confident about the security issues.</p>
<h2 id='6' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#6' aria-label='Anchor'></a>6</h2>
<p>Obviously, we didn’t write our Macaroon code in Python, or with HMAC-SHA256-CTR.</p>
<p>We landed on a primary implementation Golang (Ben subsequently wrote an Elixir implementation). Our hash is SHA256, our cipher is Chapoly. We encode in MsgPack.</p>
<div class="callout"><p>We didn’t use the pre-existing public implementation because <a href="https://securitycryptographywhatever.com/2021/08/12/what-do-we-do-about-jwt-with-jonathan-rudenberg/" title="">we were warned not to</a>. The Macaroon idea is simple, and it exists mostly as an academic paper, not a standard. The community that formed around building open source “standard” Macaroons decided to use untyped opaque blobs to represent caveats. We need things to be as rigidly unambiguous as they can be.</p>
</div>
<p><img src="/images/verifier-service.png?2/3&center" /></p>
<p>The big strength of Macaroons as a cryptographic design — that it’s based almost entirely on HMAC — makes it a challenge to deploy. If you can verify a Macaroon, you can generate one. We have thousands of servers. They can’t all be allowed to generate tokens.</p>
<p>What we did instead:</p>
<ul>
<li>We split token checking into “verification” of token HMAC tags and “clearing” of token caveats.
</li><li>Verification occurs only on a physically isolated token-verification service; to verify a token’s tag, you HTTP <code>POST</code> the token to the verifier.
</li><li>Clearing of token caveats can happen anywhere. Token caveat clearing is domain-specific and subject to change; token verification is simple cryptography and changes rarely.
</li><li>A token verification is cacheable. The client library for the token verifier does that, which speeds things up by exploiting the locality of token submissions.
</li><li>The verification service is backed by a <a href='https://fly.io/docs/litefs/' title=''>LiteFS-distributed SQLite database</a>, so verification is fast globally — a major step forward from our legacy OAuth2 tokens, which are only fast in Ashburn, VA.
</li></ul>
<p><img src="/images/service-token.png?2/3&center" /></p>
<p>Now buckle up, because I’m about to try to get you to care about service tokens.</p>
<p>We operate “worker servers” all over the world to host apps for our customers. To do that, those workers need access to customer secrets, like the key to decrypt a customer volume. To retrieve those secrets, the workers have to talk to secrets management servers.</p>
<p>We manage a lot of workers. We trust them. But we don’t trust them that much, if you get my drift. You don’t want to just leave it up to the servers to decide which secrets they can access. The blast radius of a problem with a single worker should be no greater than the apps that are supposed to run there.</p>
<p>The gold standard for approving access to customer information is, naturally, explicit customer authorization. We almost have that with Macaroons! The first time an app runs on a worker, <a href='https://fly.io/blog/carving-the-scheduler-out-of-our-orchestrator/' title=''>the orchestrator code</a> has a token, and it can pass that along to the secret stores.</p>
<p>The problem is, you need that token more than once; not just when the user does a deploy, but potentially any time you restart the app or migrate it to a new worker. And you can’t just store and replay user Macaroons. They have expirations.</p>
<div class="right-sidenote"><p>This is like dropping privilege with things like pledge(2), but in a distributed system.</p>
</div>
<p>So our token verification service exposes an API that transforms a user token into a “service token”, which is just the token with the authentication caveat and expiration “stripped off”.</p>
<p>What’s cool is: components that receive service tokens can attenuate them. For instance, we could lock a token to a particular worker, or even a particular Fly Machine. Then we can expose the whole <a href='https://fly.io/docs/machines/working-with-machines/' title=''>Fly Machines API</a> to customer VMs while keeping access traceable to specific customer tokens. Stealing the token from a Fly Machine doesn’t help you since it’s locked to that Fly Machine by a caveat attackers can’t strip.</p>
<h2 id='7' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#7' aria-label='Anchor'></a>7</h2>
<p>If a customer loses their tokens to an attacker, we can’t just blow that off and let the attacker keep compromising the account!</p>
<div class="right-sidenote"><p>This cancels every token derived through attenuation by that nonce.</p>
</div>
<p>Every Macaroon we issue is identified by a unique nonce, and we can revoke tokens by that nonce. This is just a basic function of the token verification service we just described.</p>
<p>We host token caches all over our fleet. Token revocation invalidates the caches. Anything with a cache checks frequently whether to invalidate. Revocation is rare, so just keeping a revocation list and invalidating caches wholesale seems fine.</p>
<h2 id='8' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#8' aria-label='Anchor'></a>8</h2>
<p>I get it, it’s tough to get me to shut up about Macaroons.</p>
<p>A couple years ago, I <a href='https://fly.io/blog/api-tokens-a-tedious-survey/' title=''>wrote a long survey of API token designs</a>, from JWTs (never!) to Biscuits. I had a <a href='https://fly.io/blog/api-tokens-a-tedious-survey/#macaroons' title=''>bunch to say about Macaroons</a>, not all of it positive, and said we’d be plowing forward with them at Fly.io.</p>
<p>My plan had been to follow up soon after with a deep dive on Macaroons as we planned them for Fly.io. I’m glad I didn’t do that, not just because it would’ve been embarrassing to announce a feature that took us over 2 years to launch, but also because the process of working on this with Ben Toews changed a lot of my thinking about them.</p>
<p>I think if you asked Ben, he’d say he had mixed feelings about how much complexity we wrangled to get this launched. On the other hand: we got a lot of things out of them without trying very hard:</p>
<ul>
<li>Security tokens you can (almost) email to your users and partners without putting your account at risk.
</li><li>A flexible permission system, encoded directly into the tokens, that users can drive without talking to our servers.
</li><li>A plugin system that users can (when we clean up the tooling) use themselves, to add things like Passkeys or two-person-approval rules or audit logging, without us getting in the middle.
</li><li>An SSO system that can stack different IdPs, mandate SSO login, and do that on a per-<code>Organization</code> basis.
</li><li><a href='https://www.latacora.com/blog/2018/06/12/a-childs-garden/' title=''>Inter-service authorization</a> that is traceable back to customer actions, so our servers can’t just make up which apps they’re allowed to look at.
</li><li>An elegant way of exposing our own APIs to customer Fly Machines with ambient authentication, but without the <a href="https://github.com/SummitRoute/imdsv2_wall_of_shame/blob/main/README.md">AWS IMDSv1 credential theft problem</a>.
</li></ul>
<p>There are downsides and warts! I’m mostly not telling you about them! Pure restrictive caveats are an awkward way to express some roles. And, blinded by my hunger to get Macaroons deployed, I spat in the face of science and used internal database IDs as our public caveat format, an act for which JP will never forgive me.</p>
<p>If i’ve piqued your interest, <a href='https://github.com/superfly/macaroon' title=''>the code for this stuff is public</a>, along with some more <a href='https://github.com/superfly/macaroon/blob/main/macaroon-thought.md' title=''>detailed technical documentation</a>.</p>
/blog/macaroons-escalated-quickly/assets/evil-cookies-thumb.webpHow Yoko Li makes towns, tamagoes, and tools for local AIhttps://fly.io/blog/how-i-fly-yoko-li/2024-01-08T00:00:00+00:002024-01-16T20:07:49+00:00<p>Hello all, and welcome to another episode of How I Fly, a series where I interview developers about what they do with technology, what they find exciting, and the unexpected things they’ve learned along the way. This time I’m talking with <a href='https://twitter.com/stuffyokodraws' title=''>Yoko Li</a>, an investment partner at A16Z who’s also an open-source AI developer. She works on some of the most exciting AI projects in the world. I’m excited to share them with you today, with fun stories about the lessons she’s learned along the way.</p>
<h2 id='cool-experiments' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#cool-experiments' aria-label='Anchor'></a>Cool Experiments</h2>
<p>One of Yoko’s most thought-provoking experiments is <a href='https://www.convex.dev/ai-town' title=''>AI Town</a>, a virtual town populated by AI agents that talk with each other. It takes advantage of the randomness of AI responses to create emergent behavior. When you open it, it looks like this:</p>
<p><img alt="A picture of the AI Town homepage, a UI showing a top-down 2D RPG view with a visible river and a tent. The UI shows a conversation with the characters Alice and Stella." src="/images/image1.webp" /></p>
<p>You can see the AI agents talking with each other and watch how the relationships between them form and change over time. It’s also a lot of fun to watch.</p>
<p>One of Yoko’s other experiments is <a href='https://ai-tamago.fly.dev/' title=''>AI Tamago</a>, a <a href='https://en.wikipedia.org/wiki/Tamagotchi' title=''>Tamagochi</a> virtual pet implemented with a large language model instead of the state machine that we’re all used to. AI Tamago uses an unmodified version of LLaMA 2 7B to take in game state and user inputs, then it generates what happens next. Every time you interact with your pet, it feeds data to LLaMA 2 and then uses Ollama’s JSON mode to generate unexpected output.</p>
<p><img alt="A picture of the homepage of AI Tamago, showing a virtual pet with buttons to feed the pet, play with the pet, clean the pet, discipline the pet, check pet status, and deliver medical care to the pet." src="/images/image4.webp" /></p>
<p>It’s all the fun of the classic Tamagochi toys from the 90’s (including the ability to randomly discipline your virtual pet) without any of the coin cell batteries or having to carry around the little egg-shaped puck.</p>
<p>But that’s just something you can watch, not something that’s as easy to play with on your own machine. Yoko has also worked on the <a href='https://github.com/ykhli/local-ai-stack' title=''>Local AI Starter Kit</a> that lets you go from zero to AI in minutes. It’s a collection of chains of models that let you ingest a bunch of documents, store them in a database, and then use those documents as context for a language model to generate responses. It’s everything you need to implement a “chat with a knowledge base” feature.</p>
<h3 id='the-dark-of-ai-experiments' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-dark-of-ai-experiments' aria-label='Anchor'></a>The dark of AI experiments</h3>
<p>The Local AI Starter Kit is significant because normally to do this, you need to set up billing and API keys for at least four different API providers, and then you need to write a bunch of (hopefully robust) code to tie it all together. With the Local AI Starter Kit, you can do this on your own hardware, with your own data, and your own models privately. It’s a huge step forward for democratizing access to this technology.</p>
<p>Document search is one of my favorite usecases for AI, and it’s one of the most immediately useful ones. It’s also one of the most fiddly and annoying to get right. To help illustrate this, I’ve made a diagram of the steps involved with setting up document search by hand:</p>
<p><img alt="A diagram showing the process of ingesting a pile of markdown documents into a vector database. The documents are broken into a collection of sections, then each section is passed through an embedding model and the resulting vectors are stored in a vector database." src="/images/image3.webp" /></p>
<p>You start with your Markdown documents. Most Markdown documents are easily broken up into sections where each section will focus on a single aspect of the larger topic of the document. You can take advantage of this best practice by letting people search for each section individually, which is typically a lot more useful than just searching the entire document.</p>
<div class="right-sidenote"><p>Okay, okay, fine. Language encircles concepts instead of defining them directly. The point still stands that we’re operating at a level “below” words and sentences, I don’t want to bog this down in a bunch of linear algebra that neither of us understand well enough to explain in a single paragraph like I am here. The main point is that it lets you “fuzzy match” relevant documents in a way that exact word search queries never could on their own.</p>
</div>
<p>Essentially, the vector embeddings that you generate from an embedding model are a mathematical representation of the “concepts” that the embedding model uses that are adjacent to the text of your documents. When you use the same model to generate embeddings for your documents and user queries, this lets you find documents that are similar to the query, but not precisely the same exact words. This is called “fuzzy searching” and it is one of the most difficult problems in computer science (right next to naming things).</p>
<p>When a user comes to search the database, you do the same thing as ingestion:</p>
<p><img alt="A diagram showing the full flow for doing document search Q&A with a vector database. The user submits a question to an API endpoint, the question is broken into embedding vectors and used to search for similar vectors in the database. The relevant document fragments are fed into the prompt for a large language model to generate a response that is grounded in the facts from the documents that were ingested. The response is streamed to the user one token at a time." src="/images/image2.webp" /></p>
<p>The user query comes into your API endpoint. You use the same embedding model from earlier (omitted from the diagram for brevity) to turn that query into a vector. Then you query the same vector database to find documents that are similar to the query. Then you have a list of documents with metadata like the URL to the documentation page or section fragment in that page. From here you have two options. You can either use the documents to return a list of results to the user, or you can do the more fun thing: using those documents as context for a large language model to generate a response grounded in the relevant facts in those documents.</p>
<div class="right-sidenote"><p>I think it’s also how OpenAI’s custom GPTs work, but they haven’t released technical details about how they work so this is outright speculation on my part.</p>
</div>
<p>This basic pattern is called Retrieval-augmented Generation (RAG), and it’s how Bing’s copilot chatbot works. The Local AI Starter Kit makes setting this pipeline up <em>effortless</em> and <em>fast</em>. It’s a huge step forward for making this groundbreaking technology accessible to everyone.</p>
<h2 id='the-struggles' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-struggles' aria-label='Anchor'></a>The struggles</h2>
<blockquote>
<p>When I was trying to get the AI models in AI Town to output JSON, I tried a bunch of different things. I got some good results by telling the model to “only reply in JSON, no prose”, but we ended up using a model tuned for outputting code. I think I inspired <a href='https://ollama.ai' title=''>Ollama</a> to add their JSON output feature.</p>
</blockquote>
<p>One of the main benefits of large language models is that they are essentially stochastic models of the entire Internet. They have a bunch of patterns formed that can let you create surprisingly different outputs from similar inputs. This is also one of the main drawbacks of large language models: they are essentially stochastic models of the entire Internet. They have a bunch of patterns formed that can let you create surprisingly different outputs from similar inputs. The outputs of these models are usually correct-ish enough (more correct if you ground the responses in document fact like you do with a Retrieval-augmented Generation system), but they are not always aligned with our observable reality.</p>
<p>A lot of the time you will get outputs that don’t make any logical or factual sense. These are called “hallucinations” and they are one of the main drawbacks of large language models. If a hallucination pops in at the worst times, you’ve accidentally told someone how to poison themselves with chocolate chip cookies. This is, as the kids say, “bad”.</p>
<p>The inherent randomness of the output of a large language model means that it can be difficult to get an exactly parsable format. Most of the time, you’d be able to coax the model to get usable JSON output, but without schema it can sometimes generate wildly different JSON responses. Only sometimes. This isn’t deterministic and Yoko has found that this is one of the most frustrating parts of working with large language models.</p>
<div class="right-sidenote"><p>This works by making any offending ungrammatical tokens weighted to negative infinity. It’s amazingly hacky but the hilarious part is that it works.</p>
</div>
<p>However, there are workarounds. <a href='https://github.com/ggerganov/llama.cpp' title=''>llama.cpp</a> offers a way to use a grammar file to strictly guide the output of a large language model by using context-free grammar. This lets you get something more deterministic, but it’s still not perfect. It’s a lot better than nothing, though.</p>
<p>One of the fun things that can happen with this is that you can have the model fail to generate anything but an endless stream of newlines in JSON mode. This is hilarious and usually requires some special detection logic to handle and restart the query. There’s work being done to let you use JSON schema to guide the generation of large language model outputs, but it’s not currently ready for the masses.</p>
<div class="right-sidenote"><p>If it’s dumb and it works, is it really dumb?</p>
</div>
<p>However, one of the easiest ways to hack around this is by using a model that generates code instead of text. This is how Yoko got the AI Town and AI Tamago models to output JSON that was mostly valid. It’s a hack, but it works. This was made a lot easier for AI town when one of the tools they use (<a href='https://ollama.ai' title=''>Ollama</a>) added support for JSON output from the model. This is a lot better than the code generation model hack, but research continues.</p>
<h2 id='the-simple-joy-of-unexpected-outputs' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-simple-joy-of-unexpected-outputs' aria-label='Anchor'></a>The simple joy of unexpected outputs</h2>
<blockquote>
<p>When I was making AI Town, I was inspired by <a href='https://en.wikipedia.org/wiki/The_Lifecycle_of_Software_Objects' title=''>The Lifecycle of Software Objects</a> by Ted Chiang. It’s about a former zookeeper that trained AI agents to be pets, kinda like how we use Reinforcement Learning from Human Feedback to train AI models like ChatGPT.</p>
</blockquote>
<p>However, at the same time, there are cases where hallucinations are not only useful, but they are what make the implementation of a system possible. If large language models are essentially massive banks of the word frequencies of a huge part of culture, then the emergent output can create unexpected things that happen frequently. This lets you have emergent behavior form, this can be the backbone of games and is the key thing that makes AI Town work as well as it does.</p>
<p>AI Tamago is also completely driven off of the results of large language model hallucinations. They are the core of what drives user inputs, the game loop, and the surprising reactions you get when disciplining your pet. The status screen takes in the game state and lets you know what your pet is feeling in a way that the segment displays of the Tamagochi toys could never do.</p>
<p>These enable you to build workflows that are <em>augmented</em> by the inherent randomness of the hallucinations instead of seeing them as drawbacks. This means you need to choose outputs that can have the hallucinations shine instead of being ugly warts you need to continuously shave away. Instead of using them for doing pathfinding, have them drive the AI of your characters or writing the A* pathfinding algorithm so you don’t have to write it again for the billionth time.</p>
<p>I’m not saying that large language models can replace the output of a human, but they are more like a language server for human languages as well as programming languages. They are best used when you are generating the boilerplate you don’t want to do yourself, or when you are throwing science at the wall to see what sticks.</p>
<h2 id='in-conclusion' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#in-conclusion' aria-label='Anchor'></a>In conclusion</h2>
<p>Yoko is showing people how to use AI today, on local machines, with models of your choice, that allow you to experiment, hack and learn.</p>
<p>I can’t wait to see what’s next!</p>
<p>If you want to follow what Yoko does, here’s a few links to add to your feeds:</p>
<ul>
<li>Yoko’s <a href='https://twitter.com/stuffyokodraws' title=''>Twitter</a> (or X, or whatever we’re supposed to call it now)
</li><li>Yoko’s <a href='https://github.com/ykhli' title=''>GitHub</a>
</li><li>Yoko’s <a href='https://yoko.dev/' title=''>Website</a>
</li></ul>
<p>(insert standard conclusion diatribe here)</p>
/blog/how-i-fly-yoko-li/assets/chat-bird-cover-thumb.webpDeploy Your Own (Not) Midjourney Bot on Fly GPUshttps://fly.io/blog/not-midjourney-bot/2024-01-04T00:00:00+00:002024-02-20T19:03:55+00:00<div class="lead"><p>Fly.io has Enterprise-grade GPUs and servers all over the globe (or <em>disk</em>, depending on which side of the flat Earth debate you fall on) making it a great place to deploy your next disruptive AI app.</p>
</div>
<p>Some people daydream about normal things, like coffee machines or raising that Series A round (those are normal things to dream about, right?). I daydream about commanding a fleet of chonky <a href='https://resources.nvidia.com/en-us-l40s/l40s-datasheet-28413/' title=''>NVIDIA Lovelace L40Ss</a>. Also, totally normal. Well, fortunately for me and anyone else wanting to explore the world of generative AI — Fly.io has GPUs now!</p>
<p>Sure, this technology will probably end up with the AI <a href='https://marketoonist.com/2023/03/ai-written-ai-read.html' title=''>talking to itself</a> while we go about our lives — but it seems like it’s here to stay, so we should at least have some fun with it. In this post we’ll put these GPUs to task and you’ll learn how to build your very own AI image-generating Discord bot, kinda like Midjourney. Available 24/7 and ready to serve up all the pictures of cats eating burritos your heart desires. And because I’d never tell you to draw the rest of the owl, I’ll link to working code that you can deploy today.</p>
<h2 id='latent-diffusion-models-have-entered-the-chat' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#latent-diffusion-models-have-entered-the-chat' aria-label='Anchor'></a>Latent Diffusion Models Have Entered the Chat</h2>
<p>In the realm of AI image generation, two names have become prominent: Midjourney and Stable Diffusion. Both are image generating software that allow you to synthesize an image from a textual prompt. One is a closed source paid service, while the other is open source and can run locally. Midjourney gained popularity because it allowed the less technically-inclined among us to explore this technology through its ease of use. Stable Diffusion democratized access to the technology, but it can be quite tricky to get good results out of it.</p>
<p>Enter <a href='https://github.com/lllyasviel/Fooocus' title=''>Fooocus</a> (pronounced <em>focus</em>), an open source project that combines the best of both worlds and offers a user-friendly interface to Stable Diffusion. It’s hands down the easiest way to get started with Stable Diffusion. Sure there are more popular tools like Stable Diffusion web UI and ComfyUI, but Fooocus adds some magic to reduce the need to manually tweak a bunch of settings. The most significant feature is probably GPT-2-based “<a href='https://github.com/lllyasviel/Fooocus/discussions/117#raw' title=''>prompt expansion</a>” to dynamically enhance prompts.</p>
<p>The point of Fooocus is to <em>focus</em> on your prompt. The more you put into it, the more you get out. That said, a very simple prompt like “forest elf” can return high-quality images without the need to trawl the web for prompt ideas or fiddle with knobs and levers (although they’re there if you want them).</p>
<p>So, what can this thing <em>do</em>? Well, this…</p>
<p><img alt="A black and white sketch of hot-air balloon over a mountain range generated using Fooocus with "Pencil Sketch Drawing" style and quality = True" src="/images/./balloon-sketch.webp" /></p>
<p>Here’s the full command I’ve used to generate this image: <code>/imagine prompt: sketch of hot-air balloon over a mountain range style1: Pencil Sketch Drawing quality: true ar: 1664×576</code></p>
<h2 id='what-were-building' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#what-were-building' aria-label='Anchor'></a>What We’re Building</h2>
<p>We’ll deploy two applications. The code to run the bot itself will run on normal VM hardware, and the API server doing all the hard work synthesizing alpacas out of thin air will run on GPU hardware.</p>
<p><img alt="An architecture diagram explaining how the two apps will communicate and return the requested image to an end user." src="/images/./arch-diagram.png?center&2/3" /></p>
<p>Fooocus is served up as a web UI by default, but with a little elbow grease we can interact with it as a REST API. Fortunately, with more than 25k stars on GitHub at the time of writing, the project has a lively open-source community, so we don’t need to do much work here — it’s already been done for us. <a href='https://github.com/konieshadow/Fooocus-API' title=''>Fooocus-API</a> is a project that shoves FastAPI in front of a Fooocus runtime. We’ll use this for the API server app.</p>
<p>The Python-based bot connects to the <a href='https://discord.com/developers/docs/topics/gateway' title=''>Discord Gateway API</a> using the <a href='https://github.com/Pycord-Development/pycord' title=''>Pycord</a> library. When it starts up, it maintains an open pipe for data to flow back and forth via WebSockets. The bot app also includes a client that knows how to talk to the API server using Flycast and request the image it needs via HTTP.</p>
<p>When we request an image from Discord using the <code>/imagine</code> slash command, we immediately respond using Pycord’s <code>defer()</code> function to let Discord know that the request has been received and the bot is working on it — it’ll take a few seconds to process your prompt, fabricate an image, upload it to Discord and let you share it with your friends. This is a blocking operation, so it won’t perform well if you have hundreds of people on your Discord Server using the command. For that, you’ll want to jiggle some wires to make the code non-blocking. But for for now, this gives us a nice UX for the bot.</p>
<p>When the API server returns the image, it gets saved to disk. We’ll use the fantastic <a href='https://github.com/sqids/sqids-python' title=''>Sqids</a> library to generate collision-free file names:</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-pz8ixo4r"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-pz8ixo4r"><span class="n">unique_id</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">sqids</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span>
<span class="p">[</span><span class="n">ctx</span><span class="p">.</span><span class="n">author</span><span class="p">.</span><span class="nb">id</span><span class="p">,</span> <span class="nb">int</span><span class="p">(</span><span class="n">time</span><span class="p">.</span><span class="n">time</span><span class="p">())]</span>
<span class="p">)</span>
<span class="n">result_filename</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"result_</span><span class="si">{</span><span class="n">unique_id</span><span class="si">}</span><span class="s">.png"</span>
</code></pre>
</div>
</div>
<p>We’ll also use <code>asyncio</code> to check if the image is ready every second, and when it is, we send it off to Discord to complete the request:</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-c9q1t75u"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-c9q1t75u"><span class="k">while</span> <span class="ow">not</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="n">exists</span><span class="p">(</span><span class="n">result_filename</span><span class="p">):</span>
<span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">result_filename</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="k">await</span> <span class="n">ctx</span><span class="p">.</span><span class="n">respond</span><span class="p">(</span>
<span class="nb">file</span><span class="o">=</span><span class="n">discord</span><span class="p">.</span><span class="n">File</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">result_filename</span><span class="p">)</span>
<span class="p">)</span>
</code></pre>
</div>
</div>
<p>Neither of these two apps will be exposed to the Internet, yet they’ll still be able to communicate with each other. One of the undersold stories about Fly.io is the ease with which two applications can communicate over the private network. We assign special IPv6 private network (6pn) addresses within the same organizational space and applications can effortlessly discover and connect to one another without any additional configuration.</p>
<p>But what about load balancing and this “scale-to-zero” thing? We don’t <em>just</em> want our two apps to talk to each other, we want the Fly Proxy to start our Machine when a request comes in, and stop it when idle. For that, we’ll need <a href='https://fly.io/docs/reference/private-networking/#flycast-private-load balancing' title=''>Flycast</a>, our private load balancing feature.</p>
<p>When you assign a Flycast IP to your app, you can route requests using a special <code>.flycast</code> domain. Those requests are routed through the Fly Proxy instead of directly to instances in your app. Meaning you get all the load balancing, rate limiting and other proxy goodness that you’re accustomed to. The Proxy runs a process which can automatically downscale Machines every few minutes. It’ll also start them right back up when a request comes in — this means we can take advantage of scale-to-zero, saving us a bunch of money!</p>
<h2 id='the-imagine-command' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-imagine-command' aria-label='Anchor'></a>The <code>/imagine</code> Command</h2>
<p>The slash command is the heart of your bot, enabling you to generate images based on your prompt, right from within Discord. When you type <code>/imagine</code> into the Discord chat, you’ll see some command options pop up.</p>
<p>You’ll need to input your base prompt (e.g. “an alpaca sleeping in a grassy field”) and optionally pick some styles (“Pencil Sketch Drawing”, “Futuristic Retro Cyberpunk”, “MRE Dark Cyberpunk” etc). With Fooocus, combining multiple styles — “style-chaining” — can help you achieve amazing results. Set the aspect ratio or provide negative prompts if needed, too.</p>
<p>After you execute the command, the bot will request the image from the API, then send it as a response in the chat. Let’s see it in action!</p>
<p><img alt="A dif demo run through showcasing the ability of the bot to generate images from Discord" src="/images/./demo.gif?card&center" /></p>
<h2 id='deployment-speedrun' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#deployment-speedrun' aria-label='Anchor'></a>Deployment Speedrun</h2>
<p><strong class='font-[550] text-navy-950'>First, we’ll deploy the API server.</strong> For convenience (and to speed things up), we’ll use a pre-built image when we deploy. With dependencies like <code>torch</code> and <code>torchvision</code> bundled in, it’s a hefty image weighing in just shy of 12GB. With a normal Fly Machine this would not only be a bad idea, but not even possible due to an 8GB limit for the VMs rootfs. Fortunately the wizards behind Fly GPUs have accounted for our need to run huge models and their dependencies, and awarded us 50GB of rootfs.</p>
<div class="right-sidenote"><p>Fly GPUs use <a href="https://github.com/cloud-hypervisor/cloud-hypervisor" title="">Cloud Hypervisor</a> and not <a href="https://github.com/firecracker-microvm/firecracker" title="">Firecracker</a> (like a regular Fly Machine) for virtualization. But even with a 12GB image, this doesn’t stop the Machine from booting in seconds when a new request comes in through the Proxy.</p>
</div>
<p>To start, clone the template <a href='https://github.com/fly-apps/not-midjourney-bot' title=''>repository</a>. You’ll need this for both the bot and server apps. Then deploy the server with the Fly CLI:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-3wqx27w2"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-3wqx27w2">fly deploy <span class="se">\</span>
<span class="nt">--image</span> ghcr.io/fly-apps/not-midjourney-bot:server <span class="se">\</span>
<span class="nt">--config</span> ./server/fly.toml <span class="se">\</span>
<span class="nt">--no-public-ips</span>
</code></pre>
</div>
</div>
<p>This command tells Fly.io to deploy your application based on the configuration specified in the <code>fly.toml</code>, while the <code>--no-public-ips</code> flag secures your app by not exposing it to the public Internet.</p>
<p>Remember Flycast? To use it, we’ll allocate a private IPv6:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-p1n4o7bo"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-p1n4o7bo">fly ips allocate-v6 <span class="nt">--private</span>
</code></pre>
</div>
</div>
<p>Now, let’s take a look at our <a href='https://github.com/fly-apps/not-midjourney-bot/blob/134bb634f97bf81040e489650f2334b48d976c10/server/fly.toml' title=''><code>fly.toml</code></a> config:</p>
<div class="highlight-wrapper group relative toml">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-jay4rzen"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-jay4rzen"><span class="py">app</span> <span class="p">=</span> <span class="s">"alpaca-image-gen"</span>
<span class="py">primary_region</span> <span class="p">=</span> <span class="s">"ord"</span>
<span class="nn">[[vm]]</span>
<span class="py">size</span> <span class="p">=</span> <span class="s">"performance-8x"</span>
<span class="py">memory</span> <span class="p">=</span> <span class="s">"16gb"</span>
<span class="py">gpu_kind</span> <span class="p">=</span> <span class="s">"l40s"</span>
<span class="nn">[[services]]</span>
<span class="py">internal_port</span> <span class="p">=</span> <span class="mi">8888</span>
<span class="py">protocol</span> <span class="p">=</span> <span class="s">"tcp"</span>
<span class="py">auto_stop_machines</span> <span class="p">=</span> <span class="kc">true</span>
<span class="py">auto_start_machines</span> <span class="p">=</span> <span class="kc">true</span>
<span class="py">min_machines_running</span> <span class="p">=</span> <span class="mi">0</span>
<span class="nn">[[services.ports]]</span>
<span class="py">handlers</span> <span class="p">=</span> <span class="nn">["http"]</span>
<span class="py">port</span> <span class="p">=</span> <span class="mi">80</span>
<span class="py">force_https</span> <span class="p">=</span> <span class="kc">false</span>
<span class="nn">[mounts]</span>
<span class="py">source</span> <span class="p">=</span> <span class="s">"repositories"</span>
<span class="py">destination</span> <span class="p">=</span> <span class="s">"/app/repositories"</span>
<span class="py">initial_size</span> <span class="p">=</span> <span class="s">"20gb"</span>
</code></pre>
</div>
</div>
<p>There are a few key things to note here:</p>
<ol>
<li>Currently, the NVIDIA L40Ss we’re using when we specify <code>gpu_kind</code> are only available in <code>ORD</code>, so that’s what we’ve set the <code>primary_region</code> to. We’re rolling out more GPUs to more regions in a hurry — but for now we’ll host the bot in Chicago.
</li><li>Out of the box, 8GB of system RAM is suggested. In my testing this wasn’t close to enough: the Machine would frequently run out of memory and crash. I got things working better by using 16GB of RAM.
</li><li>The FastAPI server binds to port 8888; we need to set this as our <code>internal_port</code>, or the Fly Proxy won’t know where to send requests.
</li><li>We want our Machine to <a href='https://fly.io/docs/apps/autostart-stop/' title=''>automatically stop and start</a>.
</li><li>Flycast doesn’t do HTTPS, so we won’t force it here. Don’t worry, it’s still encrypted over the wire!
</li><li>A volume is automatically created on the first deploy. On first boot, the app clones the Fooocus repo and downloads the Stable Diffusion model checkpoints onto that volume. This takes a couple of minutes, but the next time the Machine starts, it’ll have everything it needs to serve a request within seconds.
</li></ol>
<div class="callout"><p>The <a href="https://github.com/fly-apps/not-midjourney-bot/blob/84e72d1e7048627b7c845fe3d44d45b278e451d5/README.md" title=""><strong class="font-[550] text-navy-950">README</strong></a> for this project has detailed instructions about setting up your Discord bot and adding it to a Server. After setting up the permissions and privileged intents, you’ll get an OAuth2 URL. Use this URL to invite your bot to your Discord server and confirm the permissions. Once that’s done, grab your Discord API token, you’ll need it for the next step.</p>
</div>
<p><strong class='font-[550] text-navy-950'>With the API server up and running, it’s time to deploy the Discord bot.</strong> This app will run on a normal Fly Machine, no GPU required. First, set the <code>DISCORD_TOKEN</code> and <code>FOOOCUS_API_URL</code> (the Flycast endpoint for the API server) secrets, using the Fly CLI. Then deploy:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-3habjmu5"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-3habjmu5">fly deploy <span class="se">\</span>
<span class="nt">--image</span> ghcr.io/fly-apps/not-midjourney-bot:bot <span class="se">\</span>
<span class="nt">--config</span> ./bot/fly.toml <span class="se">\</span>
<span class="nt">--no-public-ips</span>
</code></pre>
</div>
</div>
<p>Notice that the bot app doesn’t need to be publicly visible on the Internet either. Under the hood, the WebSocket connection to Discord’s Gateway API allows the bot to communicate freely without the need to define any services in our <code>fly.toml</code>. This also means that the Fly Proxy will not downscale the app like it does the GPU Machine — the bot will always appear “online”.</p>
<figure class="post-cta">
<figcaption>
<h1>Not interested in GPUs?</h1>
<p>You can still deploy apps on Fly.io today and be up and running in a matter of minutes.</p>
<a class="button bg-white text-violet-500 hover:bg-opacity-75 hover:text-violet-700 transition-colors" href="https://fly.io/docs/speedrun/">
Deploy an app now<span class='opacity-50'>→</span>
</a>
</figcaption>
<div class="image-container">
<img src="/static/images/cta-turtle.webp" srcset="/static/images/cta-turtle@2x.webp 2x" alt="">
</div>
</figure>
<h2 id='how-do-i-know-this-thing-is-using-gpu-for-reals' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#how-do-i-know-this-thing-is-using-gpu-for-reals' aria-label='Anchor'></a>How Do I Know This Thing Is Using GPU for Reals?</h2>
<p>That’s easy! NVIDIA provides us with a neat little command-line utility called <code>nvidia-smi</code> which we can use to monitor and get information about NVIDIA GPU devices.</p>
<p>Let’s SSH to the running Machine for the API server app and run an <code>nvidia-smi</code> query in one go. It’s a little clunky, but you’ll get the point:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-8jn1gsg9"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-8jn1gsg9">fly ssh console <span class="se">\</span>
<span class="nt">-C</span> <span class="s2">"nvidia-smi --query-gpu=gpu_name,utilization.gpu,utilization.memory,temperature.gpu,power.draw --format=csv,noheader --loop"</span>
</code></pre>
</div>
</div><div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-q1glrj62"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-q1glrj62">Connecting to fdaa:2:f664:a7b:210:d8b2:8fd8:2... complete
NVIDIA L40S, 0 %, 0 %, 46, 88.63 W
NVIDIA L40S, 0 %, 0 %, 46, 88.61 W
NVIDIA L40S, 36 %, 4 %, 51, 103.41 W
NVIDIA L40S, 65 %, 25 %, 57, 280.90 W
NVIDIA L40S, 0 %, 0 %, 49, 91.13 W
NVIDIA L40S, 0 %, 0 %, 48, 89.76 W
</code></pre>
</div>
</div>
<p>What we’ve done is run the command on a loop while the bot is actually doing work synthesizing an image and we get to see it ramp up and consume more wattage and VRAM. The card is barely breaking a sweat!</p>
<h2 id='how-much-will-these-alpaca-pics-cost-me' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#how-much-will-these-alpaca-pics-cost-me' aria-label='Anchor'></a>How Much Will These Alpaca Pics Cost Me?</h2>
<p>Let’s talk about the cost-effectiveness of this setup. On Fly.io, an L40S GPU <a href='https://fly.io/docs/about/pricing/#gpus-and-fly-machines' title=''>costs</a> $2.50/hr. Tag on a few cents per hour for the VM resources and storage for our models and you’re looking at about $3.20/hr to run the GPU Machine. It’s <em>on-demand</em>, too — if you’re not using the compute, you’re not paying for it! Keep in mind that some of these checkpoint models can be several gigabytes and if you create a volume, you will be charged for it even when you have no Machines running. It’s worth noting too, that the non-GPU bot app falls into our <a href='https://fly.io/docs/about/pricing/#free-allowances' title=''>free allowance</a>.</p>
<div class="right-sidenote"><p>Rates are on-demand, with no minimum usage requirements. Discounted rates for reserved GPU Machines and dedicated hosts are also available if you email <a href="mailto:sales@fly.io" title="">sales@fly.io</a></p>
</div>
<p>In comparison, Midjourney offers several subscription tiers with the cheapest plan costing $10/mo and providing 3.3 hours of “fast” GPU time (roughly equivalent to an enterprise-grade Fly GPU). This works out to about $3/hr give or take a few cents.</p>
<h2 id='where-can-i-take-this' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#where-can-i-take-this' aria-label='Anchor'></a>Where Can I Take This?</h2>
<p>There is a lot you can do to build out the bot’s functionality. You control the source code for the bot, meaning that you can make it do <em>whatever you want</em>. You might decide to mimic Midjourney’s <code>/blend</code> command to splice your own images into prompts (AKA img2img diffusion). You can do this by adding more commands to your <a href='https://guide.pycord.dev/popular-topics/cogs' title=''>Cog</a>, Pycord’s way of grouping similar commands. You might decide to add a button to roll the image if you don’t like it, or even specify the number of images to return. The possibilities are endless and your cloud bill’s the limit!</p>
<p>The full code for the bot and server (with detailed instructions on how to deploy it on Fly.io) can be found <a href='https://github.com/fly-apps/not-midjourney-bot' title=''><strong class='font-[550] text-navy-950'>here</strong></a>.</p>
/blog/not-midjourney-bot/assets/purple-balloon-taking-off-thumb.webpFly With Alpinehttps://fly.io/blog/fly-with-alpine/2023-12-21T00:00:00+00:002024-02-05T16:04:11+00:00<div class="lead"><p>Reduce image sizes and improve startup times by switching your base image to Alpine Linux.</p>
</div>
<p>Before proceeding, a caution. This is an engineering trade-off. Test carefully before deploying to production.</p>
<p>By the end of this blog post you should have the information you need to make an informed decision.</p>
<h2 id='introduction' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#introduction' aria-label='Anchor'></a>Introduction</h2>
<p><a href='https://www.alpinelinux.org/about/' title=''>Alpine Linux</a> is a Linux distribution that advertises itself as Small. Simple. Secure.</p>
<p>It is indisputably smaller than the alternatives – when measured by image size. More on that in a bit. Some claim that this results in less memory usage and better performance. Others dispute these claims. For these, it is best that you test the results for yourself with your application.</p>
<p>Simple is harder to measure. Some of the larger differences, like <a href='https://github.com/OpenRC/openrc#readme' title=''>OpenRC</a> vs <a href='https://systemd.io/' title=''>SystemD</a>, are less relevant in container environments. Others, like <a href='https://busybox.net/' title=''>BusyBox</a> are implementation details. Essentially what you get is a Linux distribution with perhaps a number of standard packages (e.g., bash) not installed by default, but these can be easily added if needed.</p>
<p>Secure is definitely an important attribute. The alternatives make comparable claims in this area. Do your own research in this area and come to your own conclusions.</p>
<p>Not mentioned is the downside: Alpine Linux has a smaller ecosystem that the alternatives, particularly when compared to Debian.</p>
<h2 id='baseline' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#baseline' aria-label='Anchor'></a>Baseline</h2>
<p>Let’s start with a baseline consisting of the Dockerfiles produced by <code>fly launch</code> for some of the most popular
frameworks:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-84di2x73"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-84di2x73">FROM fideloper/fly-laravel:${PHP_VERSION}
FROM hexpm/elixir:1.12.3-erlang-24.1.4-debian-bullseye-20210902-slim
FROM node:${NODE_VERSION}-slim
FROM oven/bun:${BUN_VERSION}-slim
FROM python:${PYTHON_VERSION}-slim-bullseye
FROM ruby:$RUBY_VERSION-slim
</code></pre>
</div>
</div>
<p>What may not be obvious to the naked eye from these results is that the base image for these is one of the following:</p>
<ul>
<li>Debian Bookworm (the current “stable” distribution)
</li><li>Debian Bullseye (the previous “stable” distribution)
</li><li>Ubuntu Focal Fossa (the previous LTS release of Ubuntu)
</li></ul>
<p>Once you factor in that Ubuntu is based on Debian, the conclusion is that Debian is effectively the default distribution for fly IO. Rest assured that this isn’t the result of a devious conspiracy by Fly.io, but rather a reflection of the default choices made independently by the developers of a number of frameworks and runtimes. Beyond this, all Fly.io is doing is choosing the “slim” version of the default distribution for each framework as the base.</p>
<p>What’s likely going on here is a virtuos circle: people choose Debian because of the ecosystem, and ecosystem grows because people chose Debian.</p>
<p>Now lets compare base image sizes:</p>
<table class="ml-8 mb-8">
<thead>
<tr>
<th class="px-8">
<th class="px-8 underline">Alpine
<th class="px-8 underline">Debian slim
</tr>
</thead>
<tbody>
<tr>
<th class="text-left">Bun 1.0.18
<td class="text-center">43.10M
<td class="text-center">63.84M
</tr>
<tr>
<th class="text-left">Node 21.4.0
<td class="text-center">46.83M
<td class="text-center">70.08M
</tr>
<tr>
<th class="text-left">Python 3.12.1
<td class="text-center">17.59M
<td class="text-center">45.36M
</tr>
<tr>
<th class="text-left">Ruby 3.2
<td class="text-center">40.14M
<td class="text-center">74.36M
</tr>
</tbody>
</table>
<p>And these numbers are just the for the base images. I’ve measured a minimal Rails/Postgresql/esbuild application at 304MB on Alpine and 428MB on Debian Slim. A minimal Bun application at 110MB on Alpine and 173MB on Debian Slim. And a minimal Node application at 142MB on Alpine and 207MB on Debian Slim.</p>
<p>In each case, corresponding Alpine images are consistently smaller than their Debian slim equivalent.</p>
<h2 id='switching-distributions' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#switching-distributions' aria-label='Anchor'></a>Switching Distributions</h2>
<p>Switch distributions (and switching back!) is easy.</p>
<p>The first change is to replace <code>-slim</code> with <code>-alpine</code> in <code>FROM</code> statements in your <code>Dockerfile</code>.</p>
<p>Next is to replace <code>apt-get update</code> with <code>apk update</code> and <code>apt-get install</code> with <code>apk add</code>. Delete any options you may have like <code>-y</code> and <code>--no-install-recommends</code> - they aren’t needed.</p>
<p>Now review the names of the packages you are installing. Many are named the same. A few are different.
You can use <a href='https://pkgs.alpinelinux.org/packages' title=''>alpine packages</a> to look for ones to use. Some examples of
differences:</p>
<table class="ml-8 mb-8" style="border-collapse: separate; border-spacing: 1rem 0">
<thead>
<tr>
<th class="px-8 underline text-left">Debian
<th class="px-8 underline text-left">Alpine
</tr>
</thead>
<tbody>
<tr>
<td>build-essential
<td>build-base
</tr>
<tr>
<td>chromium-sandbox
<td>chromium-chromedriver
</tr>
<tr>
<td>default-libmysqlclient-dev
<td>mysql-client
</tr>
<tr>
<td>default-mysqlclient
<td>mysql-client
</tr>
<tr>
<td>freedts-bin
<td>freedts
</tr>
<tr>
<td>libicu-dev
<td>icu-dev
</tr>
<tr>
<td>libjemalloc
<td>jemalloc-dev
</tr>
<tr>
<td>libjpeg-dev
<td>jpeg-dev
</tr>
<tr>
<td>libmagickwand-dev
<td>imagemagick-libs
</tr>
<tr>
<td>libsqlite3-0
<td>sqlite-dev
</tr>
<tr>
<td>libtiff-dev
<td>tiff-dev
</tr>
<tr>
<td>libvips
<td>vips-dev
</tr>
<tr>
<td>node-gyp
<td>gyp
</tr>
<tr>
<td>pkg-config
<td>pkgconfig
</tr>
<tr>
<td>python
<td>python3
</tr>
<tr>
<td>python-is-python3
<td>python3
</tr>
<tr>
<td>sqlite3
<td>sqlite
</tr>
</tbody>
</table>
<p>Note: the above is just an approximation. For example, while <code>libsqlite3-0</code> and <code>sqlite-dev</code> include everything
you need to build an application that uses sqlite3, all that is needed at runtime is <code>sqlite-lib</code>. This relentless attention to detail contributes to smaller final image sizes.</p>
<p>Note: For Bun, Node, and Rails users, knowledge of how to apply the above changes are included in recent versions of the dockerfile generators that we provide. After all, computers are good at <code>if</code> statements:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-6dt87hio"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-6dt87hio">bunx dockerfile --alpine
npx dockerfile --alpine
bin/rails generate dockerfile --alpine
</code></pre>
</div>
</div><figure class="post-cta">
<figcaption>
<h1>Choose your own Linux Distribution</h1>
<p>Deploy your project in a few minutes with Fly Launch. Then do more with Fly Machines.</p>
<a class="button bg-white text-violet-500 hover:bg-opacity-75 hover:text-violet-700 transition-colors" href="https://fly.io/docs/">
Run your entire stack near your users
</a>
</figcaption>
<div class="image-container">
<img src="/static/images/cta-cat.webp" srcset="/static/images/cta-cat@2x.webp 2x" alt="">
</div>
</figure>
<h2 id='potential-issues' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#potential-issues' aria-label='Anchor'></a>Potential issues</h2>
<p>Over time, we’ve noted a number of issues.</p>
<ul>
<li>Alpine uses <a href='https://musl.libc.org/' title=''>musl</a> for a runtime library. Debian uses <a href='https://www.gnu.org/software/libc/' title=''>glibc</a>. Software tested on glibc may not work as expected on musl. And there are other potential compatibility issues like <a href='https://bell-sw.com/blog/how-to-deal-with-alpine-dns-issues/' title=''>DNS</a>.
</li><li>Debian includes both <code>adduser</code> and <code>useradd</code>. Alpine, by default, only includes <code>adduser</code>.
This can be addressed by installing package like <a href='https://pkgs.alpinelinux.org/package/edge/community/armv7/shadow' title=''>shadow</a>, or switching to <code>adduser</code>.
</li><li>Packages like <a href='https://github.com/nodenv/node-build' title=''>node-build</a> require <code>bash</code> which isn’t included by default. Adding it back in allows <code>node-build</code> to run to completion, but the end result is that a precompiled Debian executable is installed that won’t run on Alpine.
An alternative is to download an <a href='https://unofficial-builds.nodejs.org/' title=''>unofficial build</a>.
</li><li>Release candidates for Alpine may not get the same level of testing as Debian resulting in problems
like <a href='https://github.com/sparklemotion/sqlite3-ruby/issues/434' title=''>sqlite3-ruby not working on Alpine 3.19</a>.
In cases like this, stay back on previous versions of Alpine for a short while, or compile the gem for yourself.
These issues are temporary.
</li><li>Some packages, like Chrome, are not available for Alpine. Alternatives like Chromium may be necessary.
</li></ul>
<h2 id='conclusion' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#conclusion' aria-label='Anchor'></a>Conclusion</h2>
<p>While not as large a community as Debian, there is a substantial number of happy Alpine users.</p>
<p>For the forseeable future, the default for both frameworks and there fly.io will remain Debian, but we make it easy to switch.</p>
<p>Try it out! Hopefully this blog has provided insight into what you should evaluate for before you switch.</p>
/blog/fly-with-alpine/assets/fly-with-alpine-thumb.webpIntroducing Fly Kuberneteshttps://fly.io/blog/fks/2023-12-18T00:00:00+00:002024-03-08T16:31:52+00:00<div class="lead"><p>We’re Fly.io, and if you’ve been following us awhile you probably just did a double-take. We’re building a new public cloud that runs containerized applications with virtual machine isolation on our own hardware around the world. And we’ve been doing it without any K8s. Until now!</p>
</div><div class="callout"><p><strong class="font-[550] text-navy-950">Update, March 2024:</strong> FKS does more stuff now, and you can read about it in <a href="https://fly.io/blog/fks-beta-live/" title="">Fly Kubernetes does more now</a></p>
</div>
<p>We’ll own it: we’ve been snarky about Kubernetes. We are, at heart, old-school Unix nerds. We’re still scandalized by <code>systemd</code>.</p>
<p>To make matters more complicated, the problems we’re working on <a href='https://fly.io/blog/carving-the-scheduler-out-of-our-orchestrator/' title=''>have a lot of overlap with K8s</a>, but <a href='https://fly.io/blog/carving-the-scheduler-out-of-our-orchestrator/#numad' title=''>just enough impedance mismatch</a> that it (<a href='https://www.nomadproject.io/' title=''>or anything that looks like it</a>) is a bad fit for our own platform.</p>
<p>But, come on: you never took us too seriously about K8s, right? K8s is hard for us to use, but that doesn’t mean it’s not a great fit for what you’re building. We’ve been clear about that all along, right? Sure we have!</p>
<p>Well, good news, everybody! If K8s is important for your project, and that’s all that’s been holding you back from <a href='https://fly.io/docs/speedrun/' title=''>trying out Fly.io</a>, we’ve spent the past several months building something for you.</p>
<h2 id='fly-io-for-kubernetians' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#fly-io-for-kubernetians' aria-label='Anchor'></a>Fly.io For Kubernetians</h2>
<p>Fly.io works by transmogrifying Docker containers into filesystems for <a href='https://firecracker-microvm.github.io/' title=''>lightweight hypervisors</a>, and running them on servers we rack in dozens of regions around the world.</p>
<p>You can build something like Fly.io with “standard” orchestration tools like K8s. In fact, that’s what we did to start, too. To keep things simple, we used Nomad, and instead of K8s CNIs, we built our own Rust-based TLS-terminating Anycast proxy (and designed a WireGuard/IPv6-based private network system <a href='https://fly.io/blog/bpf-xdp-packet-filters-and-udp/' title=''>based on eBPF</a>). But the ideas are the same.</p>
<p>The way we look at it, the signature feature of a “standard” orchestrator is the global scheduler: the global eye in the sky that keeps track of vacancies on servers and optimized placement of new workloads. That’s the problem we ran into. We’re running over 200,000 applications, and we’re doing so on every continent except Antarctica. The speed of light (and a globally distributed network of backhoes) has something to say about keeping a perfectly consistent global picture of hundreds of thousands of applications, and it’s not pleasant.</p>
<p>The other problem we ran into is that our Nomad scheduler kept trying to outsmart us, and, worse, our customers. It turns out that our users have pretty firm ideas of where they’d like their apps to run. If they ask for São Paulo, they want São Paulo, not Rio. But global schedulers have other priorities, like optimally bin-packing resources, and sometimes <code>GIG</code> looks just as good as <code>GRU</code> to them.</p>
<p>To escape the scaling and DX problems we were hitting, we rethought orchestration. Where orchestrators like K8s tend to work through distributed consensus, we keep state local to workers. Each racked server in our fleet is a source of truth about the apps running on it, and provide an API to a market-style “scheduler” that bids on resources in regions. <a href='https://fly.io/blog/carving-the-scheduler-out-of-our-orchestrator/#numad' title=''>You can read more about here, if you’re interested.</a> We call this system the <a href='https://fly.io/docs/machines/' title=''>Fly Machines API.</a></p>
<p>An important detail to grok about how this all works – a reason we haven’t, like, beaten the CAP theorem by doing this – is that Fly Machines API calls can fail. If Nomad or K8s tries to place a workload on some server, only to find out that it’s filled up or thrown a rod, it will go hunt around for some other place to put it, like a good little robot. The Machines API won’t do this. It’ll just fail the request. In fact, it goes out of its way to fail the request quickly, to deliver feedback; if we can’t schedule work in <code>JNB</code> right now, you might want instead to quickly deploy to <code>BOM</code>.</p>
<h2 id='pluggable-orchestration-and-fks' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#pluggable-orchestration-and-fks' aria-label='Anchor'></a>Pluggable Orchestration and FKS</h2>
<p>In a real sense what we’ve done here is extract a chunk of the scheduling problem out of our orchestrator, and handed it off to other components. For most of our users, that component is <a href='https://github.com/superfly/flyctl' title=''><code>flyctl</code>, our intrepid CLI</a>.</p>
<p>But <a href='https://fly.io/docs/machines/working-with-machines/' title=''>Fly Machines is an API</a>, and anything can drive it. A lot of our users want quick answers to requests to schedule apps in specific regions, and <code>flyctl</code> does a fine job of that. But it’s totally reasonable to want something that works more like the good little robots inside of K8s.</p>
<p>You can build your own orchestrator with our API, but if what you’re looking for is literally Kubernetes, we’ve saved you the trouble. It’s called Fly Kubernetes, or FKS for short.</p>
<p>FKS is an implementation of Kubernetes that runs on top of Fly.io. You start it up using <code>flyctl</code>, by running <code>flyctl ext k8s create</code>.</p>
<p>Under the hood, FKS is a straightforward combination of two well-known Kubernetes projects: <a href='https://k3s.io/' title=''>K3s, the lightweight CNCF-certified K8s distro</a>, and <a href='https://virtual-kubelet.io/' title=''>Virtual Kubelet</a>.</p>
<p>Virtual Kubelet is interesting. In K8s-land, a <code>kubelet</code> is a host agent; it’s the thing that runs on every server in your fleet that knows how to run a K8s Pod. Virtual Kubelet isn’t a host agent; it’s a software component that pretends to be a host, registering itself with K8s as if it was one, but then sneakily proxying the Kubelet API elsewhere.</p>
<p>In FKS, “elsewhere” is <a href='https://fly.io/docs/machines/' title=''>Fly Machines</a>. All we have to do is satisfy various APIs that virtual kubelet exposes. For example, the API for the lifecycle of a pod:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-5xi3lf5o"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-5xi3lf5o">type PodLifecycleHandler interface {
CreatePod(ctx context.Context, pod *corev1.Pod) error
UpdatePod(ctx context.Context, pod *corev1.Pod) error
DeletePod(ctx context.Context, pod *corev1.Pod) error
GetPod(ctx context.Context, namespace, name string) (*corev1.Pod, error)
GetPodStatus(ctx context.Context, namespace, name string) (*corev1.PodStatus, error)
GetPods(context.Context) ([]*corev1.Pod, error)
}
</code></pre>
</div>
</div>
<p>This interface is easy to map to the Fly Machines API. For example:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-jw9k7vp7"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-jw9k7vp7">CreatePod -> POST /apps/{app_name}/machines
UpdatePod -> POST /apps/{app_name}/machines/{machine_id}
</code></pre>
</div>
</div>
<p>K3s, meanwhile, is a stripped-down implementation of all of K8s that fits into a single binary. K3s does a bunch of clever things to be as streamlined as it is, but the most notable of them is <a href='https://github.com/k3s-io/kine' title=''>kine, an API shim that switches <code>etcd</code> out with databases like SQLite</a>. Because of <code>kine</code>, K3s can manage multiple servers, but also gracefully runs on a single server, without distributed state.</p>
<p>So that’s what we do. When you create a cluster, we run K3s and the Virtual Kubelet on a single Fly Machine. We compile a <a href='https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/' title=''>kubeconfig</a>, with which you can talk to your K3s via <code>kubectl</code>. We set the whole thing up to run Pods on individual Fly Machines, so your cluster scales out directly using our platform, but with K8s tooling.</p>
<p>One thing we like about this design is how much of the lifting is already done for us by the underlying platform. If you’re a K8s person, take a second to think of all the different components you’re dealing with: <a href='https://etcd.io/' title=''>etcd</a>, specifically provisioned nodes, the <a href='https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/' title=''>kube-proxy</a>, <a href='https://github.com/flannel-io/flannel' title=''>a CNI </a>binary and configuration and its integration with the host network, containerd, registries. But Fly.io already does most of those things. So this project was mostly chipping away components until we found the bare minimum: CoreDNS, SQLite persistence, and Virtual Kubelet.</p>
<p>We ended up with something significantly simpler than K3s, which is saying something.</p>
<p>Fly Kubernetes has some advantages over plain <code>flyctl</code> and <code>fly.toml</code>:</p>
<ul>
<li>Your deployment is more declarative than it is with the <code>fly.toml</code> file. You declare the exact state of everything down to replica counts, autoscaling rules, volume definitions, and more.
</li><li>When you deploy with Fly Kubernetes, Kubernetes will automatically make your definitions match the state of the world. Machines go down? Kubernetes will whack them back online.
</li></ul>
<p>This is a different way to do orchestration and scheduling on Fly.io. It’s not what everyone is going to want. But if you want it, you really want it, and we’re psyched to give it to you: Fly.io’s platform features, with Kubernetes handling configuration and driving your system to its desired state.</p>
<p>We’ve kept things simple to start with. There are K8s use cases we’re a strong fit for today, and others we’ll get better at in the near future, as K8s users drive the underlying platform (and particularly our proxy) forward.</p>
<p><strong class='font-[550] text-navy-950'>Interested in getting early access? Email us at <a href="mailto:sales@fly.io">sales@fly.io</a> and we’ll hook you up.</strong></p>
<figure class="post-cta">
<figcaption>
<h1>Not invested in K8s?</h1>
<p>Nothing has to change for you! You can deploy apps on Fly.io today, in a matter of minutes, without talking to Sales.</p>
<a class="button bg-white text-violet-500 hover:bg-opacity-75 hover:text-violet-700 transition-colors" href="https://fly.io/docs/speedrun/">
Deploy an app in minutes.<span class='opacity-50'>→</span>
</a>
</figcaption>
<div class="image-container">
<img src="/static/images/cta-kitty.webp" srcset="/static/images/cta-kitty@2x.webp 2x" alt="">
</div>
</figure>
<div class="youtube-container" data-exclude-render>
<div class="youtube-video">
<iframe
width="100%"
height="100%"
src="https://www.youtube.com/embed/A3vFfZvUiwo"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen>
</iframe>
</div>
</div>
<h2 id='what-it-all-means' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#what-it-all-means' aria-label='Anchor'></a>What It All Means</h2>
<p>One obvious thing it means is that you’ve got an investment in Kubernetes tooling, you can keep it while running things on top of Fly.io. So that’s pretty neat. Buy our cereal!</p>
<p>But the computer science story is interesting, too. We placed a bet on an idiosyncratic strategy for doing global orchestration. We replaced global consensus, which is how Borg, Kubernetes, and Nomad all work, with a market-based system. That system was faster and, importantly, dumber than the consensus system it replaced.</p>
<p>This had costs! Nomad’s global consensus would do truly heroic amounts of work to make sure Fly Apps got scheduled somewhere, anywhere. Like a good capitalist, Fly Machines will tell you in no uncertain terms how much work it’s willing to do for you (“less than a Nomad”).</p>
<p>But that doesn’t mean you’re stuck with the answers Fly Machines gives by itself. Because Fly Machines is so simple, and tries so hard to be predictable, we hoped you’d be able to build more sophisticated scheduling and orchestration schemes on top of it. And here you go: Kubernetes scheduling, as a plugin to the platform.</p>
<p>More to come! We’re itching to see just how many different ways this bet might pay off. Or: we’ll perish in flames! Either way, it’ll be fun to watch.</p>
/blog/fks/assets/fks-thumb.webpFly.io has GPUs nowhttps://fly.io/blog/fly-io-has-gpus-now/2023-12-13T00:00:00+00:002024-02-29T01:36:09+00:00<div class="lead"><p>We’re Fly.io, we’re a new public cloud that lets you put your compute where it matters: near your users. Today we’re announcing that you can do this with GPUs too, allowing you to do AI workloads on the edge. Want to find out more? Keep reading.</p>
</div><h2 id='ai-is-pretty-fly' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#ai-is-pretty-fly' aria-label='Anchor'></a>AI is pretty fly</h2>
<p>AI is apparently a bit of a <em>thing</em> (maybe even <em>an thing</em> come to think about it). We’ve seen entire industries get transformed in the wake of ChatGPT existing (somehow it’s only been around for a year, I can’t believe it either). It’s likely to leave a huge impact on society as a whole in the same way that the Internet did once we got search engines. Like any good venture-capital funded infrastructure provider, we want to enable you to do hilarious things with AI using industrial-grade muscle.</p>
<p>Fly.io lets you run a full-stack app—or an entire dev platform based on the <a href='https://fly.io/docs/machines/' title=''>Fly Machines API</a>—close to your users. Fly.io GPUs let you attach an <a href='https://www.nvidia.com/en-us/data-center/a100/' title=''>Nvidia A100</a> to whatever you’re building, harnessing the full power of CUDA with more VRAM than your local 4090 can shake a ray-traced stick at. With these cards (or whatever you call a GPU attached to SXM fabric), AI/ML workloads are at your fingertips. You can <a href='https://fly.io/blog/transcribing-on-fly-gpu-machines/' title=''>recognize speech</a>, segment text, summarize articles, synthesize images, and more at speeds that would make your homelab blush. You can even set one up as your programming companion with <a href='https://github.com/deepseek-ai/DeepSeek-Coder' title=''>your model of choice</a> in case you’ve just not been feeling it with the output of <em>other</em> models changing over time.</p>
<p>If you want to find out more about what these cards are and what using them is like, check out <a href='https://fly.io/blog/what-are-these-gpus-really/' title=''>What are these “GPUs” really?</a> It covers the history of GPUs and why it’s ironic that the cards we offer are called “Graphics Processing Units” in the first place.</p>
<h2 id='fly-io-gpus-in-action' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#fly-io-gpus-in-action' aria-label='Anchor'></a>Fly.io GPUs in Action</h2>
<p>We want you to deploy your own code with your favorite models on top of Fly.io’s cloud backbone. Fly.io GPUs make this really easy.</p>
<p>You can get a GPU app running <a href='https://ollama.ai' title=''>Ollama</a> (our friends in text generation) in two steps:</p>
<ol>
<li><p>Put this in your <code>fly.toml</code>:</p>
<div class="highlight-wrapper group relative toml">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-bcwnxd1v"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-bcwnxd1v"><span class="py">app</span> <span class="p">=</span> <span class="s">"sandwich_ai"</span>
<span class="py">primary_region</span> <span class="p">=</span> <span class="s">"ord"</span>
<span class="py">vm.size</span> <span class="p">=</span> <span class="s">"a100-40gb"</span>
<span class="nn">[build]</span>
<span class="py">image</span> <span class="p">=</span> <span class="s">"ollama/ollama"</span>
<span class="nn">[mounts]</span>
<span class="py">source</span> <span class="p">=</span> <span class="s">"models"</span>
<span class="py">destination</span> <span class="p">=</span> <span class="s">"/root/.ollama"</span>
<span class="py">initial_size</span> <span class="p">=</span> <span class="s">"100gb"</span>
</code></pre>
</div>
</div></li><li><p>Run <code>fly apps create sandwich_ai && fly deploy</code>.</p>
</li></ol>
<p>If you want to read more about how to start your new sandwich empire, check out <a href='https://fly.io/blog/scaling-llm-ollama/' title=''>Scaling Large Language Models to zero with Ollama</a>, it explains how to set up Ollama so that it <em>automatically scales itself down</em> when it’s not in use.</p>
<h2 id='the-speed-of-light-is-only-so-fast' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-speed-of-light-is-only-so-fast' aria-label='Anchor'></a>The speed of light is only so fast</h2>
<p>Being able to spin up GPUs is great, but where Fly.io really shines is inference at the edge.</p>
<p>Let’s say you have an app that lets users enter ingredients they have in their kitchen and receive a sandwich recipe. Your users expect their recipes <em>instantly</em> (or at least as fast as the other leading apps). Seconds count when you need an emergency sandwich.</p>
<div class="left-sidenote"><p><br>
<br>
<br>
It’s depressingly customary in the AI industry to cherry-pick outputs. This was not cherry-picked. I used <a href="https://ollama.ai/library/yi:34b" title="">yi:34b</a> to generate this recipe. I’m not sure what a taco salad sandwich is, but I might be willing to try it.</p>
</div>
<p><img alt="A conversation between a user and an artificial intelligence. The user asks: "What kind of sandwich can I make with an egg, a taco shell, and a pound of ground beef?" The AI responds: " You can make a breakfast taco or a taco salad sandwich with the ingredients you have. Here's how to prepare each: for the breakfast taco. Cook the pound of ground beef in a pan until browned. Season with salt, pepper, and any other desired spices like chili powder or taco seasoning. While the beef is cooking, fry an egg (or scramble it if you prefer) in another pan. Warm up the taco shell in the oven or microwave according to the package instructions. Assemble the taco by placing the ground cooked beef in the taco shell followed by the fried egg. You can also add any additional toppings like diced onions avocados, avocado, salsa or cheese if you have them. Serve with a side of hot sauce or sour cream if desired. For the taco salad sandwich: Prepare the ground beef as in the breakfast taco recipe above. Instead of frying an egg, you can cook it sunny side up or over easy depending on your preference. Crumble the taco shell into small pieces and mix them with the cooked ground beef. Assemble the sandwich by layering the crumbled taco shell mixture and ground beef in two slices of bread. Top with the fried egg and any additional toppings you like such as shredded lettuce, diced tomatoes, or sliced olives. Serve with a side of salsa or ranch dressing if desired. Both options are hardy and can be easily customized based on your taste preferences. Enjoy!"" src="/images/sandwich-recipes.webp" /></p>
<p>In the previous snippet, we deployed our app to ord (<code>primary_region = "ord"</code>). The good news is that our model returns a result really quickly and users in Chicago get instant sandwich recipes. It’s a good experience for users near your datacentre, and you can do this on any half decent cloud provider.</p>
<p>But surely people outside of Chicago need sandwiches too. Amsterdam has sandwich fiends as well. And sometimes it takes too long to have their requests leap across the pond. The speed of light is only so fast after all. Don’t worry, we’ve got your back. Fly.io has GPUs in datacentres all over the world. Even more, we’ll let you run <em>the same program</em> with the same public IP address and the same TLS certificates in any regions with GPU support.</p>
<p>Don’t believe us? See how you can scale your app up in Amsterdam with one command:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-16usijcv"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-16usijcv">fly scale count 2 --region ams
</code></pre>
</div>
</div>
<p>It’s that easy.</p>
<h2 id='actually-on-demand' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#actually-on-demand' aria-label='Anchor'></a>Actually On-Demand</h2>
<p>GPUs are powerful parallel processing packages, but they’re not cheap! Once we have enough people wanting to turn their fridge contents into tasty sandwiches, keeping a GPU or two running makes sense. But we’re just a small app still growing our user base while also funding the latest large sandwich model research. We want to only pay for GPUs when a user makes a request.</p>
<p>Let’s open up that <code>fly.toml</code> again, and add a section called <code>services</code>, and we’ll include instructions on how we want our app to scale up and down:</p>
<div class="highlight-wrapper group relative toml">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-ba3j7hgm"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-ba3j7hgm"><span class="nn">[[services]]</span>
<span class="py">internal_port</span> <span class="p">=</span> <span class="mi">8080</span>
<span class="py">protocol</span> <span class="p">=</span> <span class="s">"tcp"</span>
<span class="py">auto_stop_machines</span> <span class="p">=</span> <span class="kc">true</span>
<span class="py">auto_start_machines</span> <span class="p">=</span> <span class="kc">true</span>
<span class="py">min_machines_running</span> <span class="p">=</span> <span class="mi">0</span>
</code></pre>
</div>
</div>
<p>Now when no one needs sandwich recipes, you don’t pay for GPU time.</p>
<h2 id='the-deets' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-deets' aria-label='Anchor'></a>The Deets</h2>
<p>We have GPUs ready to use in several US and EU regions and Sydney. You can deploy your sandwich, music generation, or AI illustration apps to:</p>
<ul>
<li><a href='https://www.nvidia.com/en-us/data-center/a100/' title=''>Ampere A100s</a> with 40gb of RAM for $2.50/hr
</li><li><a href='https://www.nvidia.com/en-us/data-center/a100/' title=''>Ampere A100s</a> with 80gb of RAM for $3.50/hr
</li><li><a href='https://resources.nvidia.com/en-us-l40s/l40s-datasheet-28413' title=''>Lovelace L40s</a> are coming soon (update: now here!) for $2.50/hr
</li></ul>
<p>By default, anything you deploy to GPUs will use eight heckin’ <a href='https://www.amd.com/en/processors/epyc-server-cpu-family' title=''>AMD EPYC</a> CPU cores, and you can attach volumes up to 500 gigabytes. We’ll even give you discounts for reserved instances and dedicated hosts if you ask nicely.</p>
<p>We hope you have fun with these new cards and we’d love to see what you can do with them! Reach out to us on X (formerly Twitter) or <a href='https://community.fly.io/' title=''>the community forum</a> and share what you’ve been up to. We’d love to see what we can make easier!</p>
/blog/fly-io-has-gpus-now/assets/llama-portal-thumb.webpWhat are these "GPUs" really?https://fly.io/blog/what-are-these-gpus-really/2023-12-11T00:00:00+00:002024-02-20T19:03:55+00:00<div class="lead"><p>Fly.io runs containerized apps with virtual machine isolation on our own hardware around the world, so you can safely run your code close to where your users are. We’re in the process of rolling out GPU support, and that’s what this post is about, but you don’t have to wait for that to try us out: <a href="https://fly.io/docs/speedrun/" title="">your app can be up and running on us in minutes</a>.</p>
</div>
<p>GPU hardware will let our users run all sorts of fun Artificial Intelligence and Machine Learning (AI/ML) workloads near their users. But, what are these “GPUs” really? What can they do? What <em>can’t</em> they do?</p>
<p>Listen here for my tale of woe as I spell out exactly what these cards are, are not, and what you can do with them. By the end of this magical journey, you should understand the true irony of them being called “Graphics Processing Units” and why every marketing term is always bad forever.</p>
<h2 id='how-does-computer-formed' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#how-does-computer-formed' aria-label='Anchor'></a>How does computer formed?</h2>
<p>In the early days of computing, your computer generally had a few basic components:</p>
<ul>
<li>The CPU
</li><li>Input device and assorted peripherals (keyboard, etc)
</li><li>Output device (monitor, printer, etc)
</li><li>Memory
</li><li>Glue logic chips
</li><li>Video rendering hardware
</li></ul>
<p>Taking the Commodore 64 as an example, it had a CPU, a chip to handle video output, a chip to handle audio output, and a chip to glue everything together. The CPU would read instructions from the RAM and then execute them to do things like draw to the screen, solve sudoku puzzles, play sounds, and so on.</p>
<p>However, even though the CPU by itself was fast by the standards of the time, it could only do a million clock cycles per second or so. Imagine a very small shouting crystal vibrating millions of times per second triggering the CPU to do one part of a task and you’ll get the idea. This is fast, but not fast enough when executing instructions can take longer than a single clock cycle and when your video output device needs to be updated 60 times per second.</p>
<p>The main way they optimized this was by shunting a lot of the video output tasks to a bespoke device called the VIC-II (Video Interface Chip, version 2). This allowed the Commodore 64 to send a bunch of instructions to the VIC-II and then let it do its thing while the CPU was off doing other things. This is called “offloading”.</p>
<p><img src="/images/./deus-ex-machina-cover.webp" /></p>
<p>As technology advanced, the desire to do bigger and better things with both contemporary and future hardware increased. This came to a head when this little studio nobody had ever heard of called id Software released one of the most popular games of all time: DOOM.</p>
<p>Now, even though DOOM was a huge advancement in gaming technology, it was still incredibly limited by the hardware of the time. It was actually a 2D game that used a lot of tricks to make it look (and feel) like it was 3D. It was also limited to a resolution of 320x200 and a hard cap of 35 frames per second. This was fine for the time (most movies were only at 24 frames per second), but it was clear that there was a lot of room for improvement.</p>
<p>One of the main things that DOOM did was to use a pair of techniques to draw the world at near real-time. It used a combination of “raycasting” and binary-space partitioning to draw the world. This basically means that they drew a bunch of imaginary lines to where points in the map would be to figure out what color everything would be and then eliminated the parts of the map that were behind walls and other objects. This is a very simplified explanation, and if you want to know more, <a href='https://fabiensanglard.net/doomIphone/doomClassicRenderer.php' title=''>Fabien Sanglard explains the rendering</a> of DOOM in more detail.</p>
<h2 id='the-dream-of-3d' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-dream-of-3d' aria-label='Anchor'></a>The dream of 3D</h2>
<p>However, a lot of this was logic that ran very slowly on the CPU, and while the CPU was doing the display logic, it couldn’t do anything else, such as enemy AI or playing sounds. Hence the idea of a “3D accelerator card”. The idea: offload the 3D rendering logic to a separate device that could do it much faster than the CPU could, and free the CPU to do other things like AI, sound, and so on.</p>
<p>This was the dream, but it was a long way off. Then Quake happened.</p>
<div class="right-sidenote"><p>Really, Half-Life is based on Quake so much that the pattern for <a href="https://www.pcgamer.com/half-life-alyxs-lights-flicker-just-like-they-did-in-quake-almost-25-years-later/" title="">blinking lights</a> has carried forward 25 years later to Half-Life: Alyx in VR. If it ain’t broke, don’t fix it.</p>
</div>
<p>Unlike Doom, Quake was fully 3D on unmodified consumer hardware. Players could look up and down (something previously thought impossible without accelerator hardware!) and designers could make levels with that in mind. Quake also allowed much more complex geometry and textures. It was a huge leap forward in 3D gaming and it was only possible because of the massive leap in CPU power at the time. The Pentium family of processors was such a huge leap that it allowed them to bust through and do it in “real time”. Quake has since set the standard for multiplayer deathmatch games, and its source code has lineage to Call of Duty, Half-Life, Half-Life 2, DotA 2, Titanfall, and Apex Legends.</p>
<p>However, the thing that really made 3D accelerator cards leap into the public spotlight was another little-known studio called Crystal Dynamics and their 1996 release of Tomb Raider. It was built from the ground up to require the use of 3D accelerator cards. The cards flew off the shelves.</p>
<p>“3D accelerator cards” would later become known as “Graphics Processing Units” or GPUs because of how synonymous they became with 3D gaming, engineering tasks such as Computer-Aided Drafting (CAD), and even the entire OS environment with compositors like <a href='https://en.wikipedia.org/wiki/Desktop_Window_Manager' title=''>DWM</a> on Windows Vista, <a href='https://en.wikipedia.org/wiki/Compiz' title=''>Compiz</a> on GNU+Linux, and <a href='https://en.wikipedia.org/wiki/Quartz_(graphics_layer)' title=''>Quartz</a> on macOS. Things became so much easier for everyone when 2D and 3D graphics were integrated into the same device so you didn’t need to chain your output through your 3D accelerator card!</p>
<h2 id='the-gpu-as-we-know-it' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-gpu-as-we-know-it' aria-label='Anchor'></a>The GPU as we know it</h2>
<p>When GPUs first came out, they were very simple devices. They had a few basic components:</p>
<ul>
<li>A framebuffer to store the current state of the screen
</li><li>A command processor to take instructions from the game and translate them into something the hardware can understand
</li><li>Memory to store temporary data
</li><li>Shader processing hardware to allow designers to change how light and textures were rendered
</li><li>A display output that was chained through an existing VGA card so that the user could see what was going on in real time (yes, this is something we actually did)
</li></ul>
<p>This basic architecture has remained the same for the past 20 years or so. The main differences are that as technology advanced, the capabilities of those cards increased. They got faster, more parallel, more capable, had more memory, were made cheaper, and so on. This gradually allowed for more and more complex games like Half-Life 2, Crysis, The Legend of Zelda: Breath of the Wild, Baudur’s Gate 3, and so on.</p>
<p>Over time, as more and more hardware was added, GPUs became computers in their own rights (sometimes even bigger than the rest of the computer thanks for the need to cool things more aggressively). This new hardware includes:</p>
<ul>
<li>Video encoding hardware via NVENC and AMD VCE so that content creators can stream and record their gameplay in higher quality without having to impact the performance of the game
</li><li><aside class="left-sidenote">Seriously, once you experience high framerate HDR raytraced Tetris you can’t really go back to the old way.</aside> Raytracing accelerator cores via RTX so that light can be rendered more realistically
</li><li>AI/ML cores to allow for dynamic upscaling to eke out more performance from the card
</li><li>Display output hardware to allow for multiple monitors to be connected to the card
</li><li>Faster and faster memory buses and interfaces to the rest of the system to allow for more data to be processed faster
</li><li>Direct streaming from the drive to GPU memory to allow for faster loading times
</li></ul>
<p>But, at the same time, that AI/ML hardware started to get noticed by more and more people. It was discovered that the shader cores and then the CUDA cores could be used to do AI/ML workloads at ludicrous speeds. This enabled research and development of models like GPT-2, Stable Diffusion, DLSS, and so on. This has led to a Cambrian Explosion of AI/ML research and development that is continuing to this day.</p>
<h2 id='the-quot-gpus-quot-that-fly-io-is-using' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-quot-gpus-quot-that-fly-io-is-using' aria-label='Anchor'></a>The “GPUs” that Fly.io is using</h2>
<p>I’ve mostly been describing consumer GPUs and their capabilities up to this point because that’s what we all have the biggest understanding of. There is a huge difference between the “GPUs” that you can get for server tasks and normal consumer tasks from a place like Newegg or Best Buy. The main difference is that enterprise-grade Graphics Processing Units do not have any of the hardware needed to process graphics.</p>
<div class="right-sidenote"><p>Author’s note: This will not be the case in the future. Fly.io is going to add <a href="https://www.nvidia.com/en-us/data-center/l40s/" title="">Lovelace L40S GPUs</a> that do have 3D rendering, video encoding, shader cores, and so on. But, that’s not what we’re talking about today.</p>
</div>
<p>Yes. Really. They don’t have rasterization hardware, shader cores, display outputs, or anything useful for trying to run games on them. They are AI/ML accelerator cards more than anything. It’s kinda beautifully ironic that they’re called Graphics Processing Units when they have no ability to process graphics.</p>
<h2 id='what-can-you-do-with-them' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#what-can-you-do-with-them' aria-label='Anchor'></a>What can you do with them?</h2>
<p>These GPUs are really good at massively parallel tasks. This naturally translates to being very good at AI/ML tasks such as:</p>
<ul>
<li>Summarization (what is this article about in a few sentences?)
</li><li>Translation (what does this article say in Spanish?)
</li><li>Speech recognition (what is a voice clip saying?)
</li><li>Speech synthesis (what does this text sound like?)
</li><li>Text generation (what would a cat say if it could talk?)
</li><li>Basic rote question and answering (what is the safe cooking temperature for chicken breasts in celsius?)
</li><li>Text classification (is this article about cats or dogs?)
</li><li>Sentiment analysis (is this article positive or negative, what could that mean about the companies involved?)
</li><li>Image classification (is this a cat or a dog?)
</li><li>Object detection (where are the cats and dogs in this image?)
</li></ul>
<p>Or any combination/chain of these tasks. A lot of this is pretty abstract building blocks that can be combined in a lot of different ways. This is why AI/ML stuff is so exciting right now. We’re in the early days of understanding what these things are, what they can do, and how to use them properly.</p>
<p>Imagine being able to load articles about the topic you are researching into your queries to find where someone said something roughly similar to what you’re looking for. Queries like “that one recipe with eggs that you fold over with ham in it”. That’s the kind of thing that’s possible with AI/ML (and tools like vector databases) but difficult to impossible with traditional search engines.</p>
<h2 id='how-to-use-ai-for-reals' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#how-to-use-ai-for-reals' aria-label='Anchor'></a>How to use AI for reals</h2>
<p>Fortunately and unfortunately, we’re in the Cambrian Explosion days of this industry. Key advances happen constantly. Exact models and tooling changes almost as often. This is both a very good thing and a very bad thing.</p>
<p>If you want to get started today, here’s a few models that you can play with right now:</p>
<ul>
<li><a href='https://ai.meta.com/llama/' title=''>Llama 2</a> - A generic foundation model with instruction and chat tuned variants. It’s a good starting point for a lot of research and nearly everything else uses the same formats that Llama 2 does.
</li><li><a href='https://openai.com/research/whisper' title=''>Whisper</a> - A speech to text model that transcribes audio files into text better than most professional dictation software. I, the author of this article, wrote most of this article using Whisper.
</li><li><a href='https://huggingface.co/NurtureAI/OpenHermes-2.5-Mistral-7B-16k' title=''>OpenHermes-2.5 Mistral 7B 16k</a> - An instruction-tuned model that can operate on up to 16 thousand tokens (about 40 printed pages of text, 12,000 words) at once. It’s a good starting point for summarization and other tasks that require a lot of context. I personally use it for my personal AI chatbot named <a href='https://xeiaso.net/characters/#Mimi' title=''>Mimi</a>.
</li><li><aside class="right-sidenote">Seriously Annie, you’re great!</aside> <a href='https://stability.ai/stable-diffusion' title=''>Stable Diffusion XL</a> - A text-to-image model that lets you create high quality images from simple text descriptions. It’s a good starting point for tasks that require image generation, such as when you want to add images to your blog posts but don’t have an artist like Annie to draw you what you want.
</li></ul>
<p>For a practical example, imagine that you have a set of <a href='https://xeiaso.net/talks/' title=''>conference talks that you’ve given over the years</a>. You want to take those talk videos, extract the audio, and transform them into written text because some people learn better from text than video. The overall workflow would look something like this:</p>
<ul>
<li>Use ffmpeg to extract the audio track from the video files
</li><li>Use Whisper to <a href='https://fly.io/blog/transcribing-on-fly-gpu-machines/' title=''>convert the audio files into subtitle files</a>
</li><li>Break the subtitle file into sequences based on significant pauses between topics (humans do this subconsciously, take advantage of it and you can make things seem heckin’ magic)
</li><li>Use a large language model to summarize the segments and create a title for each segment
</li><li>Paste the rest of the text into a markdown document between the segment titles
</li><li>Manually review the documents and make any necessary changes with technical terms that the model didn’t know about or things the model got wrong because English is a minefield of homophones that even trained experts have trouble with (ask me how I know)
</li><li>Publish the documents on your blog
</li></ul>
<p>Then bam, you don’t just have a portfolio piece, you have the recipe for winning downtime from visitors of orange websites clicking on your link so much. You can also use this to create transcripts for your videos so that people who can’t hear can still enjoy your content.</p>
<p>The true advantage of these is not using them as individual parts on themselves, but as a cohesive whole in a chain. This is where the real power of AI/ML comes from. It’s not the individual models, but the ability to chain them together to do something useful. This is where the true opportunities for innovation lie.</p>
<h2 id='conclusion' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#conclusion' aria-label='Anchor'></a>Conclusion</h2>
<p>So that’s what these “GPUs” are really: they’re AI/ML accelerator cards. The A100 cards incapable of processing graphics or encoding video, but they’re really, really good at AI/ML workloads. They allow you to do way more tasks per watt than any CPU ever could.</p>
<p>I hope you enjoyed this tale of woe as I spilled out the horrible truths about marketing being awful forever and gave you ideas for how to <em>actually use</em> these graphics-free Graphics Processing Units to do useful things. But sadly, not for processing graphics unless you wait for the <a href='https://www.nvidia.com/en-us/data-center/l40s/' title=''>Lovelace L40S</a> cards early in 2024.</p>
<p>Sign up for Fly.io today and try our GPUs! I can’t wait to see what you build with them.</p>
/blog/what-are-these-gpus-really/assets/gpu-songstress-thumb.webpScaling Large Language Models to zero with Ollamahttps://fly.io/blog/scaling-llm-ollama/2023-12-06T12:00:00+00:002024-02-29T01:36:09+00:00<div class="lead"><p>We’re Fly.io. We have powerful servers worldwide to run your code close to your users. Including GPUs so you can self host your own AI.</p>
</div>
<p>Open-source self-hosted AI tools have advanced a lot in the past 6 months. They allow you to create new methods of expression (with QR code generation and Stable Diffusion), easy access to summarization powers that would have made Google blush a decade ago (even with untuned foundation models such as LLaMa 2 and Yi), to conversational assistants that enable people to do more with their time, and to perform speech recognition in <em>real time</em> on moderate hardware (with Whisper et al). With all these capabilities comes the need for more and more raw computational muscle to be able to do inference on bigger and bigger models, and eventually do things that we can’t even imagine right now. Fly.io lets you put your compute where your users are so that you can do machine learning inference tasks on the edge with the power of enterprise-grade GPUs such as the Nvidia A100. You can also scale your GPU nodes to zero running Machines, so you only pay for what you actually need, when you need it.</p>
<div class="right-sidenote"><p>It’s worth mentioning that “scaling to zero” doesn’t mean what you may think it means. When you “scale to zero” in Fly.io, you actually stop the running Machine. This means the Machine is still laying around on the same computer box that it runs on, but it’s just put to sleep. If there is a capacity issue then your app may be unable to wake back up. We are working on a solution to this, but for now you should be aware that scaling to zero is not the same as spinning down your Machine and spinning it back up again on a new computer box when you need it.</p>
</div><div class="callout"><p>This is a continuation of the last post in this series about <a href="https://fly.io/blog/transcribing-on-fly-gpu-machines/" title="">how to use GPUs on Fly.io</a>.</p>
</div><h2 id='why-scale-to-zero' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#why-scale-to-zero' aria-label='Anchor'></a>Why scale to zero?</h2>
<p>Running GPU nodes on top of Fly is expensive. Sure, GPUs enable you to do things a lot faster than CPUs ever could on their own, but you mostly will have things run idle between uses. This is where scaling to zero comes in. With scaling to zero, you can have your GPU nodes shut down when you’re not using them. When your Machine stops, you aren’t paying for the GPU any more. This is good for the environment and your wallet.</p>
<p>In this post, we’re going to be using <a href='https://ollama.ai' title=''>Ollama</a> to generate text. Ollama is a fancy wrapper around <a href='https://github.com/ggerganov/llama.cpp' title=''>llama.cpp</a> that allows you to run large language models on your own hardware with your choice of model. It also supports GPU acceleration, meaning that you can use Fly.io’s huge GPUs to run your models faster than your RTX 3060 at home ever would on its own.</p>
<p>One of the main downsides of using Ollama in a cloud environment is that it doesn’t have authentication by default. Thanks to the power of about 70 lines of Go, we are able to shim that in after the fact. This will protect your server from random people on the internet using your GPU time (and spending your money) to generate text and integrate it into your own applications.</p>
<p>Create a new folder called <code>ollama-scale-to-0</code>:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-mk5o8iaj"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-mk5o8iaj"><span class="nb">mkdir </span>ollama-scale-to-0
</code></pre>
</div>
</div><h2 id='fly-app-setup' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#fly-app-setup' aria-label='Anchor'></a>Fly app setup</h2>
<p>First, we need to create a new Fly app:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-f99kj6i"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-f99kj6i">fly launch <span class="nt">--no-deploy</span>
</code></pre>
</div>
</div>
<p>After selecting a name and an organization to run it in, this command will create the app and write out a <code>fly.toml</code> file for you:</p>
<div class="highlight-wrapper group relative toml">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-ax5wdl7d"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-ax5wdl7d"><span class="c"># fly.toml app configuration file generated for sparkling-violet-709 on 2023-11-14T12:13:53-05:00</span>
<span class="c">#</span>
<span class="c"># See https://fly.io/docs/reference/configuration/ for information about how to use this file.</span>
<span class="c">#</span>
<span class="py">app</span> <span class="p">=</span> <span class="s">"sparkling-violet-709"</span>
<span class="py">primary_region</span> <span class="p">=</span> <span class="s">"ord"</span>
<span class="nn">[http_service]</span>
<span class="py">internal_port</span> <span class="p">=</span> <span class="mi">11434</span> <span class="c"># change me to 11434!</span>
<span class="py">force_https</span> <span class="p">=</span> <span class="kc">false</span> <span class="c"># change mo to false!</span>
<span class="py">auto_stop_machines</span> <span class="p">=</span> <span class="kc">true</span>
<span class="py">auto_start_machines</span> <span class="p">=</span> <span class="kc">true</span>
<span class="py">min_machines_running</span> <span class="p">=</span> <span class="mi">0</span>
<span class="py">processes</span> <span class="p">=</span> <span class="nn">["app"]</span>
</code></pre>
</div>
</div>
<p>This is the configuration file that Fly.io uses to know how to run your application. We’re going to be modifying the <code>fly.toml</code> file to add some additional configuration to it, such as enabling GPU support:</p>
<div class="highlight-wrapper group relative toml">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-o22k19kg"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-o22k19kg"><span class="py">app</span> <span class="p">=</span> <span class="s">"sparkling-violet-709"</span>
<span class="py">primary_region</span> <span class="p">=</span> <span class="s">"ord"</span>
<span class="py">vm.size</span> <span class="p">=</span> <span class="s">"a100-40gb"</span> <span class="c"># the GPU size, see https://fly.io/docs/gpus/gpu-quickstart/ for more info</span>
</code></pre>
</div>
</div>
<p>We don’t want to expose the GPU to the internet, so we’re going to create a <a href='https://fly.io/docs/reference/private-networking/#flycast-private-load-balancing' title=''>flycast</a> address to expose it to other services on your private network. To create a flycast address, run this command:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-lptchb11"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-lptchb11">fly ips allocate-v6 <span class="nt">--private</span>
</code></pre>
</div>
</div>
<p>The <code>fly ips allocate-v6</code> command makes a unique address in your private network that you can use to access Ollama from your other services. Make sure to add the <code>--private</code> flag, otherwise you’ll get a globally unique IP address instead of a private one.</p>
<p>Next, you may need to remove all of the other public IP addresses for the app to lock it away from the public. Get a list of them with <code>fly ips list</code> and then remove them with <code>fly ips release <ip></code>. Delete everything but your flycast IP.</p>
<p>Next, we need to declare the volume for Ollama to store models in. If you don’t do this, then when you scale to zero, your existing models will be destroyed and you will have to re-download them every time the server starts. This is not ideal, so we’re going to create a persistent volume to store the models in. Add the following to your <code>fly.toml</code>:</p>
<div class="highlight-wrapper group relative toml">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-td1pkm73"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-td1pkm73"><span class="nn">[build]</span>
<span class="py">image</span> <span class="p">=</span> <span class="s">"ollama/ollama"</span>
<span class="nn">[mounts]</span>
<span class="py">source</span> <span class="p">=</span> <span class="s">"models"</span>
<span class="py">destination</span> <span class="p">=</span> <span class="s">"/root/.ollama"</span>
<span class="py">initial_size</span> <span class="p">=</span> <span class="s">"100gb"</span>
</code></pre>
</div>
</div>
<p>This will create a 100GB volume in the <a href='https://en.wikipedia.org/wiki/O%27Hare_International_Airport' title=''><code>ord</code></a> region when the app is deployed. This will be used to store the models that you download from the <a href='https://ollama.ai/library/' title=''>Ollama library</a>. You can make this smaller if you want, but 100GB is a good place to start from.</p>
<p>Now that everything is set up, we can deploy this to Fly.io:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-1nrhwe1f"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-1nrhwe1f">fly deploy
</code></pre>
</div>
</div>
<p>This will take a minute to pull the Ollama image, push it to a Machine, provision your volume, and kick everything else off with hypervisors, GPUs and whatnot. Once it’s done, you should see something like this:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-bfk3c3ca"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-bfk3c3ca"> ✔ Machine 17816141f55489 <span class="o">[</span>app] update succeeded
<span class="nt">-------</span>
Visit your newly deployed app at https://sparkling-violet-709.fly.dev/
</code></pre>
</div>
</div>
<p>This is a lie because we just deleted the public IP addresses for this app. You can’t access it from the internet, and by extension, random people can’t access it either. For now, you can run an interactive session with Ollama using an ephemeral Fly Machine:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-3acg9o4h"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-3acg9o4h">fly m run <span class="nt">-e</span> <span class="nv">OLLAMA_HOST</span><span class="o">=</span>http://sparkling-violet-709.flycast <span class="nt">--shell</span> ollama/ollama
</code></pre>
</div>
</div>
<p>And then you can pull an image from the <a href='https://ollama.ai/library/' title=''>ollama library</a> and generate some text:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-nu882eo0"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-nu882eo0"><span class="nv">$ </span>ollama run openchat:7b-v3.5-fp16
<span class="o">>>></span> How <span class="k">do </span>I bake chocolate chip cookies?
To bake chocolate chip cookies, follow these steps:
1. Preheat the oven to 375°F <span class="o">(</span>190°C<span class="o">)</span> and line a baking sheet with parchment paper or silicone baking mat.
2. In a large bowl, mix together 1 cup of unsalted butter <span class="o">(</span>softened<span class="o">)</span>, 3/4 cup granulated sugar, and 3/4
cup packed brown sugar <span class="k">until </span>light and fluffy.
3. Add 2 large eggs, one at a <span class="nb">time</span>, to the butter mixture, beating well after each addition. Stir <span class="k">in </span>1
teaspoon of pure vanilla extract.
4. In a separate bowl, whisk together 2 cups all-purpose flour, 1/2 teaspoon baking soda, and 1/2 teaspoon
salt. Gradually add the dry ingredients to the wet ingredients, stirring <span class="k">until </span>just combined.
5. Fold <span class="k">in </span>2 cups of chocolate chips <span class="o">(</span>or chunks<span class="o">)</span> into the dough.
6. Drop rounded tablespoons of dough onto the prepared baking sheet, spacing them about 2 inches apart.
7. Bake <span class="k">for </span>10-12 minutes, or <span class="k">until </span>the edges are golden brown. The centers should still be slightly soft.
8. Allow the cookies to cool on the baking sheet <span class="k">for </span>a few minutes before transferring them to a wire rack
to cool completely.
Enjoy your homemade chocolate chip cookies!
</code></pre>
</div>
</div>
<p>If you want a persistent wake-on-use connection to your Ollama instance, you can set up a <a href='https://fly.io/docs/reference/private-networking/#discovering-apps-through-dns-on-a-wireguard-connection' title=''>connection to your Fly network using WireGuard</a>. This will let you use Ollama from your local applications without having to run them on Fly. For example, if you want to figure out the safe cooking temperature for ground beef in Celsius, you can query that in JavaScript with this snippet of code:</p>
<div class="highlight-wrapper group relative typescript">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-yz8tgha4"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-yz8tgha4"><span class="kd">const</span> <span class="nx">generateRequest</span> <span class="o">=</span> <span class="p">{</span>
<span class="na">model</span><span class="p">:</span> <span class="dl">"</span><span class="s2">openchat:7b-v3.5-fp16</span><span class="dl">"</span><span class="p">,</span>
<span class="na">prompt</span><span class="p">:</span> <span class="dl">"</span><span class="s2">What is the safe cooking temperature for ground beef in celsius?</span><span class="dl">"</span>
<span class="na">stream</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span> <span class="c1">// <- important for Node/Deno clients</span>
<span class="p">};</span>
<span class="kd">let</span> <span class="nx">resp</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span><span class="dl">"</span><span class="s2">http://sparkling-violet-709.flycast/api/generate</span><span class="dl">"</span><span class="p">,</span> <span class="p">{</span>
<span class="na">method</span><span class="p">:</span> <span class="dl">"</span><span class="s2">POST</span><span class="dl">"</span><span class="p">,</span>
<span class="na">body</span><span class="p">:</span> <span class="nx">JSON</span><span class="p">.</span><span class="nx">stringify</span><span class="p">(</span><span class="nx">generateRequest</span><span class="p">),</span>
<span class="p">});</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">resp</span><span class="p">.</span><span class="nx">status</span> <span class="o">!==</span> <span class="mi">200</span><span class="p">)</span> <span class="p">{</span>
<span class="k">throw</span> <span class="k">new</span> <span class="nb">Error</span><span class="p">(</span><span class="s2">`error fetching response: </span><span class="p">${</span><span class="nx">resp</span><span class="p">.</span><span class="nx">status</span><span class="p">}</span><span class="s2">: </span><span class="p">${</span><span class="k">await</span> <span class="nx">resp</span><span class="p">.</span><span class="nx">text</span><span class="p">()}</span><span class="s2">`</span><span class="p">);</span>
<span class="p">}</span>
<span class="nx">resp</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">resp</span><span class="p">.</span><span class="nx">json</span><span class="p">();</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">resp</span><span class="p">.</span><span class="nx">response</span><span class="p">);</span> <span class="c1">// Something like "The safe cooking temperature for ground beef is 71 degrees celsius (160 degrees fahrenheit).</span>
</code></pre>
</div>
</div><h2 id='scaling-to-zero' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#scaling-to-zero' aria-label='Anchor'></a>Scaling to zero</h2>
<p>The best part about all of this is that when you want to scale down to zero running Machines: do nothing, it will automatically shut down when it’s idle. Wait a few minutes and then verify it with <code>fly status</code>:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-qbivja7d"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-qbivja7d"><span class="nv">$ </span>fly status
...
PROCESS ID VERSION REGION STATE ROLE CHECKS LAST UPDATED
app 3d8d7949b22089 9 ord stopped 2023-11-14T19:34:24Z
</code></pre>
</div>
</div>
<p>The app has been stopped. This means that it’s not running and you’re not paying for it. When you want it to start up again, just make a request. It will automatically start up and you can use it as normal with the CLI or even just arbitrary calls to <a href='https://github.com/jmorganca/ollama/blob/main/docs/api.md' title=''>the API</a>.</p>
<p>You can also upload your own models to the Ollama registry by <a href='https://github.com/jmorganca/ollama/blob/main/docs/import.md' title=''>creating your own Modelfile</a> and pushing it (though you will need to install Ollama locally to publish your own models). At this time, the only way to set a custom system prompt is to use a Modelfile and upload your model to the registry.</p>
<h2 id='conclusion' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#conclusion' aria-label='Anchor'></a>Conclusion</h2>
<p>Ollama is a fantastic way to run large language models of your choice and the ability to use Fly.io’s powerful GPUs means you can use bigger models with more parameters and a larger context window. This lets you make your assistants more lifelike, your conversations have more context, and your text generation more realistic.</p>
<p>Oh, by the way, this also lets you use the new <code>json</code> mode to have your models call functions, similar to how ChatGPT would. To do this, have a system prompt that looks like this:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-xsauyx7u"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-xsauyx7u">You are a helpful research assistant. The following functions are available for you to fetch further data to answer user questions, if relevant:
{
"function": "search_bing",
"description": "Search the web for content on Bing. This allows users to search online/the internet/the web for content.",
"arguments": [
{
"name": "query",
"type": "string",
"description": "The search query string"
}
]
}
{
"function": "search_arxiv",
"description": "Search for research papers on ArXiv. Make use of AND, OR and NOT operators as appropriate to join terms within the query.",
"arguments": [
{
"name": "query",
"type": "string",
"description": "The search query string"
}
]
}
To call a function, respond - immediately and only - with a JSON object of the following format:
{
"function": "function_name",
"arguments": {
"argument1": "argument_value",
"argument2": "argument_value"
}
}
If no function needs to be called, respond with an empty JSON object: {}
</code></pre>
</div>
</div>
<p>Then you can use the <a href='https://github.com/jmorganca/ollama/blob/main/docs/api.md#request-json-mode' title=''>JSON format</a> to receive a JSON response from Ollama (hint: <code>—format=json</code> in the CLI or <code>format: "json"</code> in the API). This is a great way to make your assistants more lifelike and more useful. You will need to use something like <a href='https://www.langchain.com/' title=''>Langchain</a> or manual iterations to properly handle the cases where the user doesn’t want to call a function, but that’s a topic for another blog post.</p>
<p>For the best results you may want to use a model with a larger context window such as <a href='https://ollama.ai/library/vicuna:13b-v1.5-16k-fp16' title=''>vicuna:13b-v1.5-16k-fp16</a> (16k == 16,384 token window) as JSON is very token-expensive. Future advances in the next few weeks (such as the Yi models gaining ludicrous token windows on the line of 200,000 tokens at the cost of ludicrous amounts of VRAM usage) will make this less of an issue. You can also get away with minifying the JSON in the functions and examples a lot, but you may need to experiment to get the best results.</p>
<p>Happy hacking, y'all.</p>
/blog/scaling-llm-ollama/assets/thumb.webpRethinking Serverless with FLAMEhttps://fly.io/blog/rethinking-serverless-with-flame/2023-12-06T00:00:00+00:002023-12-13T15:30:20+00:00<blockquote>Imagine if you could auto scale simply by wrapping any existing app code in a function and have that block of code run in a temporary copy of your app.</blockquote>
<p>The pursuit of elastic, auto-scaling applications has taken us to silly places.</p>
<p>Serverless/FaaS had a couple things going for it. Elastic Scale™ is hard. It’s even harder when you need to manage those pesky servers. It also promised pay-what-you-use costs to avoid idle usage. Good stuff, right?</p>
<p>Well the charade is over. You offload scaling concerns and the complexities of scaling, just to end up needing <em>more complexity</em>. Additional queues, storage, and glue code to communicate back to our app is just the starting point. Dev, test, and CI complexity balloons as fast as your costs. Oh, and you often have to rewrite your app in proprietary JavaScript – even if it’s already written in JavaScript!</p>
<p>At the same time, the rest of us have elastically scaled by starting more webservers. Or we’ve dumped on complexity with microservices. This doesn’t make sense. Piling on more webservers to transcode more videos or serve up more ML tasks isn’t what we want. And granular scale shouldn’t require slicing our apps into bespoke operational units with their own APIs and deployments to manage.</p>
<p>Enough is enough. There’s a better way to elastically scale applications.</p>
<h2 id='the-flame-pattern' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-flame-pattern' aria-label='Anchor'></a>The FLAME pattern</h2>
<p>Here’s what we really want:</p>
<ul>
<li>We don’t want to manage those pesky servers. We already have this for our app deployments via <code>fly deploy</code>, <code>git push heroku</code>, <code>kubectl</code>, etc
</li><li>We want on-demand, <em>granular</em> elastic scale of specific parts of our app code
</li><li>We don’t want to rewrite our application or write parts of it in proprietary runtimes
</li></ul>
<p>Imagine if we could auto scale simply by wrapping any existing app code in a function and have that block of code run in a temporary copy of the app.</p>
<p>Enter the FLAME pattern.</p>
<blockquote>FLAME - Fleeting Lambda Application for Modular Execution</blockquote>
<p>With FLAME, you treat your <em>entire application</em> as a lambda, where modular parts can be executed on short-lived infrastructure.</p>
<p>No rewrites. No bespoke runtimes. No outrageous layers of complexity. Need to insert the results of an expensive operation to the database? PubSub broadcast the result of some expensive work? No problem! It’s your whole app so of course you can do it.</p>
<p>The Elixir <a href='https://github.com/phoenixframework/flame' title=''>flame library</a> implements the FLAME pattern. It has a backend adapter for Fly.io, but you can use it on any cloud that gives you an API to spin up an instance with your app code running on it. We’ll talk more about backends in a bit, as well as implementing FLAME in other languages.</p>
<p>First, lets watch a realtime thumbnail generation example to see FLAME + Elixir in action:</p>
<div class="youtube-container" data-exclude-render>
<div class="youtube-video">
<iframe
width="100%"
height="100%"
src="https://www.youtube.com/embed/l1xt_rkWdic"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen>
</iframe>
</div>
</div>
<p>Now let’s walk thru something a little more basic. Imagine we have a function to transcode video to thumbnails in our Elixir application after they are uploaded:</p>
<div class="highlight-wrapper group relative elixir">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-n4v3enzr"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-n4v3enzr"><span class="k">def</span> <span class="n">generate_thumbnails</span><span class="p">(%</span><span class="no">Video</span><span class="p">{}</span> <span class="o">=</span> <span class="n">vid</span><span class="p">,</span> <span class="n">interval</span><span class="p">)</span> <span class="k">do</span>
<span class="n">tmp</span> <span class="o">=</span> <span class="no">Path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="no">System</span><span class="o">.</span><span class="n">tmp_dir!</span><span class="p">(),</span> <span class="no">Ecto</span><span class="o">.</span><span class="no">UUID</span><span class="o">.</span><span class="n">generate</span><span class="p">())</span>
<span class="no">File</span><span class="o">.</span><span class="n">mkdir!</span><span class="p">(</span><span class="n">tmp</span><span class="p">)</span>
<span class="n">args</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"-i"</span><span class="p">,</span> <span class="n">vid</span><span class="o">.</span><span class="n">url</span><span class="p">,</span> <span class="s2">"-vf"</span><span class="p">,</span> <span class="s2">"fps=1/</span><span class="si">#{</span><span class="n">interval</span><span class="si">}</span><span class="s2">"</span><span class="p">,</span> <span class="s2">"</span><span class="si">#{</span><span class="n">tmp</span><span class="si">}</span><span class="s2">/%02d.png"</span><span class="p">]</span>
<span class="no">System</span><span class="o">.</span><span class="n">cmd</span><span class="p">(</span><span class="s2">"ffmpeg"</span><span class="p">,</span> <span class="n">args</span><span class="p">)</span>
<span class="n">urls</span> <span class="o">=</span> <span class="no">VidStore</span><span class="o">.</span><span class="n">put_thumbnails</span><span class="p">(</span><span class="n">vid</span><span class="p">,</span> <span class="no">Path</span><span class="o">.</span><span class="n">wildcard</span><span class="p">(</span><span class="n">tmp</span> <span class="o"><></span> <span class="s2">"/*.png"</span><span class="p">))</span>
<span class="no">Repo</span><span class="o">.</span><span class="n">insert_all</span><span class="p">(</span><span class="no">Thumb</span><span class="p">,</span> <span class="no">Enum</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">urls</span><span class="p">,</span> <span class="o">&</span><span class="p">%{</span><span class="ss">vid_id:</span> <span class="n">vid</span><span class="o">.</span><span class="n">id</span><span class="p">,</span> <span class="ss">url:</span> <span class="nv">&1</span><span class="p">}))</span>
<span class="k">end</span>
</code></pre>
</div>
</div>
<p>Our <code>generate_thumbnails</code> function accepts a video struct. We shell out to <code>ffmpeg</code> to take the video URL and generate thumbnails at a given interval. We then write the temporary thumbnail paths to durable storage. Finally, we insert the generated thumbnail URLs into the database.</p>
<p>This works great locally, but CPU bound work like video transcoding can quickly bring our entire service to a halt in production. Instead of rewriting large swaths of our app to move this into microservices or some FaaS, we can simply wrap it in a FLAME call:</p>
<div class="highlight-wrapper group relative elixir">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-vr4odag1"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-vr4odag1"><span class="k">def</span> <span class="n">generate_thumbnails</span><span class="p">(%</span><span class="no">Video</span><span class="p">{}</span> <span class="o">=</span> <span class="n">vid</span><span class="p">,</span> <span class="n">interval</span><span class="p">)</span> <span class="k">do</span>
<span class="no">FLAME</span><span class="o">.</span><span class="n">call</span><span class="p">(</span><span class="no">MyApp</span><span class="o">.</span><span class="no">FFMpegRunner</span><span class="p">,</span> <span class="k">fn</span> <span class="o">-></span>
<span class="n">tmp</span> <span class="o">=</span> <span class="no">Path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="no">System</span><span class="o">.</span><span class="n">tmp_dir!</span><span class="p">(),</span> <span class="no">Ecto</span><span class="o">.</span><span class="no">UUID</span><span class="o">.</span><span class="n">generate</span><span class="p">())</span>
<span class="no">File</span><span class="o">.</span><span class="n">mkdir!</span><span class="p">(</span><span class="n">tmp</span><span class="p">)</span>
<span class="n">args</span> <span class="o">=</span>
<span class="p">[</span><span class="s2">"-i"</span><span class="p">,</span> <span class="n">vid</span><span class="o">.</span><span class="n">url</span><span class="p">,</span> <span class="s2">"-vf"</span><span class="p">,</span> <span class="s2">"fps=1/</span><span class="si">#{</span><span class="n">interval</span><span class="si">}</span><span class="s2">"</span><span class="p">,</span> <span class="s2">"</span><span class="si">#{</span><span class="n">tmp</span><span class="si">}</span><span class="s2">/%02d.png"</span><span class="p">]</span>
<span class="no">System</span><span class="o">.</span><span class="n">cmd</span><span class="p">(</span><span class="s2">"ffmpeg"</span><span class="p">,</span> <span class="n">args</span><span class="p">)</span>
<span class="n">urls</span> <span class="o">=</span> <span class="no">VidStore</span><span class="o">.</span><span class="n">put_thumbnails</span><span class="p">(</span><span class="n">vid</span><span class="p">,</span> <span class="no">Path</span><span class="o">.</span><span class="n">wildcard</span><span class="p">(</span><span class="n">tmp</span> <span class="o"><></span> <span class="s2">"/*.png"</span><span class="p">))</span>
<span class="no">Repo</span><span class="o">.</span><span class="n">insert_all</span><span class="p">(</span><span class="no">Thumb</span><span class="p">,</span> <span class="no">Enum</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">urls</span><span class="p">,</span> <span class="o">&</span><span class="p">%{</span><span class="ss">vid_id:</span> <span class="n">vid</span><span class="o">.</span><span class="n">id</span><span class="p">,</span> <span class="ss">url:</span> <span class="nv">&1</span><span class="p">}))</span>
<span class="k">end</span><span class="p">)</span>
<span class="k">end</span>
</code></pre>
</div>
</div>
<p>That’s it! <code>FLAME.call</code> accepts the name of a runner pool, and a function. It then finds or boots a new copy of our entire application and runs the function there. Any variables the function closes over (like our <code>%Video{}</code> struct and <code>interval</code>) are passed along automatically.</p>
<p>When the FLAME runner boots up, it connects back to the parent node, receives the function to run, executes it, and returns the result to the caller. Based on configuration, the booted runner either waits happily for more work before idling down, or extinguishes itself immediately.</p>
<p>Let’s visualize the flow:</p>
<p><img alt="visualizing the flow" src="/images/visual.webp?centered" /></p>
<p>We changed no other code and issued our DB write with <code>Repo.insert_all</code> just like before, because we are running our <em>entire</em> <em>application</em>. Database connection(s) and all. Except this fleeting application only runs that little function after startup and nothing else.</p>
<p>In practice, a FLAME implementation will support a pool of runners for hot startup, scale-to-zero, and elastic growth. More on that later.</p>
<h2 id='solving-a-problem-vs-removing-the-problem' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#solving-a-problem-vs-removing-the-problem' aria-label='Anchor'></a>Solving a problem vs removing the problem</h2><blockquote>FaaS solutions help you solve a problem. FLAME removes the problem.</blockquote>
<p>The FaaS labyrinth of complexity defies reason. And it’s unavoidable. Let’s walkthrough the thumbnail use-case to see how.</p>
<p>We try to start with the simplest building block like request/response AWS Lambda Function URL’s.</p>
<p>The complexity hits immediately.</p>
<p>We start writing custom encoders/decoders on both sides to handle streaming the thumbnails back to the app over HTTP. Phew that’s done. Wait, is our video transcoding or user uploads going to take longer than 15 minutes? Sorry, hard timeout limit – time to split our videos into chunks to stay within the timeout, which means more lambdas to do that. Now we’re orchestrating lambda workflows and relying on additional services, such as SQS and S3, to enable this.</p>
<p>All the FaaS is doing is adding layers of communication between your code and the parts you want to run elastically. Each layer has its own glue integration price to pay.</p>
<p>Ultimately handling this kind of use-case looks something like this:</p>
<ul>
<li>Trigger the lambda via HTTP endpoint, S3, or API gateway ($)
</li><li>Write the bespoke lambda to transcode the video ($)
</li><li>Place the thumbnail results into SQS ($)
</li><li>Write the SQS consumer in our app (dev $)
</li><li>Persist to DB and figure out how to get events back to active subscribers that may well be connected to other instances than the SQS consumer (dev $)
</li></ul>
<p>This is nuts. We pay the FaaS toll at every step. We shouldn’t have to do any of this!</p>
<p>FaaS provides a bunch of offerings to build a solution on top of. FLAME removes the problem entirely.</p>
<h2 id='flame-backends' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#flame-backends' aria-label='Anchor'></a>FLAME Backends</h2><blockquote>On Fly.io infrastructure the <code>FLAME.FlyBackend</code> can boot a copy of your application on a new <a href="https://fly.io/docs/machines/">Machine</a> and have it connect back to the parent for work within ~3s.</blockquote>
<p>By default, FLAME ships with a <code>LocalBackend</code> and <code>FlyBackend</code>, but any host that provides an API to provision a server and run your app code can work as a FLAME backend. Erlang and Elixir primitives are doing all the heavy lifting here. The entire <code>FLAME.FlyBackend</code> is <a href='https://github.com/phoenixframework/flame/blob/main/lib/flame/fly_backend.ex' title=''>< 200 LOC with docs</a>. The library has a single dependency, <code>req</code>, which is an HTTP client.</p>
<p>Because Fly.io runs our applications as a packaged up docker image, we simply ask the Fly API to boot a new Machine for us with the same image that our app is currently running. Also thanks to Fly infrastructure, we can guarantee the FLAME runners are started in the same region as the parent. This optimizes latency and lets you ship whatever data back and forth between parent and runner without having to think about it.</p>
<h2 id='look-at-everything-were-not-doing' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#look-at-everything-were-not-doing' aria-label='Anchor'></a>Look at everything we’re not doing</h2>
<p>With FaaS, just imagine how quickly the dev and testing story becomes a fate worse than death.</p>
<p>To run the app locally, we either need to add some huge dev dependencies to simulate the entire FaaS pipeline, or worse, connect up our dev and test environments directly to the FaaS provider.</p>
<p>With FLAME, your dev and test runners simply run on the local backend.</p>
<p>Remember, this is your app. FLAME just controls where modular parts of it run. In dev or test, those parts simply run on the existing runtime on your laptop or CI server.</p>
<p>Using Elixir, we can even send a file across to the remote FLAME application thanks to the distributed features of the Erlang VM:</p>
<div class="highlight-wrapper group relative elixir">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-3lqqax1t"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-3lqqax1t"><span class="k">def</span> <span class="n">generate_thumbnails</span><span class="p">(%</span><span class="no">Video</span><span class="p">{}</span> <span class="o">=</span> <span class="n">vid</span><span class="p">,</span> <span class="n">interval</span><span class="p">)</span> <span class="k">do</span>
<span class="n">parent_stream</span> <span class="o">=</span> <span class="no">File</span><span class="o">.</span><span class="n">stream!</span><span class="p">(</span><span class="n">vid</span><span class="o">.</span><span class="n">filepath</span><span class="p">,</span> <span class="p">[],</span> <span class="mi">2048</span><span class="p">)</span>
<span class="no">FLAME</span><span class="o">.</span><span class="n">call</span><span class="p">(</span><span class="no">MyApp</span><span class="o">.</span><span class="no">FFMpegRunner</span><span class="p">,</span> <span class="k">fn</span> <span class="o">-></span>
<span class="n">tmp_file</span> <span class="o">=</span> <span class="no">Path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="no">System</span><span class="o">.</span><span class="n">tmp_dir!</span><span class="p">(),</span> <span class="no">Ecto</span><span class="o">.</span><span class="no">UUID</span><span class="o">.</span><span class="n">generate</span><span class="p">())</span>
<span class="n">flame_stream</span> <span class="o">=</span> <span class="no">File</span><span class="o">.</span><span class="n">stream!</span><span class="p">(</span><span class="n">tmp_file</span><span class="p">)</span>
<span class="no">Enum</span><span class="o">.</span><span class="n">into</span><span class="p">(</span><span class="n">parent_stream</span><span class="p">,</span> <span class="n">flame_stream</span><span class="p">)</span>
<span class="n">tmp</span> <span class="o">=</span> <span class="no">Path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="no">System</span><span class="o">.</span><span class="n">tmp_dir!</span><span class="p">(),</span> <span class="no">Ecto</span><span class="o">.</span><span class="no">UUID</span><span class="o">.</span><span class="n">generate</span><span class="p">())</span>
<span class="no">File</span><span class="o">.</span><span class="n">mkdir!</span><span class="p">(</span><span class="n">tmp</span><span class="p">)</span>
<span class="n">args</span> <span class="o">=</span>
<span class="p">[</span><span class="s2">"-i"</span><span class="p">,</span> <span class="n">tmp_file</span><span class="p">,</span> <span class="s2">"-vf"</span><span class="p">,</span> <span class="s2">"fps=1/</span><span class="si">#{</span><span class="n">interval</span><span class="si">}</span><span class="s2">"</span><span class="p">,</span> <span class="s2">"</span><span class="si">#{</span><span class="n">tmp</span><span class="si">}</span><span class="s2">/%02d.png"</span><span class="p">]</span>
<span class="no">System</span><span class="o">.</span><span class="n">cmd</span><span class="p">(</span><span class="s2">"ffmpeg"</span><span class="p">,</span> <span class="n">args</span><span class="p">)</span>
<span class="n">urls</span> <span class="o">=</span> <span class="no">VidStore</span><span class="o">.</span><span class="n">put_thumbnails</span><span class="p">(</span><span class="n">vid</span><span class="p">,</span> <span class="no">Path</span><span class="o">.</span><span class="n">wildcard</span><span class="p">(</span><span class="n">tmp</span> <span class="o"><></span> <span class="s2">"/*.png"</span><span class="p">))</span>
<span class="no">Repo</span><span class="o">.</span><span class="n">insert_all</span><span class="p">(</span><span class="no">Thumb</span><span class="p">,</span> <span class="no">Enum</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">urls</span><span class="p">,</span> <span class="o">&</span><span class="p">%{</span><span class="ss">vid_id:</span> <span class="n">vid</span><span class="o">.</span><span class="n">id</span><span class="p">,</span> <span class="ss">url:</span> <span class="nv">&1</span><span class="p">}))</span>
<span class="k">end</span><span class="p">)</span>
<span class="k">end</span>
</code></pre>
</div>
</div>
<p>On line 2 we open a file on the parent node to the video path. Then in the FLAME child, we stream the file from the parent node to the FLAME server in only a couple lines of code. That’s it! No setup of S3 or HTTP interfaces required.</p>
<p>With FLAME it’s easy to miss everything we’re not doing:</p>
<ul>
<li>We don’t need to write code outside of our application. We can reuse business logic, database setup, PubSub, and all the features of our respective platforms
</li><li>We don’t need to manage deploys of separate services or endpoints
</li><li>We don’t need to write results to S3 or SQS just to pick up values back in our app
</li><li>We skip the dev, test, and CI dependency dance
</li></ul>
<h2 id='flame-outside-elixir' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#flame-outside-elixir' aria-label='Anchor'></a>FLAME outside Elixir</h2>
<p>Elixir is fantastically well suited for the FLAME model because we get so much <a href='https://fly.io/phoenix-files/elixir-and-phoenix-can-do-it-all/' title=''>for free</a> like process supervision and distributed messaging. That said, any language with reasonable concurrency primitives can take advantage of this pattern. For example, my teammate, Lubien, created a proof of concept example for breaking out functions in your JavaScript application and running them inside a new Fly Machine: <a href='https://github.com/lubien/fly-run-this-function-on-another-machine' title=''>https://github.com/lubien/fly-run-this-function-on-another-machine</a></p>
<p>So the general flow for a JavaScript-based FLAME call would be to move the modular executions to a new file, which is executed on a runner pool. Provided the arguments are JSON serializable, the general FLAME flow is similar to what we’ve outlined here. Your application, your code, running on fleeting instances.</p>
<p>A complete FLAME library will need to handle the following concerns:</p>
<ul>
<li>Elastic pool scale-up and scale-down logic
</li><li>Hot vs cold startup with pools
</li><li>Remote runner monitoring to avoid orphaned resources
</li><li>How to monitor and keep deployments fresh
</li></ul>
<p>For the rest of this post we’ll see how the Elixir FLAME library handles these concerns as well as features uniquely suited to Elixir applications. But first, you might be wondering about your background job queues.</p>
<h2 id='what-about-my-background-job-processor' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#what-about-my-background-job-processor' aria-label='Anchor'></a>What about my background job processor?</h2>
<p>FLAME works great inside your background job processor, but you may have noticed some overlap. If your job library handles scaling the worker pool, what is FLAME doing for you? There’s a couple important distinctions here.</p>
<p>First, we reach for these queues when we need <em>durability guarantees</em>. We often can turn knobs to have the queues scale to handle more jobs as load changes. But durable operations are separate from elastic execution. Conflating these concerns can send you down a similar path to lambda complexity. Leaning on your worker queue purely for offloaded execution means writing all the glue code to get the data into and out of the job, and back to the caller or end-user’s device somehow.</p>
<p>For example, if we want to guarantee we successfully generated thumbnails for a video after the user upload, then a job queue makes sense as the <em>dispatch, commit, and retry</em> <em>mechanism</em> for this operation. The actual transcoding could be a FLAME call inside the job itself, so we decouple the ideas of durability and scaled execution.</p>
<p>On the other side, we have operations we don’t need durability for. Take the screencast above where the user hasn’t yet saved their video. Or an ML model execution where there’s no need to waste resources churning a prompt if the user has already left the app. In those cases, it doesn’t make sense to write to a durable store to pick up a job for work that will go right into the ether.</p>
<h2 id='pooling-for-elastic-scale' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#pooling-for-elastic-scale' aria-label='Anchor'></a>Pooling for Elastic Scale</h2>
<p>With the Elixir implementation of FLAME, you define elastic pools of runners. This allows scale-to-zero behavior while also elastically scaling up FLAME servers with max concurrency limits.</p>
<p>For example, lets take a look at the <code>start/2</code> callback, which is the entry point of all Elixir applications. We can drop in a <code>FLAME.Pool</code> for video transcriptions and say we want it to scale to zero, boot a max of 10, and support 5 concurrent <code>ffmpeg</code> operations per runner:</p>
<div class="highlight-wrapper group relative elixir">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-b4v6n5ah"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-b4v6n5ah"><span class="k">def</span> <span class="n">start</span><span class="p">(</span><span class="n">_type</span><span class="p">,</span> <span class="n">_args</span><span class="p">)</span> <span class="k">do</span>
<span class="n">flame_parent</span> <span class="o">=</span> <span class="no">FLAME</span><span class="o">.</span><span class="no">Parent</span><span class="o">.</span><span class="n">get</span><span class="p">()</span>
<span class="n">children</span> <span class="o">=</span> <span class="p">[</span>
<span class="o">...</span><span class="p">,</span>
<span class="no">MyApp</span><span class="o">.</span><span class="no">Repo</span><span class="p">,</span>
<span class="p">{</span><span class="no">FLAME</span><span class="o">.</span><span class="no">Pool</span><span class="p">,</span>
<span class="ss">name:</span> <span class="no">Thumbs</span><span class="o">.</span><span class="no">FFMpegRunner</span><span class="p">,</span>
<span class="ss">min:</span> <span class="mi">0</span><span class="p">,</span>
<span class="ss">max:</span> <span class="mi">10</span><span class="p">,</span>
<span class="ss">max_concurrency:</span> <span class="mi">5</span><span class="p">,</span>
<span class="ss">idle_shutdown_after:</span> <span class="mi">30_000</span><span class="p">},</span>
<span class="n">!flame_parent</span> <span class="o">&&</span> <span class="no">MyAppWeb</span><span class="o">.</span><span class="no">Endpoint</span>
<span class="p">]</span>
<span class="o">|></span> <span class="no">Enum</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="o">&</span> <span class="nv">&1</span><span class="p">)</span>
<span class="n">opts</span> <span class="o">=</span> <span class="p">[</span><span class="ss">strategy:</span> <span class="ss">:one_for_one</span><span class="p">,</span> <span class="ss">name:</span> <span class="no">MyApp</span><span class="o">.</span><span class="no">Supervisor</span><span class="p">]</span>
<span class="no">Supervisor</span><span class="o">.</span><span class="n">start_link</span><span class="p">(</span><span class="n">children</span><span class="p">,</span> <span class="n">opts</span><span class="p">)</span>
<span class="k">end</span>
</code></pre>
</div>
</div>
<p>We use the presence of a FLAME parent to conditionally start our Phoenix webserver when booting the app. There’s no reason to start a webserver if we aren’t serving web traffic. Note we leave other services like the database <code>MyApp.Repo</code> alone because we want to make use of those services inside FLAME runners.</p>
<p>Elixir’s supervised process approach to applications is uniquely great for turning these kinds of knobs.</p>
<p>We also set our pool to idle down after 30 seconds of no caller operations. This keeps our runners hot for a short while before discarding them. We could also pass a <code>min: 1</code> to always ensure at least one <code>ffmpeg</code> runner is hot and ready for work by the time our application is started.</p>
<h2 id='process-placement' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#process-placement' aria-label='Anchor'></a>Process Placement</h2>
<p>In Elixir, stateful bits of our applications are built around the <em>process</em> primitive – lightweight greenthreads with message mailboxes. Wrapping our otherwise stateless app code in a synchronous <code>FLAME.call</code>‘s or async <code>FLAME.cast</code>’s works great, but what about the stateful parts of our app?</p>
<p><code>FLAME.place_child</code> exists to take an existing process specification in your Elixir app and start it on a FLAME runner instead of locally. You can use it anywhere you’d use <code>Task.Supervisor.start_child</code> , <code>DynamicSupervisor.start_child</code>, or similar interfaces. Just like <code>FLAME.call</code>, the process is run on an elastic pool and runners handle idle down when the process completes its work.</p>
<p>And like <code>FLAME.call</code>, it lets us take existing app code, change a single LOC, and continue shipping features.</p>
<p>Let’s walk thru the example from the screencast above. Imagine we want to generate video thumbnails for a video <em>as it is being uploaded</em>. Elixir and LiveView make this easy. We won’t cover all the code here, but you can view the <a href='https://github.com/fly-apps/thumbnail_generator/blob/main/lib/thumbs/thumbnail_generator.ex' title=''>full app implementation</a>.</p>
<p>Our first pass would be to write a LiveView upload writer that calls into a <code>ThumbnailGenerator</code>:</p>
<div class="highlight-wrapper group relative elixir">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-5m6hmw7u"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-5m6hmw7u"><span class="k">defmodule</span> <span class="no">ThumbsWeb</span><span class="o">.</span><span class="no">ThumbnailUploadWriter</span> <span class="k">do</span>
<span class="nv">@behaviour</span> <span class="no">Phoenix</span><span class="o">.</span><span class="no">LiveView</span><span class="o">.</span><span class="no">UploadWriter</span>
<span class="n">alias</span> <span class="no">Thumbs</span><span class="o">.</span><span class="no">ThumbnailGenerator</span>
<span class="k">def</span> <span class="n">init</span><span class="p">(</span><span class="n">opts</span><span class="p">)</span> <span class="k">do</span>
<span class="n">generator</span> <span class="o">=</span> <span class="no">ThumbnailGenerator</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="n">opts</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:ok</span><span class="p">,</span> <span class="p">%{</span><span class="ss">gen:</span> <span class="n">generator</span><span class="p">}}</span>
<span class="k">end</span>
<span class="k">def</span> <span class="n">write_chunk</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">state</span><span class="p">)</span> <span class="k">do</span>
<span class="no">ThumbnailGenerator</span><span class="o">.</span><span class="n">stream_chunk!</span><span class="p">(</span><span class="n">state</span><span class="o">.</span><span class="n">gen</span><span class="p">,</span> <span class="n">data</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:ok</span><span class="p">,</span> <span class="n">state</span><span class="p">}</span>
<span class="k">end</span>
<span class="k">def</span> <span class="n">meta</span><span class="p">(</span><span class="n">state</span><span class="p">),</span> <span class="k">do</span><span class="p">:</span> <span class="p">%{</span><span class="ss">gen:</span> <span class="n">state</span><span class="o">.</span><span class="n">gen</span><span class="p">}</span>
<span class="k">def</span> <span class="n">close</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">_reason</span><span class="p">)</span> <span class="k">do</span>
<span class="no">ThumbnailGenerator</span><span class="o">.</span><span class="n">close</span><span class="p">(</span><span class="n">state</span><span class="o">.</span><span class="n">gen</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:ok</span><span class="p">,</span> <span class="n">state</span><span class="p">}</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre>
</div>
</div>
<p>An upload writer is a behavior that simply ferries the uploaded chunks from the client into whatever we’d like to do with them. Here we have a <code>ThumbnailGenerator.open/1</code> which starts a process that communicates with an <code>ffmpeg</code> shell. Inside <code>ThumbnailGenerator.open/1</code>, we use regular elixir process primitives:</p>
<div class="highlight-wrapper group relative elixir">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-u49exyj4"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-u49exyj4"> <span class="c1"># thumbnail_generator.ex</span>
<span class="k">def</span> <span class="n">open</span><span class="p">(</span><span class="n">opts</span> <span class="p">\\</span> <span class="p">[])</span> <span class="k">do</span>
<span class="no">Keyword</span><span class="o">.</span><span class="n">validate!</span><span class="p">(</span><span class="n">opts</span><span class="p">,</span> <span class="p">[</span><span class="ss">:timeout</span><span class="p">,</span> <span class="ss">:caller</span><span class="p">,</span> <span class="ss">:fps</span><span class="p">])</span>
<span class="n">timeout</span> <span class="o">=</span> <span class="no">Keyword</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">opts</span><span class="p">,</span> <span class="ss">:timeout</span><span class="p">,</span> <span class="mi">5_000</span><span class="p">)</span>
<span class="n">caller</span> <span class="o">=</span> <span class="no">Keyword</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">opts</span><span class="p">,</span> <span class="ss">:caller</span><span class="p">,</span> <span class="n">self</span><span class="p">())</span>
<span class="n">ref</span> <span class="o">=</span> <span class="n">make_ref</span><span class="p">()</span>
<span class="n">parent</span> <span class="o">=</span> <span class="n">self</span><span class="p">()</span>
<span class="n">spec</span> <span class="o">=</span> <span class="p">{</span><span class="bp">__MODULE__</span><span class="p">,</span> <span class="p">{</span><span class="n">caller</span><span class="p">,</span> <span class="n">ref</span><span class="p">,</span> <span class="n">parent</span><span class="p">,</span> <span class="n">opts</span><span class="p">}}</span>
<span class="p">{</span><span class="ss">:ok</span><span class="p">,</span> <span class="n">pid</span><span class="p">}</span> <span class="o">=</span> <span class="no">DynamicSupervisor</span><span class="o">.</span><span class="n">start_child</span><span class="p">(</span><span class="nv">@sup</span><span class="p">,</span> <span class="n">spec</span><span class="p">)</span>
<span class="k">receive</span> <span class="k">do</span>
<span class="p">{</span><span class="o">^</span><span class="n">ref</span><span class="p">,</span> <span class="p">%</span><span class="no">ThumbnailGenerator</span><span class="p">{}</span> <span class="o">=</span> <span class="n">gen</span><span class="p">}</span> <span class="o">-></span>
<span class="p">%</span><span class="no">ThumbnailGenerator</span><span class="p">{</span><span class="n">gen</span> <span class="o">|</span> <span class="ss">pid:</span> <span class="n">pid</span><span class="p">}</span>
<span class="k">after</span>
<span class="n">timeout</span> <span class="o">-></span> <span class="k">exit</span><span class="p">(</span><span class="ss">:timeout</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre>
</div>
</div>
<p>The details aren’t super important here, except line 10 where we call <code>{:ok, pid} = DynamicSupervisor.start_child(@sup, spec)</code>, which starts a supervised<code>ThumbnailGenerator</code> process. The rest of the implementation simply ferries chunks as stdin into <code>ffmpeg</code> and parses png’s from stdout. Once a PNG delimiter is found in stdout, we send the <code>caller</code> process (our LiveView process) a message saying “hey, here’s an image”:</p>
<div class="highlight-wrapper group relative elixir">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-fws50f3g"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-fws50f3g"><span class="c1"># thumbnail_generator.ex</span>
<span class="nv">@png_begin</span> <span class="o"><<</span><span class="mi">137</span><span class="p">,</span> <span class="mi">80</span><span class="p">,</span> <span class="mi">78</span><span class="p">,</span> <span class="mi">71</span><span class="p">,</span> <span class="mi">13</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="mi">26</span><span class="p">,</span> <span class="mi">10</span><span class="o">>></span>
<span class="k">defp</span> <span class="n">handle_stdout</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">ref</span><span class="p">,</span> <span class="n">bin</span><span class="p">)</span> <span class="k">do</span>
<span class="p">%</span><span class="no">ThumbnailGenerator</span><span class="p">{</span><span class="ss">ref:</span> <span class="o">^</span><span class="n">ref</span><span class="p">,</span> <span class="ss">caller:</span> <span class="n">caller</span><span class="p">}</span> <span class="o">=</span> <span class="n">state</span><span class="o">.</span><span class="n">gen</span>
<span class="k">case</span> <span class="n">bin</span> <span class="k">do</span>
<span class="o"><<</span><span class="nv">@png_begin</span><span class="p">,</span> <span class="n">_rest</span><span class="p">::</span><span class="n">binary</span><span class="o">>></span> <span class="o">-></span>
<span class="k">if</span> <span class="n">state</span><span class="o">.</span><span class="n">current</span> <span class="k">do</span>
<span class="n">send</span><span class="p">(</span><span class="n">caller</span><span class="p">,</span> <span class="p">{</span><span class="n">ref</span><span class="p">,</span> <span class="ss">:image</span><span class="p">,</span> <span class="n">state</span><span class="o">.</span><span class="n">count</span><span class="p">,</span> <span class="n">encode</span><span class="p">(</span><span class="n">state</span><span class="p">)})</span>
<span class="k">end</span>
<span class="p">%{</span><span class="n">state</span> <span class="o">|</span> <span class="ss">count:</span> <span class="n">state</span><span class="o">.</span><span class="n">count</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="ss">current:</span> <span class="p">[</span><span class="n">bin</span><span class="p">]}</span>
<span class="n">_</span> <span class="o">-></span>
<span class="p">%{</span><span class="n">state</span> <span class="o">|</span> <span class="ss">current:</span> <span class="p">[</span><span class="n">bin</span> <span class="o">|</span> <span class="n">state</span><span class="o">.</span><span class="n">current</span><span class="p">]}</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre>
</div>
</div>
<p>The <code>caller</code> LiveView process then picks up the message in a <code>handle_info</code> callback and updates the UI:</p>
<div class="highlight-wrapper group relative elixir">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-euw7rf7i"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-euw7rf7i"><span class="c1"># thumb_live.ex</span>
<span class="k">def</span> <span class="n">handle_info</span><span class="p">({</span><span class="n">_ref</span><span class="p">,</span> <span class="ss">:image</span><span class="p">,</span> <span class="n">_count</span><span class="p">,</span> <span class="n">encoded</span><span class="p">},</span> <span class="n">socket</span><span class="p">)</span> <span class="k">do</span>
<span class="p">%{</span><span class="ss">count:</span> <span class="n">count</span><span class="p">}</span> <span class="o">=</span> <span class="n">socket</span><span class="o">.</span><span class="n">assigns</span>
<span class="p">{</span><span class="ss">:noreply</span><span class="p">,</span>
<span class="n">socket</span>
<span class="o">|></span> <span class="n">assign</span><span class="p">(</span><span class="ss">count:</span> <span class="n">count</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="ss">message:</span> <span class="s2">"Generating (</span><span class="si">#{</span><span class="n">count</span> <span class="o">+</span> <span class="mi">1</span><span class="si">}</span><span class="s2">)"</span><span class="p">)</span>
<span class="o">|></span> <span class="n">stream_insert</span><span class="p">(</span><span class="ss">:thumbs</span><span class="p">,</span> <span class="p">%{</span><span class="ss">id:</span> <span class="n">count</span><span class="p">,</span> <span class="ss">encoded:</span> <span class="n">encoded</span><span class="p">})}</span>
<span class="k">end</span>
</code></pre>
</div>
</div>
<p>The <code>send(caller, {ref, :image, state.count, encode(state)}</code> is one magic part about Elixir. Everything is a process, and we can message those processes, regardless of their location in the cluster.</p>
<p>It’s like if every instantiation of an object in your favorite OO lang included a cluster-global unique identifier to work with methods on that object. The LiveView (a process) simply receives the image message and updates the UI with new images.</p>
<p>Now let’s head back over to our <code>ThumbnailGenerator.open/1</code> function and make this elastically scalable.</p>
<div class="highlight-wrapper group relative diff">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-6y3c6bma"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-6y3c6bma"><span class="gd">- {:ok, pid} = DynamicSupervisor.start_child(@sup, spec)
</span><span class="gi">+ {:ok, pid} = FLAME.place_child(Thumbs.FFMpegRunner, spec)
</span></code></pre>
</div>
</div>
<p>That’s it! Because everything is a process and processes can live anywhere, it doesn’t matter what server our <code>ThumbnailGenerator</code> process lives on. It simply messages the caller with <code>send(caller, …)</code> and the messages are sent across the cluster if needed.</p>
<p>Once the process exits, either from an explicit close, after the upload is done, or from the end-user closing their browser tab, the FLAME server will note the exit and idle down if no other work is being done.</p>
<p>Check out the <a href='https://github.com/fly-apps/thumbnail_generator/blob/main/lib/thumbs/thumbnail_generator.ex' title=''>full implementation</a> if you’re interested.</p>
<h2 id='remote-monitoring' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#remote-monitoring' aria-label='Anchor'></a>Remote Monitoring</h2>
<p>All this transient infrastructure needs failsafe mechanisms to avoid orphaning resources. If a parent spins up a runner, that runner must take care of idling itself down when no work is present and handle failsafe shutdowns if it can no longer contact the parent node.</p>
<p>Likewise, we need to shutdown runners when parents are rolled for new deploys as we must guarantee we’re running the same code across the cluster.</p>
<p>We also have active callers in many cases that are awaiting the result of work on runners that could go down for any reason.</p>
<p>There’s a lot to monitor here.</p>
<p>There’s also a number of failure modes that make this sound like a harrowing experience to implement. Fortunately Elixir has all the primitives to make this an easy task thanks to the Erlang VM. Namely, we get the following for free:</p>
<ul>
<li>Process monitoring and supervision – we know when things go bad. Whether on a node-local process, or one across the cluster
</li><li>Node monitoring – we know when nodes come up, and when nodes go away
</li><li>Declarative and controlled app startup and shutdown - we carefully control the startup and shutdown sequence of applications as a matter of course. This allows us to gracefully shutdown active runners when a fresh deploy is triggered, while giving them time to finish their work
</li></ul>
<p>We’ll cover the internal implementation details in a future deep-dive post. For now, feel free to poke around <a href='https://github.com/phoenixframework/flame' title=''>the flame source</a>.</p>
<h2 id='whats-next' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#whats-next' aria-label='Anchor'></a>What’s Next</h2>
<p>We’re just getting started with the Elixir FLAME library, but it’s ready to try out now. In the future look for more advance pool growth techniques, and deep dives into how the Elixir implementation works. You can also find me <a href='https://twitter.com/chris_mccord' title=''>@chris_mccord</a> to chat about implementing the FLAME pattern in your language of choice.</p>
<p>Happy coding!</p>
<p>–Chris</p>
/blog/rethinking-serverless-with-flame/assets/flame-thumb.webpThe risks of building apps on ChatGPThttps://fly.io/blog/the-risks-of-building-apps-on-chatgpt/2023-12-05T00:00:00+00:002024-02-20T19:03:55+00:00<div class="lead"><p>If AI will play an essential role in your application, then consider using a self-hosted, open source model instead of a proprietary and externally hosted one. In this post we explore some of the risks for the latter option. We’re Fly.io. We put your code into lightweight microVMs on our own hardware <a href="https://fly.io/docs/reference/regions/" title="">around the world</a>. <a href="https://fly.io/docs/speedrun/" title="">Check us out</a>—your app can be deployed in minutes.</p>
</div>
<p>The topic of “AI” gets a lot of attention and press. Coverage ranges from apocalyptic warnings to Utopian predictions. The truth, as always, is likely somewhere in the middle. As developers, we are the ones that either imagine ways that AI can be used to enhance our products or the ones doing the herculean tasks of implementing it inside our companies.</p>
<p>I believe the following statement to be true:</p>
<blockquote>
<p>AI won’t replace humans — but humans with AI will replace humans without AI.</p>
</blockquote>
<p>I believe this can be extended to many products and services and the companies that create them. Let’s express it this way:</p>
<blockquote>
<p>AI won’t replace businesses — but businesses with AI will replace businesses without AI.</p>
</blockquote>
<p>Today I’m assuming your business would benefit from using AI. Or, at the very least, your C-levels have decreed from on high that thou must integrateth with AI. With that out of the way, the next question is how you’re meant to do it. This post is an argument to build on top of open source language models instead of closed models that you rent access to. We’ll take a look at what convinced me.</p>
<h2 id='but-openai-is-the-market-leader' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#but-openai-is-the-market-leader' aria-label='Anchor'></a>But OpenAI is the market leader…</h2>
<p>OpenAI, the creators of the famous ChatGPT, are the strong market leaders in this category. Why wouldn’t you want to use the best in the business?</p>
<p>Early on, stories of private corporate documents being uploaded by employees and then finding that private information leaking out to general ChatGPT users was a real black eye. <a href='https://www.sciencealert.com/many-companies-are-banning-chatgpt-this-is-why' title=''>Companies began banning employees from using ChatGPT for work</a>. It exposed that people’s interactions with ChatGPT were being used as training data for future versions of the model.</p>
<p>In response, OpenAI recently announced an <a href='https://openai.com/enterprise' title=''>Enterprise</a> offering promising that no Enterprise customer data is used for training.</p>
<p>With the top objection addressed, it should be smooth sailing for wide adoption, right?</p>
<p>Not so fast.</p>
<p>While an Enterprise offering may address that concern, there are other subtle reasons to not use OpenAI, or other closed models, that can’t be resolved by vague statements of enterprise privacy.</p>
<h2 id='what-are-the-risks-for-building-on-top-of-openai' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#what-are-the-risks-for-building-on-top-of-openai' aria-label='Anchor'></a>What are the risks for building on top of OpenAI?</h2>
<p>Let’s briefly outline the risks we take on when relying on a company like OpenAI for critical AI features in our applications.</p>
<ul>
<li><strong class='font-[550] text-navy-950'>Single provider risk</strong>: Relying deeply on an external service that plays a critical role in our business is risky. The integration is not easily swapped out for another service if needed. Additionally, we don’t want part of our “secret sauce” to actually be another company’s product. That’s some seriously shaky ground! They <em>want</em> to sell the same thing to our competitors too.
</li><li><strong class='font-[550] text-navy-950'>Regulation or Policy change risk</strong>: “AI” is being talked about a lot in politics. What’s acceptable today may be deemed “not allowed” in the future and a corporation providing a newly regulated service must comply.
</li><li><strong class='font-[550] text-navy-950'>Financial risk</strong>: <a href='https://www.washingtonpost.com/technology/2023/06/05/chatgpt-hidden-cost-gpu-compute/' title=''>AI chatbots lose money on every chat.</a> If the financial models that make our business profitable are built on impossible to maintain prices, then our business model may be at risk when it’s time to “make the AI engine profitable” like we’ve seen happen time and time again with every industry from cookware to video games. What might the true cost be? We don’t know. ‘Nuff said.
</li><li><strong class='font-[550] text-navy-950'>Governance and leadership risk</strong>: The co-founder and CEO of OpenAI, <a href='https://openai.com/blog/openai-announces-leadership-transition' title=''>Sam Altman, was forced out of his own company by a coup from his board</a>. This was later resolved with both <a href='https://openai.com/blog/openai-announces-leadership-transition' title=''>Sam Altman and Greg Brockman returning</a>. This exposes another risk we don’t often consider with our providers. More on this later.
</li></ul>
<p>Let’s look a bit closer at the “Single provider risk”.</p>
<h2 id='single-provider-risk' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#single-provider-risk' aria-label='Anchor'></a>Single provider risk</h2>
<p>For hobby usage, proof of concept work, and personal experiments, by all means, use ChatGPT! I do and I expect to continue to as well. It’s fantastic for prototyping, it’s trivial to set up, and it allows you to throw ink on canvas so much more quickly than any other option out there.</p>
<p>Up until recently, I was all gung-ho for ChatGPT being integrated into my apps. What happened? November 2023 happened. It was a very bad month for OpenAI.</p>
<p>I created a <a href='https://fly.io/phoenix-files/created-my-personal-ai-fitness-trainer-in-2-days/' title=''>Personal AI Fitness Trainer</a> powered by ChatGPT and on the morning of November 8th, I asked my personal trainer about the workout for the day and it failed. OpenAI was having a bad day with an outage.</p>
<p>I don’t fault someone for having a bad day. At some point, downtime happens to the best of us. And given enough time, it happens to <strong class='font-[550] text-navy-950'>all</strong> of us. But when possible, I want to prevent someone <em>else’s</em> bad day from becoming <em>my</em> bad day too.</p>
<h3 id='evaluating-a-critical-dependency' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#evaluating-a-critical-dependency' aria-label='Anchor'></a>Evaluating a critical dependency</h3>
<p>In my case, my personal fitness trainer being unavailable was a minor inconvenience, but I managed. However, it gave me pause. If I had built an AI fitness trainer as a service, that outage would be a much bigger deal and there would be nothing I could have done to fix it until the ChatGPT API came back up.</p>
<p>With services like a Personal AI Fitness Trainer, the AI component is the primary focus and main value proposition of the app. That’s pretty darn critical! If that AI service is interrupted, significantly altered (say, by the model suddenly refusing my requests for fitness information in ways that worked before) or my desired usage is denied (without warning or reason), the application is useless. That’s an existential threat that could make my app evaporate overnight without warning.</p>
<p>This highlights the risk of having a critical dependency on an external service.</p>
<p>Modern applications depend on many services, both internal and external. But how <strong class='font-[550] text-navy-950'>critical</strong> that dependency is matters.</p>
<p>Let’s take a <em>very</em> simple application as an example. The application has a critical dependency on the database and both the app and database have a critical dependency on the underlying VMs/machines/provider. These critical dependencies are so common that we seldom think about them because we deal with them every day we come to work. It’s just how things are.</p>
<p><img alt="Diagram showing an application stack of hosting > Database > My Application and weak dependencies on logging, error reporting, etc. Then a critical dependency on an external AI as a Service. " src="/images/critical-dependency-vs-weak.png" /></p>
<p>The danger comes when we draw a critical dependency line to an <strong class='font-[550] text-navy-950'>external</strong> <strong class='font-[550] text-navy-950'>service</strong>. If the service has a hiccup or the network between my app and their service starts dropping all my packets, the entire application goes down. Someone else’s bad day gets spread around when that happens. 😞</p>
<p>In order to protect ourselves from a risk like that, we should diversify our reliance away from a single external provider. How do we do that? We’ll come back to this later.</p>
<h3 id='we-are-not-without-dependencies' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#we-are-not-without-dependencies' aria-label='Anchor'></a>We are not without dependencies</h3>
<p>It’s really common for apps to have external dependencies. The question is how critical to our service are those dependencies?</p>
<p>What happens to the application when the external log aggregation service, email service, and error reporting services are all unreachable? If the app is designed well, then users may have a slightly degraded experience or, best case, the users won’t even notice the issues at all!</p>
<p>The key factor is these external services are not essential to our application functioning.</p>
<h2 id='regulation-or-policy-change-risk' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#regulation-or-policy-change-risk' aria-label='Anchor'></a>Regulation or Policy change risk</h2>
<p>Our industry has a lot of misconceptions, fear, uncertainty, and doubt around the idea of regulation, but sometimes it’s justified. I don’t want you to think about regulation as a scary thing that yanks away control. Instead, let’s think about regulation as when a government body gets involved to disallow businesses from doing or engaging in specific activities. Given that our industry has been so self-defined for so long, this feels like an existential threat. However, this is a good thing when we think about vehicle safety standards (you don’t want your 4-ton mass of metal exploding while traveling at 70 mph), pollution, health risks, and more. It’s a careful balance.</p>
<p>Ironically, Sam Altman has been a major proponent <a href='https://www.forbes.com/sites/johannacostigan/2023/06/13/openais-sam-altman-makes-global-call-for-ai-regulation-and-includes-china/?sh=4fc007421b47' title=''>for government regulation</a> of the AI industry. Why would he want that?</p>
<p>It turns out that <a href='https://www.cato.org/policy-analysis/regulatory-protectionism-hidden-threat-free-trade' title=''>regulation can also be used as a form of protectionism</a>. Or, put another way, when the people with an early lead see that <a href='https://www.semianalysis.com/p/google-we-have-no-moat-and-neither' title=''>they aren’t defensible against advances with open source AI models</a>, they want to pull up the ladders behind them and have the government make it legally harder, or impossible, for competitors to catch up to them.</p>
<p>If Altman’s efforts are successful, then companies who create AI can expect government involvement and oversight. Added licensing requirements and certifications would raise the cost of starting a competing business.</p>
<p>At this point you may be thinking something like “but all of that is theoretical Mark, how would this affect my business’ use of AI today?”</p>
<p>Introducing an external organization that can dictate changes to an AI product risks breaking an existing company’s applications or significantly reducing the effectiveness of the application. And those changes may come without notice or warning.</p>
<p>Additionally, if my business is built on an external AI system protected from competition by regulators, that adds a significant risk. If they are now the only game in town, they can set whatever price they want.</p>
<h2 id='governance-and-leadership-risk' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#governance-and-leadership-risk' aria-label='Anchor'></a>Governance and leadership risk</h2>
<p>In the week following the OpenAI outage (November 17th to be precise), the entire tech industry was upended for most of a week following a blog post on the OpenAI blog <a href='https://openai.com/blog/openai-announces-leadership-transition' title=''>announcing that the OpenAI board fired the co-founder and CEO, Sam Altman</a>. Then <a href='https://www.forbes.com/sites/richardnieva/2023/11/17/openai-president-and-co-founder-quits-over-sam-altman-firing/?sh=34fe4b621d57' title=''>Greg Brockman, co-founder and acting President resigned in protest</a>.</p>
<p>OpenAI is partnered with Microsoft and on Nov 20, 2023, <a href='https://twitter.com/satyanadella/status/1726509045803336122' title=''>Satya Nadella (CEO of Microsoft) posted the following on X</a> (formerly Twitter):</p>
<blockquote>
<p>We remain committed to our partnership with OpenAI (OAI) and have confidence in our product roadmap, our ability to continue to innovate with everything we announced at Microsoft Ignite, and in continuing to support our customers and partners. We look forward to getting to know Emmett Shear and OAI’s new leadership team and working with them. And <strong class='font-[550] text-navy-950'>we’re extremely excited to share the news that Sam Altman and Greg Brockman, together with colleagues, will be joining Microsoft to lead a new advanced AI research team.</strong> We look forward to moving quickly to provide them with the resources needed for their success.</p>
</blockquote>
<p>Microsoft nearly <a href='https://en.wikipedia.org/wiki/Acqui-hiring' title=''>acqui-hired</a> OpenAI for $0! That’s some serious business Jujutsu.</p>
<p>In the end, after 12 days of very public corporate chaos, <a href='https://openai.com/blog/sam-altman-returns-as-ceo-openai-has-a-new-initial-board' title=''>Sam Altman and Greg Brockman returned to OpenAI at their previous leadership positions</a> as if nothing happened (save the firing of the rest of the board).</p>
<p>With all the drama and uncertainty resolved, you may say, “it all worked out in the end, right? So what’s the problem?”</p>
<p>This highlights the risk of building <em>any</em> critical business system on a product offered and hosted by an external company. When we do that, we implicitly take on all of that company’s risks in addition to the risks our business already has! In this case, it’s taking on all the risks of OpenAI while getting none of their financial benefits!</p>
<h2 id='whats-the-alternative' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#whats-the-alternative' aria-label='Anchor'></a>What’s the alternative?</h2>
<p>The thing big AI providers like OpenAI and Google seem to fear most is competition from open source AI models. And they should be afraid. Open source AI models continue to develop at a rapid pace (there’s huge incremental improvements on a weekly basis) and, most importantly, they can be self-hosted.</p>
<p>Additionally, it’s not out of reach for us to <a href='https://huggingface.co/docs/transformers/training' title=''>fine tune</a> a general model to better fit our needs by adding and removing capabilities rather than hope that the capabilities we need suddenly manifest for us.</p>
<p>Doesn’t this all sound like the classic argument in favor of open source?</p>
<p>If we have the model and can host it ourselves, no one can take it away. When we self-host it, we are protected from:</p>
<ul>
<li>service interruptions from an external provider for a critical system
</li><li>changes in licensing or usage fees (such as your provider suddenly doubling inference costs without warning via an email sent at 3AM)
</li><li>government regulators dictate a change to the model that negatively affects our use case (assuming our use isn’t breaking the law of course)
</li><li>company policy changes that change the behavior of the model we rely on
</li><li>rogue boards or a leadership crisis that impacts a provider
</li></ul>
<p>Using an open source and self-hosted model insulates us from these external risks.</p>
<h2 id='i-still-need-gpus' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#i-still-need-gpus' aria-label='Anchor'></a>I still need GPUs!</h2>
<p>Getting dedicated access to a GPU is more expensive than renting limited time on OpenAI’s servers. That’s why a hobby or personal project is better off paying for the brief bits of time when needed.</p>
<p>But let’s face it.</p>
<p>If you really want to integrate AI into your business, you need to host your own models. You can’t control third party privacy policies, but you can control your own policies when you are the one doing your own inference with your own models. Ideally this means getting your own GPUs and incurring the capital expenditure and operations expenditures, but thankfully we’re in the future. We have the cloud now. There’s many options you can use for renting GPU access from other companies. This is supported in the big clouds as well as Fly.io. You can check out our <a href='https://fly.io/docs/about/pricing/#gpus-and-fly-machines' title=''>GPU offerings here</a>.</p>
<figure class="post-cta">
<figcaption>
<h1>Fly.io also offer GPUs</h1>
<p>Running inference on your own hosted models can help de-risk critical AI integrations.</p>
<a class="button bg-white text-violet-500 hover:bg-opacity-75 hover:text-violet-700 transition-colors" href="https://fly.io/docs/about/pricing/#gpus-and-fly-machines">
GPU resource prices
</a>
</figcaption>
<div class="image-container">
<img src="/static/images/cta-cat.webp" srcset="/static/images/cta-cat@2x.webp 2x" alt="">
</div>
</figure>
<h2 id='closing-thoughts' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#closing-thoughts' aria-label='Anchor'></a>Closing thoughts</h2>
<p>It’s important to take advantage of AI in our applications so we can reap the benefits. It can give us an important edge in the market! However, we should be extra cautious of building any critical features on a product offered by a proprietary external business. <a href='https://www.msn.com/en-us/money/companies/sam-altman-chaos-helped-openai-rivals-says-hugging-face-ceo-cl%C3%A9ment-delangue/ar-AA1kIFQP' title=''>Others are considering the risks of building on OpenAI as well</a>.</p>
<p>Your specific level of risk depends on how central the AI aspect is to your business. If it’s a central component like in my Personal AI Fitness Trainer, then I risk losing all my customers and even the company if any of the above mentioned risk factors happen to my AI provider. That’s an existential risk that I can’t do anything about without taking emergency heroic efforts.</p>
<p>If the AI is sprinkled around the edges of the business, then suddenly losing it won’t kill the company. However, if the AI isn’t being well utilized, then the business may be at risk to competitors who place a bigger bet and take a bigger swing with AI.</p>
<p>Oh, what interesting times we live in! 🙃</p>
/blog/the-risks-of-building-apps-on-chatgpt/assets/risks-building-on-chatgpt-thumb.webpPrint on Demandhttps://fly.io/blog/print-on-demand/2023-11-29T00:00:00+00:002024-02-05T16:04:11+00:00<div class="lead"><p>Save money by using appliance machines to only allocate memory and other machine resources when you actually need them.</p>
</div>
<p>Scaling discussions often lead to recommendations to add more memory, more CPU, more machines, more regions, more, more, more.</p>
<p>This post is different. It focuses instead on the idea of decomposing parts of your applications into event handlers, starting up Machines to handle the events when needed, and stopping them when the event is done. Along the way we will see how a few built in Fly.io primitives make this easy.</p>
<p>To make the discussion concrete, we are going to focus on a common requirement: generation of PDFs from web pages. The code that we will introduce isn’t merely an example produced in support of a blog post - rather it is code that was extracted from a production application, and packaged up into an appliance that you can deploy in minutes to add PDF generation to your existing application.</p>
<p>But before we dive in, let’s back up a bit.</p>
<h2 id='motivation' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#motivation' aria-label='Anchor'></a>Motivation</h2>
<p>Normally the way this is approached is to start with a tool like <a href='https://github.com/puppeteer/puppeteer' title=''>Puppeteer</a>, <a href='https://github.com/Studiosity/grover#readme' title=''>Grover</a>, <a href='https://playwright.dev/' title=''>Playwright</a>, <a href='https://github.com/bitcrowd/chromic_pdf' title=''>ChromicPDF</a>, or <a href='https://spatie.be/docs/browsershot/v2/introduction' title=''>BrowserShot</a>. These and other tools ultimately launch a browser like <a href='https://developer.chrome.com/articles/new-headless/' title=''>Chrome headless</a>.</p>
<p>Now a few things about Chrome itself:</p>
<ul>
<li>It likely is bigger than your entire web server.
</li><li>It likely uses more memory than you see with a typical load on your server.
</li><li>All total, people using your server likely spend much less time generating PDFs than they do using the rest of your application.
</li></ul>
<p>Taken together, this makes splitting PDF generation into a completely separate application an easy win. With a smaller image, your application will start faster. Memory usage will be more predictable, and the memory needed to generate PDFs will only be allocated when needed and can be scaled separately.</p>
<h2 id='diving-in' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#diving-in' aria-label='Anchor'></a>Diving in</h2>
<p>Without further ado, the entire application is available on GitHub as <a href='https://github.com/fly-apps/pdf-appliance/#readme' title=''>fly-apps/pdf-appliance</a>. Installation is a simple matter of: clone repository, create app, adjust config, deploy, and scale.</p>
<p>Next, you will need to integrate this into your application. All that is needed is to reply to requests that are intended to produce a PDF with a <a href='https://fly.io/docs/reference/dynamic-request-routing/#the-fly-replay-response-header' title=''>fly-replay</a> response header. This can either be done on individual application routes / controller actions, or it can be done globally via either middleware or a front end like <a href='https://www.nginx.com/' title=''>NGINX</a>. You can find a few examples in the <a href='https://github.com/fly-apps/pdf-appliance/#integrate-with-your-existing-application' title=''>README</a>.</p>
<p>And, that’s it. The most you might consider doing is issuing an additional HTTP request in anticipation of the user selecting what they want to print as this will <a href='https://github.com/fly-apps/pdf-appliance/#preloading-optional' title=''>preload the machine</a>.</p>
<figure class="post-cta">
<figcaption>
<h1>Scale at your own pace</h1>
<p>Deploy your project in a few minutes with Fly Launch. Then do more with Fly Machines.</p>
<a class="button bg-white text-violet-500 hover:bg-opacity-75 hover:text-violet-700 transition-colors" href="https://fly.io/docs/">
Run your entire stack near your users
</a>
</figcaption>
<div class="image-container">
<img src="/static/images/cta-cat.webp" srcset="/static/images/cta-cat@2x.webp 2x" alt="">
</div>
</figure>
<p>If you don’t have an application handy, you can try a demo. Go to <a href='https://smooth.fly.dev/' title=''>smooth.fly.dev</a>. Click on Demo, then on Publish, and finally on Invoices to see a PDF. The PDF you see will likely be underwhelming as you would need to enter students, entries, packages and options to fill out the page. But click refresh anyway and see how fast it responds. If you want to explore further, links to the <a href='https://smooth.fly.dev/showcase/docs/' title=''>documentation</a> and <a href='https://github.com/rubys/showcase#readme' title=''>code</a> can be found on the front page.</p>
<h2 id='implementation-details' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#implementation-details' aria-label='Anchor'></a>Implementation Details</h2>
<p>The basic flow starts with a request comes into your app for a PDF. That request is replayed to the PDF appliance. A Chrome instance in that app then issues a second request to your app for the same URL minus the <code>.pdf</code> extension and then converts the HTML which it receives in response to a PDF. That PDF is then returned as the response to the original request.</p>
<p>A single Google Chrome instance per machine will be reused across all requests, which itself is faster than starting a new instance per request. As all HTTP headers will be passed back to your application, this will seamlessly work with your existing session, cookies, and basic authentication.</p>
<p>Starting up a machine on demand is handled by the <code>auto_stop_machines</code> setting in your <code>fly.toml</code>. With this in place, machines can confidently exit when idle, secure in the knowledge that they will be restarted when needed. See the <a href='https://github.com/fly-apps/pdf-appliance/#scaling' title=''>README</a> for more information on scaling.</p>
<p>Note that different machines can use different languages and frameworks. This code is written in JavaScript and runs on Bun. It was designed to support a Ruby on Rails app, but can be used with any app.</p>
<h2 id='a-reusable-pattern' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#a-reusable-pattern' aria-label='Anchor'></a>A Reusable Pattern</h2>
<p>If your app is small and your usage is low, scaling may not be much of a
concern, but as your need grow your first instinct shouldn’t merely be to throw
more hardware at the problem, but rather to partition the problem so that each
machine has a somewhat predictable capacity.</p>
<p>Do this by taking a look at your application, and look for requests that are
somehow different than the rest. Streaming audio and video files, handling websockets,
converting text to speech or performing other AI processing, long running
“background” computation, fetching static pages, producing PDFs, and updating
databases all have different profiles in terms of server load.</p>
<p>It might even be helpful – purely as a thought experiment – to think of
replacing your main server with a proxy that does nothing more than route
requests to separate machines based on the type of workload performed.</p>
<p>Once you have come up with an allocation of functions performed to pools of
machines, Fly-Replay is but one tool available to you. There is also a
<a href='https://fly.io/docs/machines/working-with-machines/' title=''>Machines API</a> that will
enable you to orchestrate whatever topology you can come up with.
<a href='https://fly.io/laravel-bytes/cost-effective-queue-workers-with-fly-io-machines/' title=''>Cost-Effective Queue Workers With Fly.io
Machines</a>
gives a preview of what that would look like with Laravel.</p>
/blog/print-on-demand/assets/print-on-demand-thumb.webpLaunching to Victoryhttps://fly.io/blog/new-launch/2023-11-28T00:00:00+00:002024-02-05T16:04:11+00:00<div class="lead"><p>Fly.io is the new public cloud for running your applications near your users so it can be faster than ever. When you create a new application, you use the <code>fly launch</code> command to give the platform all the information it needs to send it out into the sky. We’ve made steps towards making launching a new app <em>even easier</em> because first impressions matter. <a href="https://fly.io/docs/speedrun/" title="">Try the new <code>fly launch</code> now</a>; you can have an app up and running in mere minutes.</p>
</div>
<p>Previously when you ran <code>fly launch</code>, you got asked a bunch of hopefully relevant questions to help you get your app up and running. We’ve taken a lot of the guesswork out of the process and made it a lot more streamlined. It turns out that even though Fly.io developers use a variety of frameworks, languages, and toolchains you can fold most of them into a few basic infrastructure shapes.</p>
<h2 id='the-new-launch' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-new-launch' aria-label='Anchor'></a>The new launch</h2>
<p>Now when you run <code>fly launch</code>, the CLI will infer what you want based on the source code of your application. For example, if you have a Rails app with SQLite, it’ll give you an opinionated set of defaults that you can build from. If you don’t, it’ll give you other options so you can craft the infrastructure you need. I took one of my older applications named <a href='https://douglas-adams-quotes.fly.dev/' title=''>douglas-adams-quotes</a> and launched it with the new flow. Here’s what it looks like:</p>
<p><img alt="An animated GIF showing the new fully automated launch process. It starts by guessing what your app is and what needs it has, then presents you with a set of opinionated defaults so that you can confirm or deny. If you confirm it will build your application and deploy it, then give you the URL so you can use it." src="/images/./the-gif-edited.gif" /></p>
<p>If the settings it guessed are good enough, you can launch it into the cloud. If not, then you’ll be taken to a webpage where you can confirm or change the settings it guessed.</p>
<p>Once you say yes or confirm on the web, your app will get built and deployed (unless you asked it not to with <code>--no-deploy</code>). You’ll get a link to your app so you can go check it out. It’s that easy.</p>
<h2 id='conclusion' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#conclusion' aria-label='Anchor'></a>Conclusion</h2>
<p>We hope that this can help you look before you <code>fly launch</code> into the wild unknowns of the cloud.</p>
<p>Got any ideas or comments on how we can make this even smoother? Get in touch on our <a href='https://community.fly.io/' title=''>community forum</a>. We’d love to hear from you.</p>
/blog/new-launch/assets/thumb.webpHow I Flyhttps://fly.io/blog/how-i-fly/2023-11-17T00:00:00+00:002023-11-28T14:16:01+00:00<div class="lead"><p>We are Fly.io. We make it easy to run your programs close to your users. We make it easy to update your programs whenever you need to and communicate between your services in an end-to-end encrypted fashion. Today, Xe is going to tell you what they do to use Fly.io effectively. <a href="https://fly.io/docs/speedrun/" title="">Deploy your first app</a> for free and scale it up to production. That’s what Xe did.</p>
</div>
<p>I’m Xe Iaso. I’m a writer, technical educator, and philosopher who focuses on making technology easy to understand and scale to your needs. I use Fly.io to host my website and in nearly all of my personal projects now. Fly.io allows me to experiment with new ideas quickly and then deploy them to the world with ease.</p>
<h2 id='what-is-fly-io' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#what-is-fly-io' aria-label='Anchor'></a>What is Fly.io?</h2>
<p>Fly.io lets you host your applications in data centers close to your users. Fly.io also lets you have rolling updates of your programs and facilitates easy communication between your services inside and outside of your organization’s private network.</p>
<p>I use Fly.io to host my blog, its CDN (named XeDN for reasons which are an exercise for the reader), and a bunch of other supporting services that help make it run. It is easily the most fun I’ve had deploying things since I worked at Heroku.</p>
<h2 id='my-blog' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#my-blog' aria-label='Anchor'></a>My blog</h2>
<p>My blog is made up of several parts: the backend blog server and the CDN. Both are written in Go, my favorite programming language. The back-end blog server runs in Toronto, but XeDN runs in 35 datacenters worldwide. I plan to eventually move my blog to be served from XeDN, but for right now it’s still comfortably running off of a single server in Toronto.</p>
<p><img alt="The entire flow for how things run on Xesite." src="/images/./rebuild-flow.svg" /></p>
<p>Overall, my website’s architecture looks like this. My website listens for updates from Patreon and GitHub to trigger rebuilds because of its <a href='https://xeiaso.net/blog/xesite-v4/' title=''>dystatic nature</a>. When I am working on new posts or building new assets, I upload them to Backblaze B2. Anytime someone tries to access one of the files on a XeDN node, it will download it from Backblaze B2 if it doesn’t have it locally already.</p>
<p>With Fly.io, I don’t have to worry about the user experience being degraded when servers go down. If any individual XeDN server goes down, I can rely on the other XeDN servers worldwide to pick up the slack thanks to the fact that Fly.io will shunt the traffic to the servers that aren’t down. Combine this with some very aggressive caching logic for things like video assets, I can make sure that my blog is fast for everyone, no matter where they are in the world.</p>
<p>Of course, it doesn’t end here. My CDN server is the back end that helps make my other projects work too. I spent some time working on a <a href='https://xeiaso.net/blog/iaso-fonts/' title=''>custom font</a> for all of my web properties, and I <a href='https://cdn.xeiaso.net/static/pkg/iosevka/specimen.html' title=''>serve it from my CDN</a> so that I can use it in every project of mine. This allows me to integrate it into other projects like <a href='https://arsene.fly.dev/' title=''>Arsène</a> without having to do anything special.</p>
<h2 id='building-on-top-of-projects-with-fly-io' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#building-on-top-of-projects-with-fly-io' aria-label='Anchor'></a>Building on top of projects with Fly.io</h2>
<p>I like making projects that aren’t entirely serious. I love using these projects to explore aspects and bits of technology that I would have never gotten to play with before. One of these is <a href='https://arsene.fly.dev' title=''>Arsène</a>, a project I used to explore what a “dead internet” powered by AI could look like.</p>
<p>Every 12 hours, Arsène will have the ChatGPT API generate new posts and then use Stable Diffusion to create a (hopefully relevant) illustration for that post. I run a copy of the <a href='https://github.com/AUTOMATIC1111/stable-diffusion-webui' title=''>Automatic1111</a> Stable Diffusion API in my private network. When Arsène generates an image, it reaches out to that Stable Diffusion API directly over that private network to make the calls it needs. Since XeDN is in the same private network, I can also have Arsène send the images there to be cached and served all over the world.</p>
<p>Here’s what the total flow looks like:</p>
<p><img alt="The flow of data for Arsène, showing how this lets me reuse projects" src="/images/./reuse-flow.svg" /></p>
<p>This means that when I am creating things, I am not just making one-off things that don’t work with each other. I am creating individual building blocks that interoperate with each other. I am creating opportunities for me to reuse my infrastructure to create brand new things that are robust and scalable with minimal effort on my end.</p>
<h2 id='my-other-projects' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#my-other-projects' aria-label='Anchor'></a>My other projects</h2>
<p>I have some other projects that I’m working on that I don’t want to get into too much detail about yet, but it’s going to mostly involve transforming the basic ideas of using my CDN for distributing things and a webserver for sending HTML to users in new and interesting ways. I love using Fly.io for this because I am just allowed to create things instead of having to worry about how to implement it, where state is going to be stored, or how I’m going to scale it.</p>
<div class="callout"><p>Fly.io is the only platform where I’ve used where I can spin up 35 copies of a program as easily as one copy of a program.</p>
</div><h2 id='conclusion' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#conclusion' aria-label='Anchor'></a>Conclusion</h2>
<p>If you haven’t given Fly.io a try yet, you’re really missing out. It is utterly trivial to deploy your application across the globe. Not to mention, when your applications are idle, you can have them scale down to zero copies. This means that you only pay for what you actually use. I don’t have to worry about overpaying for my blog by having a giant server in Helsinki running 24/7, even though I’m only using a small sliver of it.</p>
<p>If you want to learn more about Fly.io, you can check out <a href='https://fly.io' title=''>fly.io</a>. My CDN cost me nothing until I started adding cover art per post and the <a href='https://xeiaso.net//blog/how-mara-works-2020-09-30/' title=''>conversation snippets</a> with furry stickers. It definitely went over the bar when I started uploading video. I can see it scaling in the future as my demands scale too.</p>
<p>Of course, this is barely even scratching the surface. Stay tuned for secret tricks you can use to dynamically spin up and spin down machines as you need. Imagine uploading an image, automatically creating a machine to handle compressing it, and uploading it to your storage back end. Imagine what you could do if compute was a faucet that you could turn on and off as you needed it.</p>
<p>You can do it on Fly.io. Try it today, you can run an app on a 256 MB Machine for free. XeDN ran on three 256 MB Machines for a year. Arsène still runs on a 256 MB Machine to this day. It’s more than enough for what you’re going to do. And when it isn’t, scaling up is <a href='https://fly.io/docs/about/pricing/' title=''>cheaper than you can imagine</a>.</p>
/blog/how-i-fly/assets/thumb.webpTranscribing on Fly GPU Machineshttps://fly.io/blog/transcribing-on-fly-gpu-machines/2023-11-13T00:00:00+00:002024-02-20T19:03:55+00:00<div class="lead"><p>Fly.io has GPUs! If you want to run AI (or whatever) workloads, checkout how to <a href="https://fly.io/docs/gpus/gpu-quickstart/" title="">get started with GPU Machines</a>!</p>
</div>
<p>Fly.io has GPU Machines, which means we can finally <del>play games</del> <del>mine bitcoin</del> <del>baghold NFTs</del> run AI workloads with just a few API calls.</p>
<p>This is exciting! Running GPU workloads yourself is useful when the community™ builds upon available models to make them faster, more useful, or less restrictive than first-party APIs.</p>
<p>One such tool is the <a href='https://github.com/ahmetoner/whisper-asr-webservice' title=''>Whisper Webservice</a>, which is conveniently packaged in a way that makes it a good candidate to use on Fly GPU Machines.</p>
<p>Let’s see how to use Fly.io GPU by spinning up Whisper Webservice.</p>
<h2 id='whisper-webservice' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#whisper-webservice' aria-label='Anchor'></a>Whisper Webservice</h2>
<p>Whisper is OpenAI’s voice recognition service - it’s used for audio transcription. To use it anywhere that’s not OpenAI’s platform, you need <a href='https://github.com/openai/whisper' title=''>some Python</a>, a few GB of storage, and (preferably) a GPU.</p>
<p>The aforementioned <a href='https://github.com/ahmetoner/whisper-asr-webservice' title=''>Whisper Webservice</a> packages this up for us, while making Whisper faster, more useful, and less restricted than OpenAI’s API:</p>
<ol>
<li>It provides a web API on top of Whisper’s Python library
</li><li>It (optionally) integrates <a href='https://github.com/guillaumekln/faster-whisper' title=''>faster-whisper</a> to make it, you know, faster
</li><li>It (optionally) uses FFmpeg to process the uploaded audio file, useful for getting audio out of video files or converting audio formats
</li></ol>
<p>Luckily for us, and totally <strong class='font-[550] text-navy-950'>not</strong> why I chose this as an example - the project provides GPU-friendly Docker images. We’ll use those to spin up Fly GPU Machines and process some audio files.</p>
<p>(I’ll also show examples of making your own Docker image!)</p>
<h2 id='running-a-gpu-machine' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#running-a-gpu-machine' aria-label='Anchor'></a>Running a GPU Machine</h2>
<p>Spinning up a GPU Machine is very similar to any other Machine. The main difference is the new “GPU kind” option (<code>--vm-gpu-kind</code>), which takes 2 possible values:</p>
<ol>
<li><code>a100-pcie-40gb</code>
</li><li><code>a100-sxm4-80gb</code>
</li></ol>
<p>These are 2 flavors of Nvidia A100 GPUs, the difference worth caring about is <code>40</code> vs <code>80</code> GB of memory (here’s <a href='https://fly.io/docs/about/pricing/#gpus-and-fly-machines' title=''>pricing</a>).</p>
<p>We’ll create machines using <code>a100-pcie-40gb</code> because we don’t need 80 freakin’ GB for what we’re doing.</p>
<p>Using <code>flyctl</code> is a great way to run a GPU Machine. We’ll make an app and run the conveniently created <a href='https://hub.docker.com/r/onerahmet/openai-whisper-asr-webservice' title=''>Whisper Webservice Docker image</a> that supports Nvidia GPUs. The <code>flyctl</code> commands will default us into a <code>performance-8x</code> server size (8 CPUs, 16G ram) unless we specify something different.</p>
<p><strong class='font-[550] text-navy-950'>One caveat:</strong> AI model files are big. Docker images ideally aren’t big - sending huge layers across the network angers the spiteful networking gods. If you shove models into your Docker images, you <em>might</em> have a bad time.</p>
<p>We suggest creating a Fly Volume and making your Docker image download needed models when it first spins up. The Whisper service (and in my experience, OpenAI’s Python library) does that for us.</p>
<p>So, we’ll create a volume to house (and cache) the models. In the case of the Whisper project, the models get placed in <code>/root/.cache/whisper</code> on its first boot, and so we’ll mount our disk there.</p>
<p>Alright, let’s create a GPU Machine. Here’s what the process looks like:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-8wlkoyrz"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-8wlkoyrz"><span class="nv">APP_NAME</span><span class="o">=</span><span class="s2">"whispering-zines"</span>
fly apps create <span class="nv">$APP_NAME</span> <span class="nt">-o</span> personal
<span class="c"># We "hint" --vm-gpu-kind so the volume</span>
<span class="c"># is provisioned on a GPU host</span>
<span class="c"># We choose region ord, where most Fly GPUs</span>
<span class="c"># currently live</span>
fly volumes create whisper_zine_cache <span class="nt">-s</span> 10 <span class="se">\</span>
<span class="nt">-a</span> <span class="nv">$APP_NAME</span> <span class="nt">-r</span> ord <span class="nt">--vm-gpu-kind</span> a100-pcie-40gb
<span class="c"># Take note of the volume ID from the output ^</span>
<span class="c"># Run a machine that can accept web requests</span>
<span class="c"># from the public internet</span>
fly machines run onerahmet/openai-whisper-asr-webservice:latest-gpu <span class="se">\</span>
<span class="nt">--vm-gpu-kind</span> a100-pcie-40gb <span class="se">\</span>
<span class="nt">-p</span> 443:9000/tcp:tls:http <span class="nt">-p</span> 80:9000/tcp:http <span class="se">\</span>
<span class="nt">-r</span> ord <span class="se">\</span>
<span class="nt">-v</span> <VOLUME_ID>:/root/.cache/whisper <span class="se">\</span>
<span class="nt">-e</span> <span class="nv">ASR_MODEL</span><span class="o">=</span>large <span class="nt">-e</span> <span class="nv">ASR_ENGINE</span><span class="o">=</span>faster_whisper <span class="se">\</span>
<span class="nt">-a</span> <span class="nv">$APP_NAME</span>
<span class="c"># Allocate IPs so we can view it on the web</span>
fly ips allocate-v4 <span class="nt">--shared</span> <span class="nt">-a</span> <span class="nv">$APP_NAME</span>
fly ips allocate-v6 <span class="nt">-a</span> <span class="nv">$APP_NAME</span>
</code></pre>
</div>
</div>
<p>That’s all pretty standard for Fly Machines, <strong class='font-[550] text-navy-950'>except</strong> for the <code>--vm-gpu-kind</code> flags used both for volume <strong class='font-[550] text-navy-950'>and</strong> Machine creation. Volumes are pinned to specific hosts - using this flag tells Fly.io to create the volume on a GPU host. Assuming we set the same region (<code>-r ord</code>), creating a GPU Machine with the just-created volume will tell Fly.io to place the Machine on the same host as the volume.</p>
<p><strong class='font-[550] text-navy-950'>Note:</strong> As my machine started up, I saw a log line <code>WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.</code>, which ended up being an issue of timing. Once everything is running, I was able to see things were working by using <code>fly ssh console -a $APP_NAME</code> and running command <code>nvidia-smi</code> to confirm that the VM had a GPU. It also listed the running web service (Python in this case) was running as a GPU process.</p>
<p>Once everything is running, you should be able to head to <code>$APP_NAME.fly.dev</code> and view it in the browser.</p>
<p>The Whisper Webservice UI will let you try out individual calls in its API. This will also give you the information you need to make those calls from your code. There’s a link to the API specification (e.g. <code>$APP_NAME.fly.dev/openapi.json</code>) you can use to, say, have <a href='https://www.blobr.io/post/create-api-specs-chatgpt' title=''>ChatGPT generate a client</a> in your language of choice.</p>
<h2 id='automating-gpu-machines' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#automating-gpu-machines' aria-label='Anchor'></a>Automating GPU Machines</h2>
<p>If you want to automate this, you can use the <a href='https://fly.io/docs/machines/working-with-machines/' title=''>Machines API</a> (spec <a href='https://docs.machines.dev/swagger/index.html' title=''>here</a>).</p>
<p>An easy way to get started is to spy on the API requests <code>flyctl</code> is making:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-49fuh0zz"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-49fuh0zz"><span class="c"># Debug logs will output the API requests / responses</span>
<span class="c"># made to Fly.io's API.</span>
<span class="nv">LOG_LEVEL</span><span class="o">=</span>debug flyctl machine run ...
</code></pre>
</div>
</div>
<p>This helped me figure out why my own initial API attempts failed - it turns out we need some extra parameters in the <code>compute</code> portion of the request JSON for creating a volume, and the <code>guest</code> section for creating a Machine.</p>
<p>For both volumes and Machines, we set the <code>gpu_kind</code> the same way we did in our <code>flyctl</code> command. However we <em>also</em> need the <code>cpu_kind</code> to be set. Additionally, when creating a Machine, we need to set <code>cpus</code> and <code>memory_mb</code> to <a href='https://fly.io/docs/machines/guides-examples/machine-sizing/' title=''>valid values</a> for <code>performance</code> Machines.</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-v381lcrh"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-v381lcrh"><span class="nv">APP_NAME</span><span class="o">=</span><span class="s2">"whispering-zines"</span>
<span class="c"># Create a volume on a GPU host. Specify both</span>
<span class="c"># cpu_kind and gpu_kind</span>
curl <span class="nt">-H</span> <span class="s2">"Authorization: Bearer </span><span class="sb">`</span>fly auth token<span class="sb">`</span><span class="s2">"</span> <span class="se">\</span>
<span class="nt">-H</span> <span class="s2">"Accept: application/json"</span> <span class="se">\</span>
<span class="nt">-H</span> <span class="s2">"Content-Type: application/json"</span> <span class="se">\</span>
https://api.machines.dev/v1/apps/<span class="nv">$APP_NAME</span>/volumes <span class="se">\</span>
<span class="nt">-d</span> <span class="s1">'{
"name": "whisper_zine_cache",
"region": "ord",
"size_gb": 10,
"compute": {
"cpu_kind": "performance",
"gpu_kind": "a100-pcie-40gb"
}
}'</span>
<span class="c"># Take note of the volume ID from the response ^</span>
<span class="c"># Run a machine that can accept web requests</span>
<span class="c"># from the public internet.</span>
curl <span class="nt">-H</span> <span class="s2">"Authorization: Bearer </span><span class="sb">`</span>fly auth token<span class="sb">`</span><span class="s2">"</span> <span class="se">\</span>
<span class="nt">-H</span> <span class="s2">"Accept: application/json"</span> <span class="se">\</span>
<span class="nt">-H</span> <span class="s2">"Content-Type: application/json"</span> <span class="se">\</span>
https://api.machines.dev/v1/apps/<span class="nv">$APP_NAME</span>/machines <span class="se">\</span>
<span class="nt">-d</span> <span class="s1">'{
"region": "ord",
"config": {
"env": {
"ASR_ENGINE": "faster_whisper",
"ASR_MODEL": "large",
"FLY_PROCESS_GROUP": "app",
"PRIMARY_REGION": "ord"
},
"mounts": [
{
"path": "/root/.cache/whisper",
"volume": "<VOLUME_ID>",
"name": "data"
}
],
"services": [
{
"protocol": "tcp",
"internal_port": 9000,
"autostop": false,
"ports": [
{
"port": 80,
"handlers": [
"http"
],
"force_https": true
},
{
"port": 443,
"handlers": [
"http",
"tls"
]
}
]
}
],
"image": "onerahmet/openai-whisper-asr-webservice:latest-gpu",
"guest": {
"cpus": 8,
"memory_mb": 16384,
"cpu_kind": "performance",
"gpu_kind": "a100-pcie-40gb"
}
}
}'</span>
</code></pre>
</div>
</div>
<p>After that we can assign the app some IPs. You can use <code>flyctl</code> for this, or the <a href='https://api.fly.io/graphql' title=''>graphql API.</a> You can once again use debug mode with <code>flyctl</code> to see what API calls it makes. Side note: Eventually the Machines REST API will include the ability to allocate IP addresses.</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-af5fq7lz"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-af5fq7lz">fly ips allocate-v4 <span class="nt">--shared</span> <span class="nt">-a</span> <span class="nv">$APP_NAME</span>
fly ips allocate-v6 <span class="nt">-a</span> <span class="nv">$APP_NAME</span>
</code></pre>
</div>
</div>
<p>If you’re doing this type of work for your business, you may want to keep these Machines inside a private network anyway, in which case you won’t be assigning it IP addresses.</p>
<h2 id='making-your-own-images' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#making-your-own-images' aria-label='Anchor'></a>Making Your Own Images</h2>
<p>There is, luckily (for me, a hardware ignoramus) less dark magic to making GPU-friendly Docker images than you might think. Basically you need to just install the correct Nvidia drivers.</p>
<p>A way to cheat at this is to run <a href='https://github.com/NVIDIA/nvidia-container-toolkit/tree/main' title=''>Nvidia cuda base images</a>, but you’re made of sterner stuff, you can also start with a base Ubuntu image and install your own.</p>
<p>While the Whisper webservice image is based on <code>nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu22.04</code>, I got Whisper (plain, not the webservice) working with <code>ubuntu:22.04</code>:</p>
<div class="highlight-wrapper group relative dockerfile">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-wydc3xh4"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-wydc3xh4"><span class="c"># Base image</span>
<span class="k">FROM</span><span class="s"> ubuntu:22.04</span>
<span class="k">RUN </span>apt update <span class="nt">-q</span> <span class="o">&&</span> apt <span class="nb">install</span> <span class="nt">-y</span> ca-certificates wget <span class="se">\
</span> <span class="o">&&</span> wget <span class="nt">-qO</span> /cuda-keyring.deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb <span class="se">\
</span> <span class="o">&&</span> dpkg <span class="nt">-i</span> /cuda-keyring.deb <span class="o">&&</span> apt update <span class="nt">-q</span> <span class="se">\
</span> <span class="o">&&</span> apt <span class="nb">install</span> <span class="nt">-y</span> <span class="nt">--no-install-recommends</span> ffmpeg libcudnn8 libcublas-12-2 <span class="se">\
</span> git python3 python3-pip
<span class="k">WORKDIR</span><span class="s"> /app</span>
<span class="k">COPY</span><span class="s"> audio.mp3</span>
<span class="k">COPY</span><span class="s"> run.py /app/run.py</span>
<span class="k">CMD</span><span class="s"> ["python3" "run.py"]</span>
</code></pre>
</div>
</div>
<p>You can find a full, <a href='https://github.com/fly-apps/whisper-example' title=''>working version of this here</a>.</p>
<h2 id='this-time-its-different-i-guess' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#this-time-its-different-i-guess' aria-label='Anchor'></a>This time it’s different, I guess</h2>
<p>AI feels a bit different than previous trends in that it has immediately-obvious benefits. No one needs to throw around catchy phrases with a wink-wink nudge-nudge (“we like the art”) for us to find value.</p>
<p>Since AI workloads work most efficiently in GPUs, they remain a hot commodity. For those of us who didn’t purchase enough $NVDA to retire, we can bring more value to our businesses by adding in AI.</p>
<p>Fly Machines have always been a great little piece of tech to run “ephemeral compute workloads” (wait, do I work at AWS!?) - and this is what I like about GPU Machines. You can mix and match all sorts of AI stuff together to make a chain of useful tools!</p>
/blog/transcribing-on-fly-gpu-machines/assets/whispering-thumb.webpSkip the API, Ship Your Databasehttps://fly.io/blog/skip-the-api/2023-09-13T00:00:00+00:002023-11-21T21:08:37+00:00<div class="lead"><p>With Fly.io, <a href="https://fly.io/docs/speedrun/" title="">you can get your app running globally in a matter of minutes</a>, and with LiteFS, you can run SQLite alongside your app! Now we’re introducing LiteFS Cloud: managed backups and point-in-time restores for LiteFS. <a href="https://fly.io/docs/litefs/speedrun/" title="">Try it out for yourself</a>!</p>
</div>
<p>My favorite part about building tools is discovering their unintended uses. It’s like starting to write a murder mystery book but you have no idea who the killer is!</p>
<p>History is filled with examples of these accidental discoveries: WD-40 was originally <a href='https://en.wikipedia.org/wiki/WD-40#History' title=''>used to protect ICBMs from rust</a> and now it fixes your squeaky doorknob. Bubble wrap was <a href='https://en.wikipedia.org/wiki/Bubble_Wrap_(brand)#History' title=''>originally sold as wallpaper</a> and now it protects your Amazon packages.</p>
<p>When we started writing <a href='https://fly.io/docs/litefs/' title=''>LiteFS</a>, a distributed SQLite database, we thought it would be used to distribute data geographically so users in, say, Bucharest see response times as fast as users in San Jose. And for the most part, that’s what LiteFS users are doing.</p>
<p>But we discovered another unexpected use: replacing the API layer between services with SQLite databases.</p>
<h2 id='how-it-started' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#how-it-started' aria-label='Anchor'></a>How it started</h2>
<p>In the early days of LiteFS development, we wanted to find a real-world test bed for our tool so we could smoke out any bugs that we didn’t find during automated tests. Part of our existing infrastructure is a program called <em>Corrosion</em> that gossips state between all our servers. Corrosion tracks VM statuses, health checks, and a plethora of other information for each server and communicates this info with other servers so they can make intelligent decisions about request routing and VM placement. Corrosion keeps a fast, local copy of all this data in a SQLite database.</p>
<p>So we set up a Corrosion instance that also ran on top of LiteFS. This helped root out some bugs but we also found another use for it: making Corrosion accessible to our internal services.</p>
<p><img src="/images/corrosion.png" /></p>
<h2 id='shipping-the-kitchen-sink' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#shipping-the-kitchen-sink' aria-label='Anchor'></a>Shipping the kitchen sink</h2>
<p>The typical approach to making data available between services is to spend weeks designing an API and then building a service around it. Your API design needs to take into account the different use cases of each consuming service so that it can deliver the data it needs efficiently. You don’t want your clients making a dozen API calls for every request!</p>
<p><img src="/images/architecture.png" /></p>
<p>A different approach is to skip the API design entirely and just ship the entire database to your client. You don’t need to consider the consuming service’s access patterns as they can use vanilla SQL to query and join whatever data their heart desires. That’s what we did using LiteFS.</p>
<p>While we could have set up each downstream service as a Corrosion node, gossip protocols can be chatty and we really just needed a one-way stream of updates. Setting up a read-only LiteFS instance for a new service is simple—it just needs the hostname of the upstream primary node to connect to:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-7jrnha7l"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-7jrnha7l">lease:
type: "static"
candidate: false
advertise-url: "http://corrosion-bridge:20202
</code></pre>
</div>
</div>
<p>And voila! You have a full, read-only copy of the database on your app.</p>
<h2 id='moving-compute-to-the-client' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#moving-compute-to-the-client' aria-label='Anchor'></a>Moving compute to the client</h2>
<p>API design is notoriously difficult as it’s hard to know what your consuming services will need. Query languages such as <a href='https://graphql.org/' title=''>GraphQL</a> have even been invented for this specific problem!</p>
<p>However, GraphQL has its own limitations. It’s good for fetching raw data but it lacks built-in <a href='https://www.sqlite.org/lang_aggfunc.html' title=''>aggregation</a> & advanced querying capabilities like <a href='https://www.sqlite.org/windowfunctions.html' title=''>windowing</a>. GraphQL is typically layered on top of an existing relational database that uses SQL. So why not just use SQL?</p>
<p>Additionally, performing queries on your service means that you need to handle multiple tenants competing for compute resources. Managing these tenants involves rate limiting and query timeouts so that no one client consumes all the resources.</p>
<p>By pushing a read-only copy of the database to clients, these restrictions aren’t a concern anymore. A tenant can use 100% of its CPU for hours if it wants to. It won’t adversely affect any other tenant because the query is running on its own hardware.</p>
<h2 id='so-whats-the-downside' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#so-whats-the-downside' aria-label='Anchor'></a>So what’s the downside?</h2>
<p>There’s always trade-offs with any technology and shipping read-only replicas is no different. One obvious limitation of read-only replicas is that they’re read-only. If your clients need to update data, they’ll still need an API for those mutations.</p>
<p>A less obvious downside is that the contract for a database can be less strict than an API. One benefit to an API layer is that you can change the underlying database structure but still massage data to look the same to clients. When you’re shipping the raw database, that becomes more difficult. Fortunately, many database changes, such as adding columns to a table, are backwards compatible so clients don’t need to change their code. Database views are also a great way to reshape data so it stays consistent—even when the underlying tables change.</p>
<p>Finally, shipping a database limits your ability to restrict access to data. If you have a multi-tenant database, you can’t ship that database without the client seeing all the data. One workaround for this is to use a database per tenant. SQLite databases are lightweight since they are just files on disk. This also has the added benefit of preventing queries in your application from accidentally fetching data across tenants.</p>
<h2 id='where-do-we-take-this-next' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#where-do-we-take-this-next' aria-label='Anchor'></a>Where do we take this next?</h2>
<p>While this approach has worked well for some internal tooling, how does this look in the broader world of software? APIs are likely stick around for the foreseeable future so providing read-only database replicas make sense for specific use cases where those APIs aren’t a great fit.</p>
<p>Imagine being able to query all your Stripe data or your GitHub data from a local database. You could join that data on to your own dataset and perform fast queries on your own hardware.</p>
<p>While companies such as Stripe or GitHub likely colocate their tenant data into one database, many companies run an event bus using tools like Kafka which could allow them to generate per-tenant SQLite databases to then stream to customers.</p>
<p>Pushing queries out to the end user has huge benefits for both the data provider & the data consumer in terms of flexibility and power.</p>
/blog/skip-the-api/assets/skip-the-api-thumb.webpAutomated Sentry Error Trackinghttps://fly.io/blog/sentry-partnership/2023-09-12T00:00:00+00:002023-11-21T21:08:37+00:00<div class="lead"><p>We’re Fly.io. We put your code into lightweight microVMs on our own hardware <a href="https://fly.io/docs/reference/regions/" title="">around the world</a>, close to your users. We partnered with <a href="https://sentry.io" title="">Sentry</a> to bring error and performance monitoring to your apps. Deploy your first app, and automatically get a year’s worth of credits to Sentry’s <a href="https://sentry.io/pricing/" title="">Team Plan</a> credits. <a href="https://fly.io/docs/speedrun/" title="">Check us out</a>—your app can be deployed and instrumented in minutes.</p>
</div>
<p>We’ve been using Sentry since the dawn of the internet. Or at least as far back as the <a href='https://home.cern/science/physics/higgs-boson/how' title=''>discovery</a> of the Higgs boson. Project to project, the familiar Sentry issue detail screen has been our faithful debugging companion.</p>
<p>Today it’s no exception: All of our Golang, Elixir, Ruby and Rust services report dutifully to Sentry.</p>
<p>So, it felt natural to integrate Sentry as the default error monitoring tool. All new deployments on Fly.io get a Sentry project provisioned automatically. Existing apps can grab theirs with <code>flyctl ext sentry create</code>.</p>
<p>Each Fly.io organization receives, for one year, a generous monthly quota:</p>
<ul>
<li>50,000 Error events
</li><li>100,000 Performance units
</li><li>500 Session Replays
</li><li>1GB of storage for Attachments
</li></ul>
<p>Once your app is instrumented, you’ll automatically get notified of production errors, latency issues, and crashes as soon as they occur in production. Sentry’s Team plan also gives you access to over 40 integrations, unlimited seats, and custom alerting.</p>
<h2 id='auto-instrumenting-rails' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#auto-instrumenting-rails' aria-label='Anchor'></a>Auto-instrumenting Rails</h2>
<p>To see Sentry in action, let’s launch our <a href='https://github.com/fly-apps/boomer' title=''>Boomer Rails App</a>. Yes kids, Rails is old school, and it’s the easiest framework to auto-instrument.</p>
<p>When <code>flyctl launch</code> detects a Rails app, it’s automatically setup to use a freshly minted Sentry project. Gems are installed, initializers planted, and finally, the <code>SENTRY_DSN</code> secret is set for deployment. We redacted some output for brevity.</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-bttnqmxb"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-bttnqmxb">fly deploy
</code></pre>
</div>
</div><div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-rm9smiak"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-rm9smiak">==> Verifying app config
...
Your Sentry project is ready. See details and next steps with: flyctl apps errors
Setting the following secrets on boomerang:
SENTRY_DSN
...
Visit your newly deployed app at https://boomerang.fly.dev/
</code></pre>
</div>
</div>
<p>Now, having Sentry configured at launch time means that deployment errors are captured early. This is useful for situations where apps fail to boot, run out of memory, and so on.</p>
<p>Now let’s force an application exception. We visit the app root, which goes Boom, thanks to some hastily written Ruby code.</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-trtinz16"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-trtinz16">flyctl open
</code></pre>
</div>
</div>
<p><img src="/images/boom-cover.webp?card&center" /></p>
<p>Oh shucks. Something went wrong. But, I got an email about this error.</p>
<p><img src="/images/email-cover.webp?card&center" /></p>
<p>We could click “View on Sentry”. Instead, let’s use <code>flyctl</code> to send us to the Sentry issues dashboard.</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-u185qmup"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-u185qmup">flyctl apps errors
</code></pre>
</div>
</div>
<p>We click through to this specific issue.</p>
<p><img src="/images/dash.webp?card&center" /></p>
<p>We successfully debugged our issue. The takeaway: don’t raise when you can call.</p>
<p>Error tracking on Sentry is just scratching the surface. Check out their <a href='https://docs.sentry.io/product/performance/' title=''>performance monitoring</a>, <a href='https://docs.sentry.io/product/session-replay' title=''>session replay</a>, <a href='https://docs.sentry.io/product/alerts/' title=''>alerting</a> and <a href='https://docs.sentry.io/product/' title=''>much more</a>.</p>
<h2 id='next-steps-for-fly-io-and-sentry' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#next-steps-for-fly-io-and-sentry' aria-label='Anchor'></a>Next Steps for Fly.io and Sentry</h2>
<p>For our next trick, we’ll be tracking Fly.io releases in Sentry, so Sentry can link issues to their <a href='https://docs.sentry.io/product/releases/' title=''>release tracking</a> feature.
We’ll also send events like <a href='https://fly.io/docs/getting-started/troubleshooting/#out-of-memory-oom-or-high-cpu-usage' title=''>out-of-memory errors</a> to Sentry. The possibilities are endless.</p>
<p>Got ideas or comments? Get in touch on our <a href='https://community.fly.io/' title=''>community forum</a>.</p>
/blog/sentry-partnership/assets/sentry-thumb.webpTracking Application-Level Consistency with LiteFShttps://fly.io/blog/tracking-consistency-with-litefs/2023-08-30T00:00:00+00:002023-11-21T21:08:37+00:00<div class="lead"><p>With Fly.io, <a href="https://fly.io/docs/speedrun/" title="">you can get your app running globally in a matter of minutes</a>, and with LiteFS, you can run SQLite alongside your app! Now we’re introducing LiteFS Cloud: managed backups and point-in-time restores for LiteFS. <a href="https://fly.io/docs/litefs/speedrun/" title="">Try it out for yourself</a>!</p>
</div>
<p>When we started the <a href='https://fly.io/docs/litefs/' title=''>LiteFS</a> project a year ago, we started more with an ideal in mind rather than a specific implementation. We wanted to make it possible to not only run distributed SQLite but we also wanted to make it… <em>gasp</em>… easy!</p>
<p>There were hurdles that we expected to be hard, such as intercepting SQLite transaction boundaries via syscalls or shipping logs around the world while ensuring data integrity. But there was one hurdle that was unexpectedly hard: maintaining a consistent view from the application’s perspective.</p>
<p>LiteFS requires write transactions to only be performed at the primary node and then those transactions are shipped back to replicas instantaneously. Well, almost instantaneously. And therein lies the crux of our problem.</p>
<p>Let’s say your user sends a write request to write to the primary node in Madrid and the user’s next read request goes to a local read-only replica in Rio de Janeiro. Most of the time LiteFS completes replication quickly and everything is fine. But if your request arrives a few milliseconds before data is replicated, then your user sees the database state from before the write occurred. That’s no good.</p>
<p>How exactly do we handle that when our database lives outside the user’s application?</p>
<h2 id='our-initial-series-of-failures-or-how-we-tried-to-teach-distributed-systems-to-users' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#our-initial-series-of-failures-or-how-we-tried-to-teach-distributed-systems-to-users' aria-label='Anchor'></a>Our initial series of failures, or how we tried to teach distributed systems to users</h2>
<p>Our first plan was to let LiteFS users manage consistency themselves. Every application may have different needs and, honestly, we didn’t have a better plan at the time. However, once we started explaining how to track replication state, it became obvious that it was going to be an untenable approach. Let’s start with a primer and you’ll understand why.</p>
<p>Every node in LiteFS maintains a <em>replication position</em> for each database which consists of two values:</p>
<ul>
<li>Transaction ID (TXID): An identifier that monotonically increases with every successful write transaction.
</li><li>Post-Apply Checksum: A checksum of the entire database after the transaction has been written to disk.
</li></ul>
<p>You can read the current position from your LiteFS mount from the <code>-pos</code> file:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-whs7hk7d"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-whs7hk7d">$ cat /litefs/my.db-pos
000000000042478b/8b73bc1d07d84988
</code></pre>
</div>
</div>
<p>This example shows that we are at TXID <code>0x42478b</code> (or 4,343,691 in decimal) and the checksum of our whole database after the transaction is <code>8b73bc1d07d84988</code>. A replica can detect how far it’s lagging behind by comparing its position to the primary’s position. Typically, a monotonic transaction ID doesn’t work in asynchronous replication systems like LiteFS but when we couple it with a checksum it allows us to check for divergence so the pair works surprisingly well.</p>
<p>LiteFS handles the replication position internally, however, it would be up to the application to check it to ensure that its clients saw a consistent view. This meant that the application would have needed to have its clients track the TXID from their last write to the primary and then the application would have to wait until its local replication caught up to that position before it could serve the request.</p>
<p>That would have been a lot to manage. While you may find the nuts and bolts of replication interesting, sometimes you just want to get your app up and running!</p>
<h2 id='lets-use-a-library-er-libraries' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#lets-use-a-library-er-libraries' aria-label='Anchor'></a>Let’s use a library! Er, libraries.</h2>
<p>Teaching distributed systems to each and every LiteFS user was not going to work. So instead, we thought we could tuck that complexity away by providing a LiteFS client library. Just import a package and you’re done!</p>
<p>Libraries are a great way to abstract away the tough parts of a system. For example, nobody wants to roll their own cryptography implementation so they use a library. But LiteFS is a database so it needs to work across all languages which means we needed to implement a library for each language.</p>
<p>Actually, it’s worse than that. We need to act as a traffic cop to redirect incoming client requests to make sure they arrive at the primary node for writes or that they see a consistent view on a replica for reads. We aren’t able to redirect writes at the data layer so it’s typically handled at the HTTP layer. Within each language ecosystem there can be a variety of web server implementations: Ruby has Rails & Sinatra, Go has net/http, gin, fasthttp, and whatever 12 new routers came out this week.</p>
<h2 id='moving-up-the-abstraction-stack' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#moving-up-the-abstraction-stack' aria-label='Anchor'></a>Moving up the abstraction stack</h2>
<p>Abstraction often feels like a footgun. Generalizing functionality across multiple situations means that you lose flexibility in specific situations. Sometimes that means you shouldn’t abstract but sometimes you just haven’t found the right abstraction layer yet.</p>
<p>For better or for worse, HTTP & REST-like applications have become the norm in our industry and some of the conventions provide a great layer for LiteFS to build upon. Specifically, the convention of using <code>GET</code> requests for reading data and the other methods (<code>POST</code>, <code>PUT</code>, <code>DELETE</code>, etc) for writing data.</p>
<p>Instead of developers injecting a LiteFS library into their application, we built a thin HTTP proxy that lives in front of the application.</p>
<p><img alt="Wrapping the application with a proxy & FUSE mount." src="https://slabstatic.com/prod/uploads/p1b436gf/posts/images/25yuWQlLKyLrkHBDFVcbU8to.png" /></p>
<p>This approach has let us manage both the incoming client side via HTTP as well as the backend data plane via our FUSE mount. It lets us isolate the application developer from the low-level details of LiteFS replication while making it feel like they’re developing against vanilla SQLite.</p>
<h2 id='how-it-works' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#how-it-works' aria-label='Anchor'></a>How it works</h2>
<p>The LiteFS proxy design is simple but effective. As an example, let’s start with a write request. A user creates a new order so they send a <code>POST /orders</code> request to your web app. The LiteFS proxy intercepts the request & parses the HTTP headers to see that it’s a <code>POST</code> write request. If the local node is a replica, the proxy forwards the request to the primary node.</p>
<p>If the local node is the primary, it’ll pass the request through to the application’s web server and the request will be processed normally. When the response begins streaming out to the client, the proxy will attach a cookie with the TXID of the newly-written commit.</p>
<p>When the client then sends a <code>GET</code> read request, the LiteFS proxy again intercepts it and parses the headers. It can see the TXID that was set in the cookie on the previous write and the proxy will check it against the replication position of the local replica. If replication has caught up to the client’s last write transaction, it’ll pass through the request to the application. Otherwise, it’ll wait for the local node to catch up or it will eventually time out. The proxy is built into the <code>litefs</code> binary so communication with the internal replication state is wicked fast.</p>
<h2 id='preventing-laggards' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#preventing-laggards' aria-label='Anchor'></a>Preventing laggards</h2>
<p>The proxy provides another benefit: health checks. Networks and servers don’t always play nice when they’re communicating across the world and sometimes they get disconnected. The proxy hooks into the LiteFS built-in heartbeat system to detect lag and it can report the node as unhealthy via a health check URL when this lag exceeds a threshold.</p>
<p>If you’re running on Fly.io, we’ll take that node out of rotation when health checks begin reporting issues so users will automatically get routed to a different, healthy replica. When the replica reconnects to the primary, the health check will report as healthy and the node will rejoin.</p>
<h2 id='the-tradeoffs-theres-always-tradeoffs' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-tradeoffs-theres-always-tradeoffs' aria-label='Anchor'></a>The Tradeoffs… there’s always tradeoffs!</h2>
<p>Despite how well the LiteFS proxy works in most situations, there’s gonna be times when it doesn’t quite fit. For example, if your application cannot rely on cookies to track application state then the proxy won’t work for you.</p>
<p>There are also frameworks, like <a href='https://www.phoenixframework.org/' title=''>Phoenix</a>, which can rely heavily on websockets for live updates so this circumvents your traditional HTTP request/response approach that LiteFS proxy depends on. Finally, the proxy provides <a href='https://jepsen.io/consistency/models/read-your-writes' title=''>read-your-writes</a> guarantees which may not work for every application out there.</p>
<p>In these cases, <a href='https://github.com/superfly/litefs/issues/new' title=''>let us know how we can improve the proxy</a> to make it work for more use cases! We’d love to hear your thoughts.</p>
<h2 id='diving-in-further' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#diving-in-further' aria-label='Anchor'></a>Diving in further</h2>
<p>The LiteFS proxy makes it easy to run SQLite applications in multiple regions around the world. You can even run many legacy applications with little to no change in the code.</p>
<p>If you’re interested in setting up LiteFS, check out our <a href='https://fly.io/docs/litefs/getting-started-fly/' title=''>Getting Started</a> guide. You can find additional details about configuring the proxy on our <a href='https://fly.io/docs/litefs/proxy/' title=''>Built-in HTTP Proxy</a> docs page.</p>
/blog/tracking-consistency-with-litefs/assets/tracking-consistency-thumb.webpMultiple Logs for Resiliencyhttps://fly.io/blog/redundant-logs/2023-07-21T00:00:00+00:002023-11-21T21:08:37+00:00<p>You’ve done everything right. You are well aware of
<a href='https://en.wikipedia.org/wiki/Murphy%27s_law' title=''>Murphy’s Law</a>.
You have multiple redundant machines. You’ve set up
a regular back up schedule for your database, perhaps even are using
<a href='https://fly.io/blog/litefs-cloud/' title=''>LiteFS CLoud</a>. You
<a href='https://fly.io/blog/shipping-logs/' title=''>ship your logs</a> to
<a href='https://logtail.com/' title=''>LogTail</a> or perhaps some other
<a href='https://github.com/superfly/fly-log-shipper#provider-configuration' title=''>provider</a>
so you can do forensic analysis should anything go wrong…</p>
<p>Then the unexpected happens. A major network outage causes your application to
misbehave. What’s worse is that your logs are missing crucial data from this
point, perhaps because of the same network outage. Maybe this time you are
lucky and you can find the data you need by using copies of your logs via
<a href='https://fly.io/docs/flyctl/logs/' title=''>flyctl logs</a> or the monitoring tab on the
<a href='https://fly.io/docs/flyctl/dashboard/' title=''>flyctl dashboard</a> before they
disappear forever.</p>
<p>So, what is going on here? Let’s look at the steps. Your application writes
logs to STDOUT. Fly.io will take that output and send it to
<a href='https://nats.io/' title=''>NATS</a>. The <a href='https://github.com/superfly/fly-log-shipper' title=''>Log
Shipper</a> will take that data and
hand it to <a href='https://vector.dev/docs/about/what-is-vector/' title=''>Vector</a>. From
there it is shipped to your third party logging provider. That’s a lot of
moving parts.</p>
<p>All that is great, but just like how you have redundant machines in case of
failures, you may want to have redundant logs in addition to the ones fly.io
and the log shipper provide. Below are two strategies for doing just that.
You can use either or both, and best of all the logs you create will be in
addition to your existing logs.</p>
<h2 id='logging-to-multiple-places' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#logging-to-multiple-places' aria-label='Anchor'></a>Logging to multiple places</h2>
<p>The following approach is likely the most failsafe, but often the least
convenient: having your primary application on each machine write to a
separate log file in addition to standard out. This does mean that when
you need this data you will have to fetch it from each machine and it
likely with be rather raw. But at least it will be there even in the face
of network failures.</p>
<p>For best results put these logs on a
<a href='https://fly.io/docs/reference/volumes/' title=''>volume</a> so that it survives
a restart, and be prepared to rotate logs as they grow in size so
that they don’t eventually fill up that volume.</p>
<p>This approach is necessarily framework specific, but most
frameworks provides some ability to do this. A Rails example:</p>
<div class="highlight-wrapper group relative ruby">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-eidxj8qj"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-eidxj8qj"><span class="n">logger</span> <span class="o">=</span> <span class="no">ActiveSupport</span><span class="o">::</span><span class="no">Logger</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="no">STDOUT</span><span class="p">)</span>
<span class="n">logger</span><span class="p">.</span><span class="nf">formatter</span> <span class="o">=</span> <span class="n">config</span><span class="p">.</span><span class="nf">log_formatter</span>
<span class="n">volume_logger</span> <span class="o">=</span> <span class="no">ActiveSupport</span><span class="o">::</span><span class="no">Logger</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">"/logs/production.log"</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
<span class="n">logger</span> <span class="o">=</span> <span class="n">logger</span><span class="p">.</span><span class="nf">extend</span> <span class="no">ActiveSupport</span><span class="o">::</span><span class="no">Logger</span><span class="p">.</span><span class="nf">broadcast</span><span class="p">(</span><span class="n">volume_logger</span><span class="p">)</span>
</code></pre>
</div>
</div>
<p>You probably already have the first two lines already in your
<code>config/environments/production.rb</code> file. Adjust and add the last
two lines. That’s it! You now have redundant logs.</p>
<p>See the <a href='https://docs.ruby-lang.org/en/master/Logger.html#class-Logger-label-Log+File+Rotation' title=''>Ruby docs for
Logger</a>
documentation on how to handle log rotation.</p>
<p>Some pointers for other frameworks:</p>
<ul>
<li><a href='https://dev.to/darnahsan/elixir-logging-to-multiple-files-using-metadatafilter-3896' title=''>Elixir</a>
</li><li><a href='https://laravel.com/docs/10.x/logging' title=''>Laravel</a>
</li><li><a href='https://docs.python.org/3/howto/logging-cookbook.html#multiple-handlers-and-formatters' title=''>Python</a>
</li><li><a href='https://github.com/winstonjs/winston#multiple-transports-of-the-same-type' title=''>Winston</a> for Node applications
</li></ul>
<h2 id='custom-log-shipper' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#custom-log-shipper' aria-label='Anchor'></a>Custom log shipper</h2>
<p>This approach is less bullet proof but may result in more immediately usable
results. Instead of using Log Shipper, Vector, and a third party, it is easy
to subscribe directly to NATS and process log entries yourself.</p>
<p>What you are going to want is a separate app running on a separate machine so
that it doesn’t go down there are problems with the machine you are monitoring,
or even during the times when you are deploying a new version. If the
code you write will be writing to disk, you will want a volume.</p>
<p>Also like with log shipper, you will want to set the following secret:</p>
<div class="highlight-wrapper group relative shell">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-rum30pug"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-rum30pug">fly secrets <span class="nb">set </span><span class="nv">FLY_AUTH_TOKEN</span><span class="o">=</span><span class="si">$(</span>fly auth token<span class="si">)</span>
</code></pre>
</div>
</div>
<p>Here’s a minimal JavaScript example that can be run using Node or Bun:</p>
<div class="highlight-wrapper group relative javascript">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-snsl8chv"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-snsl8chv"><span class="k">import</span> <span class="p">{</span> <span class="nx">connect</span><span class="p">,</span> <span class="nx">StringCodec</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">"</span><span class="s2">nats</span><span class="dl">"</span><span class="p">;</span>
<span class="k">import</span> <span class="nx">fs</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">node:fs</span><span class="dl">'</span><span class="p">;</span>
<span class="c1">// tailor these two constants for your needs</span>
<span class="kd">const</span> <span class="nx">LOG_FILE</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">/log/production.log</span><span class="dl">"</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">ORGANIZATION</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">your-organization-name</span><span class="dl">"</span><span class="p">;</span>
<span class="c1">// create a connection to a nats-server</span>
<span class="kd">const</span> <span class="nx">nc</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">connect</span><span class="p">({</span>
<span class="na">servers</span><span class="p">:</span> <span class="dl">"</span><span class="s2">[fdaa::3]:4223</span><span class="dl">"</span><span class="p">,</span>
<span class="na">user</span><span class="p">:</span> <span class="nx">ORGANIZATION</span><span class="p">,</span>
<span class="na">pass</span><span class="p">:</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">ACCESS_TOKEN</span>
<span class="p">});</span>
<span class="c1">// open log file</span>
<span class="nx">file</span> <span class="o">=</span> <span class="nx">fs</span><span class="p">.</span><span class="nx">openSync</span><span class="p">(</span><span class="nx">LOG_FILE</span><span class="p">,</span> <span class="dl">'</span><span class="s1">a+</span><span class="dl">'</span><span class="p">);</span>
<span class="c1">// create a codec</span>
<span class="kd">const</span> <span class="nx">sc</span> <span class="o">=</span> <span class="nx">StringCodec</span><span class="p">();</span>
<span class="c1">// create a simple subscriber and iterate over messages</span>
<span class="c1">// matching the subscription</span>
<span class="kd">const</span> <span class="nx">sub</span> <span class="o">=</span> <span class="nx">nc</span><span class="p">.</span><span class="nx">subscribe</span><span class="p">(</span><span class="dl">"</span><span class="s2">logs.></span><span class="dl">"</span><span class="p">);</span>
<span class="k">for</span> <span class="k">await</span> <span class="p">(</span><span class="kd">const</span> <span class="nx">msg</span> <span class="k">of</span> <span class="nx">sub</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">data</span> <span class="o">=</span> <span class="nx">JSON</span><span class="p">.</span><span class="nx">parse</span><span class="p">(</span><span class="nx">sc</span><span class="p">.</span><span class="nx">decode</span><span class="p">(</span><span class="nx">msg</span><span class="p">.</span><span class="nx">data</span><span class="p">));</span>
<span class="c1">// build log file entry</span>
<span class="kd">const</span> <span class="nx">log</span> <span class="o">=</span> <span class="p">[</span>
<span class="nx">data</span><span class="p">.</span><span class="nx">timestamp</span><span class="p">.</span><span class="nx">padEnd</span><span class="p">(</span><span class="mi">30</span><span class="p">),</span>
<span class="s2">`[</span><span class="p">${</span><span class="nx">data</span><span class="p">.</span><span class="nx">fly</span><span class="p">.</span><span class="nx">app</span><span class="p">.</span><span class="nx">instance</span><span class="p">}</span><span class="s2">]`</span><span class="p">,</span>
<span class="nx">data</span><span class="p">.</span><span class="nx">fly</span><span class="p">.</span><span class="nx">region</span><span class="p">,</span>
<span class="s2">`[</span><span class="p">${</span><span class="nx">data</span><span class="p">.</span><span class="nx">log</span><span class="p">.</span><span class="nx">level</span><span class="p">}</span><span class="s2">]`</span><span class="p">,</span>
<span class="nx">data</span><span class="p">.</span><span class="nx">message</span>
<span class="p">].</span><span class="nx">join</span><span class="p">(</span><span class="dl">'</span><span class="s1"> </span><span class="dl">'</span><span class="p">)</span> <span class="o">+</span> <span class="dl">"</span><span class="se">\n</span><span class="dl">"</span><span class="p">;</span>
<span class="c1">// write entry to disk</span>
<span class="nx">fs</span><span class="p">.</span><span class="nx">write</span><span class="p">(</span><span class="nx">file</span><span class="p">,</span> <span class="nx">log</span><span class="p">,</span> <span class="nx">error</span> <span class="o">=></span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">error</span><span class="p">)</span> <span class="nx">console</span><span class="p">.</span><span class="nx">error</span><span class="p">(</span><span class="nx">error</span><span class="p">);</span>
<span class="p">});</span>
<span class="p">}</span>
</code></pre>
</div>
</div>
<p>The above is pretty straightforward. It connects to NAT, opens a file,
subscribes to logs, parses each message, and writes out selected data
to disk. This example is in JavaScript, but feel free to reimplement
this basic approach using your favorite language, as NATS supports
<a href='https://docs.nats.io/using-nats/developer' title=''>plenty</a>.</p>
<p>Things to watch out for: you don’t want recursive errors when exceptions
occur during write. You want to capture errors and reconnect to NATS
when the connection goes down. You may even want to filter messages.
A more complete example implementing a number of these features can be found
<a href='https://github.com/rubys/showcase/blob/main/fly/applications/logger/logfiler.ts' title=''>here</a>.</p>
<h2 id='conclusion' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#conclusion' aria-label='Anchor'></a>Conclusion</h2>
<p>Log failures are not common, and perhaps the redundant logs that fly.io already
keeps will be sufficient for your needs. But it may be worth reviewing what
your exposure is and how to mitigate that exposure should your logs fail at the
worst possible time.</p>
<p>Hopefully the approaches listed above give you ideas on how to ensure that
you will always have the log data you need even in the most hostile
environment conditions.</p>
/blog/redundant-logs/assets/lergs-thumb.webpTokenized Tokenshttps://fly.io/blog/tokenized-tokens/2023-07-12T00:00:00+00:002023-11-21T21:08:37+00:00<div class="lead"><p>We’re Fly.io. We run apps for our users on hardware we host around the world. Building security for a platform like this is tricky, and that’s what the post is about. But you don’t have to read any of this to get an app running on here. See how to <a href="https://fly.io/docs/speedrun/" title="">speedrun getting an app running on Fly.io here</a>.</p>
</div>
<p>We built some little security thingies. We’re open sourcing them, and hoping you like them as much as we do. In a nutshell: it’s a proxy that injects secrets into arbitrary 3rd-party API calls. We could describe it more completely here, but that wouldn’t be as fun as writing a big long essay about how the thingies came to be, so: buckle up.</p>
<p>The problem we confront is as old as Rails itself. Our application started simple: some controllers, some models. The only secrets it stored were bcrypt password hashes. But not unlike a pet baby alligator, it grew up. Now it’s become more unruly than we’d planned.</p>
<p>That’s because frameworks like Rails make it easy to collect secrets: you just create another model for them, <a href='https://guides.rubyonrails.org/active_record_encryption.html' title=''>roll some kind of secret to encrypt them</a>, jam that secret into the deployment environment, and call it a day.</p>
<p>And, at least in less sensitive applications, or even the early days of an app like ours, that can work!</p>
<div class="callout"><p>For what it’s worth, and to the annoyance of some of our Heroku refugees, we’ve never stored customer app secrets this way; our Rails API can write customer secrets, but has never been able to read them. We’ll talk more about how this works in a sec.</p>
</div>
<p>But for us, not anymore. At the stage we’re at, all secrets are hazmat. And Rails itself is the portion of our attack surface we’re least confident about – the rest of it is either outside of our trust boundaries, or written in Rust and Go, strongly-typed memory-safe languages that are easy to reason about, and which have never accidentally treated YAML as an executable file format.</p>
<p>So, a few months back, during an integration with a 3rd party API that relied on OAuth2 tokens, we drew a line: ⚡ <em>henceforth, hazmat shall only be removed from Rails, never added</em> ⚡. This is easier said than done, though: despite prominent “this is not a place of honor” signs all over the codebase, our Rails API is still where much of the action in our system takes place.</p>
<h3 id='how-apps-use-secrets-3-different-approaches' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#how-apps-use-secrets-3-different-approaches' aria-label='Anchor'></a>How Apps Use Secrets: 3 Different Approaches</h3>
<p><img src="/images/secrets-1.png?2/3&card&center" /></p>
<p>We just gave you one way, probably the most common. Stick ‘em in a model, encrypt them with an environment secret, and watch Dependabot religiously for vulnerabilities in transitively-added libraries you’ve never heard of before.</p>
<p><img src="/images/secrets-2.png?2/3&card&center" /></p>
<p>Here’s a second way, probably the second-most popular: use a secrets management system, like <a href='https://aws.amazon.com/kms/' title=''>KMS</a> or <a href='https://www.hashicorp.com/products/vault' title=''>Vault</a>. These systems, which are great, keep secrets encrypted and allow access based on an intricate access control language, which is great.</p>
<p>That’s what we do for customer app secrets, like <code>DATABASE_URL</code> and <code>API_KEY</code>. We use <a href='https://www.hashicorp.com/products/vault' title=''>HashiCorp Vault</a> (for the time being). Our Rails API has an access token for Vault that allows it to set secrets, but not read any of them back, like a kind of diode. A game-over Rails vulnerability might allow an attacker to scramble secrets, but not to easily dump them.</p>
<p>In the happiest cases with secrets, systems like Vault can keep secret bits from ever touching the application. Customer app secrets are a happy case: Rails never needs to read them, <a href='https://fly.io/blog/carving-the-scheduler-out-of-our-orchestrator/' title=''>just our orchestrator</a>, to inject them into VM environments. In other happy cases, Vault operates on the app’s behalf: signing a time-limited request URL for AWS, or making a direct request to a known 3rd-party service. Vault calls these features “<a href='https://developer.hashicorp.com/vault/docs/secrets' title=''>secret engines</a>”, and when you can get away with using them, it’s hard to do better.</p>
<p>The catch is, sometimes you can’t get away with them. For most 3rd parties, Vault has no idea how to interact with them. And most secrets are bearer tokens, not request signatures. The only way to use those kinds of secrets is to read them into app memory. If good code can read a secret from Vault, so can a YAML vulnerability.</p>
<div class="callout"><p>Still: this is better than nothing: even if apps can read raw secrets, systems like Vault can provide an audit trail of which secrets were pulled when, and make it much easier to rotate secrets, which you’ll want to do with raw secrets to contain their blast radius. HashiCorp Vault is great, so is KMS, we recommend them unreservedly.</p>
</div>
<p><img src="/images/secrets-3.png?2/3&card&center" /></p>
<p>So that’s why there’s a third way to handle this problem, which is: decompose your application into services so that the parts that have to handle secrets are tiny and well-contained. The bulk of our domain-specific business code can chug along in Rails, and the parts that trade bearer tokens with 3rd parties can be built in a couple hundred lines of Go.</p>
<p>This is a good approach, too. It’s just cumbersome, because a big application ends up dealing with lots of different kinds of secrets, making a trusted microservice for each of them is a drag. What you want is to notice some commonality in how 3rd party API secrets are used, and to come up with some possible way of exploiting that.</p>
<p>We thought long and hard on this and came up with:</p>
<h3 id='tokenizer-the-fabled-4th-way' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#tokenizer-the-fabled-4th-way' aria-label='Anchor'></a>Tokenizer: The Fabled 4th Way</h3>
<p><img src="/images/secrets-4.png?2/3&card&center" /></p>
<p>We developed a multipurpose secret-using service called the <code>Tokenizer</code>.</p>
<p><code>Tokenizer</code> is a stateless HTTP proxy that holds the private key of a <a href='https://pkg.go.dev/golang.org/x/crypto/nacl/box' title=''>Curve25519 keypair.</a></p>
<p>When we get a new 3rd party API secret, we encrypt it to <code>Tokenizer's</code> public key; we “tokenize” it. Our API server can handle the (encrypted) tokenized secret, but it can’t read or use it directly. Only <code>Tokenizer</code> can.</p>
<p>When it comes time to talk to the 3rd party API, Rails does so via <code>Tokenizer</code>. Here’s how that works:</p>
<ol>
<li>The API request is proxied, as an ordinary HTTP 1.1 request, through <code>Tokenizer</code>.
</li><li>The request carries one or more additional <code>Proxy-Tokenizer</code> headers.
</li><li>Each <code>Proxy-Tokenizer</code> header carries an encrypted secret and instructions for <code>Tokenizer</code> to rewrite the request in some way, usually by injecting the decrypted plaintext into a header.
</li></ol>
<p>You can think of <code>Tokenizer</code> as a sort of Vault-style “secret engine” that happens to capture virtually everything an app needs secrets for. It can even use decrypted secrets to selectively HMAC parts of requests, for APIs that authenticate with signatures instead of bearer tokens.</p>
<p>Check it out: <a href='https://github.com/superfly/tokenizer' title=''>it’s not super complicated</a>.</p>
<p>Now, our goal is to keep Rails from ever touching secret bits. But, hold on: a game-over Rails vulnerability would give attackers an easy way around <code>Tokenizer</code>: you’d just proxy requests for a particular secret to a service you ran that collected the plaintext.</p>
<p>To mitigate that, we built the obvious feature: you can lock requests for specific secrets down to a list of allowed hosts or host regexp patterns.</p>
<p>We think this approach to handling secrets is pretty similar to how payment processors tokenize payment card information, hence the name. The advantages are straightforward:</p>
<ul>
<li>Secrets are exposed to a much smaller attack surface that doesn’t include Rails.
</li><li>Virtually every usage of secrets we’re likely to run across is captured by HTTP proxying, without us needing to write per-service code.
</li><li>The tokenizer is a tiny project that’s easy to audit and reason about.
</li><li>Every language we work in already has first-class support for running requests through a proxy (something we already do for <a href='https://github.com/stripe/smokescreen' title=''>SSRF protection</a>.)
</li></ul>
<h3 id='ssokenizer-tokenizing-oauth-sso' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#ssokenizer-tokenizing-oauth-sso' aria-label='Anchor'></a>SSOkenizer: Tokenizing OAuth SSO</h3>
<p>When we created <code>Tokenizer</code>, we were motivated by the problem of OAuth2 tokens other services providers gave us, for partnership features we build for mutual customers.</p>
<p>We’d also dearly like our customers to use OAuth2/OIDC to log into Fly.io itself; it’s more secure for them, and gives them the full complement of Google MFA features, meaning we don’t immediately have to implement the full complement of Google MFA features. Letting people log into Fly.io with a Google OAuth token means we have to keep track of people’s OAuth tokens. That sounds like a job for the <code>Tokenizer</code>!</p>
<p>But there’s a catch: acquiring those OAuth tokens in the first place means doing the OAuth2 dance, which means that for a brief window of time, Rails is handling hazmat. We’d like to close that window.</p>
<p><img src="/images/ssokenizer.png?2/3&card&center" /></p>
<p>Enter the <code>SSOkenizer</code>.</p>
<p>The job of the <code>SSOkenizer</code> is to perform the OAuth2 dance on behalf of Rails, and then use the output of that process (the OAuth2 bearer token yielded from the OAuth2 code flow, which you can <a href='https://github.com/superfly/ssokenizer#ssokenizer' title=''>see in its cursed majesty here</a>) to drive the <code>Tokenizer</code>.</p>
<p>In other words, where we’d otherwise explicitly encrypt secrets to be tokenized a-priori, the <code>SSOkenizer</code> does that on the fly, passing tokenized OAuth2 credentials back to Rails. Those… tokenized tokens can only be used through the <code>Tokenizer</code> proxy, which is the only component in our system with the private key that unseals them.</p>
<p>We think this is a pretty neat trick. The <code>SSOkenizer</code> itself is tiny, even smaller than the <code>Tokenizer</code> (<a href='https://github.com/superfly/ssokenizer/' title=''>you can read it here</a>), and essentially stateless; in fact, pretty much everything in this system is minimally stateful, except Rails, which is great at being stateful. We even keep almost all of OAuth2 out of Rails and confined to Go code (where it’s practically the hello-world of Go OAuth2 libraries).</p>
<p>A nice side effect-slash-validation of this design: once we got it working for Google, it became a super easy project to get OAuth2 logins working for other providers.</p>
<h3 id='feel-free-to-poach-this' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#feel-free-to-poach-this' aria-label='Anchor'></a>Feel Free To Poach This</h3>
<p>We’re psyched for a bunch of reasons:</p>
<ul>
<li>We’ve got a clear path to rolling out SSO logins.
</li><li>We can do integrations with third-party services now without infecting Rails with more hazmat secrets.
</li><li>We’ve honored the rule of “only removing hazmat from Rails, not adding it”.
</li><li>We’ve also cleared a path to getting all the rest of the hazmat Rails has access to tokenized.
</li></ul>
<p>These are standalone tools with no real dependencies on Fly.io, so they’re easy for us to open source. Which is what we did: if they sound useful to you, check out the <a href='https://github.com/superfly/tokenizer' title=''>tokenizer</a> and <a href='https://github.com/superfly/ssokenizer' title=''>ssokenizer</a> repositories for instructions on deploying and using these services yourself.</p>
/blog/tokenized-tokens/assets/ghosts.pngFly.io ❤️ Bunhttps://fly.io/blog/flydotio-heart-bun/2023-07-11T00:00:00+00:002024-02-05T16:04:11+00:00<p><a href='https://lu.ma/cqk31rvl' title=''>Bun 1.0 comes out September 7th</a>. Fly.io is making preparations.</p>
<p>Previously, we stated that <a href='https://fly.io/blog/flydotio-heart-js/' title=''>Fly.io ❤️ JS</a>, and we understandably started with Node.js. While that work is ongoing, it makes sense to start expanding to other runtimes.</p>
<p>Bun is the obvious next choice given it <a href='https://bun.sh/docs/runtime/nodejs-apis' title=''>aims for complete Node.js API compatibility</a>.</p>
<p>Starting with <a href='https://fly.io/docs/hands-on/install-flyctl/' title=''>flyctl</a> version 0.1.54 and <a href='https://www.npmjs.com/package/@flydotio/dockerfile' title=''>@flydotio/dockerfile</a> version 0.3.3, you can launch and deploy bun applications using <code>fly launch</code> and <code>fly deploy</code>,
provided:</p>
<ul>
<li>You’ve installed bun version 0.5.3 or later
</li><li>You have a <code>package.json</code> that meets at least one of the following conditions:
<ul>
<li>It has a <code>start</code> entry in the <code>scripts</code> section.
</li><li>It has a <code>module</code> entry and specified <code>module</code> as the <code>type</code>.
</li><li>If has a <code>main</code> entry.
</li></ul>
</li></ul>
<p>Basically, if you can run <a href='https://bun.sh/docs/quickstart' title=''>Bun’s Quickstart</a> and <a href='https://fly.io/docs/hands-on/' title=''>Fly’s hands-on walk-through</a>, you have all you need to deploy your application on fly.io.</p>
<p>We also have a <a href='https://github.com/fly-apps/bun/' title=''>sample</a> that you can deploy.</p>
<p>Be forewarned that everything is beta at this point. Some issues we encountered while preparing this support:</p>
<ul>
<li><a href='https://github.com/oven-sh/bun/issues/3605' title=''><code>bun install</code> has no <code>--prune</code> option</a>. Our Dockerfiles use this to remove development dependencies after running <code>build</code>. Of course with bun you are less likely to need a build step as TS and JSX are built in.
</li><li><a href='https://github.com/oven-sh/bun/issues/1579' title=''><code>throwIfNoEntry</code> is not supported in <code>fs.statSync</code></a>. <a href='https://github.com/fly-apps/node-demo' title=''><code>fly-apps/node-demo</code></a> uses that.
</li><li>Programs that used <a href='https://nodejs.org/api/readline.html' title=''>readline</a> <a href='https://github.com/oven-sh/bun/issues/3604' title=''>never exit</a>. Switching to <a href='https://bun.sh/docs/api/globals' title=''>global</a>.<a href='https://developer.mozilla.org/en-US/docs/Web/API/Window/prompt' title=''>prompt</a> resolved this issue for <code>@flydotio/dockerfile</code>.
</li></ul>
<p>Undoubtedly there will be bugs in fly’s dockerfile generator too. But as Node.js and Bun share the same generator, fixes that are made for either framework will generally benefit both.</p>
<p>If you see a problem,
<a href='https://community.fly.io/' title=''>start a discussion</a>,
<a href='https://github.com/fly-apps/dockerfile-node' title=''>open an issue</a>, or
<a href='https://github.com/fly-apps/dockerfile-node/pulls' title=''>create a pull request</a>.</p>
/blog/flydotio-heart-bun/assets/flydotio-heart-bun-thumb.webpLiteFS Cloud: Distributed SQLite with Managed Backupshttps://fly.io/blog/litefs-cloud/2023-07-05T00:00:00+00:002023-11-21T21:08:37+00:00<div class="lead"><p>With Fly.io, <a href="https://fly.io/docs/speedrun/" title="">you can get your app running globally in a matter of minutes</a>, and with LiteFS, you can run SQLite alongside your app! Now we’re introducing LiteFS Cloud: managed backups and point-in-time restores for LiteFS—whether your app is running on Fly.io or anywhere else. <a href="https://fly.io/docs/litefs/speedrun/" title="">Try it out for yourself</a>!</p>
</div>
<p>We love <a href='https://fly.io/blog/all-in-on-sqlite-litestream/' title=''>SQLite in production</a>, and we’re all about running apps close to users. That’s why we created LiteFS: an open source distributed SQLite database that lives on the same filesystem as your application, and replicates data to all the nodes in your app cluster.</p>
<p>With LiteFS, you get the simplicity, flexibility, and lightning-fast local reads of working with vanilla SQLite, but distributed (so it’s close to your users)! It’s especially great for read-heavy web applications. Learn more about LiteFS in the <a href='https://fly.io/docs/litefs/' title=''>LiteFS docs</a> and in <a href='https://fly.io/blog/introducing-litefs/' title=''>our blog post introducing LiteFS</a>.</p>
<p>At Fly.io we’ve been using LiteFS internally for a while now, and it’s awesome!</p>
<p>However, something is missing: disaster recovery. Because it’s local to your app, you don’t need to—indeed can't—pay someone to manage your LiteFS cluster, which means no managed backups. Until now, you’ve had to <a href='https://fly.io/docs/litefs/backup/' title=''>build your own</a>: take regular snapshots, store them somewhere, figure out a retention policy, that sort of thing.</p>
<p>This also means you can only restore from a point in time when you happen to have taken a snapshot, and you likely need to limit how frequently you snapshot for cost reasons. Wouldn’t it be cool if you could have super-frequent reliable backups to restore from, without having to implement it yourself?</p>
<p>Well, that’s why we’re launching, in preview, LiteFS Cloud: backups and restores for LiteFS, managed by Fly.io. It gives you painless and reliable backups, with the equivalent of a snapshot every five minutes (8760 snapshots per month!), whether your database is hosted with us, or anywhere else.</p>
<h2 id='how-do-i-use-litefs-cloud' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#how-do-i-use-litefs-cloud' aria-label='Anchor'></a>How do I use LiteFS Cloud?</h2>
<p>There’s a few steps to get started:</p>
<ul>
<li>Upgrade LiteFS to version 0.5.1 or greater
</li><li>Create a LiteFS Cloud cluster in the Fly.io dashboard, <a href='https://fly.io/dashboard/personal/litefs' title=''>LiteFS Cloud section</a>
</li><li>Make the LiteFS Cloud auth token available to your LiteFS
</li></ul>
<p><img alt="Screenshot of Fly.io dashboard, with a red arrow pointing to "LiteFS Cloud" in the left navbar, and another red arrow pointing to the "Create" button on the top right for creating a LiteFS Cloud cluster" src="/images/screenshot1.png" /></p>
<p><a href='https://fly.io/docs/litefs/cloud-backups' title=''>There are some docs here</a>, but that’s literally it. Then your database will start automagically backing up, we’ll manage the backups for you, and you’ll be able to restore your database near instantaneously to any point in time in the last 30 days (with 5 minute granularity).</p>
<p>I want to say that again because I think it’s just wild – you can restore your database to <em>any point in time, with 5 minute granularity</em>. <strong class='font-[550] text-navy-950'><em>Near instantaneously</em></strong>.</p>
<p>Speaking of restores—you can do those in the dashboard too. You pick a date and time, and we’ll take the most recent snapshot before that timestamp and restore it. This will take a couple of seconds (or less).</p>
<p><img alt="Screenshot of popup modal on Fly.io dashboard, with a date and time selector, and a text field with "lfsc-test-runner/db" typed in it, and a red button at the bottom with text "I understand the consequences. Restore from this snapshot."" src="/images/screenshot2.png" /></p>
<p>We’ll introduce pricing in the coming months, but for now LiteFS Cloud is in preview and is free to use. Please go check it out, and let us know how it goes!</p>
<h2 id='the-secret-sauce-ltx-amp-compactions' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-secret-sauce-ltx-amp-compactions' aria-label='Anchor'></a>The secret sauce: LTX & compactions</h2>
<p>LiteFS is built on a simple file format called <a href='https://github.com/superfly/ltx' title=''>Lite Transaction File (LTX)</a> which is designed for fast, flexible replication and recovery in LiteFS itself and in LiteFS Cloud.</p>
<p>But first, let’s start off with what an LTX file represents: <em>a change set of database pages</em>.</p>
<p>When you commit a write transaction in SQLite, it updates one or more fixed-sized blocks called pages. By default, these are 4KB in size. An LTX file is simply a sorted list of these changed pages. Whenever you perform a transaction in SQLite, LiteFS will build an LTX file for that transaction.</p>
<p>The interesting part of LTX is that contiguous sets of LTX files can be merged together into one LTX file. This merge process is called <em>compaction</em>.</p>
<p>For example, let’s say you have 3 transactions in a row that update the following set of pages:</p>
<ul>
<li>LTX A: Pages 1, 5, 7
</li><li>LTX B: Pages 5, 6
</li><li>LTX C: Pages 5, 7
</li></ul>
<p>With LTX compaction, you avoid the duplicate work that comes from overwriting the same pages one transaction at a time. Instead, one LTX file for transactions A through C contains the last version of each page, so the pages are stored and updated only once:</p>
<p><img alt="Compacting three contiguous LTX files into a single LTX file." src="/images/single-level-compaction.png" /></p>
<p>That, in a nutshell, is how a single-level compaction works.</p>
<h2 id='its-ltx-all-the-way-down' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#its-ltx-all-the-way-down' aria-label='Anchor'></a>It’s LTX all the way down</h2>
<p>Compactions let us take changes for a bunch of transactions and smoosh them down into a single, small file. That’s cool and all but how does that give us fast point-in-time restores? By the magic of multi-level compactions!</p>
<p>Compaction levels are progressively larger time intervals that we roll up transaction data. In the following illustration, you can see that the highest level (L3) starts with a full snapshot of the database. This occurs daily and it’s our starting point during a restore.</p>
<p>Next, we have an hourly compaction level called L2 so there will be an LTX file with page changes between midnight and 1am, and then another file for 1am to 2am, etc. Below that is L1 which holds 5-minute intervals of data.</p>
<p><img alt="Compaction levels for snapshots (L3), hourly (L2), & every five minutes (L1)." src="/images/multi-level-compaction.png" /></p>
<p>When a restore is requested for a specific timestamp, we can determine a minimal set of LTX files to replay. For example, if we restored to January 10th at 8:15am we would grab the following files:</p>
<ul>
<li>Start with the snapshot for January 10th.
</li><li>Fetch the eight hourly LTX files from midnight to 8am.
</li><li>Fetch the three 5-minute interval LTX files from 8:00am to 8:15am.
</li></ul>
<p>Since LTX files are sorted by page number, we can perform a streaming merge of these twelve files and end up with the state of the database at the given timestamp.</p>
<h2 id='department-of-redundancy-department' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#department-of-redundancy-department' aria-label='Anchor'></a>Department of Redundancy Department</h2>
<p>One of the primary goals of LiteFS is to be simple to use. However, that’s not an easy goal for a distributed database when our industry is moving more and more towards highly dynamic and ephemeral infrastructure. Traditional consensus algorithms require stable membership and adjusting the member set can be complicated.</p>
<p>With LiteFS, we chose to use async replication as the primary mode of operation. This has some trade-offs in durability guarantees but it makes the cluster much simpler to operate. LiteFS Cloud alleviates many of these trade-offs of async replication by writing data out to high-durability, high-availability object storage—for now, we’re using S3.</p>
<p>However, we don’t write every individual LTX file to object storage immediately. The latency is too high and it’s not cost effective when you write a lot of transactions. Instead, the LiteFS primary node will batch up its changes every second and send a single, compacted LTX file to LiteFS Cloud. Once there, LiteFS Cloud will batch these 1-second files together and flush them to storage periodically.</p>
<p>We track the ID of the latest transaction that’s been flushed, and we call this the “high water mark” or HWM. This transaction ID is propagated back down to the nodes of the LiteFS cluster so we can ensure that the transaction file is not removed from any node until it is safely persisted in object storage. With this approach, we have multiple layers of redundancy in case your LiteFS cluster can’t communicate with LiteFS Cloud or if we can’t communicate with S3.</p>
<h2 id='whats-next-for-litefs-cloud' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#whats-next-for-litefs-cloud' aria-label='Anchor'></a>What’s next for LiteFS Cloud?</h2>
<p>We have a small team dedicated to LiteFS Cloud, and we’re chugging away at new exciting features! Right now, LiteFS Cloud is really just backups and restores, but we are working on a lot of other cool stuff:</p>
<ul>
<li>Upload your database in the Fly.io dashboard. This way you don’t have to worry about figuring out how to initialize your database when you first deploy it, just upload the database in the dashboard and LiteFS will pull it from LiteFS Cloud.
</li><li>Download a point-in-time snapshot of your database from the Fly.io dashboard. You can use this to spin up a local dev env (with production data), do some local analysis, etc.
</li><li>Clone your LiteFS Cloud cluster to a new cluster, which you could use for a staging environment (or on-demand test environments for your CI pipelines) with real data.
</li><li>Features to support apps that run on serverless platforms like Vercel, Google Cloud Run, Deno, and more. We’ll need to develop a number of different features for this, stay tuned for more information in the coming weeks!
</li></ul>
<p>We’re really excited about the future of LiteFS Cloud, so we wanted to share what we’re thinking. We’d also love to hear any feedback you have about these ideas that might inform our work.</p>
/blog/litefs-cloud/assets/litefs-cloud-thumb.webpWe Raised A Bunch Of Moneyhttps://fly.io/blog/we-raised-a-bunch-of-money/2023-06-27T00:00:00+00:002023-11-21T21:08:37+00:00<p><strong class='font-[550] text-navy-950'>This past July, we raised $25MM from A16Z and our existing investors, including Intel Capital and Dell. Recently, we raised an additional $70MM led by EQT Ventures.</strong></p>
<p>Why do startups write announcements like these? We went back and forth on it. There are lots of reasons, most of them dumb.</p>
<p>Our first reason is obvious, and mercenary. It’s the same reason we write anything: to woo customers. We’re all adults here, we can talk about this stuff, right? There are customers who are comfortable engaging with tiny Fly.io, and others who are comfortable engaging with the Fly.io that raised an additional $70MM led by EQT ventures. Alcoa: ring us up!</p>
<p>More compellingly, it’s an opportunity to gaze deeply into our own navels. We’ve been <a href='https://community.fly.io/' title=''>talking to users, fans, and detractors about what we’ve been doing</a>, for years. We evolved, and got religion about a particular vision of what we’re building. We shared that with investors, and they bought it (suckers). Now we’ll share with you.</p>
<h2 id='the-two-hour-problem' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-two-hour-problem' aria-label='Anchor'></a>The Two Hour Problem</h2>
<p>Here’s what we believed in 2020: apps work better when they run closer to their users. Some kinds of apps, like video or real-time presence, can’t be done without physical locality. So, that’s what we expected to talk about on our <a href='https://news.ycombinator.com/item?id=22616857' title=''>HN launch thread</a>: WebRTC, edge caching, game servers.</p>
<p>What people actually wanted to talk about, though? Databases.</p>
<p>Here’s what we missed: we thought there was a particular kind of “edgy” app that demanded global deployment. But it turns out, most apps want to be edgy… if it’s easy.</p>
<p>What’s going on here? Why is edge deployment table stakes for a game server and an untenable science project for an online bookstore? We think it’s because game servers have to be edgy, and online bookstores don’t. The game server team will bang on edge deployment until it’s solved. The bookstore team will try for about two hours, not find a clear path forward, and then give up and move on to other things.</p>
<p>The result of this is an Internet where all of the world’s CRUD apps are hosted in Loudoun County, VA (motto: “where tradition meets innovation”), at Amazon’s <code>us-east-1</code> in Ashburn, a city with so many Rails apps that one of them was elected to the county Board of Supervisors.</p>
<p>We think everybody understands that it’d be better to run close to users rather than in the Internet’s least worst data center. But with ordinary tooling, getting an app running in more than one city at the same time isn’t a two-hour problem: in two hours, you’ll learn that it’s possible to run simultaneously in Sydney, Frankfurt, and Dallas, but not how to do it, or how long it’ll take.</p>
<p>So our bet is simple: with the right platform and toolchain, people building bookstores, sandwich rating apps, music recommenders, mailing list managers for churches, and every other kind of app will build apps that run fast globally. Not just walking distance from Carolina Brothers BBQ in Ashburn, but in Chicago, or Sydney, or Singapore, or São Paulo. Because being fast in more than one city at the same time is a super valuable feature!</p>
<p>We think this pattern holds for a lot of things. We’re going to track those things down and build them.</p>
<p>For example: sandboxing, code editors and REPLs, and CI/CD applications all have to figure out how to run untrusted customer code. They all figure out how to spin up locked down containers on demand. But being able to spin up a VM on the fly is a super valuable feature for all kinds of apps (as anyone who’s ever debugged a stuck job queue can attest). Why doesn’t everybody do it? Because it isn’t clear after two hours of investigation how to do it. <a href='https://fly.io/blog/fly-machines/' title=''>So we built Fly Machines</a>, which makes spinning up a VM as straightforward as calling a function.</p>
<p>We’ve got more things like this coming. Real-time features and user presence are two-hour features. So is encryption and secret storage. And clustered databases. And hardware-accelerated inferencing.</p>
<p>There are other companies looking to solve “two hour window” problems for developers: distributed databases, data locality, storage, AI, app frameworks. If we get Fly.io right, we’ll give those platforms new primitives to build on top of, get new ideas in front of users faster, and ratchet up the quality of every application anywhere.</p>
<p>Sounds like an investment pitch? Well, yeah, it was.</p>
<h2 id='why-we-raised-a-bunch-of-money' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#why-we-raised-a-bunch-of-money' aria-label='Anchor'></a>Why We Raised A Bunch Of Money</h2>
<p>Here’s what we think it takes to build this kind of platform:</p>
<ul>
<li>A hardware fleet. Fly.io has always run on its own hardware. There are fun, technical, “control your own destiny” reasons to rack hardware instead of layering on top of commodity clouds. But it’s really just economics. If you want to get people to build apps on your platform, you need a shot at being around 10 years from now. Hardware is what makes the margins work.
</li><li>All the regions. This is subtle! We launched with 19 regions, which, if you’re just serving individual application developers, is plenty. But alongside individual apps, we want other platforms running on us — managed databases, developer tools. And those companies need all the regions. We’re up to 33 regions now, and we’re getting much faster at lighting new ones up.
</li><li>Support and reliability. We’re under no illusions about the platform reliability task we’re facing, or about our ability to clever our way through it (we’re not that clever to begin with).
</li></ul>
<p>Those things are all capital intensive, and alongside them we’d like to place more bets: on advanced storage, on security capabilities, on new kinds of hardware. So you see where the money goes.</p>
<h2 id='heres-whats-not-changing' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#heres-whats-not-changing' aria-label='Anchor'></a>Here’s What’s Not Changing</h2>
<p>🎶 There are two kinds of platform companies 🎶 : the kind where you can sign up online and be playing with them in 5 minutes, and the kind where you can sign up online and get a salesperson to call and quote you a price and arrange a demo.</p>
<p>🎶 There are two kinds of platform companies 🎶 : the kind you can figure out without reading the manual, and the kind where publishers have competing books on how to use them, the kind where you can get professionally certified in actually being able to boot up an app on them.</p>
<p>🎶 There are two kinds of platform companies 🎶 : the kind where you can get your Python or Rust or Julia code running nicely, and the kind where you find a way to recompile it to Javascript.</p>
<p>The kind of platform company we want to be hasn’t changed since 2020. Our features are all generally a command or two in <code>flyctl</code>, and they work for any app that can be packaged in a container.</p>
<p>You can take our word for that, but if you’ve already got a working Docker container for your app, you can put us to the test. From a standing start, you should be able to get it running on Fly.io in single digit minutes, and on every continent in just a minute or two more.</p>
/blog/we-raised-a-bunch-of-money/assets/fundraising-thumbnail.pngPython Async Workers on Fly Machineshttps://fly.io/blog/python-async-workers-on-fly-machines/2023-05-31T00:00:00+00:002023-11-21T21:08:37+00:00<div class="lead"><p>In this post we use Fly Machines to implement lightweight background jobs for a Python web application. Check it out: <a href="https://fly.io/docs/languages-and-frameworks/python/" title="">you can be up and running on Fly.io in just minutes.</a></p>
</div>
<p>Last year, while working in what was my day job at the time (before I joined Fly.io!), we had just developed a new internal tool to help an adjacent team with their work. This adjacent team wrote technical content, and they had a lot of issues stemming from differences in library and language versions in the team members’ local environments as compared to our production environment.</p>
<p>There are a lot of possible solutions to this problem, but because of the unique needs and skillset of this team, we decided to build an app for them to work in, and allow them to just get rid of their local environment entirely. This way, we could ensure that all the versions were exactly as expected, and over time we could also add more assistive features.</p>
<p>At the start, this was a super-hastily-thrown-together, barely an MVP tool that kinda sorta met the internal users’ needs most of the time. The first version was only good enough because their previous workflow was just so awful — it was difficult for us to do worse.</p>
<p>One thing our new app needed to do, was build and install libraries (the same ones our teammates had been installing locally), and we needed to rebuild them regularly (think, when a user clicks a “Build” button in the app).</p>
<p>Initially, we simply implemented these builds in the backend directly. This worked great for a little while, and it was nice to only have to deploy one thing. But then we discovered that (1) for some edge cases, our builds were very slow (occasionally over 30 minutes — far too slow for the HTTP request cycle…), and (2) some builds took a lot of resources, so occasionally, even after over-provisioning, if two builds came in at once, our backend got killed (and the builds never completed).</p>
<p>Based on this less-than-awesome experience, it became clear to us that we needed background jobs!</p>
<p>We ended up configuring Celery, as one does (when one is a Python developer anyway). However, this wasn’t as pain-free as it could have been. There’s some significant configuration required, and Celery was overkill for our very simple use case.</p>
<p>Plus – those expensive builds? We needed to have a worker (or several workers) available to run them any time, even though we only had a handful of team members using the tool, so most of the time the worker was idle. We were paying for resources we weren’t using most of the time — not at all awesome for a bootstrapped startup!</p>
<p>So, how could we have implemented super simple background jobs, and avoid paying for resources we didn’t need?</p>
<p>Well, it turns out that it’s really pretty easy to implement simple background jobs using Fly Machines! I’ll show you how.</p>
<h2 id='how-it-works' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#how-it-works' aria-label='Anchor'></a>How it works</h2>
<p>First some background. Fly Machines are lightweight VMs based on <a href='https://firecracker-microvm.github.io/' title=''>Firecracker</a> that start up super fast (you can read more details about Machines in <a href='https://fly.io/docs/machines/' title=''>our Machines documentation</a>). They also have a convenient and simple API, making them easy to start, stop, and interact with from your code.</p>
<p>For the purposes of this post, we’ll be building a demo app - a super minimal Flask web application which sends email in a background job (<a href='https://github.com/magdalene/fly-background-jobs' title=''>full code available here</a>). You can also try out the application at <a href='https://darla-send-email.fly.dev/' title=''>darla-send-email.fly.dev</a>. Note: for demonstration purposes, the application I’ve deployed uses the <code>dummy_send_email</code> function, which doesn’t actually send an email! You can also deploy your own version with real <a href='https://www.mailjet.com/' title=''>Mailjet</a> credentials, though.</p>
<p>So, here’s how our implementation works from a high level:</p>
<ul>
<li>The web application (or a library the web app uses) writes some job parameters to Redis
</li><li>The web application calls the Fly Machines API to start a new machine, adding an env variable to tell the worker which Redis key to load job params from
</li><li>When the worker starts, it reads the job params from Redis
</li><li>The worker does its magic! ✨
</li><li>The worker writes results to Redis
</li><li>The web app retrieves the results!
</li></ul>
<p><img alt="Python workers on machines diagram" src="/images/python-workers-diagram.webp" /></p>
<p>One really cool thing about this implementation is that you only pay for worker resources when your workers are actually, you know, doing work. For infrequent, expensive background jobs, this can make a huge difference in costs!</p>
<p>Before we get into the code, we’ll need to set up a few bits of infrastructure. Let’s check how that’s done.</p>
<h2 id='infrastructure-setup' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#infrastructure-setup' aria-label='Anchor'></a>Infrastructure setup</h2>
<p>I’ll assume you’ve already set up your Fly.io account and installed the <code>flyctl</code> commandline tool. If you haven’t done that yet, follow these instructions to <a href='https://fly.io/docs/hands-on/install-flyctl/' title=''>install <code>flyctl</code></a>, <a href='https://fly.io/docs/hands-on/sign-up/' title=''>sign up</a>, <a href='https://fly.io/docs/hands-on/sign-in/' title=''>log in to fly.io</a>, and then come back here!</p>
<p>After you have your Fly.io account set up and <code>flyctl</code> installed locally, you’ll need to create two pieces of infrastructure: a Fly.io App, which the Machines that run the background jobs will belong to, and a Redis instance, which we’ll use to communicate between the web application and the background job Machines.</p>
<h3 id='create-an-app' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#create-an-app' aria-label='Anchor'></a>Create an app</h3>
<p>Fly.io Machines need to be created in an app, so we’ll need to create an app.</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-z2s5zzrw"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-z2s5zzrw">fly apps create my-machine-tasks <span class="c"># name your app something appropriate!</span>
</code></pre>
</div>
</div><div class="callout"><p>Warning: app names are unique across all Fly.io users, so you’ll need to pick something unique. You can also call <code>fly apps create</code> without an app name, and let it generate one for you, if you’re stuck.</p>
</div><h3 id='create-an-upstash-redis-instance' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#create-an-upstash-redis-instance' aria-label='Anchor'></a>Create an Upstash Redis instance</h3><div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-efrwrro5"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-efrwrro5">fly redis create
</code></pre>
</div>
</div>
<p>Take note of the Redis url that’s printed after creation. If you forget it, you can see it again using <code>fly redis status</code>.</p>
<h2 id='the-worker-code' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-worker-code' aria-label='Anchor'></a>The worker code</h2>
<p>First, let’s take a look at the code that we’ll run on the Machine:</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-8sho35dp"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-8sho35dp"><span class="k">def</span> <span class="nf">run_task</span><span class="p">():</span>
<span class="n">redis_client</span> <span class="o">=</span> <span class="n">redis</span><span class="p">.</span><span class="n">from_url</span><span class="p">(</span><span class="n">REDIS_URL</span><span class="p">)</span>
<span class="c1"># retrieve info about the function to run and args to use
</span> <span class="n">task_info</span> <span class="o">=</span> <span class="n">json</span><span class="p">.</span><span class="n">loads</span><span class="p">(</span><span class="n">redis_client</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">REDIS_TASK_INFO_KEY</span><span class="p">))</span>
<span class="n">module_name</span> <span class="o">=</span> <span class="n">task_info</span><span class="p">[</span><span class="s">"module"</span><span class="p">]</span>
<span class="n">fn_name</span> <span class="o">=</span> <span class="n">task_info</span><span class="p">[</span><span class="s">"function_name"</span><span class="p">]</span>
<span class="n">args</span> <span class="o">=</span> <span class="n">task_info</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"args"</span><span class="p">,</span> <span class="p">[])</span>
<span class="n">kwargs</span> <span class="o">=</span> <span class="n">task_info</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"kwargs"</span><span class="p">,</span> <span class="p">{})</span>
<span class="n">module</span> <span class="o">=</span> <span class="n">importlib</span><span class="p">.</span><span class="n">import_module</span><span class="p">(</span><span class="n">module_name</span><span class="p">)</span>
<span class="n">task_fn</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">module</span><span class="p">,</span> <span class="n">fn_name</span><span class="p">)</span>
<span class="c1"># do the actual work
</span> <span class="n">result</span> <span class="o">=</span> <span class="n">task_fn</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span>
<span class="c1"># write results to Redis
</span> <span class="n">redis_client</span><span class="p">.</span><span class="nb">set</span><span class="p">(</span><span class="n">REDIS_RESULTS_KEY</span><span class="p">,</span> <span class="n">json</span><span class="p">.</span><span class="n">dumps</span><span class="p">({</span>
<span class="s">"status"</span><span class="p">:</span> <span class="s">"SUCCESS"</span><span class="p">,</span>
<span class="s">"result"</span><span class="p">:</span> <span class="n">result</span>
<span class="p">})</span>
<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">'__main__'</span><span class="p">:</span>
<span class="n">run_task</span><span class="p">()</span>
</code></pre>
</div>
</div>
<p>You might notice something missing here — the code that actually sends the email. You’ll also need to implement the functions that do the work of the background jobs, and include them in the worker library. You can take a look at <a href='https://github.com/magdalene/fly-background-jobs/blob/main/worker/tasks.py#L10' title=''>the <code>send_email</code> function in <code>tasks.py</code> in the demo code repo</a>, to see the implementation for sending an email!</p>
<p>Here’s an example of the task info that might be stored in Redis for sending an email from our demo app:</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-qggr06hp"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-qggr06hp"><span class="p">{</span>
<span class="s">"module"</span><span class="p">:</span> <span class="s">"tasks"</span><span class="p">,</span>
<span class="s">"function_name"</span><span class="p">:</span> <span class="s">"send_email"</span><span class="p">,</span>
<span class="s">"args"</span><span class="p">:</span> <span class="p">[</span>
<span class="s">"your-friend@yourfriendsdomain.com"</span><span class="p">,</span>
<span class="s">"Hello from my Fly.io app!"</span><span class="p">,</span>
<span class="s">"Hello friend! I sent this email from my app on Fly.io!"</span>
<span class="p">],</span>
<span class="s">"kwargs"</span><span class="p">:</span> <span class="p">{</span>
<span class="s">"to_name"</span><span class="p">:</span> <span class="s">"Friend's Name"</span><span class="p">,</span>
<span class="s">"from_email"</span><span class="p">:</span> <span class="s">"you@yourdomain.com"</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre>
</div>
</div>
<p>We’re sending the module and function name as strings in the task information in Redis. There are more sophisticated options here, but this approach works for our simple use case!</p>
<h2 id='code-to-call-the-worker' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#code-to-call-the-worker' aria-label='Anchor'></a>Code to call the worker</h2>
<p>Then, let’s take a look at the code that we’ll use to set up the Machine and kick off the background job:</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-wznc48i1"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-wznc48i1"><span class="n">headers</span> <span class="o">=</span> <span class="p">{</span>
<span class="s">"Authorization"</span><span class="p">:</span> <span class="sa">f</span><span class="s">"Bearer </span><span class="si">{</span><span class="n">FLY_API_TOKEN</span><span class="si">}</span><span class="s">"</span><span class="p">,</span>
<span class="s">"Content-Type"</span><span class="p">:</span> <span class="s">"application/json"</span>
<span class="p">}</span>
<span class="k">def</span> <span class="nf">run_task</span><span class="p">(</span><span class="n">module_name</span><span class="p">,</span> <span class="n">function_name</span><span class="p">,</span> <span class="n">args</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">kwargs</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span>
<span class="n">redis_client</span> <span class="o">=</span> <span class="n">redis</span><span class="p">.</span><span class="n">from_url</span><span class="p">(</span><span class="n">REDIS_URL</span><span class="p">,</span> <span class="n">decode_responses</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">args</span> <span class="o">=</span> <span class="n">args</span> <span class="ow">or</span> <span class="p">[]</span>
<span class="n">kwargs</span> <span class="o">=</span> <span class="n">kwargs</span> <span class="ow">or</span> <span class="p">{}</span>
<span class="n">task_id</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">function_name</span><span class="si">}</span><span class="s">-</span><span class="si">{</span><span class="n">uuid</span><span class="p">.</span><span class="n">uuid4</span><span class="p">()</span><span class="si">}</span><span class="s">"</span>
<span class="n">redis_task_info_key</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">TASKS_KEY_PREFIX</span><span class="si">}{</span><span class="n">task_id</span><span class="si">}</span><span class="s">"</span>
<span class="n">redis_results_key</span> <span class="o">=</span> <span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">RESULTS_KEY_PREFIX</span><span class="si">}{</span><span class="n">task_id</span><span class="si">}</span><span class="s">"</span>
<span class="n">machine_config</span> <span class="o">=</span> <span class="p">{</span>
<span class="s">"name"</span><span class="p">:</span> <span class="n">task_id</span><span class="p">,</span>
<span class="s">"config"</span><span class="p">:</span> <span class="p">{</span>
<span class="s">"image"</span><span class="p">:</span> <span class="n">WORKER_IMAGE</span><span class="p">,</span>
<span class="s">"env"</span><span class="p">:</span> <span class="p">{</span>
<span class="s">"REDIS_TASK_INFO_KEY"</span><span class="p">:</span> <span class="n">redis_task_info_key</span><span class="p">,</span>
<span class="s">"REDIS_RESULTS_KEY"</span><span class="p">:</span> <span class="n">redis_results_key</span>
<span class="p">},</span>
<span class="s">"processes"</span><span class="p">:</span> <span class="p">[{</span>
<span class="s">"name"</span><span class="p">:</span> <span class="s">"worker"</span><span class="p">,</span>
<span class="s">"entrypoint"</span><span class="p">:</span> <span class="p">[</span><span class="s">"python"</span><span class="p">],</span>
<span class="s">"cmd"</span><span class="p">:</span> <span class="p">[</span><span class="s">"worker.py"</span><span class="p">]</span>
<span class="p">}]</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">redis_client</span><span class="p">.</span><span class="nb">set</span><span class="p">(</span><span class="n">redis_task_info_key</span><span class="p">,</span> <span class="n">json</span><span class="p">.</span><span class="n">dumps</span><span class="p">({</span>
<span class="s">"module"</span><span class="p">:</span> <span class="n">module_name</span><span class="p">,</span>
<span class="s">"function_name"</span><span class="p">:</span> <span class="n">function_name</span><span class="p">,</span>
<span class="s">"kwargs"</span><span class="p">:</span> <span class="n">kwargs</span><span class="p">,</span>
<span class="s">"args"</span><span class="p">:</span> <span class="n">args</span>
<span class="p">}))</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="p">.</span><span class="n">post</span><span class="p">(</span>
<span class="sa">f</span><span class="s">"https://api.machines.dev/v1/apps/</span><span class="si">{</span><span class="n">FLY_TASKS_APP</span><span class="si">}</span><span class="s">/machines"</span><span class="p">,</span> <span class="n">headers</span><span class="o">=</span><span class="n">headers</span><span class="p">,</span> <span class="n">json</span><span class="o">=</span><span class="n">machine_config</span>
<span class="p">)</span>
<span class="n">response</span><span class="p">.</span><span class="n">raise_for_status</span><span class="p">()</span>
<span class="c1"># store the machine id so we can use it later to check if the job has completed
</span> <span class="n">machine_id</span> <span class="o">=</span> <span class="n">response</span><span class="p">.</span><span class="n">json</span><span class="p">()[</span><span class="s">"id"</span><span class="p">]</span>
<span class="n">redis_client</span><span class="p">.</span><span class="nb">set</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">MACHINE_INFO_KEY_PREFIX</span><span class="si">}{</span><span class="n">task_id</span><span class="si">}</span><span class="s">"</span><span class="p">,</span> <span class="n">machine_id</span><span class="p">)</span>
<span class="k">return</span> <span class="p">{</span>
<span class="s">"task_id"</span><span class="p">:</span> <span class="n">task_id</span>
<span class="p">}</span>
</code></pre>
</div>
</div>
<p>We’ll call this code from our web application whenever the POST endpoint (to send an email) is called. This will kick off the job running on a Fly Machine, and return the task id, which is used to retrieve the results!</p>
<h2 id='code-to-retrieve-results' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#code-to-retrieve-results' aria-label='Anchor'></a>Code to retrieve results</h2>
<p>When we retrieve the results, we need to first check whether the Machine is still running. If it’s still running, we can just return a <code>PENDING</code> status, and expect the client will try again later.</p>
<p>Once the Machine is done, we can retrieve the result that the job wrote to Redis, and return it to the caller!</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-crmppycf"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-crmppycf"><span class="k">def</span> <span class="nf">get_results</span><span class="p">(</span><span class="n">task_id</span><span class="p">):</span>
<span class="n">redis_client</span> <span class="o">=</span> <span class="n">redis</span><span class="p">.</span><span class="n">from_url</span><span class="p">(</span><span class="n">REDIS_URL</span><span class="p">,</span> <span class="n">decode_responses</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="c1"># check whether the machine is still running
</span> <span class="n">machine_id</span> <span class="o">=</span> <span class="n">redis_client</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">MACHINE_INFO_KEY_PREFIX</span><span class="si">}{</span><span class="n">task_id</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">requests</span><span class="p">.</span><span class="n">get</span><span class="p">(</span>
<span class="sa">f</span><span class="s">"https://api.machines.dev/"</span>
<span class="sa">f</span><span class="s">"v1/apps/</span><span class="si">{</span><span class="n">FLY_TASKS_APP</span><span class="si">}</span><span class="s">/machines/</span><span class="si">{</span><span class="n">machine_id</span><span class="si">}</span><span class="s">"</span><span class="p">,</span>
<span class="n">headers</span><span class="o">=</span><span class="n">headers</span>
<span class="p">)</span>
<span class="n">machine_info</span> <span class="o">=</span> <span class="n">response</span><span class="p">.</span><span class="n">json</span><span class="p">()</span>
<span class="k">if</span> <span class="n">machine_info</span><span class="p">[</span><span class="s">"state"</span><span class="p">]</span> <span class="ow">in</span> <span class="p">(</span><span class="s">"starting"</span><span class="p">,</span> <span class="s">"created"</span><span class="p">,</span> <span class="s">"started"</span><span class="p">):</span>
<span class="k">return</span> <span class="p">{</span>
<span class="s">"status"</span><span class="p">:</span> <span class="s">"PENDING"</span>
<span class="p">}</span>
<span class="c1"># if the machine is done, get the result!
</span> <span class="n">result</span> <span class="o">=</span> <span class="n">redis_client</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">RESULTS_KEY_PREFIX</span><span class="si">}{</span><span class="n">task_id</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
<span class="k">return</span> <span class="n">json</span><span class="p">.</span><span class="n">loads</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
</code></pre>
</div>
</div>
<p>In our simple demo web application, we have a <code>GET /status/{task_id}</code> endpoint, which calls this function to retrieve the result and then displays it to the user. If the status is <code>PENDING</code>, the user can refresh the page to try again.</p>
<h2 id='code-to-clean-up-resources' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#code-to-clean-up-resources' aria-label='Anchor'></a>Code to clean up resources</h2>
<p>After results have been retrieved, you’ll want to clean up: remove the Machine, and delete the values stored in Redis.</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-kp6lqxt1"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-kp6lqxt1"><span class="k">def</span> <span class="nf">clean_up</span><span class="p">(</span><span class="n">task_id</span><span class="p">):</span>
<span class="n">redis_client</span> <span class="o">=</span> <span class="n">redis</span><span class="p">.</span><span class="n">from_url</span><span class="p">(</span><span class="n">REDIS_URL</span><span class="p">,</span> <span class="n">decode_responses</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">machine_id</span> <span class="o">=</span> <span class="n">redis_client</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">MACHINE_INFO_KEY_PREFIX</span><span class="si">}{</span><span class="n">task_id</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
<span class="n">requests</span><span class="p">.</span><span class="n">delete</span><span class="p">(</span>
<span class="sa">f</span><span class="s">"https://api.machines.dev/"</span>
<span class="sa">f</span><span class="s">"v1/apps/</span><span class="si">{</span><span class="n">FLY_TASKS_APP</span><span class="si">}</span><span class="s">/machines/</span><span class="si">{</span><span class="n">machine_id</span><span class="si">}</span><span class="s">"</span><span class="p">,</span>
<span class="n">headers</span><span class="o">=</span><span class="n">headers</span>
<span class="p">)</span>
<span class="n">redis_client</span><span class="p">.</span><span class="n">delete</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">TASKS_KEY_PREFIX</span><span class="si">}{</span><span class="n">task_id</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
<span class="n">redis_client</span><span class="p">.</span><span class="n">delete</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">RESULTS_KEY_PREFIX</span><span class="si">}{</span><span class="n">task_id</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
<span class="n">redis_client</span><span class="p">.</span><span class="n">delete</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">MACHINE_INFO_KEY_PREFIX</span><span class="si">}{</span><span class="n">task_id</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
</code></pre>
</div>
</div>
<p>And that’s it! Now we have a super-simple implementation of background jobs using Fly Machines. 🎉</p>
<h2 id='whats-next' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#whats-next' aria-label='Anchor'></a>What’s next?</h2>
<p>In this post, I’ve presented a very simple proof of concept implementation of background jobs on Fly.io Machines with Python. For some simple apps, you can use this approach as it is, but there’s a lot more you could do without very much effort! Here’s some ideas to get you started:</p>
<ul>
<li>Write a generic Python library for this purpose, which could be reused across different apps.
</li><li>Right now there’s a limitation: job request args and results need to be JSON-serializable. This is fine for many use cases, but it could be interesting to explore other alternatives!
</li><li>My implementation uses a separate Docker image for the worker, but you could also use the same image as the web app, similarly to what <a href='https://fly.io/ruby-dispatch/rails-background-jobs-with-fly-machines/' title=''>Brad did in his Ruby background jobs on Machines implementation</a>.
</li><li>Instead of a library, you could create a language-agnostic service. Using <a href='https://fly.io/docs/apps/autostart-stop/' title=''>Fly Machines scale-to-zero capability</a>, this service could avoid incurring unnecessary costs for idle time.
</li></ul>
/blog/python-async-workers-on-fly-machines/assets/python-async-workers-on-fly-machines-thumbnail.webpVanilla with Candy Sprinkleshttps://fly.io/blog/vanilla-candy-sprinkles/2023-05-18T00:00:00+00:002023-11-21T21:08:37+00:00<p>Recapping where we are to date:</p>
<ul>
<li>There are <a href='https://fly.io/blog/flydotio-heart-js/' title=''>plenty of JavaScript frameworks to choose from</a>, and fly.io loves them all.
</li><li>Pretty much <strong class='font-[550] text-navy-950'>all</strong> of the big name frameworks are
<a href='https://fly.io/blog/js-ecosystem-delightfully-wierd/' title=''>delightfully weird</a>.
</li></ul>
<p>Picking up where we left off, this blog post will describe literally
dozens (and that’s actually an understatement as you will soon see) of
considerably more, dare I say it, <a href='http://vanilla-js.com/' title=''>vanilla</a> frameworks that you can assemble on
your own and deploy to fly.io and elsewhere.</p>
<p>This can be overwhelming, so to make things easier we are going to define a baseline application that will be reimplemented to take advantage of various tools. The result will be:</p>
<ul>
<li>Educational. Seeing a bite sized working example is a great way to learn how a tool works.
</li><li>Useful starting point. Whereas large frameworks make a number of choices for you, being able to selectively include only the tools you need can provide you with a preconfigured configuration to build upon.
</li><li>Debugging aid. When a large system doesn’t behave the way you want it to, being able to reproduce and debug the problems on a smaller base not only can help you quickly narrow down the problem, and also can be used as a test case for a bug report.
</li></ul>
<p>Let’s get started!</p>
<h2 id='baseline-requirements' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#baseline-requirements' aria-label='Anchor'></a>Baseline requirements</h2>
<p>What we are looking for is a cross between <a href='https://en.wikipedia.org/wiki/%22Hello,_World!%22_program' title=''>Hello, World!</a> and <a href='https://rosettacode.org/wiki/Rosetta_Code' title=''>Rosetta Code</a>, but for a full stack application. For our purposes, the baseline is a stateful web server. Ideally one that can be deployed around the globe, and can deliver real time updates. But for now we will start small and before you know it we will have grown into the full application.</p>
<p>A simple application that meets these requirements is one that shows a visitors counter. A counter that starts at one, and increments each time you refresh the page, return to the page, or even open the page in another tab, window, browser, or on another machine. It looks something like this:</p>
<p><img alt="welcome counter" src="/images/welcome-counter.webp" /></p>
<p>As <a href='https://fly.io/blog/flydotio-heart-js/#package-json-enters-the-chat' title=''>previously discussed</a>, key to deployment is a <code>package.json</code> file that lists all of your dependencies, optional build instructions, and how to start your application. We are going to start very simple, with no dependencies and no build process, so the <code>package.json</code> file will start out looking like the following:</p>
<div class="highlight-wrapper group relative json">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-3887k8yz"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-3887k8yz"><span class="p">{</span><span class="w">
</span><span class="nl">"scripts"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"start"</span><span class="p">:</span><span class="w"> </span><span class="s2">"node server.js"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre>
</div>
</div>
<p>Now to complete this we are going to need not only a <code>server.js</code> file, but also HTML, CSS, and image(s). As with some of the cooking shows you see on the television, we are going to skip ahead and pull a completed meal out of the oven. Run the following commands on a machine that has node.js >= 16 installed:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-cfegwkbh"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-cfegwkbh">mkdir demo
cd demo
npx --yes @flydotio/node-demo@latest
</code></pre>
</div>
</div>
<p>Once this command completes, you can launch the application with <code>npm run start</code>. If you have authenticated and have flyctl version 0.1.6 or later installed, you can launch this application with <code>fly launch</code> followed by <code>fly deploy</code>. When you run <code>fly launch</code>, consider saying <em>yes</em> to deploying a postgres and redis database as we will be using them later.</p>
<figure class="post-cta">
<figcaption>
<h1>You can play with this right now.</h1>
<p>Don’t have node installed or a fly.io login? Deploy using [Fly.io terminal](https://fly.io/terminal) or see our [Hands-on](https://fly.io/docs/hands-on/) guide that will walk you through the steps.</p>
<a class="button bg-white text-violet-500 hover:bg-opacity-75 hover:text-violet-700 transition-colors" href="https://fly.io/docs/about/pricing/#free-allowances">
Try Fly for free <span class='opacity-50'>→</span>
</a>
</figcaption>
<div class="image-container">
<img src="/static/images/cta-turtle.webp" srcset="/static/images/cta-turtle@2x.webp 2x" alt="">
</div>
</figure>
<p>If you are running it locally, open <code>http://localhost:3000/</code> in your browser. If you have deployed it on fly.io, try <code>fly open</code>. If you are running in a fly.io terminal, there is a handy link you can use on the left hand pane.</p>
<p>Now take a look at <code>server.js</code>. It is all of 72 lines, including blank lines and comments. In subsequent sections we show how to make it smaller using available libraries, and how to add features. But before we proceed, lets save time and keystrokes by installing the node-demo package, which we will use repeatedly to generate variations on this application:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-s30ug48y"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-s30ug48y">npm install @flydotio/node-demo --save-dev
</code></pre>
</div>
</div><h2 id='starting-small' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#starting-small' aria-label='Anchor'></a>Starting small</h2>
<p>If you look at the top of the <code>server.js</code> file you will see a number of calls to <code>require()</code>. This is Nodes <a href='https://nodejs.org/api/modules.html#modules-commonjs-modules' title=''>CommonJS</a> modules. Node also supports <a href='https://nodejs.org/api/esm.html#modules-ecmascript-modules' title=''>EMCAScript</a> modules, which is what all the cool kids are using these days.</p>
<p>This requires opting in. You can let <code>node-demo</code> make the changes for you by running the following command:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-ivjlvyav"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-ivjlvyav">npx node-demo --esm
</code></pre>
</div>
</div>
<p>This script will detect what changes need to be made, give you the option to show a diff of the changes, and to accept or reject the changes. This leads us to the second option: <code>--force</code> that will automatically apply the changes without prompting:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-dsk109cn"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-dsk109cn">npx node-demo --esm --force
</code></pre>
</div>
</div>
<p>Relaunch your application locally using <code>npm run start</code> or redeploy it remotely using <code>fly deploy</code>.</p>
<h2 id='using-a-real-template' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#using-a-real-template' aria-label='Anchor'></a>Using a real template</h2>
<p>Inside the application you can see that the HTML response is produced by reading a template file and replacing a placeholder string with the current count:</p>
<div class="highlight-wrapper group relative javascript">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-zcnoc5pv"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-zcnoc5pv"><span class="nx">contents</span> <span class="o">=</span> <span class="nx">contents</span><span class="p">.</span><span class="nx">replace</span><span class="p">(</span><span class="dl">'</span><span class="s1">@@COUNT@@</span><span class="dl">'</span><span class="p">,</span> <span class="nx">count</span><span class="p">.</span><span class="nx">toString</span><span class="p">());</span>
</code></pre>
</div>
</div>
<p>While this is fine for this example, larger projects would be better served with a real template. <code>node-demo</code> supports two such templating engines at the moment: <a href='https://ejs.co/' title=''>ejs</a> and <a href='https://mustache.github.io/' title=''>mustache</a>. Select your favorite, or switch back and forth:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-fcir4ps"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-fcir4ps">npx node-demo --ejs
</code></pre>
</div>
</div>
<p>and</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-kxw4b3r3"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-kxw4b3r3">npx node-demo --mustache
</code></pre>
</div>
</div>
<p>Be sure to add <code>--esm</code> if you want to continue to use <code>import</code> statements.</p>
<h2 id='a-more-substantial-change' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#a-more-substantial-change' aria-label='Anchor'></a>A more substantial change</h2>
<p>While <code>node:http</code> provides the means for you to create a capable HTTP server, it requires you to be responsible for status codes, mime types, headers, and other protocol details. <a href='https://expressjs.com/' title=''>express</a> will take care of all of this for you:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-yr1mqe4z"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-yr1mqe4z">npx node-demo --express
</code></pre>
</div>
</div>
<p>Both ejs and mustache have integrations with express. Try switching between the two to see how they differ.</p>
<h2 id='a-real-database' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#a-real-database' aria-label='Anchor'></a>A real database</h2>
<p>Maintaining a counter in a text file is good enough for a demo, but not suitable for production. Sqlite3 and PostgreSQL are better alternatives:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-hmpm8gpa"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-hmpm8gpa">npx node-demo --sqlite3
</code></pre>
</div>
</div>
<p>and</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-aaiupckr"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-aaiupckr">npx node-demo --postgresql
</code></pre>
</div>
</div>
<p>Sqlite3 is great for development, and when used with <a href='https://fly.io/docs/litefs/' title=''>litefs</a> is great for deployment. PostgreSQL can be used in development, and currently is the best choice for production.</p>
<p>To run with PostgreSQL locally, you need to install and start the server and create a database. For MacOS:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-cp5lh22b"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-cp5lh22b">brew install postgresql
brew services start postgresql
psql -U postgres -c "drop database if exists $USER;"
psql -U postgres -c "create database $USER;"
export DATABASE_URL=postgresql://$USER:$USER@localhost:5432/$USER
</code></pre>
</div>
</div><h2 id='be-as-weird-as-you-want-to-be' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#be-as-weird-as-you-want-to-be' aria-label='Anchor'></a>Be as weird as you want to be</h2>
<p>The next two options are frankly polarizing. People either love them or hate them. We won’t judge you.</p>
<p>First <a href='https://tailwindcss.com/' title=''>tailwindcss</a> is a CSS builder that works based on parsing your class attributes in your HTML:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-8hauc74h"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-8hauc74h">npx node-demo --tailwindcss
</code></pre>
</div>
</div>
<p>Next is <a href='https://www.typescriptlang.org/' title=''>typescript</a> which adds type annotations:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-p6sm5k8c"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-p6sm5k8c">npx node-demo --typescript
</code></pre>
</div>
</div>
<p>TypeScript should work with all of the options on this page, in many cases making use of development only <a href='https://www.npmjs.com/search?q=%40types' title=''>@types</a>. All of this should be handled automatically by node-demo.</p>
<p>Both of these require a build step, which can be run via <code>npm run build</code>. A change to the Dockerfile used to deploy is also required, which can be made using:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-kj9h1alb"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-kj9h1alb">npx dockerfile
</code></pre>
</div>
</div>
<p><a href='https://github.com/fly-apps/dockerfile-node#overview' title=''>dockerfile-node</a> is actually a separate project with its own options for you to explore.</p>
<h3 id='object-relational-mappers-orms' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#object-relational-mappers-orms' aria-label='Anchor'></a>Object Relational Mappers (ORMs)</h3>
<p>Adding databases was the first change that we’ve seen that actually makes the demo application noticeably larger, particularly with PostgreSQL once the code that handles reconnecting to the database after network failures is included. This can be handled by including still more libraries, this time Object Relational Managers (ORMs). Three popular ones:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-51ulivq2"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-51ulivq2">npx node-demo --drizzle
</code></pre>
</div>
</div>
<p>and</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-hnyqrr5a"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-hnyqrr5a">npx node-demo --knex
</code></pre>
</div>
</div>
<p>and</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-74emnsv2"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-74emnsv2">npx node-demo --prisma
</code></pre>
</div>
</div>
<p>Knex runs just fine with vanilla JavaScript. Prisma can run with vanilla JavaScript, but works better with TypeScript. Drizzle requires TypeScript.</p>
<p>Prisma and Drizzle also require a build step.</p>
<p>A final note: if you switch back and forth between Sqlite3 and PostgreSQL, you may get into a state where the migrations generated are for the wrong database. Simply delete the <code>prisma</code> or <code>src/db/migrations</code> directory and rerun the <code>npx demo</code> command to regenerate the migrations.</p>
<h2 id='real-time-updates' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#real-time-updates' aria-label='Anchor'></a>Real Time Updates</h2>
<p>If you open more than one browser window or tab, each will show a different number. This can be addressed by introducing websockets:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-tufo2l7v"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-tufo2l7v">npx node-demo --websocket
</code></pre>
</div>
</div>
<p>The server side of web sockets will be different based on whether or not you are using express. For the first time we are providing a client side script which is responsible for establishing (and reestablishing) the connection, and updating the <a href='https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model/Introduction' title=''>DOM</a> when messages are received. This is a chore, and <a href='https://htmx.org/' title=''>htmx</a> is one of the many libraries that can be used to handle this chore:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-gfl4odu5"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-gfl4odu5">npx node-demo --htmx
</code></pre>
</div>
</div>
<p>The next problem is that if you are running multiple servers, each will manage their own pool of WebSockets so that only clients in the same pool will be notified of updates. This can be addressed by using redis:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-wln8jp06"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-wln8jp06">npx node-demo --redis
</code></pre>
</div>
</div>
<p>At this point, if you are using fly.io, postgres, and redis, you can go global:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-oiygy7x0"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-oiygy7x0">fly scale count 8 --region ams,syd,nrt,dfw
</code></pre>
</div>
</div><h2 id='packaging-alternatives' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#packaging-alternatives' aria-label='Anchor'></a>Packaging alternatives</h2>
<p>So far, we have been using <code>npm</code>, but <code>yarn</code> and <code>pnpm</code> are alternatives that may be better for some use cases:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-ub8ai6hl"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-ub8ai6hl">npx node-demo --yarn
</code></pre>
</div>
</div>
<p>and</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-nonsbiou"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-nonsbiou">npx node-demo --pnpm
</code></pre>
</div>
</div>
<p>Each package manager organizes the <code>node_modules</code> directory a bit differently, so for best results when switching, remove the <code>node_modules</code> directory before switching:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-stkw2vg0"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-stkw2vg0">rm -rf node_modules
</code></pre>
</div>
</div>
<p>Windows Powershell users will want to use the following command instead:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-n3kh4p6h"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-n3kh4p6h">rm -r -fo node_modules
</code></pre>
</div>
</div><h2 id='future-explorations' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#future-explorations' aria-label='Anchor'></a>Future explorations</h2>
<p>While we have explored many options, this only scratches the surface. There are many alternatives to the libraries above, and many more things to explore. Examples:</p>
<ul>
<li><a href='https://react.dev/' title=''>React</a> can be run server side in a number of different ways, and can be run client side using a <a href='https://en.wikipedia.org/wiki/Content_delivery_network' title=''>CDN</a> or self hosted scripts.
</li><li>In addition to React, there are a number of client side libraries like <a href='https://angularjs.org/' title=''>Angular</a>, <a href='https://lit.dev/' title=''>Lit</a>, <a href='https://www.solidjs.com/' title=''>SolidJS</a>, <a href='https://svelte.dev/' title=''>Svelte</a>, and <a href='https://vuejs.org/' title=''>Vue</a>. Coupling this with bundlers like <a href='https://esbuild.github.io/' title=''>esbuild</a> and <a href='https://rollupjs.org/' title=''>rollup</a>, perhaps in a <a href='https://en.wikipedia.org/wiki/Monorepo' title=''>monorepo</a> using <a href='https://docs.npmjs.com/cli/using-npm/workspaces' title=''>workspaces</a> would make good starting points for larger projects.
</li><li>I welcome alternate implementations of this demo, perhaps using decidedly non-vanilla frameworks as a starting point. I’m particularly interested in implementations that support real time updates and globally distributed replications. If we get enough, perhaps we can maintain a catalog of pointers to these implementations.
</li><li>While this blog post has focused on local development and deployment on fly.io, there is no lock in here. Maintaining a catalog of pointers to blog posts that describe how to deploy this application elsewhere would be welcomed too. Again, bonus points for geographic distribution and real-time updates.
</li></ul>
<p><a href='https://github.com/fly-apps/node-demo' title=''>node-demo</a> is open source, so <a href='https://github.com/fly-apps/node-demo/issues' title=''>issues</a>, <a href='https://github.com/fly-apps/node-demo/pulls' title=''>pull requests</a>, and <a href='https://github.com/fly-apps/node-demo/discussions' title=''>discussions</a> are always welcome!</p>
<p>I hope you have found this blog post to be informative, and perhaps some of you will use this information to start your next application “vanilla” with your personal selection of toppings. Yummy!</p>
/blog/vanilla-candy-sprinkles/assets/vanilla-candy-sprinkles-thumb.webpPlaying Traffic Cop with Fly-Replayhttps://fly.io/blog/how-to-fly-replay/2023-05-15T00:00:00+00:002024-02-05T16:04:11+00:00<div class="lead"><p>Fly.io is a platform for compute. You can do a bunch more than just run your average web app! Check out the <a href="/docs/machines/" title="">Machines platform</a> and see how your business might run on Fly.io.</p>
</div>
<p>The <a href='/docs/reference/dynamic-request-routing/' title=''>Fly Replay</a> header is deceptively simple. All your app has to do is respond with a header, and the HTTP request gets re-ran somewhere else.</p>
<p>It’s behind-the-scenes of some pretty interesting apps on Fly.io (we wrote about using it with <a href='/blog/globally-distributed-postgres/' title=''>Globally Distributed Postgres</a>).</p>
<p>We often bring it up when answering questions by those enamored with the <a href='/docs/machines/' title=''>Machines platform</a>.</p>
<p>So, here’s a use case I think is pretty neat.</p>
<h2 id='but-first-what-is-it' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#but-first-what-is-it' aria-label='Anchor'></a>But first: What is it?</h2>
<p>All public network traffic headed into Fly.io goes through the Fly Proxy. The proxy has features!
One of those features involves looking for a <code>fly-replay</code> header in responses.</p>
<p>The <code>fly-replay</code> header tells the Fly Proxy to replay an HTTP request somewhere else. This gives your applications some power.</p>
<p>Depending on the value your app gives the <code>fly-replay</code> header, the Fly Proxy can replay the initial HTTP request on another app, in a different region, on a specific VM, or a mix of those things.
This only works for sending apps within the same Fly.io organization.</p>
<p>Here’s what that looks like.</p>
<p><strong class='font-[550] text-navy-950'>Replay in a Different Region:</strong></p>
<p>I’m going to steal from the <a href='/blog/globally-distributed-postgres/' title=''>Globally Distributed Postgres</a> article (and the corresponding <a href='/docs/postgres/advanced-guides/high-availability-and-global-replication/#detect-write-requests' title=''>docs</a>).</p>
<p>If you have a “leader” database with a bunch of read-replicas, you typically need write queries to go to the leader.</p>
<p>If an HTTP request (e.g. <code>POST /foo</code>) results in writes to your database, then sending that request
to a VM near the leader database has benefits - it’s way faster than opening DB connection across the globe.</p>
<p>To do this, your application can return a header that looks like this:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-qk03xphi"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-qk03xphi">fly-replay: region=sjc
</code></pre>
</div>
</div>
<p><strong class='font-[550] text-navy-950'>Replay in Other Apps:</strong></p>
<p>You may have a bunch of apps - perhaps because each of your customers gets an app, or your have some micro services, or whatever crazy scheme you trapped yourself into.</p>
<p>You can route requests to specific apps:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-t2n8r0e4"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-t2n8r0e4">fly-replay: app=some-app
</code></pre>
</div>
</div>
<p><strong class='font-[550] text-navy-950'>Replay in Specific VMs:</strong></p>
<p>Maybe you want requests to go to specific VM’s! I’ve used this to make sure requests after a file upload landed on the same server.</p>
<p>The <code>fly-replay</code> was a quick way to accomplish that:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-jm2rea16"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-jm2rea16">fly-replay: instance=00bb33ff
</code></pre>
</div>
</div>
<p>Since Machines can scale down to zero (stop on exit), you can also use this as a tricky way to wake them up - just ship it an HTTP request!</p>
<div class="callout"><p>There’s more you can do than just these examples, so definitely <a href="https://fly.io/docs/reference/dynamic-request-routing/" title="">RTFM</a>.</p>
</div><h2 id='something-about-a-traffic-cop' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#something-about-a-traffic-cop' aria-label='Anchor'></a>Something about a traffic cop?</h2>
<p>We’re going to make a “proxy” - a little app that just responds with a <code>fly-replay</code> header. It’ll tell the Fly Proxy to replay the HTTP request on a different app.</p>
<p>This is useful if you, for example, point <code>*.example.org</code> to that router and have a specific app respond to a request - perhaps based on the hostname.</p>
<p>This particular use case of mine is a bit like a load balancer - a “reverse proxy”, but with some code instead of configuration.</p>
<p>I like Go for HTTP plumbing, so let’s do some of that. We’re going to write the type of “toy” app that accidentally stays in production for 14 years. </p>
<p>This “proxy” app will check the request hostname against a database of known apps, and route the request as needed.</p>
<p>The full(ish) <a href='https://github.com/fly-apps/fly-replay-proxy-example' title=''>code is here</a>.</p>
<h3 id='its-basically-just-this' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#its-basically-just-this' aria-label='Anchor'></a>It’s basically just this</h3>
<p>The important logic is this bit of standard Go HTTP stuff:</p>
<div class="highlight-wrapper group relative go">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-pdem8hlr"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-pdem8hlr"><span class="n">http</span><span class="o">.</span><span class="n">HandleFunc</span><span class="p">(</span><span class="s">"/"</span><span class="p">,</span> <span class="k">func</span><span class="p">(</span><span class="n">w</span> <span class="n">http</span><span class="o">.</span><span class="n">ResponseWriter</span><span class="p">,</span> <span class="n">r</span> <span class="o">*</span><span class="n">http</span><span class="o">.</span><span class="n">Request</span><span class="p">)</span> <span class="p">{</span>
<span class="c">// We'll find a customer based on `r.Host`</span>
<span class="n">customer</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">Find</span><span class="p">(</span><span class="n">r</span><span class="o">.</span><span class="n">Host</span><span class="p">)</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
<span class="c">// Customer not found</span>
<span class="n">w</span><span class="o">.</span><span class="n">Header</span><span class="p">()</span><span class="o">.</span><span class="n">Set</span><span class="p">(</span><span class="s">"fly-replay"</span><span class="p">,</span> <span class="s">"app=our-default-app"</span><span class="p">)</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="c">// Replay the request on customer's app</span>
<span class="n">w</span><span class="o">.</span><span class="n">Header</span><span class="p">()</span><span class="o">.</span><span class="n">Set</span><span class="p">(</span><span class="s">"fly-replay"</span><span class="p">,</span> <span class="n">fmt</span><span class="o">.</span><span class="n">Sprintf</span><span class="p">(</span><span class="s">"app=%s"</span><span class="p">,</span> <span class="n">customer</span><span class="o">.</span><span class="n">App</span><span class="p">))</span>
<span class="p">}</span>
<span class="p">})</span>
</code></pre>
</div>
</div>
<p>Go’s HTTP library does a prefix match on HTTP URI’s, so <code>"/"</code> will match anything, which is just what we want.</p>
<p>All we do is find a customer (based on hostname) and respond with a replay header.</p>
<p>This is great when paired with a SQLite database, as (trigger warning) reads from the local disk are pretty quick relative to network stuff.</p>
<p>The <code>Find</code> function is just a sql query (but super verbose, because Golang):</p>
<div class="highlight-wrapper group relative go">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-41xbo6dp"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-41xbo6dp"><span class="k">func</span> <span class="n">Find</span><span class="p">(</span><span class="n">host</span> <span class="kt">string</span><span class="p">)</span> <span class="p">(</span><span class="o">*</span><span class="n">Customer</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span> <span class="p">{</span>
<span class="n">row</span> <span class="o">:=</span> <span class="n">db</span><span class="o">.</span><span class="n">QueryRow</span><span class="p">(</span><span class="s">`SELECT id, host, app, instance
FROM customers
WHERE host = ?`</span><span class="p">,</span> <span class="n">host</span><span class="p">)</span>
<span class="n">customer</span> <span class="o">:=</span> <span class="n">Customer</span><span class="p">{}</span>
<span class="n">err</span> <span class="o">:=</span> <span class="n">row</span><span class="o">.</span><span class="n">Scan</span><span class="p">(</span>
<span class="o">&</span><span class="n">customer</span><span class="o">.</span><span class="n">Id</span><span class="p">,</span>
<span class="o">&</span><span class="n">customer</span><span class="o">.</span><span class="n">Host</span><span class="p">,</span>
<span class="o">&</span><span class="n">customer</span><span class="o">.</span><span class="n">App</span><span class="p">,</span>
<span class="o">&</span><span class="n">customer</span><span class="o">.</span><span class="n">Instance</span>
<span class="p">)</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
<span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="n">fmt</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"no customer found: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
<span class="p">}</span>
<span class="k">return</span> <span class="o">&</span><span class="n">customer</span><span class="p">,</span> <span class="no">nil</span>
<span class="p">}</span>
</code></pre>
</div>
</div>
<p>Locally, the whole round trip of the HTTP request + database lookup took ~4ms. In the real world, it added ~100ms to hit this proxy and replay the request against another Fly.io app (my crufty blog).</p>
<p>To test this out, I ran a few <code>curl</code> requests:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-qjbcdeei"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-qjbcdeei"><span class="c"># Get replayed against the default app</span>
curl <span class="nt">-i</span> <span class="nt">-H</span> <span class="s2">"Host: fake.fideloper.com"</span> https://proxycentral.fly.dev
<span class="c"># Get replayed against an app that is registered</span>
curl <span class="nt">-i</span> <span class="nt">-H</span> <span class="s2">"Host: c1.fideloper.com"</span> https://proxycentral.fly.dev
</code></pre>
</div>
</div><h2 id='preventing-direct-access' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#preventing-direct-access' aria-label='Anchor'></a>Preventing Direct Access</h2>
<p>In this scenario, we want the “proxy” app to be available publicly, while keeping customer apps private.</p>
<p>However, the Fly Proxy needs to know where apps are listening when it directs HTTP requests to them.
Therefore, we need to define <a href='https://fly.io/docs/reference/configuration/#the-services-sections' title=''>services</a> in the <code>fly.toml</code> file.</p>
<div class="callout"><p>You also might be dynamically creating apps, in which case you don’t need a <code>fly.toml</code> file, but will be defining services via <a href="https://fly.io/docs/machines/working-with-machines/" title="">Machine API</a> calls.</p>
</div>
<p>Luckily, we can keep the apps private while still telling the Fly Proxy how to reach them.
The easiest way is to create the app without any public IP addresses via the <code>fly launch</code> command:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-hwc8ztzz"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-hwc8ztzz"><span class="nb">cd</span> /path/to/my/app
<span class="c"># Ensure we're using appsv2 (Machines) platform</span>
<span class="c"># Older accounts default to an older platform</span>
fly orgs apps-v2 default-on <org-slug>
fly launch <span class="nt">--no-public-ips</span>
</code></pre>
</div>
</div>
<p>The flag <code>--no-public-ips</code> is the key there. However, it requires the newer Machines-based apps platform. Also, if you’re creating apps via the Machines API, having no public IP’s is the default.</p>
<p>Now the customer apps are private, <em>and</em> the Fly Proxy can still replay requests against them.</p>
<h2 id='app-discovery' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#app-discovery' aria-label='Anchor'></a>App Discovery</h2>
<p>I used a SQLite database to map domains to apps. If this proxy ran globally, I could have used <a href='https://fly.io/blog/introducing-litefs/' title=''>LiteFS</a> for distributed SQLite across multiple regions.</p>
<p>Another fun possibility is (ab)using Fly’s <a href='https://fly.io/docs/reference/private-networking/#fly-internal-addresses' title=''>.internal addresses</a> to check for the existence of apps (or application instances) via DNS.</p>
<p>Perhaps we could have pinged this occasionally and created/updated an in-memory map of apps and hostnames! Here’s two DNS queries that would have been useful for that:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-rwcjenge"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-rwcjenge"><span class="c"># List apps in the same org</span>
dig TXT _apps.internal
<span class="c"># List apps and their VM instances</span>
<span class="c"># in the same org</span>
dig TXT _instances.internal
</code></pre>
</div>
</div>
<p>So this is pretty neat! The <code>fly-replay</code> header is a simple solution that gives you the ability to do some really neat stuff -
particularly within globally distributed apps.</p>
/blog/how-to-fly-replay/assets/replay-router-thumb.webpThe JavaScript Ecosystem is Delightfully Weirdhttps://fly.io/blog/js-ecosystem-delightfully-wierd/2023-05-11T00:00:00+00:002024-02-05T16:04:11+00:00<p>Note, I’m not saying that JavaScript is weird, though it <a href='https://wtfjs.com/' title=''>definitely is weird</a>. But that’s not the point of this blog post.</p>
<p>Bear with me, instead of starting with <em>how</em> JavaScript ecosystem is weird, I’m going to start with <em>why</em> the JavaScript ecosystem is weird.</p>
<h2 id='historical-background' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#historical-background' aria-label='Anchor'></a>Historical Background</h2>
<p>Less that 10 years ago, JavaScript sucked bad. It had no imports, no classes, no async, no arrow functions, no template literals, no destructuring assignment, no default parameters, no rest/spread parameters. And the environment it predominately ran in, namely the browser’s DOM, sucked too. <a href='https://jquery.com/' title=''>JQuery</a> made it suck less. It still sucked, but was — at that point in time — relatively sane.</p>
<p>Bundling JS to run in the browser was the first sign of weirdness. In that process you would also want to both minimize and tree shake the source, and perhaps even code split. In general the process involved reading a number of JavaScript sources as input and then producing one or more JavaScript sources as output. This meant that the code you were executing wasn’t the code you wrote. <a href='https://sourcemaps.info/spec.html' title=''>Sourcemaps</a> helped.</p>
<p>Then <a href='https://coffeescript.org/' title=''>CoffeeScript</a> came along. Instead of writing in JavaScript, you would write in a language which was compiled into JavaScript. This is a bit different than languages like <a href='https://elixir-lang.org/' title=''>Elixir</a> and <a href='https://kotlinlang.org/' title=''>Kotlin</a> which compile into the same byte codes as another language, CoffeeScript actually compiles into the other language. C++ started out this way.</p>
<p>Then came ECMAScript 6 in 2015. JavaScript improved rapidly in the next few years. This eventually mostly displaced CoffeeScript, but presented a different problem: for a while the implementations were not keeping up so <em>transpilers</em> like <a href='https://babeljs.io/' title=''>Babel</a> came along that compiled current and future versions of JavaScript into older versions of JavaScript that ran on supported environments. Currently <a href='https://esbuild.github.io/' title=''>esbuild</a> is rapidly rising in popularity as a Javascript bundler/transpiler.</p>
<p>Along the way, <a href='https://emscripten.org/' title=''>emscripten</a> came along which compiled actual machine code into a subset of JavaScript, though these days the new target for this tool is generally <a href='https://webassembly.org/' title=''>Wasm.</a></p>
<p>Lately the pace of innovation in JavaScript has slowed, and JavaScript implementations are doing a better job of keeping up, so you would think that the need for transpilers would be waning, particularly on the server side where there is no need for bundlers. But that’s not happening. And the reason why is an interesting story.</p>
<h2 id='nobody-writes-javascript-any-more' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#nobody-writes-javascript-any-more' aria-label='Anchor'></a>Nobody Writes JavaScript Any More</h2>
<p>OK, the title above is clearly hyperbole, but I’ll describe a number of the many ways that people aren’t writing JavaScript any more.</p>
<p>If you write a Rails application, you write it in Ruby. If you write a Django application, you write it in Python. Phoenix, Elixir. Lavavel, PHP. Rails gets a lot of flack for doing magic using meta-programming, and Elixir has macros, but all of the above stay within the boundaries of what can be done by the language.</p>
<p>JavaScript, however, is different. While it nominally is standardized by <a href='https://tc39.es/' title=''>EMCA TC39</a>, if you are using a popular framework like <a href='https://nextjs.org/' title=''>Next.JS</a>, <a href='http://remix.run/' title=''>Remix</a>, or <a href='https://svelte.dev/' title=''>Svelte</a> you are <strong class='font-[550] text-navy-950'>not</strong> coding in <a href='https://www.ecma-international.org/publications-and-standards/standards/ecma-262/' title=''>ECMAScript</a> as standardized by ECMA TC39. Four examples:</p>
<ul>
<li>Once upon a time, nearly 20 years ago, the ECMA committee standardized <a href='https://www-archive.mozilla.org/js/language/ECMA-357.pdf' title=''>E4X</a> that enabled XML to be treated as a data type. This lost favor, got deprecated and archived. Years later what once was Facebook (now Meta) had a similar need and invented <a href='https://facebook.github.io/jsx/' title=''>JSX</a>. It differs from E4X in that it compiles into JS.
</li><li>One thing that ECMA TC39 has never standardized is type annotations. Undeterred, Microsoft did it anyway with <a href='https://www.typescriptlang.org/' title=''>TypeScript</a>. It, too, compiles into JS.
</li><li>Svelte has their own <a href='https://www.npmjs.com/package/svelte' title=''>compiler</a> that even <a href='https://svelte.dev/docs#component-format-script-3-$-marks-a-statement-as-reactive' title=''>deliberately misuses the JavaScript label syntax</a> to enable marking a statement as reactive.
</li><li>It doesn’t stop there. When a bundler/transpiler encounters an import statement, they don’t necessarily presume that the file being imported is JavaScript or even any of the variants mentioned above. If configured properly and you want to import a CSS or PNG file, it will <a href='https://esbuild.github.io/content-types/' title=''>happily do so for you</a>.
</li></ul>
<p>I mentioned earlier that Rails gets a lot of flack for its use of meta programming. Nobody bats an eye at any of the “abuses” of the JavaScript language mentioned above. The JavaScript ecosystem is a Big Tent party.</p>
<h2 id='quot-use-server-quot' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#quot-use-server-quot' aria-label='Anchor'></a>“use server”;</h2>
<p>The latest <a href='https://github.com/reactjs/rfcs/blob/main/text/0188-server-components.md#adoption-strategy' title=''>abuse of the bundler</a> is by <a href='https://legacy.reactjs.org/blog/2020/12/21/data-fetching-with-react-server-components.html' title=''>React Server Components (RSC)</a>. First <a href='https://github.com/reactjs/server-components-demo' title=''>demoed with express</a>, it is now adopted by <a href='https://nextjs.org/docs/getting-started/react-essentials#server-components' title=''>Next.js</a>.</p>
<p>What “<code>use server"</code> and <code>"use client"</code> do, other than being a valid JavaScript statements that do absolutely nothing, is change the meaning of the code that follows them. This has gotten mixed reviews, but in my mind is very much in the spirit of <code>"use strict"</code>which also changes the meaning of the code that follows.</p>
<p>While JSX often compiles to JS, the <a href='https://react.dev/reference/react-dom/server' title=''>Server React DOM APIs</a> enable compilation to HTML. RSC goes a different way, and compiles into a <a href='https://www.plasmic.app/blog/how-react-server-components-work' title=''>stream of tagged JSON</a>. This is all very transparent to you, but what it does enable is a <a href='https://twitter.com/levelsio/status/1654053489004417026' title=''>different style of programming. One that many are comparing to PHP</a> and even Rails:</p>
<p><a style="display: inline" href="https://twitter.com/wobsoriano/status/1654181584357019649"><img style="display: inline; width: 200px; margin: 0.5em" src="https://pbs.twimg.com/media/FvTVW3taUAEc1PM?format=jpg&name=medium"></a>
<a style="display: inline" href="https://twitter.com/jaredpalmer/status/1654178077356720164"><img style="display: inline; width: 200px; margin: 0.5em" src="https://pbs.twimg.com/media/FvTSI9TacAAQ0rg?format=png&name=medium"></a>
<a style="display: inline" href="https://twitter.com/jeremyopendata/status/1654297213781131266"><img style="display: inline; width: 200px; margin: 0.5em" src="https://pbs.twimg.com/media/FvU-T0iWwAIj6i5?format=jpg&name=medium"></a></p>
<p>It is not clear to me whether these comparisons are meant in a positive way, but I will say that from my perspective it is a very good thing.</p>
<p>From a fly.io perspective, RSC enabling an <a href='https://github.com/reactjs/rfcs/blob/main/text/0188-server-components.md#update-refetch-sequence' title=''>Update (Refetch) Sequence</a> is very much of interest. We’ve always been especially shiny for frameworks that benefit from geographic distribution, like Elixir’s <a href='https://fly.io/blog/how-we-got-to-liveview/' title=''>LiveView</a>, Laravel’s <a href='https://laravel-livewire.com/' title=''>Livewire</a> and Ruby on Rail’s <a href='https://hotwired.dev/' title=''>Hotwire</a>. We want those kinds of frameworks to succeed, because the better they do, the more valuable we are. Now we can add React’s RSC to that list.</p>
<p>Returning to the topic at hand, the fact that such a feature is only made possible through cooperation with bundlers — a statement tantamount to saying a change to the JavaScript language itself — is profound and, dare I say it, delightful.</p>
<h2 id='another-dimension' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#another-dimension' aria-label='Anchor'></a>Another Dimension</h2>
<p>Dan Abramov gave a talk at RemixConf entitled <u>React from Another Dimension</u>:</p>
<p><p><div class="youtube-container"></p>
<div class="youtube-video"><iframe width="100%" height="100%" src="https://www.youtube.com/embed/wobP9yhrmhQ?start=21085&end=23591" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="">
</iframe>
</div>
<p>In Dan’s talk he imagines an alternate universe in which React was first implemented in the late 90s on the server and still managed to converge to where it is today. During the talk he launches a Windows 95 emulator and runs Internet Explorer (specifically, IE6) with React. He even manages to get nine out of ten steps working using that operating system and browser combination.</p>
<p>The mind bending parts of this presentation are where he first utilizes <code>use server</code> to implement a client side form action, and then later launches a client side alert from the server using <code>use client</code>. </p>
<p>And he closes by saying that this requires new generation routers and new generation bundlers.</p>
<p>And to think all of this is made possible by the fact that the JavaScript you write not only isn’t the JavaScript you run, but under closer examination isn’t even JavaScript at all.</p>
/blog/js-ecosystem-delightfully-wierd/assets/js-weird-cover.webpDeploying LangChain to Fly.iohttps://fly.io/blog/deploying-langchain-to-fly-io/2023-05-08T00:00:00+00:002024-02-05T16:04:11+00:00<div class="lead"><p>In this post we deploy a minimal LangChain app to Fly.io using Flask. Check it out: <a href="https://fly.io/docs/languages-and-frameworks/python/" title="">you can be up and running on Fly.io in just minutes.</a></p>
</div>
<p>I hear about Large Language Models (LLM) everywhere these days! Do you? 🤔</p>
<p>LLMs are a type of natural language processing (NLP) technology that uses advanced deep learning techniques to generate human-like language. If you haven’t heard about LLMs, you probably heard about one of the most notable examples of it today: <a href='https://openai.com/blog/chatgpt' title=''>ChatGPT</a>. ChatGPT is a language model developed by OpenAI and it was trained on a large amount of text data which allows it to understand the patterns and generate responses to inputs.</p>
<p><a href='https://github.com/hwchase17/langchain' title=''>LangChain</a> is a Python framework that rapidly gained notoriety. It was launched as an open source project in October 2022 - yes, a few months ago. This framework was designed to simplify the creation of powerful applications providing ways to interact with LLMs.</p>
<p>I recently created a <a href='https://github.com/fly-apps/hello-fly-langchain' title=''>minimal application</a> using LangChain and deployed it to Fly.io. This article aims to share the process of how to <a href='https://python.langchain.com/en/latest/deployments.html#fly-io' title=''>deploy this minimal LangChain app to Fly.io using Flask</a>.</p>
<p><a href='https://github.com/pallets/flask/' title=''>Flask</a> is a Python <em>micro</em> framework for building web applications. That’s perfect for our example since it’s designed to make getting started quick and easy. That’s all we need for now.</p>
<p>Let’s get to it! 😎</p>
<h2 id='langchain-models' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#langchain-models' aria-label='Anchor'></a>LangChain Models 🦜 🔗</h2>
<p>LangChain provides an interface to interact with several LLMs.</p>
<p>The <a href='https://github.com/fly-apps/hello-fly-langchain' title=''>template</a> is using the <code>OpenAI</code> LLM wrapper, which uses, at the time I’m writing this article, <code>text-davinci-003</code> <a href='https://github.com/hwchase17/langchain/blob/master/langchain/llms/openai.py#LL127C15' title=''>model by default</a> - this model belongs to the <a href='https://platform.openai.com/docs/models/gpt-3-5' title=''>GPT-3.5</a> family. Keep in mind that there are other alternatives to use more capable and less expensive models like <code>gpt-3.5-turbo</code>, which is the one recommended by OpenAI because of its lower cost. However, we won’t get into that in this article.</p>
<p>Language models take text as input. This text is what we usually referred as a <strong class='font-[550] text-navy-950'>prompt</strong>. LangChain facilitates the use of those prompts. To make things a bit more interesting, the template makes use of the <code>PromptTemplate</code>: ask a question and also receive an input from the user.</p>
<h2 id='our-application' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#our-application' aria-label='Anchor'></a>Our Application 🍽</h2>
<p>Our minimal application receives a <code>place</code> (city, country, etc.) as an input and give us 3 options where to eat in that <code>place</code>. The default value for <code>place</code> is <code>Berlin</code>.</p>
<p>Out prompt:</p>
<blockquote>
<p><code>What are the 3 best places to eat in <place>?</code></p>
</blockquote>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-56c37t6n"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-56c37t6n"><span class="c1"># hello.py
</span><span class="kn">import</span> <span class="nn">os</span>
<span class="kn">from</span> <span class="nn">flask</span> <span class="kn">import</span> <span class="n">Flask</span><span class="p">,</span> <span class="n">render_template</span>
<span class="kn">import</span> <span class="nn">openai</span>
<span class="kn">from</span> <span class="nn">langchain.llms</span> <span class="kn">import</span> <span class="n">OpenAI</span>
<span class="kn">from</span> <span class="nn">langchain.prompts</span> <span class="kn">import</span> <span class="n">PromptTemplate</span>
<span class="n">app</span> <span class="o">=</span> <span class="n">Flask</span><span class="p">(</span><span class="n">__name__</span><span class="p">)</span>
<span class="n">openai</span><span class="p">.</span><span class="n">api_key</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">getenv</span><span class="p">(</span><span class="s">"OPENAI_API_KEY"</span><span class="p">)</span>
<span class="o">@</span><span class="n">app</span><span class="p">.</span><span class="n">route</span><span class="p">(</span><span class="s">"/"</span><span class="p">)</span>
<span class="o">@</span><span class="n">app</span><span class="p">.</span><span class="n">route</span><span class="p">(</span><span class="s">"/<place>"</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">hello</span><span class="p">(</span><span class="n">place</span><span class="o">=</span><span class="s">"Berlin"</span><span class="p">):</span>
<span class="n">llm</span> <span class="o">=</span> <span class="n">OpenAI</span><span class="p">(</span><span class="n">temperature</span><span class="o">=</span><span class="mf">0.9</span><span class="p">)</span>
<span class="n">prompt</span> <span class="o">=</span> <span class="n">PromptTemplate</span><span class="p">(</span>
<span class="n">input_variables</span><span class="o">=</span><span class="p">[</span><span class="s">'place'</span><span class="p">],</span>
<span class="n">template</span><span class="o">=</span><span class="s">"What are the 3 best places to eat in {place}?"</span><span class="p">,</span>
<span class="p">)</span>
<span class="n">question</span> <span class="o">=</span> <span class="n">prompt</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">place</span><span class="o">=</span><span class="n">place</span><span class="p">)</span>
<span class="k">return</span> <span class="n">render_template</span><span class="p">(</span>
<span class="s">'hello.html'</span><span class="p">,</span>
<span class="n">place</span><span class="o">=</span><span class="n">place</span><span class="p">,</span>
<span class="n">answer</span><span class="o">=</span><span class="n">llm</span><span class="p">(</span><span class="n">question</span><span class="p">).</span><span class="n">split</span><span class="p">(</span><span class="s">"</span><span class="se">\n\n</span><span class="s">"</span><span class="p">)</span>
<span class="p">)</span>
</code></pre>
</div>
</div>
<p>You can define your own input variable by calling the url:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-nyi5bpkk"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-nyi5bpkk">http://127.0.0.1:5000/<place>
</code></pre>
</div>
</div>
<p>For example:</p>
<ul>
<li>Country: <code>http://127.0.0.1:5000/norway</code>
</li><li>City: <code>http://127.0.0.1:5000/prague</code>
</li></ul>
<p>To illustrate, we are using the <code>hello.html</code> to display the results on the browser.</p>
<p>So, let’s start at the beginning…</p>
<h2 id='setting-up' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#setting-up' aria-label='Anchor'></a>Setting up ⚒️</h2>
<p>We assume the initial setup is already done and you have <a href='https://www.python.org/downloads/' title=''>Python</a> installed.It’s recommended to use the latest version of Python. We are using <a href='https://flask.palletsprojects.com/en/2.3.x/installation/' title=''>Flask 2.2.3</a> and it supports Python 3.8 and newer.</p>
<p>Create and enter your project’s folder:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-pw5y1ato"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-pw5y1ato">mkdir my-fly-langchain
</code></pre>
</div>
</div><div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-w3yk9z6p"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-w3yk9z6p">cd my-fly-langchain
</code></pre>
</div>
</div>
<p>We can go ahead and clone the repository inside your project’s folder using either</p>
<p>HTTPS:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-9jkfs4ow"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-9jkfs4ow">git clone https://github.com/fly-apps/hello-fly-langchain.git .
</code></pre>
</div>
</div>
<p>or SSH:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-g9423ej5"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-g9423ej5">git clone git@github.com:fly-apps/hello-fly-langchain.git .
</code></pre>
</div>
</div><h3 id='virtual-environment' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#virtual-environment' aria-label='Anchor'></a>Virtual Environment</h3>
<p>Choose a virtual environment to manage our dependencies. For simplicity, we’re using <code>venv</code> for this project. Inside your project, create and activate it:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-95z2u4u6"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-95z2u4u6"><span class="c"># Unix/macOS</span>
python3 <span class="nt">-m</span> venv venv
<span class="nb">source </span>venv/bin/activate
<span class="o">(</span>.venv<span class="o">)</span> <span class="err">$</span>
<span class="c"># Windows</span>
py <span class="nt">-3</span> <span class="nt">-m</span> venv venv
venv<span class="se">\S</span>cripts<span class="se">\a</span>ctivate
<span class="o">(</span>.venv<span class="o">)</span> <span class="err">$</span>
</code></pre>
</div>
</div>
<p>From this point on, the commands won’t be displayed with <code>(.venv) $</code> but we assume you have your Python virtual environment <strong class='font-[550] text-navy-950'>activated</strong>.</p>
<h3 id='install-dependencies-from-requirements-txt' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#install-dependencies-from-requirements-txt' aria-label='Anchor'></a>Install Dependencies from <code>requirements.txt</code></h3>
<p>For this minimal example, we have a few dependencies to be installed:</p>
<div class="highlight-wrapper group relative python">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-w4c0g4ah"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-w4c0g4ah"><span class="c1"># requirements.txt
</span><span class="n">Flask</span><span class="o">==</span><span class="mf">2.2</span><span class="p">.</span><span class="mi">3</span>
<span class="n">gunicorn</span><span class="o">==</span><span class="mf">20.1</span><span class="p">.</span><span class="mi">0</span>
<span class="n">langchain</span><span class="o">==</span><span class="mf">0.0</span><span class="p">.</span><span class="mi">148</span>
<span class="n">openai</span><span class="o">==</span><span class="mf">0.27</span><span class="p">.</span><span class="mi">4</span>
<span class="n">python</span><span class="o">-</span><span class="n">dotenv</span><span class="o">==</span><span class="mf">1.0</span><span class="p">.</span><span class="mi">0</span>
</code></pre>
</div>
</div>
<p>Go ahead and install them by running:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-k9of5a3q"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-k9of5a3q">python -m pip install -r requirements.txt
</code></pre>
</div>
</div>
<p>We are using <code>Flask</code>, <code>langchain</code> and <code>openai</code> packages as minimal requirements for this example. <code>gunicorn</code> (Green Unicorn) is the pure Python WSGI server we will use in production instead of the built-in development server - other options can be found <a href='https://flask.palletsprojects.com/en/2.2.x/deploying/#self-hosted-options' title=''>here</a>. Finally, we use <code>python-dotenv</code> to use the environment variables set on <code>.env</code> file - more about in the next section.</p>
<h3 id='environment-variables' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#environment-variables' aria-label='Anchor'></a>Environment Variables</h3>
<p>The template contains a <code>.env.dist</code> file. Go ahead and rename it to <code>.env</code>. Our local environment variables will be stored in this <code>.env</code> file:</p>
<div class="highlight-wrapper group relative docker">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-4htk2q3q"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-4htk2q3q"><span class="c"># .env (rename .env.dist file)</span>
FLASK_APP=hello
OPENAI_API_KEY=<your-openai-api-secret-key>
</code></pre>
</div>
</div>
<p>The OpenAI API uses API keys for authentication. We will need an API Key to be able to use the API in your requests. Log in to your account and check <a href='https://platform.openai.com/account/api-keys' title=''>OpenAI API Key</a> page to create or retrieve your API key to be set as <code>OPENAI_API_KEY</code>.</p>
<p>Note that <code>OPENAI_API_KEY</code> is required because we are using <code>OpenAI</code> LLM wrapper - other providers will have different requirements. <a href='https://python.langchain.com/en/latest/modules/models/llms/integrations.html' title=''>Here is a list of multiple LLM providers</a>.</p>
<div class="callout"><p>You can find <a href="https://flask.palletsprojects.com/en/2.2.x/cli/#environment-variables-from-dotenv" title="">here</a> other options to set the environment variables like setting them on the command line or creating <code>.flaskenv</code> file instead.</p>
</div>
<p><code>.env</code> file is only used for your local development.</p>
<h3 id='local-development' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#local-development' aria-label='Anchor'></a>Local Development</h3>
<p>Now that everything is set up we can run the project:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-jjvkrmga"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-jjvkrmga">flask run
</code></pre>
</div>
</div><div class="highlight-wrapper group relative output">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-5dymq1g4"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight output'><code id="code-5dymq1g4"> * Serving Flask app 'hello'
* Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on http://127.0.0.1:5000
Press CTRL+C to quit
</code></pre>
</div>
</div>
<p>Now, we can head over to <a href='http://127.0.0.1:5000' title=''>http://127.0.0.1:5000</a> 🎉</p>
<div class="callout"><p>Note that <code>flask run</code> command works since we set <code>FLASK_APP</code> on <code>.env</code> file. In this case, it wasn’t necessary to run the command with <code>--app</code> option. If our <code>FLASK_APP</code> setting was not set, we would need to run: <code>flask --app <app> run</code>
</p>
</div>
<p>With our LangChain app prepped and running on our local machine, let’s move to the next section and deploy our app to Fly.io!</p>
<h2 id='deploying-to-fly-io' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#deploying-to-fly-io' aria-label='Anchor'></a>Deploying to Fly.io 🚀</h2>
<p><code>flyctl</code> is the command-line utility provided by Fly.io.</p>
<p>If not installed yet, follow these <a href='https://fly.io/docs/hands-on/install-flyctl/' title=''>instructions</a>, <a href='https://fly.io/docs/hands-on/sign-up/' title=''>sign up</a> and <a href='https://fly.io/docs/hands-on/sign-in/' title=''>log in</a> to Fly.io.</p>
<div class="callout"><p>New customers’ organizations use <a href="https://fly.io/docs/reference/apps/#apps-v2" title="">V2 of the Fly Apps platform</a>, running on <a href="https://fly.io/docs/machines/" title="">Fly Machines</a>. If you’re already a customer, you can flip the switch to start deploying your new apps to Apps V2 with <code>fly orgs apps-v2 default-on <org-slug></code>.
</p>
</div><h3 id='launching-our-app' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#launching-our-app' aria-label='Anchor'></a>Launching Our App</h3>
<p>Before deploying our app, first we need to configure and launch our app to Fly.io by using the <code>flyctl</code> command <code>fly launch</code>. During the process, we will:</p>
<ul>
<li><strong class='font-[550] text-navy-950'>Choose an app name</strong>: this will be your dedicated fly.dev subdomain.
</li><li><strong class='font-[550] text-navy-950'>Select the organization</strong>: you can create a new organization or deploy to your personal account (connect to your Fly account, visible only to you).
</li><li><strong class='font-[550] text-navy-950'>Choose the region for deployment</strong>: Fly.io initially suggests the closest to you, you can choose another region if you prefer.
</li></ul>
<p>This is what it looks like when we run <code>fly launch</code>:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-wll06w6e"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-wll06w6e">fly launch
</code></pre>
</div>
</div><div class="highlight-wrapper group relative output">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-7ib4r4vo"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight output'><code id="code-7ib4r4vo">Creating app in ../flyio/my-fly-langchain
An existing fly.toml file was found for app hello-fly-langchain
? Would you like to copy its configuration to the new app? Yes
Scanning source code
Detected a Python app
Using the following build configuration:
Builder: paketobuildpacks/builder:base
? Choose an app name (leaving blank will default to 'hello-fly-langchain') my-fly-langchain
? Select Organization: Fly.io (fly-io)
App will use 'ams' region as primary
Created app 'my-fly-langchain' in organization 'fly-io'
Admin URL: https://fly.io/apps/my-fly-langchain
Hostname: my-fly-langchain.fly.dev
? Overwrite "../flyio/my-fly-langchain/.dockerignore"? No
? Overwrite "../flyio/my-fly-langchain/Procfile"? No
? Would you like to set up a Postgresql database now? No
? Would you like to set up an Upstash Redis database now? No
Wrote config file fly.toml
Validating ../flyio/my-fly-langchain/fly.toml
Platform: machines
✓ Configuration is valid
Run "fly deploy" to deploy your application.
</code></pre>
</div>
</div>
<p>If you cloned the template mentioned in this article, you will see a similar message described above.</p>
<p>The template provides you with an existing <code>fly.toml</code> file, you can copy its configuration to your app.</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-kw66t018"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-kw66t018">An existing fly.toml file was found <span class="k">for </span>app hello-fly-langchain
? Would you like to copy its configuration to the new app? Yes
</code></pre>
</div>
</div>
<p>Go ahead and define your app name and select the organization to deploy our app.</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-by18dhti"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-by18dhti">? Choose an app name (leaving blank will default to 'hello-fly-langchain') my-fly-langchain
? Select Organization: Fly.io (fly-io)
</code></pre>
</div>
</div>
<p>The template also provides you with existing <code>.dockerignore</code> and <code>Procfile</code> files. Those files are generated for you if they don’t exist in your project. If so, make sure you update them to fit your needs.</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-bwuvqsdh"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-bwuvqsdh">? Overwrite <span class="s2">"../flyio/my-fly-langchain/.dockerignore"</span>? No
? Overwrite <span class="s2">"../flyio/my-fly-langchain/Procfile"</span>? No
</code></pre>
</div>
</div>
<p>Note that the built-in Python builder used (<code>paketobuildpacks/builder:base</code>) will automatically copy over the contents of the directory to the deployable image.</p>
<div class="highlight-wrapper group relative docker">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-pcjdzzn2"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-pcjdzzn2"><span class="c"># fly.toml</span>
...
[build]
builder = "paketobuildpacks/builder:base"
...
</code></pre>
</div>
</div>
<p>To keep it simple, a <code>Procfile</code> is used to deploy and run Python applications - the minimal generated <code>Procfile</code> starts the Gunicorn server with our WSGI application.</p>
<div class="highlight-wrapper group relative docker">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-7qzhtkok"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-7qzhtkok"><span class="c"># Procfile</span>
web: gunicorn hello:app
</code></pre>
</div>
</div>
<p>By now, we are almost ready to deploy our app. Before we do that, we need to set the environment variables to be used in production. Let’s see how that’s done.</p>
<h3 id='environment-variables-2' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#environment-variables-2' aria-label='Anchor'></a>Environment Variables</h3>
<p>As mentioned before, for our local development we are using <code>.env</code> file to set our environment variables. In production, we can’t share such file with sensitive values.</p>
<p>We can specify secret values for our app using <code>flyctl secrets</code> command by running:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-r4xvl7r4"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-r4xvl7r4">fly secrets set OPENAI_API_KEY=<your-openai-api-secret-key>
</code></pre>
</div>
</div>
<p>That’s it! We are now ready to deploy our app!</p>
<h3 id='deploying-our-app' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#deploying-our-app' aria-label='Anchor'></a>Deploying Our App</h3>
<p>Let’s simply run:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-ozmta6xn"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-ozmta6xn">fly deploy
</code></pre>
</div>
</div><div class="highlight-wrapper group relative output">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-6xdf3hxf"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight output'><code id="code-6xdf3hxf">==> Verifying app config
Validating ../flyio/my-fly-langchain/fly.toml
Platform: machines
✓ Configuration is valid
--> Verified app config
==> Building image
Remote builder fly-builder-ancient-surf-8247 ready
==> Building image with Buildpacks
--> docker host: 20.10.12 linux x86_64
base: Pulling from paketobuildpacks/builder
...
Paketo Buildpack for Procfile 5.6.1
https://github.com/paketo-buildpacks/procfile
Process types:
web: gunicorn hello:app
...
--> Pushing image done
image: registry.fly.io/my-fly-langchain:deployment-01GYZ27HQF3C7MQ9EB8VGJAE9Z
image size: 378 MB
Provisioning ips for my-fly-langchain
Dedicated ipv6: 2a09:8280:1::37:12bc
Shared ipv4: 66.241.124.47
Add a dedicated ipv4 with: fly ips allocate-v4
Process groups have changed. This will:
* create 1 "app" machine
No machines in group app, launching one new machine
Machine 4d89696a2ed508 [app] update finished: success
Creating a second machine to increase service availability
Machine 4d89699ce71578 [app] update finished: success
Finished launching new machines
Updating existing machines in 'my-fly-langchain' with rolling strategy
Finished deploying
Visit your newly deployed app at https://my-fly-langchain.fly.dev
</code></pre>
</div>
</div>
<p>Our app should be up and running!</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-ne8g9owv"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-ne8g9owv">fly open
</code></pre>
</div>
</div>
<p>Let’s try it: <code>https://<your-app-name>.fly.dev/<your-city></code></p>
<p>YAY! 🎉 We just deployed our LangChain app to production! Cool, right? 😎</p>
<figure class="post-cta">
<figcaption>
<h1>Fly.io ❤️ all things Python.</h1>
<p>Fly.io makes it easier to deploy your apps and move them closer to your users!</p>
<a class="button bg-white text-violet-500 hover:bg-opacity-75 hover:text-violet-700 transition-colors" href="https://fly.io/docs/languages-and-frameworks/python/">
Deploy a Python app today! <span class='opacity-50'>→</span>
</a>
</figcaption>
<div class="image-container">
<img src="/static/images/cta-kitty.webp" srcset="/static/images/cta-kitty@2x.webp 2x" alt="">
</div>
</figure>
<h2 id='whats-next' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#whats-next' aria-label='Anchor'></a>What’s Next?</h2>
<p>Our app does the job of finding new places to eat! Now that we gave it a try, you are probably wondering: what’s next?</p>
<p>We got some options where to eat tonight in Berlin, here where I live! That’s a great start for what is possible to do with LangChain. But that’s a LOT more!</p>
<p>Let’s say that I’m meeting my best friend in Berlin for dinner tomorrow.</p>
<p>From all the places I could get in <code>Berlin</code>, I want to get the name and address, with working hours of the ones that serve <code>Italian food</code> (because we all love Italian food, right?) and are closer to <code>Neukölln</code> - my best friend’s neighbourhood. The places also need to be top-rated, with rating higher than <code>4.5</code> on Google Maps and be open <code>tomorrow</code> at <code>7pm</code>.</p>
<p>And we could go on and on here.</p>
<p>That looks a bit more complex and…</p>
<p>It started to look like a <a href='https://python.langchain.com/en/latest/modules/chains.html' title=''>chain</a> (aha!) of calls that also depend on user’s input. That’s when simple applications start to become more powerful.</p>
<p>Note that our chain depends on <code>user's input</code>. Not only that, but some of the <strong class='font-[550] text-navy-950'>real-time</strong> information like current working hours and rating on Google Maps are not available to us.</p>
<p>AI language models don’t have access to real-time data neither the ability to browse the internet.</p>
<h3 id='agents-joined-the-chat' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#agents-joined-the-chat' aria-label='Anchor'></a><a href='https://python.langchain.com/en/latest/modules/agents.html' title=''>Agents</a> joined the chat ⛓</h3>
<p>For these type of chains, we got to interact with the outside world to get some answers!</p>
<p>That’s when agents come into play.</p>
<p>The “agent” has access to a <a href='https://python.langchain.com/en/latest/modules/agents/tools.html' title=''>set of tools</a>. Depending on the user’s input, the agent can decide which, if any, of the available <a href='https://python.langchain.com/en/latest/modules/agents/tools.html' title=''>tools</a> to call - you can also build your own custom <a href='https://python.langchain.com/en/latest/modules/agents/tools/custom_tools.html' title=''>tools</a> and <a href='https://python.langchain.com/en/latest/modules/agents/agents/custom_agent.html' title=''>agents</a>.</p>
<p>Those tools are the way we interact with the rest of the world - in our case, using <a href='https://python.langchain.com/en/latest/modules/agents/tools/examples/google_places.html' title=''>Google Places API</a> to get real-time information such as working hours and rating.</p>
<p>That’s so neat and it doesn’t even scratch the surface. There are so much more out there - and that’s something for future articles! <a href='https://github.com/kyrolabs/awesome-langchain' title=''>Here</a> you can find a curated list of tools and projects using LangChain.</p>
<p>Happy coding! 🤖</p>
<h2 id='got-feedback' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#got-feedback' aria-label='Anchor'></a>Got Feedback?</h2>
<p>For more detailed information on how to deploy a Python App to Fly.io, you can check the <a href='https://fly.io/docs/languages-and-frameworks/python/' title=''>Fly.io Docs</a>.</p>
<p>If you have any question or comments, reach out on the <a href='https://community.fly.io/c/questions-and-help/11' title=''>Fly.io Community</a>. That’s a great place to share knowledge, help and get help!</p>
<p>📢 Now, tell me… What are the cool ideas you have now using LangChain? 👩🏽💻</p>
/blog/deploying-langchain-to-fly-io/assets/deploying-langchain-to-fly-io-thumbnail.webpFly.io ❤️ JShttps://fly.io/blog/flydotio-heart-js/2023-04-24T00:00:00+00:002023-11-21T21:08:37+00:00<p>Fly.io is a great place to run fullstack applications. For most programming languages,
there is a defacto default fullstack framework. For Ruby, there is Rails.
For Elixir, there is Phoenix. For PHP there is Laravel. For Python, there is Django.</p>
<p>If you don’t know where to look, Node.js appears to be a mess. For starters there
are <a href='https://stackdiary.com/node-js-frameworks/' title=''>plenty of js frameworks to choose from</a>.
Then there are three different package managers. Not to mention that Typescript as
an alternative to JavaScript. And if that is not bad enough Bun and Deno are providing
alternatives to Node itself.</p>
<p>The result is predictable. Fly.io has a number of community contributed templates for
a small number of Node frameworks. Some have had more attention than others.</p>
<p>It is time to clean up the mess.</p>
<h2 id='package-json-enters-the-chat' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#package-json-enters-the-chat' aria-label='Anchor'></a><code>package.json</code> enters the chat</h2>
<p>The key sentence in the preceding section starts with <em>If you don’t know where to look</em>.
The right place to start is <code>package.json</code>. It tells you what dependencies need to be
installed. For most frameworks, it tells you how to start the web server. And if there
is a build step. And if there are any development dependencies that may be needed to
run the build, and removed prior to deployment.</p>
<p>Given this knowledge, a baseline Dockerfile can be built for any framework that follows
these conventions. Handling different package managers can be accomplished by looking
for <code>yarn.lock</code> and <code>pnpm-lock.yaml</code> files. TypeScript is a devDependency and handled by the
build step. While Deno projects don’t typically have <code>package.json</code> files, some bun
projects do.</p>
<p>The <a href='https://github.com/fly-apps/dockerfile-node' title=''>dockerfile-node</a> project endeavors to do exactly that:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-3fb9n5nw"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-3fb9n5nw">npx @flydotio/dockerfile
</code></pre>
</div>
</div>
<p>This will create (or replace!) your existing Dockerfile, as well as ensure that you have a
<code>.dockerignore</code> file, and optionally may create a <code>docker-entrypoint</code> script. You can run
with this Dockerfile locally, or use it to deploy on your favorite cloud
provider. For Fly.io, you would get started by running:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-r7rppz1q"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-r7rppz1q">fly launch --dockerfile Dockerfile
</code></pre>
</div>
</div>
<p>The <code>--dockerfile</code> parameter is needed to tell <code>fly launch</code> to use your Dockerfile rather
than trying to generate a new one.</p>
<p>Of course, if you prefer to run your application on Google Cloud Run, Amazon ECS, MRSK, or
even locally, you are welcome to do so.</p>
<figure class="post-cta">
<figcaption>
<h1>You can play with this right now.</h1>
<p>Deploy using [Fly.io terminal](https://fly.io/terminal) or see our [Hands-on](https://fly.io/docs/hands-on/) guide that will walk you through the steps.</p>
<a class="button bg-white text-violet-500 hover:bg-opacity-75 hover:text-violet-700 transition-colors" href="https://fly.io/docs/rails/">
Try Fly for free <span class='opacity-50'>→</span>
</a>
</figcaption>
<div class="image-container">
<img src="/static/images/cta-turtle.webp" srcset="/static/images/cta-turtle@2x.webp 2x" alt="">
</div>
</figure>
<h2 id='devils-in-the-details' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#devils-in-the-details' aria-label='Anchor'></a>Devils in the details</h2>
<p>Not all frameworks are alike.</p>
<p>Some will, by default, start servers that only process requests that come from
the localhost. That, of course, is entirely unsatisfactory.</p>
<p>Some require extra steps, for example applications that make use of Prisma.</p>
<p>One (and I won’t mention the name) actually lists the package needed to run the
production server as a development only dependency.</p>
<p>Fortunately, <a href='https://ejs.co/' title=''>ejs</a> templates can include <code>if</code> statements and/or make
use of computed variables that customize the Dockerfiles produced.</p>
<p>As a starter set, I’ve got templates working for the following frameworks:
<a href='https://expressjs.com/' title=''>Express</a>, <a href='https://www.fastify.io/' title=''>Fastify</a>, <a href='https://www.gatsbyjs.com/' title=''>Gatsby</a>,
<a href='https://nestjs.com/' title=''>Nest</a>, <a href='https://nextjs.org/' title=''>Next</a>, <a href='https://nuxtjs.org/' title=''>Nuxt</a>, and
<a href='https://remix.run/' title=''>Remix</a>. At the moment, I’ve been focusing
on breadth vs depth, so what I have working may not be able to handle much more than
the splash screen, but my experience is that getting that far is often the hardest part,
after that point you have all the scaffolding in place and can focus on any specific issue that may come up.</p>
<p>Those are the successes so far. Here’s a list of frameworks that are still being worked on,
along with the current blocking issue:</p>
<ul>
<li><a href='https://trpc.io/' title=''>tRPC</a>: Access to the Postgres database is required during the build step. Worst case,
we do the build step during the deployment of the server, but that is suboptimal for cases
where multiple servers are started.
</li><li><a href='https://strapi.io/' title=''>Strapi</a>: Needs to set secrets for JWT, session. This isn’t a problem, and already is
solved for Remix deployment for fly, but at the moment goes beyond what a Dockerfile generator
can do by itself.
</li><li><a href='https://redwoodjs.com/' title=''>RedwoodJS</a>: No scripts, recommends nginx. Fly.io
already has a template for Redwood, so it presumably
is just a matter of work to figure out how to fit the steps required into the
general purpose template. It may make sense to either encourage Redwood to add
scripts to their <code>package.json</code>, or to add them during the dockerfile
generation. If not, <code>if</code> statements can be used to generate Redwood-specific
steps rather than generic ones.
</li><li><a href='https://kit.svelte.dev/' title=''>SvelteKit</a>: attempting to deploy results in
<code>Could not detect a supported production environment</code>. Again, just appears to
be a matter of work to add a new production environment.
</li><li><a href='https://keystonejs.com/' title=''>KeystoneJS</a>: at build time, I’m seeing <code>✘ [ERROR]
Could not resolve "./keystone"</code>. Works fine on development machine,
so I probably just missed a step.
</li></ul>
<p>In the fullness of time, these will be picked off one by one. This code is all
open source, so everybody with an interest in a particular framework can
contribute via issues and pull requests. Interest and participation will
definitely affect prioritization of this work.</p>
<h2 id='futures' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#futures' aria-label='Anchor'></a>Futures</h2>
<p>Once this script has a little bit of exposure to real world usage, it will replace the existing
flyctl scanners, much in the way that <a href='https://github.com/rubys/dockerfile-rails' title=''>dockerfile-rails</a>
is the basis for the Dockerfiles produced for Rails applications with Fly.io.
At which point, usage will be as simple as <code>fly launch</code>.</p>
<p>Integration with fly launch will also enable thing like setting of secrets,
defining volumes, launching of databases, and defining health checks as part of
the workflow.</p>
<p>This package will also be designed to be re-run and accept arguments which will customize the Dockerfile
produced. Peruse the <a href='https://github.com/rubys/dockerfile-rails#usage' title=''>usage</a> for dockerfile-rails
to see examples of the types of customizations possible. Some highlights:</p>
<ul>
<li><code>--cache</code> - use build caching to speed up builds
</li><li><code>--swap=n</code> - allocate swap space enabling running of larger applications on memory constrained VMs.
</li><li><code>--compose</code> - generate a <code>docker-compose.yml</code> file
</li></ul>
<p>The scanner will also be able to do things like detect the inclusion of <code>puppeteer</code> and automatically
install and configure Chrome/Chromium. This is already being done for Rails applications today.</p>
<p>Another thing already being done for Rails applications is to run the web server as a
non-root user for security reasons. Repeating this for Node.js will require
knowledge of what files the application is expected to write to and which are
expected to be read-only. This knowledge is necessarily framework specific,
and may not be possible for minimal and general purpose frameworks like
express.</p>
<h2 id='got-feedback' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#got-feedback' aria-label='Anchor'></a>Got Feedback?</h2>
<p>If you have questions, comments, or concerns, let us know!</p>
<p>If they are even vaguely Fly.io related, feel free to use our
<a href='https://community.fly.io/' title=''>community forum</a>. Otherwise,
start a <a href='https://github.com/fly-apps/dockerfile-node/discussions' title=''>discussion</a>
on GitHub.</p>
<p>And to those that wish to contribute, perhaps to make support for their
favorite framework(s) better…. let’s do this!</p>
/blog/flydotio-heart-js/assets/flydotio-heart-js-thumb.webpGossip Glomershttps://fly.io/blog/gossip-glomers/2023-02-22T00:00:00+00:002023-11-21T21:08:37+00:00<div class="lead"><p>We’re Fly.io. We run apps for our users on hardware we host around the world. This post isn’t about our platform. Rather, it’s an elaborate plot to get you to write some code just for the hell of it.</p>
</div>
<p>In the field of computer science, the industry is represented by two separate yet equally important groups: the software developers who build Rails applications and mobile games, and the academics who write theory papers about why the problems those apps try to solve are NP-hard. This is a story about both.</p>
<p>Distributed systems span the practical-academic divide. Reading a stack of MIT PhD dissertations may be a good Friday night, but it won’t equip you for debugging a multi-service outage at 2am. That requires real-world experience.</p>
<p>Likewise, building a fleet of microservices won’t give you the conceptual tools to gracefully & safely handle failure. Many failure scenarios are rare. They don’t show up in unit tests. But they’re devastating when they do show up. Nailing down the theory gives you a fighting chance at designing a correct system in the first place.</p>
<p>The practical and academic tracks seldom converge. To fix this, we teamed up with <a href='https://aphyr.com/about' title=''>Kyle Kingsbury</a>, author of <a href='https://jepsen.io/' title=''>Jepsen</a>, to develop a series of distributed systems challenges that combine real code with the academic rigor of Jepsen’s verification system.</p>
<p>We call these challenges the <a href='/dist-sys' title=''>Gossip Glomers</a>.</p>
<div class="callout"><p><strong class="font-[550] text-navy-950">What the f$#* is a Glomer?</strong>
<br>
It’s an elaborate pun about the CAP theorem.</p>
</div><h2 id='how-it-works' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#how-it-works' aria-label='Anchor'></a>How it works</h2>
<p>You know Kyle Kingsbury from his “<a href='https://jepsen.io/analyses' title=''>Call Me Maybe</a>” blog posts that eviscerate distributed databases. You may also have known about <a href='https://github.com/jepsen-io/jepsen' title=''>Jepsen</a>, the Clojure-based open-source tooling Kyle uses to conduct these analyses. Well, Kyle also wrote another tool on top of Jepsen called <a href='https://github.com/jepsen-io/maelstrom' title=''>Maelstrom</a>.</p>
<p>Maelstrom runs toy distributed systems on a simulated network. It easily runs on a laptop. Kyle uses it to teach distributed systems. We all thought it’d be neat to build a series of challenges that would teach people around the Internet Maelstrom, and, in turn, some distributed systems theory.</p>
<p>Each challenge is composed of several parts:</p>
<ul>
<li>The <em>workload</em> acts as a set of clients to your distributed systems. These clients send different types of messages as defined by the challenge and expect certain constraints to be met. These workloads can vary between a simple distributed counter all the way to multi-operation, transactional database systems.
</li><li>The <em>simulated network</em> injects network partitions or slows messages between nodes.
</li><li>The <em>verification system</em> uses Jepsen to check consistency and availability constraints required by the challenge.
</li><li>And finally, the binary for the <em>node</em> which is written by you!
</li></ul>
<h2 id='pathway-to-distributed-systems-enlightenment' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#pathway-to-distributed-systems-enlightenment' aria-label='Anchor'></a>Pathway to distributed systems enlightenment</h2>
<p>Our challenges start off easy and get more difficult as you move along. They’re organized into six high-level challenges with many of those having several smaller challenges within them.</p>
<p>First, you’ll start with the Echo challenge. This is the “hello world” of distributed systems challenges. It gets you up and running and helps you understand how these challenges work.</p>
<p>Next, you’ll build a totally-available, distributed unique ID generator. In this challenge, nodes will need to be coordination-free and independently generate a unique identifier for any number of clients.</p>
<p>After that, the difficulty starts to ramp up with the broadcast challenge. In this challenge, you’ll need to propagate messages out to all the nodes in the cluster. You’ll need to ensure fault tolerance in the face of network partitions and then work to optimize your message delivery to minimize the number of messages sent within your system.</p>
<p>Once you’ve made it past broadcast, you’ll implement a grow-only counter, or g-counter. The tricky part with this challenge is that you’ll need to build on top of Maelstrom’s <a href='https://jepsen.io/consistency/models/sequential' title=''>sequentially</a> consistent key/value store.</p>
<p>Then you’ll dive into the world of replicated logs by building a Kafka-like system. This challenge will build on the <a href='https://jepsen.io/consistency/models/linearizable' title=''>linearizable</a> key/value store provided by Maelstrom but you’ll need to figure out how to not only make it correct but also efficient.</p>
<p>Finally, you’ll wrap up with the totally-available transactions challenge where you’ll build a transactional database on various consistency levels.</p>
<h2 id='a-bit-of-history' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#a-bit-of-history' aria-label='Anchor'></a>A bit of history</h2>
<p>Over the past year, we’ve been growing like gangbusters. That’s great. But it also means we’ve been hiring, and hiring is hard.</p>
<p>We hire <a href='https://fly.io/docs/hiring/' title=''>resume-blind, based on work-sample tests</a>: we have people write code and design systems, and then score those submissions based on a rubric. We’ve got criteria set up for <a href='https://fly.io/docs/hiring/levels/' title=''>early-career, mid-level, and team-lead developers</a>. But we didn’t have strong criteria for hiring staff engineers.</p>
<p>So we began tossing around ideas. In a previous life, some of us had success with a series of cryptography challenges called <a href='https://cryptopals.com/' title=''>Cryptopals</a>, so we figured we’d try something similar, but with a distributed systems flavor.</p>
<p>That sounded great but how do you actually test distributed systems to know if someone passed or failed? For weeks, we wrote up one iteration after another but none of them felt right.</p>
<p>Finally, we had a brilliant idea. Let’s find someone who lives and breathes distributed system validation! That someone is Kyle Kingsbury.</p>
<p>After working on these challenges with Kyle, we realized that they are too much fun to keep to ourselves as an internal evaluation tool. So we’re releasing them for anyone to play with.</p>
<h2 id='but-wait-theres-more' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#but-wait-theres-more' aria-label='Anchor'></a>But wait… there’s more!</h2>
<p>If you scoff in the face of cascading failures, if you bend consistency levels to your will, and if you read <a href='https://k8s.af/' title=''>k8s.af</a> post-mortems as bedtime stories to your kids, you may be interested in trying our hardest challenge.</p>
<p>We reserved this last challenge for evaluating our staff engineers at Fly.io. So if you think you’d be up to the challenge, <a href='https://fly.io/jobs/' title=''>we’d love to talk to you</a>.</p>
/blog/gossip-glomers/assets/gossip-glomers-thumbnail.webpShipping Logshttps://fly.io/blog/shipping-logs/2023-02-21T00:00:00+00:002023-11-21T21:08:37+00:00<div class="lead"><p>Fly.io runs apps (globally) in just few commands. That means a lot of log output! Centralizing logs is important. <a href="/docs/speedrun/" title="">Fire up an app</a> and follow along as we see just how easy it can be.</p>
</div>
<p>Nearly all of our apps are puking output. Sometimes, it’s intentional. Often this output is in the form of structured logs. </p>
<p>Logs are helpful for a variety of use cases - debugging, tracking, collating, correlating, coalescing, and condensing the happenings of your code into useful bits of human-parsable information.</p>
<p>There can be a lot of logs, from a lot of apps. Aggregating logs to a central place is useful for many reasons, but here are my top 2 favorite:</p>
<ol>
<li><strong class='font-[550] text-navy-950'>Correlation</strong> - Being able to search/query/report on all your logs in one place helps you correlate events (“Joe deleted prod again”) amongst services
</li><li><strong class='font-[550] text-navy-950'>Retention</strong> - Fly.io doesn’t keep your logs around forever. If you want to see them, retain them!
</li></ol>
<h2 id='the-logging-river' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-logging-river' aria-label='Anchor'></a>The Logging River</h2>
<p>Since we grab stdout from the processes run in your apps, whatever an app outputs becomes a log. Logs are constantly flowing through Fly.io’s infrastructure.</p>
<div class="right-sidenote"><p></p><p> </p>Image: <a href="https://commons.wikimedia.org/wiki/File%3APhotograph_of_Log_Jam_-_NARA_-_2129372.webp" title="">Wikimedia</a><p></p>
</div>
<p><img alt="Black-and-white photo of people walking on a river jammed with logs" src="/images/logjam.webp" /></p>
<p>Here’s how that works.</p>
<p>Your apps run in a VM via Firecracker. Inside the VM, we inject an <code>init</code> process (pid 1) that runs and monitors your app. Since we build VM’s from Docker images, <code>init</code> is taking <code>ENTRYPOINT</code> + <code>CMD</code> and running that.
The <code>init</code> program (really just a bit of Rust that we named <code>init</code>) is, among other things, gathering process output from stdout and shooting it into a socket.</p>
<p>Outside of the VM, on the host, a bit of Golang takes that output and sends it to <a href='https://vector.dev/' title=''>Vector</a> via yet-another socket.</p>
<p>Vector’s job is to ship logs to other places. In this case, those logs (your app’s output) are shipped to an internal <a href='https://nats.io' title=''>NATS</a> cluster. For the sake of simplicity, let’s call NATS a “fancy, clustered pubsub service”. Clients can subscribe to specific topics, and NATS sends the requested data to those subscribers.</p>
<p>In true Fly.io fashion, a proxy sits in front of NATS. We call this proxy “Flaps” (Fly Log Access Pipeline Server™, as one does). Flaps ensures you only see your own logs.</p>
<p><strong class='font-[550] text-navy-950'>NATS is the fun part!</strong> You can hook into NATS (via Flaps) to get your logs.</p>
<p>To get your logs, all you need is an app that acts as a NATS client, reads the logs, and ships them somewhere. Vector can do just that! It’s fairly simple - in fact, we’ve done the work for you:</p>
<h2 id='the-fly-log-shipper' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#the-fly-log-shipper' aria-label='Anchor'></a>The Fly Log Shipper™</h2>
<p>To ship your logs, you can run an instance of the <a href='https://github.com/superfly/fly-log-shipper' title=''>Fly Log Shipper</a>.</p>
<p>This app configures a Vector <a href='https://vector.dev/docs/reference/configuration/sinks/' title=''>sink</a> of your choosing, and runs Vector. A sink is a “driver” that Vector will ship logs to, for example Loki, Datadog, or (bless your heart) Cloudwatch.</p>
<p>I liked the look of <a href='https://logtail.com/' title=''>Logtail</a>, so I tried out its free tier.</p>
<div class="callout"><p>Logtail actually lets you set Fly.io as a source of logs, but as we can see, we’re actually just telling Vector to send logs somewhere.
If your log aggregator doesn’t know Fly.io exists, that’s fine. It just needs a <a href="https://vector.dev/docs/reference/configuration/sinks/" title="">Vector sink</a> to exist.
The process is the same no matter what log aggregator you use.</p>
</div>
<p>If you sign up for Logtail, it helpfully gives you instructions on setting that up with Fly.io.</p>
<p><img alt="logtail fly.io setup" src="/images/logtail-1.webp" /></p>
<p>Let’s go ahead and follow those instructions (they’re similar to what you see on the <a href='https://github.com/superfly/fly-log-shipper' title=''>Log Shipper repo</a>).</p>
<h2 id='using-the-log-shipper' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#using-the-log-shipper' aria-label='Anchor'></a>Using the Log Shipper</h2>
<p>The NATS log stream is scoped to your organization. This means that the Fly Log Shipper collects logs from <em>all</em> your applications.</p>
<p>Here’s how to set it up with Logtail:</p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-wuolodym"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-wuolodym"><span class="c"># Make a directory for</span>
<span class="c"># our log shipper app</span>
<span class="nb">mkdir </span>logshippper
<span class="nb">cd </span>logshippper
<span class="c"># I chose not to deploy yet</span>
fly launch <span class="nt">--image</span> ghcr.io/superfly/fly-log-shipper:latest
<span class="c"># Set some secrets. The secret / env var you set</span>
<span class="c"># determines which "sinks" are configured</span>
fly secrets <span class="nb">set </span><span class="nv">ORG</span><span class="o">=</span>personal
fly secrets <span class="nb">set </span><span class="nv">ACCESS_TOKEN</span><span class="o">=</span><span class="si">$(</span>fly auth token<span class="si">)</span>
fly secrets <span class="nb">set </span><span class="nv">LOGTAIL_TOKEN</span><span class="o">=</span><token provided by logtail <span class="nb">source</span><span class="o">></span>
</code></pre>
</div>
</div>
<p>You can configure as many providers as you’d like by adding more secrets. The secrets needed are determined by <a href='https://github.com/superfly/fly-log-shipper#provider-configuration' title=''>which provider(s)</a> you want to use.</p>
<p>Before launching your application, you should edit the generated <code>fly.toml</code> file and delete the entire <code>[[services]]</code> section. Replace it with this:</p>
<div class="highlight-wrapper group relative toml">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-4dbixek5"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-4dbixek5"><span class="nn">[[services]]</span>
<span class="py">http_checks</span> <span class="p">=</span> <span class="p">[]</span>
<span class="py">internal_port</span> <span class="p">=</span> <span class="mi">8686</span>
</code></pre>
</div>
</div>
<p>Then you can deploy it:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-9q59ua1d"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-9q59ua1d">fly deploy
</code></pre>
</div>
</div>
<p>You’ll soon start to see logs appear from all of your apps.</p>
<p><img alt="logtail logs from fly.io" src="/images/logtail-2.webp" /></p>
<p>That wasn’t too bad!</p>
<figure class="post-cta">
<figcaption>
<h1>Try it out yourself!</h1>
<p>You have apps. Apps have logs! Run your apps and ship your logs in just a few commands.</p>
<a class="button bg-white text-violet-500 hover:bg-opacity-75 hover:text-violet-700 transition-colors" href="https://fly.io/docs/speedrun">
Deploy your app! <span class='opacity:50'>→</span>
</a>
</figcaption>
<div class="image-container">
<img src="/static/images/cta-dog.webp" srcset="/static/images/cta-dog@2x.webp 2x" alt="">
</div>
</figure>
<h2 id='shipping-specific-logs' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#shipping-specific-logs' aria-label='Anchor'></a>Shipping Specific Logs</h2>
<p>So far we’ve seen how to ship logs from every application in your organization.</p>
<p>You can, however, narrow that down by setting a <code>SUBJECT</code> environment variable. That can be set in the <code>fly.toml</code>‘s <code>[env]</code> section, or as an application secret.</p>
<p>I opted to add it to my <code>fly.toml</code>, which looked like this:</p>
<div class="highlight-wrapper group relative toml">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-xnmlk2pf"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-xnmlk2pf"><span class="nn">[env]</span>
<span class="py">SUBJECT</span> <span class="p">=</span> <span class="s">"logs.sandwich.>"</span>
</code></pre>
</div>
</div>
<p>The subject is in format <code>logs.<app_name>.<region>.<instance_id></code>. An example <code>SUBJECT</code> to only log an application named <code>sandwich</code> (no matter what region it’s in) is: </p>
<div class="highlight-wrapper group relative bash">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-oenww5e4"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-oenww5e4"><span class="nv">SUBJECT</span><span class="o">=</span><span class="s2">"logs.sandwich.>"</span>
</code></pre>
</div>
</div>
<p>See that greater-than symbol <code>></code>? That’s a <a href='https://docs.nats.io/nats-concepts/subjects#wildcards' title=''>NATS wildcard</a>. There are also regular wildcards <code>*</code>, but the special wildcard <code>></code> is used at the end of the string to say “and anything to the right of this”.</p>
<p>So, our use of <code>"logs.sandwich.>"</code> says to ship any logs that are from application <code>sandwich</code>, no matter what region or instance they come from. You can (ab)use this to get the logs you’re interested in.</p>
<p>Go forth and ship logs!</p>
/blog/shipping-logs/assets/slinging-logs-in-nats-thumbnail.webpCarving The Scheduler Out Of Our Orchestratorhttps://fly.io/blog/carving-the-scheduler-out-of-our-orchestrator/2023-02-01T00:00:00+00:002023-11-21T21:08:37+00:00<div class="lead"><p>We’re Fly.io, a global sandwich-rating company with a hosting problem. Even if you don’t have a sandwich to rate, you might benefit from the hosting platform we built. Check it out: with a working Docker image, <a href="https://fly.io/docs/speedrun/" title="">you can be up and running on Fly.io in just minutes.</a></p>
</div>
<p>So, you want to build an app to rate sandwiches. Well, the world has a lot of different sandwiches. <a href='https://en.wikipedia.org/wiki/Pit_beef' title=''>Pit beefs</a> in Baltimore, <a href='https://www.bonappetit.com/recipe/pork-katsu-sandwich' title=''>Tonkatsu sandos</a> in Shinjuku, and <a href='https://en.wikipedia.org/wiki/Cemita' title=''>Cemitas</a> in Puebla. You want real-time sandwich telemetry, no matter the longitude of the sandwich. So you need to run it all over the world, without a lot of ceremony.</p>
<p>We built one of those at Fly.io. We’ve written a bunch <a href='https://fly.io/blog/docker-without-docker/' title=''>about one important piece of the puzzle</a>: how we take Docker images from our users and efficiently run them as virtual machines. You can run a Docker image as VM. You’re almost done! Time to draw the rest of the owl.</p>
<p>To turn our Docker transmogrification into a platform, we need to go from running a single job to running hundreds of thousands. That’s an engineering problem with a name:</p>
<h2 id='orchestration' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#orchestration' aria-label='Anchor'></a>Orchestration</h2>
<p>Orchestrators link clusters of worker servers together and offer up an API to run jobs on them. <a href='https://kubernetes.io/' title=''>Kubernetes</a> is an orchestrator; the Kleenex of orchestrators. Then, HashiCorp has <a href='https://www.nomadproject.io/' title=''>Nomad</a>, which we use, and about which more in a bit.</p>
<p>Find a serverside developer complaining about how much harder it is to deliver an app in 2023 than it was in 2005, and odds are, <a href='https://news.ycombinator.com/item?id=20774712' title=''>they’re complaining about orchestration.</a> They’re not wrong: Kubernetes is fractally complicated. But the idea isn’t.</p>
<p>Let’s write an orchestrator. Start by writing a supervisor.</p>
<div class="highlight-wrapper group relative go">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-6oowgteq"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-6oowgteq"> <span class="k">for</span> <span class="n">_</span><span class="p">,</span> <span class="n">task</span> <span class="o">:=</span> <span class="k">range</span> <span class="n">tasks</span> <span class="p">{</span>
<span class="n">wg</span><span class="o">.</span><span class="n">Add</span><span class="p">(</span><span class="m">1</span><span class="p">)</span>
<span class="k">go</span> <span class="k">func</span><span class="p">(</span><span class="n">t</span> <span class="n">Task</span><span class="p">)</span> <span class="p">{</span>
<span class="k">defer</span> <span class="n">wg</span><span class="o">.</span><span class="n">Done</span><span class="p">()</span>
<span class="n">argv</span> <span class="o">:=</span> <span class="n">strings</span><span class="o">.</span><span class="n">Split</span><span class="p">(</span><span class="n">t</span><span class="o">.</span><span class="n">Command</span><span class="p">,</span> <span class="s">" "</span><span class="p">)</span>
<span class="k">for</span> <span class="p">{</span>
<span class="n">cmd</span> <span class="o">:=</span> <span class="n">exec</span><span class="o">.</span><span class="n">Command</span><span class="p">(</span><span class="n">argv</span><span class="p">[</span><span class="m">0</span><span class="p">],</span> <span class="n">argv</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="p">]</span><span class="o">...</span><span class="p">)</span>
<span class="n">cmd</span><span class="o">.</span><span class="n">Start</span><span class="p">()</span>
<span class="n">cmd</span><span class="o">.</span><span class="n">Wait</span><span class="p">()</span>
<span class="k">if</span> <span class="o">!</span><span class="n">t</span><span class="o">.</span><span class="n">Restart</span> <span class="p">{</span>
<span class="k">break</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}(</span><span class="n">task</span><span class="p">)</span>
<span class="p">}</span>
<span class="n">wg</span><span class="o">.</span><span class="n">Wait</span><span class="p">()</span>
</code></pre>
</div>
</div>
<p>I believe this design is so powerful it does not need to be discussed.</p>
<div class="right-sidenote"><p>A build-your-own-light-saber tool if ever there was one.</p>
</div>
<p>There are, like, <a href='http://supervisord.org/' title=''>100</a> <a href='https://github.com/DarthSim/overmind' title=''>different</a> <a href='https://mmonit.com/monit/' title=''>supervisors</a>. You can write a program to run a shell command, you can write a supervisor. Come on. You’ve already written a supervisor. Let’s stop kidding each other.</p>
<p>Let’s turn ours into an orchestrator.</p>
<p>For illustrative purposes, our supervisor takes a JSON configuration:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-3v45o516"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-3v45o516">[
{
"Name": "frick",
"Command": "sleep 1",
"Restart": true
},
{
"Name": "frack",
"Command": "sleep 5"
}
]
</code></pre>
</div>
</div>
<p>Instead of reading this configuration from a file, like a dumb old supervisor, read it from an HTTP API, like a majestic orchestrator. “Workers” run our simple supervisor code, and a <code>server</code> doles out tasks. Here’s an API:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-1dcql70a"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-1dcql70a">GET /sched/jobs # polled
POST /sched/claim/{name}
POST /sched/release/{name}
GET /sched/cancellations # polled
POST /sched/submit
GET /sched/status/{name}
POST /sched/cancel/{name}
</code></pre>
</div>
</div>
<p>The server implementing this is an exercise for the reader. Don’t overthink it .</p>
<p>Workers poll <code>/jobs</code>. They <code>/claim</code> them by name. The <code>server</code> decides which claim wins, awarding it a <code>200</code> HTTP response. The worker runs the job, until it stops, and posts <code>/release</code>.</p>
<p>End-users drive the orchestrator with the same API; they post JSON tasks to <code>/submit</code>, check to see where they’re running, kill them by name with <code>/cancel</code>. Workers poll <code>/cancellations</code> to see what to stop running.</p>
<p>There. That’s an orchestrator. It’s just a client-server process supervisor.</p>
<p>I see a lot of hands raised in the audience. I’ll take questions at the end. But let’s see if I can head some of them off:</p>
<ul>
<li>Sure, it’s unusual for an orchestrator to run shell commands. <aside class="right-sidenote">Though <a href='https://mesos.apache.org/documentation/latest/architecture/' title=''>Mesos</a>_, I’m told by a veteran Meson, did start out running shell commands as well.</aside> A serious orchestrator would run Docker containers (or some agglomeration of multiple Dockerfiles called a Pod or a Brood or a Murder). But that’s just a detail; a constant factor of new lines calling the containerd SDK. <a href='https://pkg.go.dev/github.com/containerd/containerd' title=''>Knock yourself out!</a>
</li><li>Yeah, if you were running this in some big enterprise, you’d need some kind of <a href='https://awspolicygen.s3.amazonaws.com/policygen.html' title=''>security and access control</a>; this thing is just <code>rsh</code>. These are just details.
</li><li>“That’s a stupid API” isn’t a question.
</li><li>No, I don’t know what should happen if the server goes down. Something sane. Cancel and restart all the tasks.
</li><li>OK, having all the workers stampeding to grab conflicting jobs is inefficient. But at most cluster sizes, who cares? Have the workers wait a random interval before claiming. Have them randomize the job they try to claim. It’ll probably scale fine.
</li><li>Yes, you could just do this with Redis and <code>BLPOP</code>.
</li></ul>
<p>You there in the back hollering… this isn’t a real orchestrator, why? Oh, because we’re not</p>
<h2 id='scheduling' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#scheduling' aria-label='Anchor'></a>Scheduling</h2>
<p>Scheduling means deciding which worker to run each task on.</p>
<p>Scheduling is to an orchestrator what a routing protocol is to a router: the dilithium crystal, the contents of Marcellus Wallace’s briefcase, the thing that, ostensibly, makes the system Difficult.</p>
<p>It doesn’t have to be hard. Assume our cluster is an undifferentiated mass of identical workers on the same network. Decide how many jobs a worker can run. Then: just tell a worker not to bid on jobs when it’s at its limit.</p>
<p>But no mainstream orchestrator works this way. All of them share some notion of centralized scheduling: an all-seeing eye that allocates space on workers the way a memory allocator doles out memory.</p>
<p>Even centralized scheduling doesn’t complicate our API that much.</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-p56luvvt"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-p56luvvt">POST /sched/register # {'cpu':64,'mem':256,'diskfree':'4t'}
GET /sched/assigned
POST /sched/started/{name}
POST /sched/stopped/{name}
GET /sched/cancellations
</code></pre>
</div>
</div>
<p>Instead of rattling off all the available jobs and having workers stampede to claim them, our new API assigns them directly. Easier for the workers, harder for the server, which is now obligated to make decisions.</p>
<p>Here’s the rough outline of a centralized scheduler:</p>
<ol>
<li>Filter out workers that fail to match constraints, like sufficient disk space or CPUs or microlattice shapecasters.
</li><li>Rank the surviving workers.
</li></ol>
<p>The textbook way to rank viable workers is “<a href='https://www.ics.uci.edu/~goodrich/teach/cs165/notes/BinPacking.pdf' title=''>bin packing</a>”. Bin packing is a classic computer science problem: given a series of variably-sized objects and fixed-size containers, fit all the objects in the smallest number of containers. The conventional wisdom about allocating jobs in a cluster is indeed that of the clown car: try to make servers as utilized as possible, so you can minimize the number of servers you need to buy.</p>
<p>So far, the mechanics of what I’m describing are barely an afternoon coding project. But real clusters tend to run Kubernetes. Even small clusters: people run K8s for apps like <code>ratemysandwich.com</code> all the time. But K8s was designed to host things like <code>all of Google</code>. So K8s has fussy scheduling system.</p>
<p>To qualify as “fussy”, a scheduler needs at least 2 of the following 3 properties:</p>
<ol>
<li>Place jobs on workers according to some optimum that is theoretically NP-hard to obtain (but is in practice like 2 nested <code>for</code> loops).
</li><li>Accounting for varying resource requirements for jobs using a live inventory of all the workers and something approximating a constraint solver.
</li><li>Scaling to huge clusters, without a single point of failure, so that the scheduler itself becomes a large distributed system.
</li></ol>
<p>These tenets of fussiness hold true not just for K8s, but for all mainstream orchestrators, including the one we use.</p>
<div class="right-sidenote"><p><br><br><br><a href="https://scholar.harvard.edu/files/mickens/files/thesaddestmoment.pdf" title="">The Mickensian aspect.</a></p>
</div>
<p><img src="/images/mickens.webp?3/4&border" /></p>
<h2 id='nomad' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#nomad' aria-label='Anchor'></a>Nomad</h2>
<p>Let’s start by reckoning with what’s going on with Kubernetes.</p>
<p>The legends speak of a mighty orchestrator lurking within the halls of Google called “<a href='https://research.google/pubs/pub43438/' title=''>Borg</a>”. Those of us who’ve never worked at Google have to take the word of those who have that Borg actually exists, and the word of other people that <a href='https://kubernetes.io/blog/2015/04/borg-predecessor-to-kubernetes/' title=''>K8s is based on the design of Borg.</a></p>
<p>The thing about Borg is that, if it exists, it exists within an ecosystem of other internal Google services. This makes sense for Google the same way having <a href='https://aws.amazon.com/s3/' title=''>S3</a>, <a href='https://aws.amazon.com/sqs/' title=''>SQS</a>, <a href='https://aws.amazon.com/ecs/' title=''>ECS</a>, <a href='https://aws.amazon.com/lambda/' title=''>Lambda</a>, <a href='https://aws.amazon.com/ebs/' title=''>EBS</a>, <a href='https://aws.amazon.com/elasticloadbalancing/' title=''>ALBs</a>, <a href='https://aws.amazon.com/cloudwatch/' title=''>CloudWatch</a>, <a href='https://aws.amazon.com/cognito/' title=''>Cognito</a>, <a href='https://aws.amazon.com/efs/' title=''>EFS</a>, <a href='https://aws.amazon.com/redshift/' title=''>RedShift</a>, <a href='https://aws.amazon.com/route53' title=''>Route53</a>, <a href='https://aws.amazon.com/glacier/' title=''>Glacier</a>, <a href='https://aws.amazon.com/sns/' title=''>SNS</a>, <a href='https://aws.amazon.com/vpc/' title=''>VPC</a>, <a href='https://aws.amazon.com/certificate-manager/' title=''>Certificate Manager</a>, <a href='https://universalpaperclips.fandom.com/wiki/Quantum_Foam_Annealment' title=''>QFA</a>, <a href='https://aws.amazon.com/iam/' title=''>IAM</a>, <a href='https://aws.amazon.com/kms/' title=''>KMS</a>, <a href='https://aws.amazon.com/codecommit/' title=''>CodeCommit</a>, <a href='https://aws.amazon.com/opsworks/' title=''>OpsWorks</a>, <a href='https://aws.amazon.com/cloudformation/' title=''>Cloudformation</a>, <a href='https://aws.amazon.com/snowball/' title=''>Snowball</a>, <a href='https://aws.amazon.com/xray/' title=''>X-Ray</a>, <a href='https://www.youtube.com/watch?v=KWotO76SuXE' title=''>Price List Marketplace Metering Service Entitlement Modernization</a>, and <a href='https://aws.amazon.com/ec2/' title=''>EC2</a> does for AWS. Like, somewhere within Google there’s a team that’s using each of these kinds of service.</p>
<div class="right-sidenote"><p>You can’t argue with the success Kubernetes has had. I get it.</p>
</div>
<p>It makes less sense for a single piece of software to try to wrap up all those services. But <a href='https://landscape.cncf.io/' title=''>Kubernetes seems to be trying</a>. Here’s some perspective: K8s is, some people say, essentially Borg but with Docker Containers instead of <a href='https://www.usenix.org/sites/default/files/conference/protected-files/lisa_2014_talk.pdf' title=''>Midas packages</a>. Midas is neat, but it in turn relies on <a href='https://static.googleusercontent.com/media/research.google.com/en//archive/bigtable-osdi06.pdf' title=''>BigTable</a> and <a href='https://cloud.google.com/blog/products/storage-data-transfer/a-peek-behind-colossus-googles-file-system' title=''>Colossus</a>, two huge Google services. And that’s just packages, the lowest level primitive in the system. It’s an, uh, ambitious starting point for a global open source standard. </p>
<p>At any rate, our customers want to run Linux apps, not Kubernetes apps. So Kubernetes is out.</p>
<p>Sometime later, a team inside Google took it upon themselves to redesign Borg. Their system was called <a href='https://research.google/pubs/pub41684/' title=''>Omega</a>. I don’t know if it was ever widely used, but it’s influential. Omega has these properties:</p>
<ul>
<li>Distributed scheduling, so that scheduling decisions could be made on servers across the cluster instead of a monolithic single central scheduler.
</li><li>A complete, up-to-date picture of available resources on the cluster (via a Paxos-replicated database) provided to all schedulers.
</li><li>Optimistic transactions: if a proposed decision fails, because it conflicts with some other claim on the same resources, your scheduler just tries again.
</li></ul>
<p>Hashicorp <a href='https://developer.hashicorp.com/nomad/docs/concepts/scheduling/scheduling' title=''>took Google’s Omega paper and turned it into an open source project</a>, called Nomad.</p>
<p>Omega’s architecture is nice. But the real win is that Nomad is lightweight. It’s conceptually not all that far from the API we designed earlier, <a href='https://github.com/hashicorp/raft' title=''>plus Raft</a>.</p>
<p>Nomad can run Unix programs directly, or in Docker containers. We do neither. Not a problem: Nomad will orchestrate jobs for anything that conforms to <a href='https://pkg.go.dev/github.com/hashicorp/nomad/plugins/drivers' title=''>this interface</a>:</p>
<div class="highlight-wrapper group relative go">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-zaz6mvze"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-zaz6mvze"> <span class="n">RecoverTask</span><span class="p">(</span><span class="o">*</span><span class="n">TaskHandle</span><span class="p">)</span> <span class="kt">error</span>
<span class="n">StartTask</span><span class="p">(</span><span class="o">*</span><span class="n">TaskConfig</span><span class="p">)</span> <span class="p">(</span><span class="o">*</span><span class="n">TaskHandle</span><span class="p">,</span> <span class="o">*</span><span class="n">DriverNetwork</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span>
<span class="n">WaitTask</span><span class="p">(</span><span class="n">ctx</span> <span class="n">context</span><span class="o">.</span><span class="n">Context</span><span class="p">,</span> <span class="n">taskID</span> <span class="kt">string</span><span class="p">)</span> <span class="p">(</span><span class="o"><-</span><span class="k">chan</span> <span class="o">*</span><span class="n">ExitResult</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span>
<span class="n">StopTask</span><span class="p">(</span><span class="n">taskID</span> <span class="kt">string</span><span class="p">,</span> <span class="n">timeout</span> <span class="n">time</span><span class="o">.</span><span class="n">Duration</span><span class="p">,</span> <span class="n">signal</span> <span class="kt">string</span><span class="p">)</span> <span class="kt">error</span>
<span class="n">DestroyTask</span><span class="p">(</span><span class="n">taskID</span> <span class="kt">string</span><span class="p">,</span> <span class="n">force</span> <span class="kt">bool</span><span class="p">)</span> <span class="kt">error</span>
<span class="n">InspectTask</span><span class="p">(</span><span class="n">taskID</span> <span class="kt">string</span><span class="p">)</span> <span class="p">(</span><span class="o">*</span><span class="n">TaskStatus</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span>
<span class="c">// plus some other goo</span>
</code></pre>
</div>
</div>
<p>For the year following <a href='https://news.ycombinator.com/item?id=22616857' title=''>our launch</a>, Fly.io’s platform was a Rust proxy and a Golang Nomad driver. The driver could check out a Docker image, convert it to a block device, and start Firecracker on it. In return for coding to the driver interface, we got:</p>
<ul>
<li>Constraint-based deployments that let us tell a specific Fly app to run in Singapore (har cheong gai burger), Sydney (hamdog), and Frankfurt (doner kebab), on dedicated CPU instances with 2 cores and at least 4 gigs of memory, say.
</li><li>The ability to move Fly Apps around our fleet, draining them off specific machines for maintenance.
</li><li>Opt-in integration <a href='https://fly.io/blog/a-foolish-consistency/' title=''>with Consul</a>, which we used for request routing and to glue our API to our platform backend.
</li></ul>
<p>About Nomad itself, we have nothing but nice things to say. Nomad is like Flask to K8s’s Django, Sinatra to K8s’s Rails. It’s unopinionated, easy to set up, and straightforward to extend. Use Nomad.</p>
<div class="callout"><p>Another very cool system to look at in this space is <a href="https://github.com/flynn/flynn" title="">Flynn</a>. Flynn was an open source project that started before Docker was stable and grew up alongside it. They set out to build a platform-as-a-service in a box, one that bootstraps itself from a single-binary install. It does so much stuff! If you’ve ever wondered what all the backend code for something like Fly.io must be like (multiple generations of schedulers and all), check out what they did.</p>
</div>
<p>But we’ve outgrown it, because:</p>
<p><strong class='font-[550] text-navy-950'>Bin packing is wrong for platforms like Fly.io</strong>. Fussy schedulers are premised on minimizing deployed servers by making every server do more. That makes a lot of sense if you’re Pixar. We rent out server space. So we buy enough of them to have headroom in every region. As long as they’re running, we’d want to use them.</p>
<p><a href='https://web.archive.org/web/20150323040137/https://www.columbia.edu/~cs2035/courses/ieor4405.S13/datacenter_scheduling.ppt' title=''>Here’s a Google presentation</a> on the logic behind Nomad’s first-fit bin packing scheduler. It was designed for a cluster where 0% utilization was better, for power consumption reasons, than < 40% utilization. Makes sense for Google. Not so much for us.</p>
<p>With strict bin packing, we end up with Katamari Damacy scheduling, where a couple overworked servers in our fleet suck up all the random jobs they come into contact with. Resource tracking is imperfect and neighbors are noisy, so this is a pretty bad customer experience.</p>
<p>Nomad added a “<a href='https://developer.hashicorp.com/nomad/tutorials/advanced-scheduling/spread' title=''>spread scheduling</a>” option, which just inverts the bin pack scoring they use by default. But that’s not necessarily what we want. What we want is complicated! We’re high-maintenance! In a geographically diverse fleet with predictable usage patterns, the best scheduling plans are intricate, and we don’t want to fight with a scheduler to implement them.</p>
<p><strong class='font-[550] text-navy-950'>We Run One Global Cluster.</strong> This isn’t what Nomad expects. Nomad wants us to run <a href='https://developer.hashicorp.com/nomad/tutorials/manage-clusters/federation' title=''>a bunch of federated clusters</a> (one in Dallas, one in Newark, and so on).</p>
<p>There are two big reasons we don’t federate Nomad:</p>
<ul>
<li>It changes the semantics of how apps are orchestrated, which would require fiddly engineering for us to wire back into our UX. For instance: there isn’t an obvious, clean way to roll back a failing app deploy across a dozen regions all at once. We have lots of regions, but offer one platform to our users, so we run into lots of stuff like this.
</li><li>Even if we did that work, Nomad pricing looks at us and sees Apple, Inc. More power to them! But, like, no.
</li></ul>
<p><strong class='font-[550] text-navy-950'>We Outgrew The Orchestration Model.</strong> Nomad scheduling is asynchronous. You submit a job to a server. All the servers convene a trustees meeting, solicit public comment, agree on the previous meeting’s minutes, and reach consensus about the nature of the job requested. A plan is put into motion, and the implicated workers are informed. Probably, everything works fine; if not, the process starts over again, and again, until seconds, minutes, hours, or days later, it does work.</p>
<p>This is not a bad way to handle a <code>flyctl deploy</code> request. But it’s no way to handle an HTTP request, and that’s what we want: for a request to land at our network edge in São Paulo, and then we <strong class='font-[550] text-navy-950'>scale from zero</strong> to handle it in our <code>GRU</code> region, starting a Fly Machine on a particular server, synchronously.</p>
<div class="callout"><p>The Fly.io step in there that costs the most is pulling containers from registries. People’s containers are huge! That makes the win from caching large – and just not captured by the Nomad scheduler.
<br><br>
Nomad autoscaling is elegant, and just not well matched to our platform. How <a href="https://github.com/hashicorp/nomad-autoscaler" title="">the autoscaler</a> works is, it takes external metrics and uses them to adjust the count constraint on jobs. We scrape metrics every 15 seconds, and then Nomad’s scheduling work adds a bunch of time on top of that, so it never really worked effectively.</p>
</div>
<p>At this point, what we’re asking our scheduler to do is to consider Docker images themselves to be a resource, like disk space and memory. The set of images cached and ready to deploy on any given server is changing every second, and so are the scheduling demands being submitted to the orchestrator. Crazy producers. Crazy consumers. It’s a lot to ask from a centralized scheduler.</p>
<p>So we built our own, called</p>
<h2 id='numad' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#numad' aria-label='Anchor'></a>Nümad</h2><div class="right-sidenote"><p>I would also accept “nonomad”, “yesmad”, “no, mad”, and “fauxmad” for this dad joke.</p>
</div>
<p>Just kidding, we call it <code>flyd</code>.</p>
<p>There is a <a href='https://queue.acm.org/detail.cfm?id=3173558' title=''>long and distinguished literature</a> of <a href='https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44843.pdf' title=''>cluster scheduling</a>, <a href='https://ieeexplore.ieee.org/document/4634' title=''>going back into the 1980s</a>. We decided not to consult it, and just built something instead.</p>
<p><code>flyd</code> has a radically different model from Kubernetes and Nomad. Mainstream orchestrators are like sophisticated memory allocators, operating from a reliable global picture of all capacity everywhere in the cluster. Not <code>flyd</code>.</p>
<p>Instead, <code>flyd</code> operates like a market. Requests to schedule jobs are bids for resources; workers are suppliers. Our orchestrator sits in the middle like an exchange. <code>ratemysandwich.com</code> asks for a Fly Machine with 4 dedicated CPU cores in Chennai (sandwich: bun kebab?). Some worker in <code>MAA</code> offers room; a match is made, the order is filled.</p>
<p>Or, critically: the order is not filled. That’s fine too! What’s important is that the decision be made quickly, so that it can be done synchronously. What we don’t want is a <code>pending</code> state waiting for the weather to clear up.</p>
<p><img src="/images/flaps.webp?3/4" /></p>
<p>Our system has a cast of three characters:</p>
<ul>
<li><code>flyd</code> is the source of truth for all the VMs running on a particular worker.
</li><li><code>flaps</code> is a stateful proxy for all the <code>flyd</code> instances.
</li><li><code>flyctl</code> is our CLI. You know it, you love it.
</li></ul>
<p>The engine of this system is <code>flyd</code>.</p>
<p>In Nomad-land, our Firecracker driver doesn’t keep much state. That’s the job of huge scheduling servers, operating in unlighted chambers beyond time amidst the maddening beating and monotonous whine of the <a href='https://www.imdb.com/title/tt5073642/' title=''>Raft consensus protocol</a>.</p>
<div class="right-sidenote"><p>Unlike Nomad, which goes through some effort to keep the entire map of available resources in the cluster in memory, nothing in <code>flyd</code> is cached; everything is just materialized on-demand from disk.</p>
</div>
<p>In <code>flyd</code>-land, state-keeping is very much the worker’s problem. Every worker is its own source of truth. Every <code>flyd</code> keeps a <code>boltdb</code> database of its current state, which is an append-only log of all the operations applied to the worker.</aside></p>
<p><code>flyd</code> is rigidly structured as a collection of state machines, like “create a machine” or “delete a volume”. Each has a concrete representation both in the code (using Go generics) and in <code>boltdb</code>. Everything happening in <code>flyd</code> (in logs, traces, metrics or whatever) happens at a particular state for a particular resource ID. Easy to reason about. And, of course, if we bounce <code>flyd</code>, it picks up right where it left off.</p>
<div class="callout"><p><code>flyd</code> operates off of a local <code>boltdb</code> database, but our platform also has an SQLite view of all the resources allocated systemwide. We built it by caching Consul, but, in keeping with our ethos of “if you see Raft anywhere, something went wrong”, we’ve replaced it with something simpler. We call it Corrosion.
<br><br>
Corrosion is what would happen if you looked at Consul, realized every server is its own source of truth and thus distributed state wasn’t a consensus problem at all but rather just a replication problem, built a <a href="https://fly.io/blog/building-clusters-with-serf/#what-serf-is-doing" title="">SWIM gossip system</a>, and made it spit out SQLite. Also you decided it should be written in Rust. Corrosion is neat, and we’ll eventually write more about it.</p>
</div>
<p>All the <code>flyd</code> instances in (say) Madrid form a <code>MAD</code> cluster. But it’s not a cluster in the same sense Nomad or K8s uses: no state is shared between the <code>flyd</code> instances, and no consensus protocol runs.</p>
<p>To get jobs running on a <code>flyd</code> in <code>MAD</code>, you talk to <code>flaps</code>. <code>flaps</code> is running wherever you are (in my case, <code>ORD</code>).</p>
<p><code>flaps</code> uses Corrosion to find all the workers in a particular region. It has direct connectivity to every <code>flyd</code>, because our network is meshed up with WireGuard. <code>flyd</code> exposes an internal HTTP API to <code>flaps</code>, and <code>flaps</code> in turn exposes this API:</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-k5550m4y"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-k5550m4y">GET /v1/apps/my-app/machines # list
POST /v1/apps/my-app/machines # create
GET /v1/apps/{machineid} # show
DELETE /v1/apps/{machineid}
POST /v1/apps/{machineid}/start
POST /v1/apps/{machineid}/stop
</code></pre>
</div>
</div>
<p>“Creating” a Fly Machine reserves space on a worker in some region.</p>
<p><img alt="A utility function for, say, CPU capacity" src="/images/curve.webp?1/4&wrap-right" />
To reserve space in Sydney, <code>flaps</code> collects capacity information from all the <code>flyds</code> in <code>SYD</code>, and then runs a quick best-fit ranking over the workers with space, which is just a simple linear interpolation rankings workers as more or less desirable at different utilizations of different resources.</p>
<p><br/><br/></p>
<div class="right-sidenote"><p>† (and, in the future, the cartesian of regions cross new hardware products, like space modulator coprocessors)</p>
</div>
<p>Rather than forming distributed consensus clusters, Fly.io regions like <code>MAD</code> and <code>SYD</code>† are like products listed on an exchange. There are multiple suppliers of <code>MAD</code> VMs (each of our workers in Madrid) and you don’t care which one you get. <code>flaps</code> act like a broker. Orders come in, and we attempt to match them. <code>flaps</code> does some lookups in the process, but it doesn’t hold on to any state; the different <code>flaps</code> instances around the world don’t agree on a picture of the world. The whole process can fail, the same way an immediate-or-cancel order does with a financial market order. That’s OK!</p>
<p>Here’s what doesn’t happen in this design: jobs don’t arrive and then sit on the book in a “pending” state while the orchestrator does its best to find some place, any place to run it. If you ask for VMs in <code>MAD</code>, you’re going to get VMs in <code>MAD</code>, or you’re going to get nothing. You won’t get VMs in <code>FRA</code> because the orchestrator has decided “that’s close enough”. That kind of thing happened to us all the time with Nomad.</p>
<h2 id='scheduling-reconsidered' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#scheduling-reconsidered' aria-label='Anchor'></a>Scheduling, Reconsidered</h2>
<p>If you’re a certain kind of reader, you’ve noticed that this design doesn’t do everything Fly Apps do. What happens when an app crashes? How do we deploy across a bunch of regions? How does a rollback work? These are problems Nomad solved. It doesn’t look like <code>flaps</code> and <code>flyd</code> solve them.</p>
<p>That’s because they don’t! Other parts of the platform — most notably, <code>flyctl</code>, our beloved CLI — take over those responsibilities.</p>
<p>For example: how do we handle a crashed worker? Now, <code>flyd</code> will restart a crashed VM, of course; that’s an easy decision to make locally. But some problems can’t be fixed by a single worker. Well, one thing we do is: when you do a deploy, <code>flyctl</code> creates multiple machines for each instance. Only one is started, but others are prepped on different workers. If a worker goes down, <code>fly-proxy</code> notices, and sends a signal to start a spare.</p>
<p>What we’re doing more generally is carving complex, policy-heavy functionality out of our platform, and moving it out to the client. <a href='https://web.mit.edu/Saltzer/www/publications/endtoend/endtoend.pdf' title=''>Aficionados of classic papers</a> will recognize this as an old strategy.</p>
<div class="right-sidenote"><p>Networks, boy I tell ya.</p>
</div>
<p>What we had with Nomad was a system that would make a lot of sense if we were scheduling a relatively small number of huge apps. But we schedule a huge number of relatively small apps, and the intelligent decisions our platform made in response to stimuli were often a Mad Hatter’s tea party. For instance: many times when Europe lost connectivity to <code>us-east-1</code> S3, apps would flake, and Nomad would in response cry “change places!” and reschedule them onto different machines.</p>
<p>What we’ve concluded is that these kinds of scheduling decisions are actually the nuts and bolts of how our platform works. They’re things we should have very strong opinions about, and we shouldn’t be debating a bin packer or a constraint system to implement them. In the new design, the basic primitives are directly exposed, and we just write code to configure them the way we want.</p>
<p>Internally, we call this new system “AppsV2”, because we’re good at naming things. If you’re deploying an app in January of 2023, you’re still using Nomad; if you’re deploying one in December of 2023, you’ll probably be interacting with <code>flyd</code>. If we do it right, you mostly won’t have to care.</p>
<figure class="post-cta">
<figcaption>
<h1>You can play with this stuff right now.</h1>
<p>The Fly Machines API runs on flyd and reserves, starts, and stops individual VMs.</p>
<a class="button bg-white text-violet-500 hover:bg-opacity-75 hover:text-violet-700 transition-colors" href="https://fly.io/docs/speedrun/">
Try Fly for free <span class='opacity-50'>→</span>
</a>
</figcaption>
<div class="image-container">
<img src="/static/images/cta-dog.webp" srcset="/static/images/cta-dog@2x.webp 2x" alt="">
</div>
</figure>
<h2 id='drawing-most-of-the-owl' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#drawing-most-of-the-owl' aria-label='Anchor'></a>Drawing Most Of The Owl</h2>
<p>Over the last couple years, we’ve written about most of the guts of Fly.io:</p>
<ul>
<li><a href='https://fly.io/blog/docker-without-docker/' title=''>How we run containers as VMs</a> in the first place,
</li><li>how our <a href='https://fly.io/blog/ipv6-wireguard-peering/' title=''>WireGuard-backed private networking layer</a> works,
</li><li>how we <a href='https://fly.io/blog/32-bit-real-estate/' title=''>provision Internet Anycasting</a>, and
</li><li>how our <a href='https://fly.io/blog/a-foolish-consistency/' title=''>control plane works</a> (gonna have to rewrite that one soon!).
</li></ul>
<p>It took us awhile, but we’re glad to have finally written down our thoughts about one of the last remaining big pieces. With an execution engine, a control plane, and an orchestrator, you’ve got most of our platform! The only huge piece left is <code>fly-proxy</code>, which we have not yet done justice.</p>
<p>We hope this is interesting stuff even if you never plan on running an app here (or building a platform of your own on top of ours). We’re not the first team to come up with a bidding-style orchestrator — they’re documented <a href='https://ieeexplore.ieee.org/document/4634' title=''>in that 1988 paper above</a>! But given an entire industry of orchestrators that look like Borg, it’s good to get a reminder of how many degrees of freedom we really have.</p>
/blog/carving-the-scheduler-out-of-our-orchestrator/assets/starry-containers-thumbnail.webpFly.io is in Johannesburghttps://fly.io/blog/fly-in-johannesburg/2023-01-31T00:00:00+00:002023-11-21T21:08:37+00:00<div class="lead"><p>Fly.io is busily adding servers in new regions, but here’s one we prepared earlier. Launch a full-stack app in Johannesburg, South Africa! It’s easy to <a href="/docs/speedrun" title="">get started</a>.</p>
</div>
<p>Did you know that we’re in Johannesburg? There’s rugby and cricket. Hearty kota and Gatsby sandwiches. Braii under sunny skies and low-latency full-stack apps. Front end, Postgres, Redis, the works: if your users support the Springboks and Banyana Banyana, you should put your whole app in JNB.</p>
/blog/fly-in-johannesburg/assets/jnb-kota-thumbnail.webpWAL Mode in LiteFShttps://fly.io/blog/wal-mode-in-litefs/2023-01-04T00:00:00+00:002023-11-21T21:08:37+00:00<div class="lead"><p><a href="https://fly.io/docs/litefs/" title="">LiteFS</a> is a distributed file system that magically replicates your SQLite databases. Make an update on one server and, voilà, your change is instantly available to your other servers on the edge. Take a look at our <a href="https://fly.io/docs/litefs/getting-started/" title=""><em>Getting Started</em></a> guide to see how to add LiteFS to your application.</p>
</div>
<p>By and large, SQLite is configuration-free. You can get pretty far by just using the default settings. As your application grows and you start tweaking settings, one of the first knobs you’ll come across is the <a href='https://www.sqlite.org/pragma.html#pragma_journal_mode' title=''>journal mode</a>. This setting determines how SQLite performs transactions on disk and there are essentially two modes: the rollback journal & the write-ahead log, or WAL.</p>
<p>The rollback journal was the original transaction mechanism and it’s still the default. The WAL mode is the shiny new transaction mode. If you start reading blog posts & forums about SQLite, one tip you will repeatedly hear is, <em>“use WAL mode!”</em></p>
<p>If your database is slow, you should use the WAL mode.</p>
<p>If you have concurrent users, you should use the WAL mode.</p>
<p>WAL mode. WAL mode. WAL mode.</p>
<p>In the SQLite world, the write-ahead log is as close to a <a href='https://en.wikipedia.org/wiki/No_Silver_Bullet' title=''>silver bullet</a> as you can find. It’s basically magic fairy dust that makes your database better and you should <a href='https://sqlite.org/wal.html#overview' title=''>almost</a> always use it.</p>
<p>However, <a href='https://github.com/superfly/litefs' title=''>LiteFS</a>, our distributed SQLite file system, only supported the rollback journal mode. Until now! With the release of <a href='https://github.com/superfly/litefs/releases/tag/v0.3.0' title=''>LiteFS v0.3.0</a>, we now support all journaling modes.</p>
<h2 id='quick-primer-on-journal-modes' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#quick-primer-on-journal-modes' aria-label='Anchor'></a>Quick primer on journal modes</h2>
<p>We’ve written about the internals of the <a href='https://fly.io/blog/sqlite-internals-rollback-journal/' title=''>rollback journal</a> and the <a href='https://fly.io/blog/sqlite-internals-wal/' title=''>WAL mode</a> in previous posts, but here’s a refresher.</p>
<p>With the rollback journal, SQLite:</p>
<ul>
<li>Writes new pages directly to the database file.
</li><li>Copies the previous contents to a separate rollback journal file.
</li><li>Deletes the rollback journal on commit.
</li></ul>
<p>Because the pages in the database file are moving around and being deleted, this mode does not allow read transactions & write transactions to occur at the same time.</p>
<p>The WAL works the opposite way:</p>
<ul>
<li>New pages are written to a separate write-ahead log file.
</li><li>The last page written has a “commit” flag to indicate the end of the transaction.
</li></ul>
<p>Since the original data is never changed during the transaction, readers can continue running in parallel while another process is writing to the database. In addition to improved concurrency, the WAL also tends to have better write performance.</p>
<h2 id='databases-as-a-history-of-change-sets' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#databases-as-a-history-of-change-sets' aria-label='Anchor'></a>Databases as a history of change sets</h2>
<p>Most developers think of databases as just a collection of tables & rows. And that’s how you should view it when you’re building an application. However, when designing database tooling like LiteFS, it’s better to think in terms of change sets.</p>
<p>A good analogy is baseball card collections. You might start off buying a pack of cards to start your collection. Over time, you may buy more packs or you might trade cards with friends. Each of these actions is a “change set”, adding and/or removing a set of cards from your collection.</p>
<p>Eventually, word gets out about your sweet baseball card collection and your friends want to have the same set. So each time you make a change, you send each friend a list of which cards were added and removed so they can update their collections. Now everyone has the same collection just by communicating change sets.</p>
<p>That, in a nutshell, is how LiteFS nodes keep distributed copies of your database in sync. However, instead of baseball cards, these LiteFS nodes communicate change sets of fixed-sized blocks called <em>pages</em>.</p>
<p>SQLite applies these change sets of pages safely & atomically by using either a rollback journal or the write-ahead log. These two methods have a different approach but, at the end of they day, they both transactionally update a set of pages in a SQLite database.</p>
<p>In LiteFS, we track the beginning and end of these transactions through the file system API. We can see which pages have changed and bundle them up in an internal file format called <a href='https://github.com/superfly/ltx' title=''>LTX</a>.</p>
<h3 id='detecting-page-sets-with-the-rollback-journal' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#detecting-page-sets-with-the-rollback-journal' aria-label='Anchor'></a>Detecting page sets with the rollback journal</h3>
<p>The rollback journal is a simple mechanism, which makes it easy for LiteFS to determine when write transactions start & end. From a high-level, SQLite implements transactions like this:</p>
<ol>
<li>Obtain an exclusive lock on the <code>SHARED</code> & <code>RESERVED</code> lock bytes.
</li><li>Create a <code>-journal</code> file.
</li><li>Write changes to the database file with <code>write(2)</code> & copy old versions of pages to the journal.
</li><li><code>fsync(2)</code> the database file & <code>unlink(2)</code> the journal file.
</li><li>Release the locks.
</li></ol>
<p>LiteFS acts as a passthrough file system so it can see all these file system calls. On the initial journal creation, it begins watching for page changes. On <code>write(2)</code>, it marks a page as changed. And finally, on <code>unlink(2)</code> it will copy the page change set to an LTX file and then delete the journal.</p>
<h3 id='detecting-page-sets-with-the-wal' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#detecting-page-sets-with-the-wal' aria-label='Anchor'></a>Detecting page sets with the WAL</h3>
<p>SQLite’s operations when it uses the WAL mode are a bit more complicated but it still has similar start & end triggers.</p>
<ol>
<li>Obtain the <code>SHARED</code> lock byte in the database file but also obtain WAL-specific locks such as <code>WAL_WRITE_LOCK</code>.
</li><li>Write new pages to the end of the WAL using <code>write(2)</code>.
</li><li>On the last page write, the <code>commit</code> field is set in the WAL frame header. This indicates the end of the transaction and also the ending size of the database.
</li><li>Release locks.
</li></ol>
<p>LiteFS can read the list of changed pages from the WAL and copy them out to an LTX file when the final WAL write for the transaction comes in. Again, both the rollback journal and WAL are implementation details so we end up with the same LTX format with either one.</p>
<p>In the WAL mode, SQLite will also maintain a shared-memory file (aka SHM) and uses it as an index to look up pages in the WAL. This piece is managed by SQLite so LiteFS doesn’t touch it during a write.</p>
<h3 id='applying-transactions-to-the-replica' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#applying-transactions-to-the-replica' aria-label='Anchor'></a>Applying transactions to the replica</h3>
<p>Once an LTX file is created on the primary LiteFS node, it will send it to all connected replica LiteFS nodes. These replicas will validate the file, perform some consistency checks, and then apply the change set to the SQLite database.</p>
<p>The LiteFS replica imitates a SQLite client and takes the same locks in order to apply the transaction. That means it looks like just another SQLite client doing an update so it’s safe across other processes using the database.</p>
<h2 id='bootstrapping-made-easy' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#bootstrapping-made-easy' aria-label='Anchor'></a>Bootstrapping made easy</h2>
<p>Previously, it was tough to convert an existing application to use LiteFS. You’d need to create a SQL dump of your database and import in using the <code>sqlite3</code> command line. That was a pain.</p>
<p>We’ve improved this workflow with the new <a href='/docs/litefs/import/' title=''><code>litefs import</code></a> command. This command lets you remotely send a SQLite database to your LiteFS cluster and it will transactionally replace it. That means you can start a cluster with an existing database or you can even revert to an old snapshot on a live application.</p>
<div class="highlight-wrapper group relative sh">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-hd1b1t96"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-hd1b1t96"><span class="nv">$ </span>litefs import <span class="nt">-name</span> my.db /path/to/my.db
</code></pre>
</div>
</div><h2 id='reworking-checksumming' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#reworking-checksumming' aria-label='Anchor'></a>Reworking checksumming</h2>
<p>LiteFS uses a fast, incremental checksum for ensuring the state of the entire database is consistent across all nodes at every transaction. The method is simple: we XOR the <a href='https://en.wikipedia.org/wiki/Cyclic_redundancy_check' title=''>CRC64</a> checksums of every page in the database together. This approach let us incrementally update individual pages by XOR'ing out the old checksum for a page and XOR'ing in the new checksum for the page. That’s pretty cool.</p>
<p>However, in practice, it was difficult to ensure we were calculating the correct previous checksum for a page every time we performed an update as page data is spread across the database file, journal file, & WAL file. The edge cases for determining the previous page data were too easy to get wrong.</p>
<p>So in v0.3.0, we decided to rework the database checksum. It still uses the same algorithm of XOR'ing page checksums but now we maintain a map of the current checksum of every page in the database so they can be XOR’d together on commit. We no longer need to track the previous checksum and this change made a lot of edge cases disappear.</p>
<p>This approach is not without its trade-offs though. First, it requires additional memory. The map keys are 4-byte unsigned integers and the values are 8-byte hash values so we need about 12 bytes per page. SQLite uses 4KB pages by default so that’s 262,144 pages per gigabyte. Our total memory overhead for our map of page hashes ends up being about 3MB of RAM per gigabyte of on-disk SQLite database data. LiteFS targets database sizes between 1 to 10 GB so that seemed like a reasonable trade-off.</p>
<p>Second, this approach adds CPU overhead after each commit. Map iteration and XOR computation are quite fast but these do begin to show up in performance profiles as the database grows. In our tests, we’ve found it adds about 5ms per gigabyte of SQLite data. That’s pretty high. Fortunately, much of this iteration can be cached since XORs are associative. We’ll be implementing this cache in the next version of LiteFS.</p>
<h2 id='improving-debugging-with-the-trace-log' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#improving-debugging-with-the-trace-log' aria-label='Anchor'></a>Improving debugging with the trace log</h2>
<p>One benefit to having checksum bugs in v0.2.0 was that it gave us plenty of time to get our hands dirty with debugging. The best tools come out of necessity and the LiteFS trace log is one of those tools.</p>
<p>Debugging a failed database or distributed system is <a href='https://twitter.com/honest_update/status/651897353889259520' title=''>a bit like a murder mystery</a> in that you know how it ended but you need to put the pieces together to figure out how it happened.</p>
<p>In the previous version of LiteFS, we didn’t have many clues when one of these failures happened so it required a Sherlock Holmes level of deductive reasoning to figure out the mystery. The trace log simplifies this process by writing out every internal event to a log file so we can see where things went awry after the fact.</p>
<p>SQLite uses the POSIX file system API so debugging with a normal <code>strace</code> would look like a series of seemingly opaque system calls. LiteFS translates these system calls back into SQLite related actions such as <code>WriteDatabase()</code> or <code>LockSHM()</code>. When we write those events to the trace log, we can decorate the log lines with additional information such as page numbers and checksums. All this makes reading the trace much more straightforward.</p>
<p>The trace log is not without its costs though. It will increase I/O to your disk as there are a lot of events that are written. It’s typical to see your disk I/O double when you enable the trace log. However, it does cap the total size of the trace log by using a rolling log so you don’t need much space available. By default, it will roll over to a new log file every 64MB and it will retain the last 10 logs in a gzipped format.</p>
<p>The trace log is disabled by default, however, you review the <a href='https://fly.io/docs/litefs/config/#trace-log' title=''>trace log documentation</a> if you need it to debug any LiteFS issues.</p>
<h2 id='upcoming-work' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#upcoming-work' aria-label='Anchor'></a>Upcoming work</h2>
<p>The WAL support & stability improvements have been huge steps in moving LiteFS to be production ready but there’s still more work to come. In the next release, we’ll be focused on making LiteFS easier to integrate into your application by adding support for <a href='https://github.com/superfly/litefs/issues/56' title=''><em>write forwarding</em></a>. That will let you write to your database from any node and have LiteFS automatically forward those writes to the primary instead of having your application redirect writes.</p>
<p>We’ll also be making performance improvements by adding <a href='https://en.wikipedia.org/wiki/LZ4_(compression_algorithm)' title=''>LZ4 compression</a> to the LTX files. This will reduce latency between nodes and it will significantly cut down on bandwidth costs.</p>
<h2 id='thank-you' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#thank-you' aria-label='Anchor'></a>Thank you!</h2>
<p>Finally, we’d like to give a huge shoutout for everyone who has tried LiteFS and given feedback. It makes a world of difference! <a href='https://www.youtube.com/@KentCDodds-vids' title=''>Kent C. Dodds</a> even live streamed his experience with LiteFS and it gave us incredible, detailed feedback. Thank you!</p>
/blog/wal-mode-in-litefs/assets/wal-mode-thumbnail.webpLaunching Redis by Upstashhttps://fly.io/blog/launching-redis-by-upstash/2022-12-15T00:00:00+00:002023-11-21T21:08:37+00:00<div class="lead"><p>We’re Fly.io. We put your code into lightweight microVMs on our own hardware <a href="https://fly.io/docs/reference/regions/" title="">around the world</a>, close to your users. <a href="https://fly.io/docs/reference/redis/" title="">Redis by Upstash</a> is managed Redis living right next door to your Fly.io apps. <a href="https://fly.io/docs/speedrun/" title="">Check us out</a>—your app and database can be running close to your users within minutes.</p>
</div>
<p>We love databases that scale globally. As an <a href='https://fly.io/blog/how-we-built-fly-postgres/' title=''>ambivalent</a> database provider, we built a global, automated <a href='https://fly.io/docs/postgres/advanced-guides/high-availability-and-global-replication/' title=''>Postgres</a>, and we <a href='https://fly.io/blog/last-mile-redis/' title=''>tinkered with global Redis</a> on scrappy startup weekends. But the Fly.io forecast called for integration over invention. So we partnered up on launching a simple, global, low-latency Redis service built by the intrepid crew at <a href='https://upstash.com' title=''>Upstash</a>.</p>
<p><em>Redis by Upstash</em> sounds good enough to launch a cologne. We think it’s as big a deal. Oh, and there’s a <a href='https://fly.io/docs/reference/redis/#pricing' title=''>generous free tier</a>.</p>
<p>Keep reading to learn how our first integration came to life. Or, just <a href='https://fly.io/docs/getting-started/log-in-to-fly/' title=''>sign up for Fly</a> and give it a try:</p>
<div class="highlight-wrapper group relative cmd">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-5p27qt48"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight cmd'><code id="code-5p27qt48">flyctl redis create
</code></pre>
</div>
</div><div class="highlight-wrapper group relative output">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-4cbq5771"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight output'><code id="code-4cbq5771">? Select Organization: fly-apps (fly-apps)
? Choose a Redis database name (leave blank to generate one): redis-for-lovers
? Choose a primary region: Madrid, Spain (mad)
? Would you like to enable eviction? Yes
? Optionally, choose one or more replica regions: Amsterdam, Dallas, São Paulo, Johannesburg
? Select an Upstash Redis plan Free: 100 MB
Your Upstash Redis database redis-for-lovers is ready.
</code></pre>
</div>
</div><h2 id='a-better-redis-for-global-deployments' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#a-better-redis-for-global-deployments' aria-label='Anchor'></a>A Better Redis for Global Deployments</h2>
<p>So what’s special here? I assure you: this isn’t stock Redis with a price tag slapped on.</p>
<p>Complex features like global read replicas demand good DX to get noticed. But in the managed Redis market, read replicas are elusive, hidden behind sales calls, enterprise pricing plans and confusing UI.</p>
<p>With <code>flyctl redis update</code> and a few keystrokes, you can spin up global Redis replicas in seconds, with <a href='https://fly.io/docs/reference/redis/#writing-to-replica-regions' title=''>write forwarding</a> switched on. Reads <em>and</em> writes make their way to the geographically-nearest replica, which happily forwards writes along to its primary, <a href='#ryow' title=''>ensuring read-your-write consistency</a> along the way. So, with a single Redis URI, you can safely experiment with global deployment without changing your app configuration.</p>
<p>VM-to-Redis requests are reliably fast, in every region, because your apps run on the same bare metal hardware as your databases, one network hop away at most. Check out Upstash’s <a href='https://latency.upstash.io/fly.io/read/1kb' title=''>live latency measurements</a> to compare Fly.io with serverless platforms like Vercel or AWS. This comparison is not entirely fair, as we run apps on real VMs; not in JavaScript isolates. But we love the colors.</p>
<p>Finally, it’s worth mentioning these databases are secure: only reachable through your Fly.io encrypted, private IPv6 network.</p>
<h2 id='like-a-surgeon' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#like-a-surgeon' aria-label='Anchor'></a>Like a Surgeon</h2>
<p>When this integration was on the cards, we had two clear goals: don’t expose Redis to the internet, and give Upstash full control of their service without compromising customer app security. Serendipity struck as we pondered this.</p>
<p>We were knee-deep in fresh platform plumbing — the <a href='https://fly.io/docs/reference/machines/' title=''>Machines API</a> and <a href='https://fly.io/docs/reference/private-networking/#private-load-balancing-aka-flycast' title=''>Flycast private load balancing</a>. The API grants precise control over where and how VMs launch. And Flycast yields anycast-like powers to apps on the private, <a href='https://fly.io/docs/about/healthcare/#wireguard-everywhere' title=''>global WireGuard mesh</a>.</p>
<p>So Upstash Redis is a standard Fly.io app — a multitenant megalith running on beefy VMs in all Fly.io regions. These VMs gossip amongst themselves over their private IPv6 network. Upstash uses our API to deploy. We support Upstash like any other customer. Awesome.</p>
<p>But Redis runs in its own Fly.io organization, and therefore, in its own isolated network. And customer apps, each in their own. We needed a way to securely connect two Fly applications. Enter Flycast, stage left.</p>
<p>Flycast is a beautiful, complex cocktail of BPF, iptables and tproxy rules: fodder for another post! Flycast offers public proxy features — geo-aware load balancing, concurrency control and TLS termination — between apps that share a private network. With a small tweak, Flycast could now surgically join services with customer networks.</p>
<p>Customer apps can connect to their provisioned Redis, but not to anything else in the Upstash private network. Upstash can’t access the customer’s network at all. Mission accomplished.</p>
<h2 id='a-tale-of-provisioning' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#a-tale-of-provisioning' aria-label='Anchor'></a>A Tale of Provisioning</h2>
<p>You might be curious how provisioning Redis works, end-to-end.</p>
<p>Your <code>flyctl redis create</code> hits the Fly.io API. We mint a fresh Flycast IP address on your network and pass that IP along to Upstash’s API with the desired database configuration.</p>
<p>In the same request, Upstash informs their Fly.io mega-deployment about your database, and we (Fly.io) point the Flycast address at Upstash’s app. We blast this info to our global proxies. They’ll now proxy connections on this IP to the nearest healthy mega-Redis instance. This all happens in a matter of seconds.</p>
<p>Alright, so now you have a Redis connection URL to chuck requests at.</p>
<p>Remember that Upstash’s Redis deployment is <em>multitenant</em>. Upstash hosts scores of customer databases within a single OS process. With a clever shuffling of data from RAM to <a href='https://fly.io/docs/app-guides/git-gogs-server/#persistent-disks' title=''>persistent disks</a>, many, many more databases can fit in this instance than your average Redis running on its own VM.</p>
<p>But multitenancy poses a problem. How can mega-Redis identify the target database for a given request?</p>
<p>Your Redis URL includes a unique database password (remember this is all private, encrypted traffic). Could we use this password to identify your database? Technically, yes, but if you leak your Redis password on a live coding stream, anyone else with a Redis database could hijack yours! Yeah, let’s not.</p>
<p>Before, we passed your Flycast IP address to Upstash, so they have it on record. Could they match that against the source address of the incoming Redis TCP connection? Not quite! Connections to Redis pass through our proxy. So, traffic will appear to arrive from the proxy itself; not from your Flycast IP.</p>
<p>No worries! We’ve got another trick up our sleeve.</p>
<h2 id='a-protocol-for-proxies' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#a-protocol-for-proxies' aria-label='Anchor'></a>A Protocol for Proxies</h2>
<p>Bonus: our proxy supports prepending <a href='https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt' title=''>proxy procotol</a> headers to TCP requests.</p>
<p>This curious 10-year-old internet resident is understood by most web servers and programming languages. At the top of the protocol <a href='https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt' title=''>spec</a>, we spot our problem:</p>
<blockquote>
<p>Relaying TCP connections through proxies generally involves a loss of the
original TCP connection parameters such as source and destination addresses,
ports, and so on.</p>
</blockquote>
<p>Redis runs on port 6379, just because. Here’s a typical header for Redis connection initiation:</p>
<p><code>PROXY TCP6 fdaa:0:47fb:0:1::19 fdaa:0:47fb:0:1::16 6379 6379</code></p>
<p>Here we have two IPs — source and destination — on the same lovingly-named network, <code>fdaa:0:47fb</code>. The source IP belongs to the application VM, which is assigned randomly and is of little use here. But the destination address is the Flycast IP assigned to our particular database. Ace.</p>
<p>Now we’re in the home stretch. Redis parses this header, plucks out that Flycast IP, finds the associated customer database, and forwards traffic to it. In wafts the sweet aroma of victory.</p>
<h2 id='a-need-for-speed' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#a-need-for-speed' aria-label='Anchor'></a>A Need for Speed</h2>
<p>Let’s talk about a clear-cut use case for global Redis: caching HTML at the edge.</p>
<p>Last year we turbo-boosted our Paris-based, recipe finder Rails app by <a href='https://fly.io/ruby-dispatch/run-ordinary-rails-apps-globally/' title=''>deploying Postgres replicas around the globe</a>. But our database has grown. We don’t need to replicate all of its contents, and we’re too busy to spend time optimizing our queries. Let’s just lean on a lightweight HTML cache, which Rails is good at.</p>
<p>We know we can get similar or better performance by caching HTML in Redis alongside our deployed VMs. And we can do this in a few minutes, really. First, let’s add a few read replicas in distant, exotic lands.</p>
<div class="highlight-wrapper group relative ">
<button
type="button"
class="bubble-wrap z-20 absolute right-9 -mr-0.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-wrap-target="#code-a66jp22y"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35" stroke-linecap="round" stroke-linejoin="round"><g buffered-rendering="static"><path d="M9.912 8.037h2.732c1.277 0 2.315-.962 2.315-2.237a2.325 2.325 0 00-2.315-2.31H2.959m10.228 9.01H2.959M6.802 8H2.959" /><path d="M11.081 6.466L9.533 8.037l1.548 1.571" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-9px] tail text-navy-950">
Wrap text
</span>
</button>
<button
type="button"
class="bubble-wrap z-20 absolute right-1.5 top-1.5 text-transparent group-hover:text-gray-400 group-hover:hocus:text-white focus:text-white bg-transparent group-hover:bg-gray-900 group-hover:hocus:bg-gray-700 focus:bg-gray-700 transition-colors grid place-items-center w-7 h-7 rounded-lg outline-none focus:outline-none"
data-copy-target="sibling"
>
<svg class="w-4 h-4 pointer-events-none" viewBox="0 0 16 16" fill="none" stroke="currentColor" stroke-width="1.35"><g buffered-rendering="static"><path d="M10.576 7.239c0-.995-.82-1.815-1.815-1.815H3.315c-.995 0-1.815.82-1.815 1.815v5.446c0 .995.82 1.815 1.815 1.815h5.446c.995 0 1.815-.82 1.815-1.815V7.239z" /><path d="M10.576 10.577h2.109A1.825 1.825 0 0014.5 8.761V3.315A1.826 1.826 0 0012.685 1.5H7.239c-.996 0-1.815.819-1.816 1.815v1.617" /></g></svg>
<span class="bubble-sm bubble-tl [--offset-l:-6px] tail [--tail-x:calc(100%-30px)] text-navy-950">
Copy to clipboard
</span>
</button>
<div class='highlight relative group'>
<pre class='highlight '><code id="code-a66jp22y">~ $ fly redis update cookherenow-redis
? Choose replica regions, or unselect to remove replica regions: [Use arrows to move, space to select, <right> to all, <left> to none, type to filter]
> [ ] Amsterdam, Netherlands (ams)
[x] Denver, Colorado (US) (den)
[ ] Dallas, Texas (US) (dfw)
[ ] Secaucus, NJ (US) (ewr)
[ ] Frankfurt, Germany (fra)
[x] São Paulo (gru)
[ ] Hong Kong, Hong Kong (hkg)
[ ] Ashburn, Virginia (US) (iad)
[x] Johannesburg, South Africa (jnb)
[ ] Los Angeles, California (US) (lax)
[ ] London, United Kingdom (lhr)
[ ] Chennai (Madras), India (maa)
[ ] Madrid, Spain (mad)
[ ] Miami, Florida (US) (mia)
[x] Santiago, Chile (scl)
</code></pre>
</div>
</div>
<p>Then, with <a href='https://github.com/jsierles/cookherenow/commit/042c29d2fe8b7f28578dd35091a310e278af5983' title=''>a sprinkle of Rails magic</a>, our naive HTML cache is on the scene. Metrics can be boring, so, trust us that our <a href='https://en.wikipedia.org/wiki/Time_to_first_byte' title=''>Time To First Byte</a> is still in the low milliseconds, globally, for GET requests on cached recipe pages.</p>
<h2 id='ryow' class='group flex items-start whitespace-pre-wrap relative mt-14 sm:mt-16 mb-4 text-navy-950 font-heading'><a class='inline-block align-text-top relative top-[.15em] w-6 h-6 -ml-6 after:hash opacity-0 group-hover:opacity-100 transition-all' href='#ryow' aria-label='Anchor'></a>RYOW</h2>
<p>Now and then, one must write. And <a href='https://jepsen.io/consistency/models/read-your-writes' title=''>read-your-own-write consistency</a> is a thing you need to care about when hitting speed-of-light latency in global deployments. That’s life, kids.</p>
<p>Readers hitting database replicas may not be served the very freshest of writes. We’re OK with that. Except in one case: when that replica is serving the author of the write. Good UX demands that a writer feel confident about the changes they’ve made, even if they have to wait a few hundred milliseconds.</p>
<p>To that end, Upstash Redis replicas take one of two paths to ensure a consistent read-your-own-write experience, with some trade-offs. Let’s talk it out.</p>
<p>Isa — one our recipe editors in Santiago — is worried that the recipe for <a href='https://cookherenow.com/recipes/288370' title=''>Humitas Chilenas</a> mentions New Mexico Green Chiles. While they may be the first chiles <a href='https://www.nasa.gov/feature/chile-peppers-start-spicing-up-the-space-station/' title=''>grown in outer space</a>, they’re generally not tossed into <em>humitas</em>. So she makes corrections and proudly smashes that <strong class='font-[550] text-navy-950'>ENVIAR</strong> button.</p>
<p>Meanwhile, Santiago Redis has been diligently keeping track of the unique IDs of the writes that pass through Isa’s Redis connection.</p>
<p>So, that write is forwarded on to Paris, securely, over the WireGuard mesh. Santiago Redis holds blocks on the write command, waiting for replication to catch up to <em>this specific write</em>. On a clear internet day, we might wait 150ms, and Isa is redirected to the recipe page and sees her updated recipe sans chiles.</p>
<p>But under poor network conditions, we may need to wait longer, and we don’t want to wait forever. Editing must go on. This kind of thing can happen, and we need to be prepared for it.</p>
<p>So, the less happy path: Santiago Redis waits up to 500ms for the written value to return via replication. After that, Redis client connection is released, suggesting to the Redis client that the write completed. Now, this is risky business. If we redirect Isa to her recipe before her write makes that round trip, she gets spicy <em>Humitas</em> once again. New Mexican space chiles haunt her confused mind.</p>
<p>No fear - Santiago Redis has our back. Remember that it was tracking writes? When Isa’s recipe read is attempted, Santiago grabs the ID of the most recently tracked write on her connection. It checks to see if that ID exists in the replicated database contents. If so, Isa gets a fast, correct read of her updated recipe.</p>
<p>But if her change didn’t arrive yet, Santiago <em>forwards the read operation</em> to our our source of truth — Paris Redis — at the cost of another full round trip to Europe. Such is the price of consistency.</p>
/blog/launching-redis-by-upstash/assets/love-thumbnail.webp