Macaroons Escalated Quickly

Evil cookies!
Image by Annie Ruygt

We’re and we transmute containers into VMs, running them on our hardware around the world. We built a new security token system, and can I tell you the good news about our lord and savior the Macaroon?


Let’s implement an API token together. It’s a design called “Macaroons”, but don’t get hung up on that yet.

First some . Then:

import sys
import os
import json
import hmac as hm
from base64 import b64encode, b64decode
from hashlib import sha256

def hmac(k, v): return, v, sha256).digest()
def enc(x): return b64encode(x)
def dec(x): return b64decode(x)
def blank_token(uid, key):
  nonce = enc(":".join([str(uid), os.urandom(16)]))
  return json.dumps([nonce, enc(hmac(key, nonce))])

Bearer tokens: like cookies, blobs you attach to a request (usually in an HTTP header).

We’re going to build a minimally-stateful bearer token, a blob signed with HMAC. Nothing fancy so far. Rails has done this for a decade and a half.

There’s a fashion in API security for stateless tokens, which encode all the data you’d need to check any request accompanied by that token – without a database lookup. Stateless tokens have some nice properties, and some less-nice. Our tokens won’t be stateless: they carry a user ID, with which we’ll look up the HMAC key to verify it. But they’ll stake out a sort of middle ground.

def attenuate(macStr, cav):
    mac = json.loads(macStr)
    cavStr = json.dumps(cav)
    oldTail = dec(mac[-1])
    newTail = enc(hmac(oldTail, cavStr))
    return json.dumps(mac[0:-1] + [cavStr, newTail])

m0 = blank_token(10, keys[10])
m1 = attenuate(m0, {'path': '/images'})
m2 = attenuate(m1, {'op': 'read'})

Let’s add some stuff.

The meat of our tokens will be a series of claims we call “caveats”. We call them that because each claim restricts further what the token authorizes. After {'path': '/images'}, this token only allows operations that happen underneath the /images directory. Then, after {'op': 'read'}, it allows only reads, not writes.

(I guess we’re building a file sharing system. Whatever.)

Some important things about things about this design. First: by implication from the fact that caveats further restrict tokens, a token with no caveats restricts nothing. It’s a god-mode token. Don’t honor it.

In other words: the ordering of caveats doesn’t matter.

Second: the rule of checking caveats is very simple: every single caveat must pass, evaluating True against the request that carries it, in isolation and without reference to any other caveat. If any caveat evaluates False, the request fails. In that way, we ensure that adding caveats to a token can only ever weaken it.

With that in mind, take a closer look at this code:

oldTail = dec(mac[-1])
newTail = enc(hmac(oldTail, cavStr))

Every caveat is HMAC-signed independently, which is weird. Weirder still, the key for that HMAC is the output of the last HMAC. The caveats chain together, and the HMAC of the last caveat becomes the “tail” of the token.

Creating a new blank token for a particular user requires a key that the server (and probably only the server) knows. But adding a caveat doesn’t! Anybody can add a caveat. In our design, you, the user, can edit your own API token.

def verify(macStr, keys):
    mac = json.loads(macStr)
    nonce = dec(mac[0]).split(":")
    key = keys[int(nonce[0])]
    tail = ""
    for cav in mac[:-1]:
        tail = hmac(key, cav)
        key = tail
    return hm.compare_digest(tail, dec(mac[-1]))

verify(m2, keys) # => True

For completeness, and to make a point, there’s the verification code. Look up the original secret key from the user ID, and then it’s chained HMAC all the way down. The point I’m making is that Macaroons are very simple.


Back in 2014, Google published a paper at NDSS introducing “Macaroons”, a new kind of cookie. Since then, they’ve become a sort of hipster shibboleth. But they’re more talked about than implemented, which is a nice way to say that practically nobody uses them.

Until now! I dragged into implementing them. Suckers!

We had a problem: our API tokens were much too powerful. We needed to scope them down and let them express roles, and I scoped up that project to replace OAuth2 tokens altogether. We now have what I think is one of the more expansive Macaroon implementations on the Internet.

I dragged us into using Macaroons because I wanted us to use a hipster token format. Google designed Macaroons for a bigger reason: they hoped to replace browser cookies with something much more powerful.

The problem with simple bearer tokens, like browser cookies or JWTs, is that they’re prone to being stolen and replayed by attackers.

game-over: pentest jargon for “very bad”

Worse, a stolen token is usually a game-over condition. In most schemes, a bearer token is an all-access pass for the associated user. For some applications this isn’t that big a deal, but then, think about banking. A banking app token that authorizes arbitrary transactions is a recipe for having a small heart attack on every HTTP request.

(Perfectly minimized API tokens: a software security holy grail)

Macaroons are user-editable tokens that enable JIT-generated least-privilege tokens. With minimal ceremony and no additional API requests, a banking app Macaroon lets you authorize a request with a caveat like, I don’t know, {'maxAmount': '$5'}. I mean, something way better than that, probably lots of caveats, not just one, but you get the idea: a token so minimized you feel safe sending it with your request. Ideally, a token that only authorizes that single, intended request.


That’s not why we like Macaroons. We already assume our tokens aren’t being stolen.

In most systems, the developers come up with a permissions system, and you’re stuck with it. We run a public cloud platform, and people want a lot of different things from our permissions. The dream is, we (the low-level platform developers on the team) design a single permission system, one time, and go about our jobs never thinking about this problem again.

Instead of thinking of all of our “roles” in advance, we just model our platform with caveats:

  1. Users belong to Organizations.
  2. Organizations own Apps.
  3. Apps contain Machines and Volumes.
  4. To any of these things, you can Read, Write, Create, Delete, and/or Control .
  5. Some administrivia, like expiration (ValidityWindow), locking tokens to specific Fly Machines (FromMachineSource), and escape hatches like Mutation (for our GraphQL API).

(this is a vibes-based notation, don’t think too hard about it)

Simplistic. But it expresses admin tokens:

Organization 4721, mask=*

And it expresses normal user tokens:

Organization 4721, mask=read,write,control
(App 123, mask=control), (App 345, mask=read, write, control)

And also an auditor-only token for that user:

Organization 4721, mask=read,write,control
(App 123, mask=control), (App 345, mask=read, write, control)
Organization 4721, mask=read

(our deploy tokens are more complicated than this)

Or a deployment-only token, for a CI/CD system:

Organization 4721, mask=write,control
(App 123, mask=*)

Those are just the roles we came up with. Users can invent others. The important thing is that they don’t have to bother me about them.


Astute readers will have noticed by now that we haven’t shown any code that actually evaluates a caveat. That’s because it’s boring, and I’m too lazy to write it out. Got an Organization token for image-hosting that allows Reads? Ok; check and make sure the incoming request is for an asset of image-hosting, and that it’s a Read. Whatever code you came up with, it’d be fine.

These straightforward restrictions are called “first party caveats”. The first party is us, the platform. We’ve got all the information we need to check them.

Let’s kit out our token format some more.

def third_party_caveat(ka, tail, msg, url):
    crk = os.urandom(16)
    ticket = enc(encrypt(ka, json.dumps({
        'crk': enc(crk),
        'msg': msg
    challenge = enc(encrypt(tail, crk))
    return { 'url': url, 'ticket': ticket, 'challenge' : challenge }

key = bytes("YELLOW SUBMARINE")
url = "https://canary.service"
c3 = third_party_caveat(key, tail, url, json.dumps({'user': 'bobson.dugnutt'}))
m3 = attenuate(m2, c3)

Up till now, we’ve gotten by with nothing but HMAC, which is one of the great charms of the design. Now we need to encrypt. There’s no authenticated encryption in the Python standard library, but that won’t stop us.

# do i really need to say that i'm not serious about this?

def hmactr(k, n):
     ks =
     for counter in xrange(sys.maxint):
         kbs = ks.digest()
         for i in xrange(16): yield kbs[i]

def encrypt(k, buf):
    ak =, 'auth').digest()
    nonce = os.urandom(16)
    cipher = hmactr(, 'enc').digest(), nonce)
    ctxt = bytearray(buf)
    for i in xrange(len(buf)):
        ctxt[i] ^= ord(
    res = nonce + str(ctxt)
    return res +, res).digest()

def decrypt(k, buf):
    ak =, 'auth').digest()
    if not hm.compare_digest(buf[-16:],, buf[:-16]).digest()):
        return False
    nonce = buf[:16]
    cipher = hmactr(, 'enc').digest(), nonce)
    ptxt = bytearray(buf[16:-16])
    for i in xrange(len(buf[16:-16])):
        ptxt[i] ^= ord(
    return str(ptxt)

With “third-party” caveats comes a cast of characters. We’re still the first party. You’ll play the second party. The third party is any other system in the world that you trust: an SSO system, an audit log, a revocation checker, whatever.

Here’s the trick of the third-party caveat: our platform doesn’t know what your caveat means, and it doesn’t have to. Instead, when you see a third-party caveat in your token, you tear a ticket off it and exchange it for a “discharge Macaroon” with that third party. You submit both Macaroons together to us.

Let’s attenuate our token with a third-party caveat hooking it up to a “canary” service that generates a notice approximately any time the token is used.

To build that canary caveat, you first make a ticket that users of the token will hand to your canary, and then a challenge that will use to verify discharges your checker spits out. The ticket and the challenge are both encrypted. The ticket is encrypted under KA, so your service can read it. The challenge is encrypted under the previous Macaroon tail, so only can read it. Both hide yet another key, the random HMAC key CRK (“caveat root key”).

In addition to CRK, the ticket contains a message, which says whatever you want it to; doesn’t care. Typically, the message describes some kind of additional checking you want your service to perform before spitting out a discharge token.

def discharge(ka, ticket):
    ptxt = decrypt(ka, dec(ticket))
    if ptxt == False: return False
    tbody = json.loads(ptxt)
    # not shown: do something with tbody['msg']
    return json.dumps([ticket, enc(hmac(dec(tbody['crk']), ticket))])

To authorize a request with a token that includes a third-party caveat for the canary service, you need to get your hands on a corresponding discharge Macaroon. Normally, you do that by POSTing the ticket from the caveat to the service.

Discharging is simple. The service, which holds KA, uses it to decrypt the ticket. It checks the message and makes some decisions. Finally, it mints a new macaroon, using CRK, recovered from the ticket, as the root key. The ticket itself is the nonce.

If it wants, the third-party service can slap on a bunch of first-party caveats of its own. When we verify the Macaroon, we’ll copy those caveats out and enforce them. Attenuation of a third-party discharge macaroon works like a normal macaroon.

def verify_third_party(tag, cav, discharges=[]):
    crk = decrypt(tag, dec(cav['challenge']))
    if crk == False: return False
    discharge = None
    for dcs in discharges:
        if json.loads(dcs)[0] == cav['ticket']:
            discharge = dcs
    if not discharge: return False
    mac = json.loads(discharge)
    key = crk
    # boring old stuff ---------------------
    tag = ""
    for cav in mac[:-1]:
        tag = hmac(key, cav)
        key = tag
    return hm.compare_digest(tag, dec(mac[-1]))

To verify tokens that have third-party caveats, start with the root Macaroon, walking the caveats like usual. At each third-party caveat, match the ticket from the caveat with the nonce on the discharge Macaroon. The key for root Macaroon decrypts the challenge in the caveat, recovering CRK, which cryptographically verifies the discharge.

(The Macaroons paper uses different terms: “caveat identifier” or cId for “ticket”, and “verification-key identifier” or vId for “challenge”. These names are self-evidently bad and our contribution to the state of the art is to replace them.)

There’s two big applications for third-party caveats in Popular Macaroon Thought. First, they facilitate microservice-izing your auth logic, because you can stitch arbitrary policies together out of third-party caveats. And, they seem like fertile ground for an ecosystem of interoperable Macaroon services: Okta and Google could stand up SSO dischargers, for instance, or someone can do a really good revocation service.

Neither of these light us up. We’re allergic to microservices. As for public protocols, well, it’s good to want things. So we almost didn’t even implement third-party caveats.


I’m glad we did though, because they’ve been pretty great.

The first problem third-party caveats solved for us was hazmat tokens. To the extent possible, we want Macaroon tokens to be safe to transmit between users. Our Macaroons express permissions, but not authentication, so it’s almost safe to email them.

The way it works is, our Macaroons all have a third-party caveat pointing to a “login service”, either identifying the proper bearer as a particular user or as a member of some Organization. To allow a request with your token, you first need to collect the discharge from the login service, which requires authentication.

The login discharge is very sensitive, but there isn’t much reason to pass it around. The original permissions token is where all the interesting stuff is, and it’s not scary. So that’s nice.

Ben then came up with third-party caveats that require Google or Github SSO logins. If your token has one of those caveats, when you run flyctl deploy, a browser will pop up to log you into your SSO IdP (if you haven’t done so recently already).

We’ve put a bunch of work into getting the guts of our SSO system working, but that work has mostly been invisible to customers. But Macaroon-ized SSO has a subtle benefit: you can configure to automatically add SSO requirements to specific Organizations (so, for instance, a dev environment might not need SSO at all, and prod might need two).

SSO requirements in most applications are a brittle pain in the ass. Ours are flexible and straightforward, and that happened almost by accident. Macaroons, baby!

Here’s a fun thing you can do with a Macaroon system: stand up a Slack bot, and give it an HTTP POST handler that accepts third-party tickets. Then:

So, the bot is cute, but any platform could do that. What’s cool is the way our platform doesn’t work with Slack; in fact, nothing on our platform knows anything about Slack, and Slack doesn’t know anything about us. We didn’t reach out to a Slack endpoint. Everything was purely cryptographic.

That bot could, if I sunk some time into it, enforce arbitrary rules: it could selectively add caveats for the requests it authorizes, based on lookups of the users requesting them, at specific times of day, with specific logging. Theoretically, it could add third-party caveats of its own.

The win for us for third-party caveats is that they create a plugin system for our security tokens. That’s an unusual place to see a plugin interface! But Macaroons are easy to understand and keep in your head, so we’re pretty confident about the security issues.


Obviously, we didn’t write our Macaroon code in Python, or with HMAC-SHA256-CTR.

We landed on a primary implementation Golang (Ben subsequently wrote an Elixir implementation). Our hash is SHA256, our cipher is Chapoly. We encode in MsgPack.

We didn’t use the pre-existing public implementation because we were warned not to. The Macaroon idea is simple, and it exists mostly as an academic paper, not a standard. The community that formed around building open source “standard” Macaroons decided to use untyped opaque blobs to represent caveats. We need things to be as rigidly unambiguous as they can be.

The big strength of Macaroons as a cryptographic design — that it’s based almost entirely on HMAC — makes it a challenge to deploy. If you can verify a Macaroon, you can generate one. We have thousands of servers. They can’t all be allowed to generate tokens.

What we did instead:

  • We split token checking into “verification” of token HMAC tags and “clearing” of token caveats.
  • Verification occurs only on a physically isolated token-verification service; to verify a token’s tag, you HTTP POST the token to the verifier.
  • Clearing of token caveats can happen anywhere. Token caveat clearing is domain-specific and subject to change; token verification is simple cryptography and changes rarely.
  • A token verification is cacheable. The client library for the token verifier does that, which speeds things up by exploiting the locality of token submissions.
  • The verification service is backed by a LiteFS-distributed SQLite database, so verification is fast globally — a major step forward from our legacy OAuth2 tokens, which are only fast in Ashburn, VA.

Now buckle up, because I’m about to try to get you to care about service tokens.

We operate “worker servers” all over the world to host apps for our customers. To do that, those workers need access to customer secrets, like the key to decrypt a customer volume. To retrieve those secrets, the workers have to talk to secrets management servers.

We manage a lot of workers. We trust them. But we don’t trust them that much, if you get my drift. You don’t want to just leave it up to the servers to decide which secrets they can access. The blast radius of a problem with a single worker should be no greater than the apps that are supposed to run there.

The gold standard for approving access to customer information is, naturally, explicit customer authorization. We almost have that with Macaroons! The first time an app runs on a worker, the orchestrator code has a token, and it can pass that along to the secret stores.

The problem is, you need that token more than once; not just when the user does a deploy, but potentially any time you restart the app or migrate it to a new worker. And you can’t just store and replay user Macaroons. They have expirations.

This is like dropping privilege with things like pledge(2), but in a distributed system.

So our token verification service exposes an API that transforms a user token into a “service token”, which is just the token with the authentication caveat and expiration “stripped off”.

What’s cool is: components that receive service tokens can attenuate them. For instance, we could lock a token to a particular worker, or even a particular Fly Machine. Then we can expose the whole Fly Machines API to customer VMs while keeping access traceable to specific customer tokens. Stealing the token from a Fly Machine doesn’t help you since it’s locked to that Fly Machine by a caveat attackers can’t strip.


If a customer loses their tokens to an attacker, we can’t just blow that off and let the attacker keep compromising the account!

This cancels every token derived through attenuation by that nonce.

Every Macaroon we issue is identified by a unique nonce, and we can revoke tokens by that nonce. This is just a basic function of the token verification service we just described.

We host token caches all over our fleet. Token revocation invalidates the caches. Anything with a cache checks frequently whether to invalidate. Revocation is rare, so just keeping a revocation list and invalidating caches wholesale seems fine.


I get it, it’s tough to get me to shut up about Macaroons.

A couple years ago, I wrote a long survey of API token designs, from JWTs (never!) to Biscuits. I had a bunch to say about Macaroons, not all of it positive, and said we’d be plowing forward with them at

My plan had been to follow up soon after with a deep dive on Macaroons as we planned them for I’m glad I didn’t do that, not just because it would’ve been embarrassing to announce a feature that took us over 2 years to launch, but also because the process of working on this with Ben Toews changed a lot of my thinking about them.

I think if you asked Ben, he’d say he had mixed feelings about how much complexity we wrangled to get this launched. On the other hand: we got a lot of things out of them without trying very hard:

  • Security tokens you can (almost) email to your users and partners without putting your account at risk.
  • A flexible permission system, encoded directly into the tokens, that users can drive without talking to our servers.
  • A plugin system that users can (when we clean up the tooling) use themselves, to add things like Passkeys or two-person-approval rules or audit logging, without us getting in the middle.
  • An SSO system that can stack different IdPs, mandate SSO login, and do that on a per-Organization basis.
  • Inter-service authorization that is traceable back to customer actions, so our servers can’t just make up which apps they’re allowed to look at.
  • An elegant way of exposing our own APIs to customer Fly Machines with ambient authentication, but without the AWS IMDSv1 credential theft problem.

There are downsides and warts! I’m mostly not telling you about them! Pure restrictive caveats are an awkward way to express some roles. And, blinded by my hunger to get Macaroons deployed, I spat in the face of science and used internal database IDs as our public caveat format, an act for which JP will never forgive me.

If i’ve piqued your interest, the code for this stuff is public, along with some more detailed technical documentation.