May 15: Bad TLS cert update broke Consul
#May 15: Bad TLS cert update broke Consul (16:12UTC)
A configuration automation run accidentally overwrote Consul TLS certificates with invalid ones, which caused Consul lookups to fail across parts of the fleet. We have spent a lot of time in the past couple of years to remove Consul as a key dependency, and as such most aspects of our platform were not directly impacted: all running Machines were unaffected and fly-proxy routing remained functional. The impact was concentrated on the few control-plane operations that still rely on Consul: mainly fly ssh console and OIDC tokens. We restored the correct certificates and restarted Consul agents to pick up the fixed TLS configuration, after which errors and alerts subsided.