April 23: Extension provider polling overloaded Postgres

April 23: Extension provider polling overloaded Postgres (11:12UTC)

An extension provider increased how often they polled one of our private API endpoints. The endpoint ran an expensive Postgres query, and at the higher rate it saturated CPU on the database backing our dashboard and GraphQL API. This caused intermittent 500s on the dashboard and GraphQL API endpoints for about 40 minutes. The provider reverted the polling frequency change and traffic dropped back to normal.

The query was checking whether an organization had a registered extension with the provider, but it was scanning far more rows than it needed to. We rewrote it to short-circuit on the first match. We are also adding rate limiting on this endpoint to stop a similar spike from saturating the database again.