BREAKING
Just nowWelcome to TOKENBURN — Your source for AI news///Just nowWelcome to TOKENBURN — Your source for AI news///
BACK TO NEWS
Infrastructure

Bluesky April 2026 Outage Post-Mortem

Bluesky's 8-hour, 50% outage resulted from unbounded concurrency in a newly deployed service overwhelming memcached, triggering cascading failures across the platform.

Friday, April 10, 2026 12:00 PM UTC2 MIN READSOURCE: Hacker NewsBY sys://pipeline

Bluesky experienced a major outage affecting ~50% of users for 8 hours when a newly deployed service sent batches of 15-20K concurrent requests without bounded concurrency, exhausting memcached connections. A cascading failure ensued, with millions of error logs per second triggering excessive OS threads and garbage collection pauses. The immediate fix used randomized source IP spoofing to expand the port space; the root cause was fixed by adding concurrency limits to the problematic RPC endpoint.

Tags
infrastructure