The Hot-Shard Spiral
It's 14:03 on Black Friday. MegaToys' flash sale went live at 13:30. Checkout is green, payments are clearing — but the warehouse queue is empty. orders_fact is 41 minutes stale and the gap is growing. At 13:00 the pipeline was sub-30s fresh. Fulfillment, fraud, and the live 'low stock' banners all read this table. Sale ends 15:00. You're on call.
The incident
It's Black Friday and MegaToys' flash sale opened at 13:30. Orders are pouring in — checkout is green and payments are clearing, so to a customer everything looks fine. But the warehouse queue is empty: every system that reads orders_fact (fulfillment, fraud scoring, the live 'low stock' banners) is now working off data that's 41 minutes stale, and that gap is widening about 1.4 minutes every minute. At 13:00 this pipeline was fresh to within 30 seconds. The producers are healthy and Postgres is writing in milliseconds — so orders are being created and would land instantly if they arrived. They're getting stuck somewhere on the path between Kafka and Flink, and the stall is not clearing on its own.
Symptoms on the table
- orders_fact freshness 41 min (SLA 5 min) and climbing
- order-indexer group lag 18.4M messages and rising
- partition 7 alone holds 17.9M of the 18.4M lag (97%)
- order-indexer rebalancing every ~7 seconds
- checkout, payments, and producer throughput all nominal
Systems on the board
The real components in play for this incident — the surface you investigate when the clock starts.
What you'll practice
This is a timed, hands-on incident in the Incident Response. You diagnose the symptom, trace it to a root cause across real components, and ship a fix before the clock runs out — the same loop you run on call, without the production blast radius.
Members-only challenge
Solve it in the Simulation Arcade.
The interactive workspace — live metrics, the component map, and the fix you ship — runs inside Petascale Labs. Sign in to start the clock.
Related topics
Browse the full Arcade
Every challenge maps to a stratum in the curriculum.