
The Incident Challenge
Production Debugging Games for Software Engineers
185 followers
Production Debugging Games for Software Engineers
185 followers
Compete in realistic incident simulations where you find the root cause, fix the system, and race the leaderboard.
This is the 2nd launch from The Incident Challenge. View more
The Incident Challenge
Launching today
The Incident Challenge is a live debugging challenge for the AI age, where engineers investigate realistic failures across logs, code, runtime, docs, and architecture, then ship the fix and race the leaderboard. Bring your agent. Maybe it helps. Maybe it doesn’t :)
The challenge is live now for 24 hrs. New challenge every two weeks.





Free
Launch Team / Built With



The Incident Challenge
Thank you to the 300+ devs who played the last challenge.
The results were insanely close. First and second place were only 13 seconds apart.
Your feedback helped us improve, add features, and make the challenge better.
What's new:
Public leaderboard
Enhanced challenge UI
Added Terminal & SQL shortcuts
Private Challenge Circle (beta) - compete against friends!!
A special challenge (live now for 24hrs - check out the vid): Final Boss
Players defeat the boss.
K.O. plays.
Staging accepts the win.
Production rejects the ranked result under load.
Beautiful bug. Terrible day in prod.
Go play, test your engineering skills and compete globally!
The idea behind the challenge:
AI made writing code easier. But when production breaks, can it actually understand the system?
The Incident Challenge drops engineers into realistic production failures. You investigate logs, code, runtime, docs, and architecture, then ship the fix and race the leaderboard.
Agents are allowed. Encouraged, actually.
But the challenge is built around the messy parts of engineering that don’t fit neatly into a prompt: dependencies, runtime behavior, race conditions, misleading logs, and staging behaving nothing like prod.
Would love feedback from engineers, founders, devtools people, and anyone thinking about how engineering skills should be tested in the age of AI.
Re_gent
@avi_ct LETS GO
@avi_ct congrats on the launch Avi.I'd be interested to know what the patterns of the top performers are; faster log/code navigation, better hypothesis discipline, knowing when not to trust the agent etc?
The Incident Challenge
@zolani_matebese Thanks!
The top performers seem to move in a pretty disciplined loop: form a hypothesis, verify it against logs/runtime behavior, use the agent for acceleration, then stay skeptical when the system behavior doesn’t match the explanation. The weaker attempts often look fast at first, but they trust the first plausible answer too much. So yes: log/code navigation matters, but the real edge seems to be knowing what evidence is actually enough to prove root cause.
Re_gent
Hey hunters! 👋 Maker here.
Just shipped our hardest incident yet!
The setup: it's a retro arcade ranked mode. Staging confirms the win. Production rejects it. Same fight, same replay - and it only breaks during leaderboard rushes.
Here's the twist that's been wrecking playtesters: ranked combat got moved off a single synchronous path onto regional shards, each with its own worker pool. So now you get this:
A region that looks clean in one capture is not safe under load. The validator tells you how many regions are still wrong - never which ones. So you can't whack-a-mole it.
And the cheap fixes are all traps:
Trust the client? The UI is lying - it's the whole point.
Add a delay knob? Not a fix.
Collapse back to synchronous? You just deleted the feature.
To win you have to make ranked finalization deterministic in every region under real concurrency - no stale canonical reads, no client authority, no synchronous cop-out. It's a genuine distributed-systems race condition dressed up as a boss fight.
Curious who can actually clear it without nuking the architecture. Post your region count when you're stuck 😏 we're reading every one.
The Incident Challenge
@shayliv the genius behind it all.