trending

What building an AI incident agent actually taught us

We've spent the last 6 months pointing an agent at real production errors. Going in, we were sure the hard problem was "can an AI write the fix." Almost everything that mattered turned out to be somewhere else. Three surprising things we learned:

1. Most errors were never worth a page

Across real traffic, ~70% of errors triage out as noise. Which means they are not actionable and no human is needed. This rate holds surprisingly consistently across tenants and services. I still remember integrating with our first design partner. First Slack notification comes in, the team is high-fiving, pure excitement. This thing actually works! Next thing we know our Slack inbox blows up and a burst of 12 notifications arrive. So we built a relevance gate to make sure notifications only happen if something truly breaks. We set out to fix bugs and discovered the bigger win was the 70% of times we give time back.

Yumi Joh

4h ago

OurBase - AI finds the bug. You ship the fix.

We launch today, before Fathers' Day weekend to honor my dad. He was the on-call warrior and our agent shares his name. Bohun watches your alerts (Sentry, Grafana, GCP), reads the stacktrace, pulls your source files from GitHub/GitLab, classifies severity, and opens a PR with a fix for engineers review and merge. Always human in the loop.