The "five-minute AI task" that wasted 40 minutes of my life

by•9d ago

I used to plan my lunch breaks around AI tasks. Pick the one task that's easy enough to just hand off, start it, go eat.

Came back more than once to find the agent had stopped two minutes in over some dumb question. Not even a permissions thing, it just sat there waiting for me to say "go ahead". Didn't even know it was stuck until I checked.

Anyone actually solved this or are we all just checking constantly?

112 views

Replies

Best

Retime

I’d split this into two separate problems: unnecessary questions and necessary approvals.

For low-risk coding tasks, the agent should usually continue, write down the assumption it made, and give you a tight summary at the end. For actions with real blast radius, it should pause, but the pause has to include enough context that you can decide quickly instead of opening the laptop just to reconstruct the state.

That distinction has shaped how I think about KubeAgent for Kubernetes/on-call work: reading logs, events, pod state, rollout history, etc. should keep moving because the cost of a wrong assumption is mostly a correction. Anything that mutates production or could widen an incident needs an incident note, the evidence it found, the smallest proposed action, and explicit approval.

Otherwise you end up with the two bad modes people are describing here: babysitting the agent, or yolo mode with fingers crossed. The useful middle rule is: auto-continue when the cost of being wrong is a small correction; interrupt when the cost is user-visible, security-sensitive, or hard to roll back.

Report

8d ago

the fix is to never trust the "finish" state. three things that compound:

1. wrap every agent call in a timeout that pings you on stall. claude code and cursor both let you set this. 90 seconds is the sweet spot — long enough for real thinking, short enough that you aren't gone for lunch.

2. when you start a task, write the expected output format down before kicking it off. then have a second cheap model check the output for matching that shape. if it doesn't match, that triggers the ping. 80 percent of stalls are "waiting for clarification", not actual compute time.

3. batch the small stuff. five 5-minute tasks queued sequentially with a shared context buffer is way safer than five tabs of you nervously refreshing. the agent does not lose context the way you do.

the meta-fix: stop using AI for tasks that need your real-time attention. those are for the morning. AI is for the stuff you stack while in meetings.

Report

8d ago

What killed the mid-run stalls for me was making the agent treat a missing answer as a finding instead of a reason to stop. If it can't confirm something it writes 'couldn't verify X, went with Y' into the output and keeps moving. I'm building a research agent that runs unattended, so dead time mid-run is basically the whole problem. Now the gaps land in the final summary and I just fix the two that actually matter, instead of babysitting the full 40 minutes.

Report

6d ago

I thought I had a way but then .....

Report

9d ago

1 2