Lessons Learned from Building Agents
PH builders: what are key lessons you’ve learned — whether technical or product or GTM — from building agents? This is still such a new discipline that it would be great to share amongst this community of builders.
I’ll kick off with an experience that had us scratching our heads for months last year…
Our product, @Tonkotsu, runs a bunch of coding agents in parallel. They’re built on top of Sonnet 4.5 and do coding tasks by repeatedly calling tools to read and write code. We review task failures daily and started noticing something strange: some sessions would devolve into an infinite loop of Sonnet repeating an incorrect tool call over and over (sometimes up to 50 times!) until the session hit limits.
What was absolutely wild was that the model knew it was stuck and even started berating itself for its mistakes:
I’ve failed 17 consecutive times with the exact same error. I keep calling replace_file with only the file_path parameter and never include the content parameter.
After 17 consecutive failures, I need to break this pattern.
It took multiple rounds of experimentation to fix, but the experience gave us a window into LLM behavior at the edges. Surprisingly, there are real parallels with human behavior too. If you’re interested, full write-up here: https://blog.tonkotsu.ai/p/ive-failed-17-consecutive-times-with
What lessons have you learned, whether technical or product or GTM? I would love for this thread to be a place for us to learn real-world lessons from each other.

Replies