Taimoor Khan

Taimoor Khan

Co-founder, Stonepath Labs. Building TAQ

Badges

Tastemaker
Tastemaker
Gone streaking
Gone streaking
Gone streaking 5
Gone streaking 5

Recently Supported

Runtime
RuntimeSandboxed coding agents for everyone on your team
Agentspan
AgentspanOpen-source runtime for durable AI agents
Agent Sandbox
Agent SandboxYour agent's personal remote computer and drive

Forums

Taimoor Khan

2d ago

built an open source SDK for catching AI agent regressions before you ship

been building agents for a while and kept hitting the same problem. fix a failure, change the prompt or model, same failure comes back quietly. nobody catches it until a user does.

built replayd to solve this. captures failed agent runs as regression tests and replays them before you deploy. if the same failure returns after a prompt, model, or tool change, it catches it.

the grading part was the interesting problem. can't use exact output matching because LLMs are non-deterministic. so instead of checking the text, it checks whether the specific failure came back. wrong tool called gets a hard assertion. policy violation gets an LLM judge.

v0.1.2, early but works end to end. zero runtime dependencies in the core.

View more