Alessandro Potenza

MsC AI - Polimi

#94269710 followers 0 following

⚡️ 4 day streak

>10,000All time

0 KP

Badges

Tastemaker

Gone streaking

Maker History

agentrialRun your AI agent 20x. Get confidence intervals, not vibes.
Feb 2026

🎉

Joined Product HuntFebruary 6th, 2026

Forums

•

5mo ago

agentrial - Run your AI agent 20x. Get confidence intervals, not vibes.

Your AI agent passed the test. But would it pass again? LLMs are non-deterministic — the same task can fail 30% of the time on the next run. agentrial runs each test case N times and gives you confidence intervals instead of pass/fail. Wilson CI on pass rates, failure attribution via Fisher exact test, real API cost tracking, CI/CD regression detection. Works with LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, any Python callable. YAML config, MIT license.

Alessandro Potenza

Links

Badges

Maker History

Forums

agentrial - Run your AI agent 20x. Get confidence intervals, not vibes.