Alessandro Potenza

Alessandro Potenza

MsC AI - Polimi

Badges

Tastemaker
Tastemaker
Gone streaking
Gone streaking

Maker History

  • agentrial
    agentrialRun your AI agent 20x. Get confidence intervals, not vibes.
    Feb 2026
  • 🎉
    Joined Product HuntFebruary 6th, 2026

Forums

agentrial - Run your AI agent 20x. Get confidence intervals, not vibes.

Your AI agent passed the test. But would it pass again? LLMs are non-deterministic — the same task can fail 30% of the time on the next run. agentrial runs each test case N times and gives you confidence intervals instead of pass/fail. Wilson CI on pass rates, failure attribution via Fisher exact test, real API cost tracking, CI/CD regression detection. Works with LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, any Python callable. YAML config, MIT license.
View more