Launching today

agentrial
Run your AI agent 20x. Get confidence intervals, not vibes.
2 followers
Run your AI agent 20x. Get confidence intervals, not vibes.
2 followers
Your AI agent passed the test. But would it pass again? LLMs are non-deterministic — the same task can fail 30% of the time on the next run. agentrial runs each test case N times and gives you confidence intervals instead of pass/fail. Wilson CI on pass rates, failure attribution via Fisher exact test, real API cost tracking, CI/CD regression detection. Works with LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, any Python callable. YAML config, MIT license.
agentrial Reviews
Pros
Cons
Reviews
Most Informative
