Coval

Coval

Simulation & evals to ship delightful voice & chat AI agents

1.2K followers

Coval helps developers build reliable voice and chat agents faster with seamless simulation and evals. Create custom metrics, run 1000s of scenarios, trace workflows and integrate with CI/CD pipelines for actionable insights and peak agent performance.
Coval gallery image
Coval gallery image
Coval gallery image
Coval gallery image
Coval gallery image
Coval gallery image
Free Options
Launch Team / Built With
Anima - Vibe Coding for Product Teams
Build websites and apps with AI that understands design.
Promoted

What do you think? …

Brooke Hopkins
Hi Product Hunt Community 🐱👋 I’m Brooke, Founder of Coval! Today, we’re excited to launch Coval, a platform that transforms how you test, debug, and monitor voice and chat agents. Simulate thousands of scenarios from a few test cases. You create the prompts, we simulate environments to test your agents from all directions. 👉 Why did I build Coval? Before founding Coval, I led the evaluation job infrastructure team at Waymo, building simulation tools that tested every code change to ensure the Waymo Driver improved with every iteration. This shift from manual testing on racetracks to scalable, automated simulation transformed autonomous vehicles from early prototypes into reliable systems now navigating the streets of San Francisco. Today, AI agents face similar challenges: promising prototypes often hit reliability roadblocks as they scale. Drawing on my Waymo experience, I built Coval to bring automated simulation and evaluation to AI agents, helping teams move faster and deliver reliable, real-world performance. Coval’s mission? To ensure AI agents can be trusted with critical tasks, just as simulation helped unlock the potential of self-driving cars. It’s a tool built by developers, for developers—designed to save time, increase confidence, and eliminate the headaches of conversational AI development. ❓What Problems Does Coval Solve? 👉 Manual Testing Wastes Time Manually calling or chatting with agents is inefficient. Coval integrates into your CI/CD pipeline, running 1000s of simulations automatically with each prompt change. This saves time, increases test coverage, and boosts confidence in production performance. 👉 Debugging is a Nightmare Fixing one issue often breaks something else. Coval eliminates this frustration by providing actionable insights into agent workflows, tracking metrics for each simulation to help you pinpoint and resolve problems effectively. 👉 Production Monitoring is Hard Identifying the root cause of agent mistakes in production can be a nightmare. Coval’s monitoring offers immediate, actionable insights into custom metrics like LLM-as-a-Judge or tool calls, making it easier to ensure reliable performance. ❓Why Us? Our team brings deep experience in LLM evaluations at Berkeley & Stanford, building distributed systems for Fortune 500 companies, and crafting intuitive user interfaces. 🚀 Special Launch Offer As part of our Product Hunt launch, enjoy a free 2-week trial with personalized onboarding. We’ll help you set up custom metrics, run your first evaluations, and get the most out of Coval. 👉 Start Your Free Trial: https://www.coval.dev 👉 Check out our Docs: https://docs.coval.dev/overview 👉 Book a Demo Call: https://cal.com/bnhopkins/demo Excited to help you ship reliable AI agents faster! P.S. Drop by the comments—we’d love your feedback!
Tony Tong
@brooke_hopkins3 Coval feels like a step up for building smarter voice and chat agents. The seamless simulation and custom metrics are intriguing, how deeply can it analyze edge cases? I’m curious if it uncovers patterns that might otherwise go unnoticed. Sounds like a tool that pushes beyond the basics!
Martee
@brooke_hopkins3 Seeing all the 100 hour weeks you put into this, I’m so proud of you and the team! Congrats!
Brooke Hopkins
@tonyabracadabra Great question Tony! Yes, we definitely support developers with spotting edge cases.
We do this by offering - for example - tool call evaluations where we check for hard-to-spot function calls by your agent. It has definitely helped our customers with debugging incorrect tool calls with very precise identification of error spots. Additionally, we offer topic analysis to catch new arising topics in conversations; as well as workflow monitoring where we tell you where and how often your agents go off the beaten path. Make sure to check out our product, we're offering a free trial and you can get started on running your own evals very easily --> coval.dev
Brooke Hopkins
@martees Thank you marti!! Thanks for all your support my love
Fiona Meyer-Teruel
@brooke_hopkins3 This seems so necessary to meet market needs! Congrats on the launch!
Kwindla Kramer
🔌 Plugged in
It's exciting to see this launch. Congratulations, Brooke and team! When we all first started building voice agents on top of the new generation of LLMs, the hard problems were things like phrase endpointing, interruption handling, and squeezing latency out of all the steps in the processing pipeline. Now we have really good frameworks -- both open source and proprietary -- that make it possible to build flexible, capable voice agents that perform well "most of the time." But most of the time isn't good enough for lots of use cases. To get to "performs well all the time" we need great tooling for evals (testing), real-time observability, and performance metrics. Coval's tools are a huge step in this direction and we're all going to benefit from their work.
Brooke Hopkins
@kwindla Thank you! You’ve hit the nail on the head. Getting AI agents to perform reliably every time is the next frontier. At Coval, we’re focused on helping teams close that gap with powerful eval tools and real-time monitoring. Excited to be part of this journey and looking forward to seeing how we can all push the boundaries of voice agent performance together with Daily!
Katka Sabo
Congrats @brooke_hopkins3 & the team! I can see your platform becoming an integral part of the tech stack for developing and deploying AI agents. I've seen people outsourcing testing to cheaper countries, your solution by far beats the manual way - I'll be happy to spread the word about it. I am also just curious what's the word play behind the Coval name?
Brooke Hopkins
@katka_sabo thank you soooo much! Coval draws inspiration from Sofya Kovalevskaya, the first woman to earn a Ph.D. in mathematics!
Katka Sabo
@brooke_hopkins3 this is super helpful! I wanted to mention your startup to someone the other day but I couldn't recall the name - now I'll remember - best of lucj
Rhea Pokorny
Hey! I'm Rhea, the founding engineer at Coval. I'm so proud to be part of this amazing team that @brooke_hopkins3 has put together. Hanging around in the comments to answer any questions :)
Brooke Hopkins
Jon Hopkins
If I am using my agents as a front line of interaction with my clients, I need to make sure I have covered every scenario. Coval makes sure that I cover all my edge cases - the ones that are sure to pop up!
Brooke Hopkins
@jonwhopkins thank you!
Kamil Ruczynski
Congrats on the launch, Brooke and team! 👏🏻 This is a much-needed step forward for building reliable AI agents that can walk the talk (and debug the chat). The founder-market fit here is spot on—bringing simulation testing from self-driving to conversational AI is a genius move. Can’t wait to see Coval put agents through the wringer and help teams level up!
Brooke Hopkins
@unable0 Thanks Kamil!! We are also excited to level agents up because higher quality agents build trust in agents in all verticals, and with wordware agents are even more accessible - a power combo!
Ashit Vora
Congrats on the launch @brooke_hopkins3 @mwseibel. Can the app integrate with any tech stack, or is it focused on specific tech stacks?
Brooke Hopkins
@mwseibel @ashitvora we integrate with any tech stack! It is agnostic to tech stack because we call your API or phone number
123
•••
Next
Last