
Agenta
Open-source prompt management & evals for AI teams
576 followers
Open-source prompt management & evals for AI teams
576 followers
Agenta is an open-source LLMOps platform for building reliable AI apps. Manage prompts, run evaluations, and debug traces. We help developers and domain experts collaborate to ship LLM applications faster and with confidence.








Agenta
Hi there!
Agenta is a workspace where AI teams collaborate effectively to build reliable AI applications.
Whether you’re building interactive chat apps, single-prompt workflows, or more agentic systems, Agenta keeps everything in one place instead of having prompts, experiments, and evaluations scattered across different tools.
We’d really appreciate your feedback or ideas, and thanks for taking the time to check it out at cloud.agenta.ai (free forever) or contact us at agenta-hq.slack.com for a demo !
Zeef
I really like Agenta and chose it as my #1 tool for prompt engineering when I researched different tools. I needed a tool to teach my students at the University of Calgary how to do systematic prompt engineering studies, and this one was the best one for a non-technical audience to access all the professional tools for such studies in one place. I'm planning to get our PhD students into it now for prompt engineering studies that can turn into full-fledged research papers.
Agenta
@keyhanimo That is awesome! Let us know if you have any question or feedback!
@mabrouk Love seeing more momentum in the LLMOps space, especially with an open-source approach. Most teams trying to ship AI features hit the same wall: lots of prompts, zero visibility, and no reliable way to evaluate or debug what’s actually happening under the hood.
A platform that unifies prompts, evals, and trace debugging feels like a real unlock for both devs and domain experts who don’t want to depend on guesswork.
Curious: what’s been the biggest challenge so far, capturing consistent traces, defining evaluation metrics, or helping teams collaborate around prompt changes?
Agenta
@fernando_scharnick That depend on the team. For dev-only teams, usually the starting point is observability. They want to debug their agent and start from there. For cross-functional teams, the biggest pain is usually collaboration on prompts.
@mabrouk Makes total sense, devs want to see inside the black box first, while cross-functional teams just need a shared place to iterate without stepping on each other.
Always interesting how the same AI stack creates totally different bottlenecks depending on who’s using it.
TBF this looks promising. Curious how Agenta handles traces when you have async, high-latency LLM calls. We've seen trace sampling drop important edge cases in our infra and that bit us in prod. Are evaluators configurable to run off real traffic vs synthetic test sets? Also, who stores the logs if self-hosted, does it require extra infra or it included?
Nice launch, but TBF I'm curious about approval workflows and audit trails. Henricook raised a good point, role-based access is useful, but some orgs need strict approvals before a prompt hits prod. Does Agenta provide an out-of-the-box approval queue or webhooks so we can tie it into our Jira/GitOps flow? Also wondering about immutable audit logs for compliance, that's non-negotiable for us. Is there any plan for add approval workflows or will we rely on external scripts?
Triforce Todos
Huge congrats! Love seeing more open-source tooling that actually helps teams ship with confidence.
Agenta
Thank you @abod_rehman ! Let us know if you have any feedback!
Congrats on the launch! Super clean workflow, love how Agenta brings the whole team into one place. How do you handle version control for prompts?