
PandaProbe
open source agent engineering platform
1 follower
open source agent engineering platform
1 follower
PandaProbe is an open-source agent engineering platform that gives you deep observability into AI agent applications. Use it to trace, evaluate, monitor and debug your AI agents in development and production.











Chirpz
👋 Hey Product Hunt!
I’m Sina, founder of PandaProbe.
Building AI agents is getting easier, but understanding and trusting them in production is still way too hard.
Once agents call LLMs, tools, APIs, MCPs, and sub-agents, logs are not enough. You need to know what happened, why it failed, whether quality regressed, and if the agent is reliable across a full session.
PandaProbe is my attempt to solve that: an open-source agent engineering platform for tracing, evaluation, monitoring, and debugging AI agent applications.
The goal is simple: to enable developers move from “the agent runs on my laptop” to “I understand what happened in production, I can measure quality, and I can improve it continuously.”
What PandaProbe provides
🔎 Tracing — capture agent executions as traces and spans across LLM calls, tool calls, agents, and custom logic.
🧵 Sessions — group related traces to understand the full lifecycle of an agent.
📊 Evaluations — score traces and sessions with built-in agent-focused metrics.
⏱️ Monitoring — schedule recurring evaluations for new traces and sessions.
🛠️ Open source + cloud — build from our source GitHub and self-host or use PandaProbe Cloud.
Who it’s for
🧑💻 AI engineers — debug agent behavior across LLMs, tools, and workflows.
🏗️ Agent platform teams — monitor quality, regressions, and reliability in production.
🔬 Teams experimenting with agents — understand failures faster and compare iterations.
🚀 Startups building AI products — add observability and evaluation early before agents become impossible to reason about.
Quick links
GitHub: https://github.com/chirpz-ai/pandaprobe
Docs: https://docs.pandaprobe.com
Cloud: https://www.pandaprobe.com/
I’ll be here all day answering questions and collecting feedback.
If you’re building agents today, what’s the hardest part to debug or evaluate?
Thanks for checking it out 🙏
— Sina