Open Source Agent Monitoring

Start new thread

Latitude - Fix what's breaking in your AI agent

Tabstack by Mozilla

•10d ago

Open-source AI agent monitoring platform. Latitude automatically detects all the ways your agents fail at scale, and gives your coding agent the tools to fix it.

Replies

Best

What does onboarding look like for an existing agent fleet is there meaningful setup required or is it closer to drop-in instrumentation?

Report

10d ago

Have you tested this against multi agent systems where failures cascade across agents rather than staying contained to one? That seems like the harder problem.

Report

10d ago

What's the security model for agent run data especially for teams whose agents touch sensitive internal systems or customer data?

Report

10d ago

The clustering of conversations into discrete failure modes is the clever part. Most observability tools dump raw traces and leave you to find patterns yourself. We've spent time manually sifting logs to spot recurring failures. How does the automatic issue detection work? Does it use embedding clustering on trace outputs, or is there a rule-based approach?

Report

10d ago

The issue abstraction is the important move. Raw traces are necessary, but small teams need a release gate: is this failure mode understood, reproducible, and covered by an eval before we ship again?

Report

10d ago

Humalike

Congrats on the launch! <3 desde bcn

Report

10d ago

the "issues not logs" framing resonates. i've lost hours scrolling through agent execution traces trying to find why something broke, only to realize the actual failure happened 6 steps earlier. how do the evals work here, do you define failure criteria upfront or does it infer patterns from the traces?

Report

10d ago

The phrase all the ways your agents fail is ambitious is failure detection here pattern based on known anti patterns or does it learn failure signatures from your own agent's history over time?

Report

10d ago

While most agent tools stop at dumping logs, auto-building an eval from each failure cluster looks totally spot on! One thing I'd poke at - how do you stop those auto-evals from overfitting to the exact transcripts that triggered them instead of the general failure mode? Thanks!

Report

10d ago

How does Latitude differentiate a genuine failure from an agent that's thinking out loud through a messy but ultimately correct reasoning path?

Report

10d ago

1 2 3 4 5