Open-source AI agent monitoring platform. Latitude automatically detects all the ways your agents fail at scale, and gives your coding agent the tools to fix it.
Replies
Best
What does onboarding look like for an existing agent fleet is there meaningful setup required or is it closer to drop-in instrumentation?
Report
Have you tested this against multi agent systems where failures cascade across agents rather than staying contained to one? That seems like the harder problem.
Report
What's the security model for agent run data especially for teams whose agents touch sensitive internal systems or customer data?
Report
The clustering of conversations into discrete failure modes is the clever part. Most observability tools dump raw traces and leave you to find patterns yourself. We've spent time manually sifting logs to spot recurring failures. How does the automatic issue detection work? Does it use embedding clustering on trace outputs, or is there a rule-based approach?
Report
The issue abstraction is the important move. Raw traces are necessary, but small teams need a release gate: is this failure mode understood, reproducible, and covered by an eval before we ship again?
the "issues not logs" framing resonates. i've lost hours scrolling through agent execution traces trying to find why something broke, only to realize the actual failure happened 6 steps earlier. how do the evals work here, do you define failure criteria upfront or does it infer patterns from the traces?
Report
The phrase all the ways your agents fail is ambitious is failure detection here pattern based on known anti patterns or does it learn failure signatures from your own agent's history over time?
Report
While most agent tools stop at dumping logs, auto-building an eval from each failure cluster looks totally spot on! One thing I'd poke at - how do you stop those auto-evals from overfitting to the exact transcripts that triggered them instead of the general failure mode? Thanks!
Report
How does Latitude differentiate a genuine failure from an agent that's thinking out loud through a messy but ultimately correct reasoning path?
Replies
What does onboarding look like for an existing agent fleet is there meaningful setup required or is it closer to drop-in instrumentation?
Have you tested this against multi agent systems where failures cascade across agents rather than staying contained to one? That seems like the harder problem.
What's the security model for agent run data especially for teams whose agents touch sensitive internal systems or customer data?
The clustering of conversations into discrete failure modes is the clever part. Most observability tools dump raw traces and leave you to find patterns yourself. We've spent time manually sifting logs to spot recurring failures. How does the automatic issue detection work? Does it use embedding clustering on trace outputs, or is there a rule-based approach?
The issue abstraction is the important move. Raw traces are necessary, but small teams need a release gate: is this failure mode understood, reproducible, and covered by an eval before we ship again?
Humalike
Congrats on the launch! <3 desde bcn
The phrase all the ways your agents fail is ambitious is failure detection here pattern based on known anti patterns or does it learn failure signatures from your own agent's history over time?
While most agent tools stop at dumping logs, auto-building an eval from each failure cluster looks totally spot on! One thing I'd poke at - how do you stop those auto-evals from overfitting to the exact transcripts that triggered them instead of the general failure mode? Thanks!
How does Latitude differentiate a genuine failure from an agent that's thinking out loud through a messy but ultimately correct reasoning path?