Polarity - The Self-Improvement Stack For agents

Polarity

•2mo ago

Polarity monitors every agent decision in production, surfaces failure patterns before users hit them, and turns trajectories into evals that compound your agent’s reliability over time!

Replies

Best

Polarity

Maker

📌

Hey Product Hunt 👋 Alex here, founder of Polarity. Most agent teams I've talked to have a 95% pass rate on their eval suite and a 60% pass rate in production. The gap is where products die, and most teams find out from a customer ticket hours later. Polarity closes that loop gap with ease: → craft agent behaviors in the dashboard → learns from agent behaviour and finds new opportunities for tracking → Slack alerts the second your agent misbehaves. Wrong tool call, skipped guardrail, latency going past thresholds; it’ll all show up in your team's slack channel with the trace. Three SDKs currently supported:  → Go → Python → TypeScript Leave any feedback in the comments, thank you product hunt! - Alex ❤️

Report

2mo ago

Polarity

Maker

Hi everyone! My name is Jay and I'm glad you're reading this :)

We're super excited to have@polaritycoout and ready for devs to start integrating within their Slack Channels!
Given the validation with design partners, VCs, and testers- we're excited to release this to the public after many
days ideating and building.

With a full revamp of the site and its core, would love to hear how you find the product launch: www.polarity.so
We're accepting as many demos as time allows this week, request here

Don't forget to follow the company page for future releases!🫡

Polarity Team -- I’m in the corner ;p

Report

2mo ago

@polarityco @jaychopra love it! let's fuckin gooo!

Report

2mo ago

Mailwarm

How much labeling does it need from humans before the evaluations are actually useful?

Report

2mo ago

Polarity

Maker

@othman_katim Great question! Depending on the agent’s functions and where it’s incorporated, we’ve found small- to medium-sized PRs that include the agent is often enough to give teams the metrics they need for evaluations to become useful.

TLDR: not that much labeling is required for accurate results, more always helps :)

If you want more info, check out our docs: https://docs.polarity.so/

Report

2mo ago

Would love a free trial from this to confirm fit for my 31 agentic/queue 36 skills, TAM authority enforcement system. Ai citation readiness and answers interpretation verification infra.. Python(11), Node(16) and TS(4)?

Report

2mo ago

🧐 Good find

The landing page's hero section has an issue. Instead of showing a case study for cal, it's navigating to ohm, or you placed the link at wrong place.

Report

2mo ago

The 95% eval / 60% production gap is the most honest stat I've seen on a launch page in a while.. that's exactly the failure mode I hit adding LangSmith tracing to my own agentic project. Evals pass because you wrote them against known failure cases, production breaks on the ones you didn't think of. Polarity learning new tracking opportunities from actual agent behavior is the right direction.. you can't write evals for failure modes you haven't seen yet.

Curious how the failure pattern detection actually works under the hood.. is it clustering similar failed trajectories, or something more structured like anomaly detection against a baseline of successful runs? That distinction matters for how quickly it catches genuinely novel failure modes vs just variations of known ones.

Report

2mo ago