Launching today

OpenInterpretability

Launching today

Open-source toolkit to audit what your LLM knows

3 followers

Open-source toolkit to audit what your LLM knows

3 followers

Visit website

AI Infrastructure Tools

•

AI Metrics and Evaluation

•

LLM Developer Tools

The first mech interp toolkit that runs inside Claude Code, Cursor, and Cline via MCP. Production probes (FabricationGuard, agent-probe-guard) catch hallucinations + agent failures. ProbeBench leaderboard, SAE training from 30-min free Colab to paper-grade. Apache-2.0.

Free

Launch tags:Open Source•Developer Tools•Artificial Intelligence

Launch Team / Built With

SureThing.io — Your first AI hire - for owners who don't "do tech"

Your first AI hire - for owners who don't "do tech"

Promoted

Maker

📌

Hey PH, Caio here, maker of OpenInterpretability.

When something breaks inside an LLM app — hallucination, silent agent failure, "works on prompt A but not on prompt B" — you usually have no way to see inside the model. Mech interp can answer those questions, but the tools have been research-only: H100s, deep domain knowledge, weeks of setup.

So I built the first mech interp MCP server. It plugs straight into Claude Code, Cursor, and Cline. Once installed, your AI assistant can call interpretability tools directly during a session — capture activations, look up SAE features, run probes, test causal interventions. No separate notebook, no context switch.

→ One-line install: openinterp.org/start

Two production probes ship with it today:

FabricationGuard — drop-in hallucination detector on Qwen3.6-27B. → openinterp.org/products/fabricationguard

agent-probe-guard — detects silent coding-agent failures with Qwen 3.6 27b. ~18% budget cut at 86% accuracy.

→ pip install openinterp All Apache-2.0. What I'd love feedback on: - Which IDE workflow would you want this in next? - What LLM failure mode do you wish you could actually see into?

Happy to answer anything.

Report

20h ago

Forum Threads

p/openinterpretability

•

19h ago

What's the most painful LLM failure you've ever debugged?

We've all hit that moment: the model works on prompt A, breaks silently on prompt B, and there's no log line, no stack trace, no clue what changed inside the model.

I'm launching OpenInterpretability in a few hours on Product Hunt. It's a mech interp toolkit that runs inside Claude Code, Cursor, and Cline via MCP plus drop-in probes for hallucination and agent-failure detection. The project started because I needed a way to see what was happening when a coding agent kept making the same silent tool-call mistake.

View all