Launching today

Vexp
Cut AI coding costs 58% with pre-indexed codebase context
13 followers
Cut AI coding costs 58% with pre-indexed codebase context
13 followers
vexp is a local-first context engine for AI coding agents. It pre-indexes your codebase into a dependency graph and serves only relevant context via MCP, so Claude Code, Cursor, and other agents stop wasting tokens on blind file exploration. Benchmarked on FastAPI (42 runs): 58% less cost, 63% fewer output tokens, 90% fewer tool calls. Runs 100% locally, no cloud, no account required. Free tier available.



Free Options
Launch Team / Built With



Hey PH! I'm Nicola, solo dev behind Vexp.
I built this because I was tired of watching Claude Code burn through tokens reading files it didn't need. On a real benchmark it was doing ~23 tool calls per task just to orient itself - loading 40K+ tokens before writing a single line.
vexp pre-indexes your codebase into a dependency graph (Rust + tree-sitter + SQLite) and serves ranked context in one MCP call. The agent gets exactly what's relevant, nothing else.
The benchmark results surprised me. Cost dropped 58%, which I expected. But total tokens processed went UP 20% while cost went DOWN. The reason: output tokens dropped 63% because the model stops "thinking out loud" when it gets focused input. And the structured context hits cache at 95.3%, so the extra input is nearly free.
Other things worth noting:
- Session memory that links observations to code symbols and auto-stales when code changes
- Works with 12 agents (Claude Code, Cursor, Copilot, Cline, Aider, Zed, etc.)
- PreToolUse hook that blocks wasteful grep/glob when the daemon is running
- 100% local, single Rust binary, no cloud, no account
- Free tier: 2K nodes, Single-repo workspace (All plans work on unlimited individual repositories. The workspace limit defines how many repos can be linked together for cross-repo queries), no time limit
Would love feedback from anyone using AI coding agents daily. What's the biggest friction point in your workflow?
Congrats on your launch! One thing I’m curious about: you say output tokens drop 63% with pre indexed context, does that mean the actual code quality changes too or is it just less “thinking out loud” from the model? Because if the model is writing better code when it gets cleaner input, that’s a much bigger deal than just cost saving.
@paolo_rossi6 Great question, it's both.
The 63% output reduction is mostly the "thinking out loud" disappearing. When Claude gets 40K tokens of unfiltered context, it generates narration like "Let me look at this file... I can see that this function...", that's orientation, not code. With pre-indexed context, it skips straight to the answer.
But code quality also improves, and the mechanism is subtle: when the agent explores blind, it patterns off whatever code it finds first, including the worst code in your repo. When it gets graph-ranked context, it only sees the structurally relevant code. So the reference material it works from is better, which means the output is better.
I didn't formally measure code quality in the benchmark (hard to quantify), but the variance data tells the story indirectly: cost standard deviation dropped 6-24x across task types. More consistent cost = more consistent behavior = more consistent output.
Congrats on the launch!! The idea of saving on tokens is crucial now more than ever.. How can I integrate you into Cursor and is it even possible?
Congrats on the launch, will definitely gon try it today.