I've been benchmarking AI coding agents for the past few months and one finding keeps coming up: agents spend ~80% of their context budget on orientation, reading files, grepping, exploring - before they write a single line of code.
On a FastAPI codebase (~800 files), Claude Code averaged 23 tool calls per task just to figure out what's relevant. That's 40K+ tokens burned before any actual work happens.
I've been building a solution to this and we're launching it on Product Hunt tomorrow. But before that, I'm curious:
What's the biggest source of token waste in your workflow? Is it the exploration phase, context window exhaustion mid-task, session restarts, or something else entirely?
vexp is a local-first context engine for AI coding agents. It pre-indexes your codebase into a dependency graph and serves only relevant context via MCP, so Claude Code, Cursor, and other agents stop wasting tokens on blind file exploration. Benchmarked on FastAPI (42 runs): 58% less cost, 63% fewer output tokens, 90% fewer tool calls. Runs 100% locally, no cloud, no account required. Free tier available.