AI coding tools don't understand your codebase. 40+ projects are trying to change that.
AI coding tools are brilliant at generating code. They're blind when it comes to understanding your codebase.
They can't tell you who calls a function. They don't know what breaks if you change something. They have no idea how your team actually writes code. Every session, they start from scratch: grep, read files, guess, burn tokens.
A whole category is emerging to fix this. I've been tracking it for a few weeks and found 40+ tools that appeared in the last 6 months alone. Three approaches:
Context packing (flatten repo into text, paste it in. Repomix, GitIngest)
Embeddings (semantic search, what Cursor and Copilot do under the hood)
Structural graphs (parse symbols and relationships, persist, query via MCP)
Why the explosion? Three things converged: MCP gave AI tools a universal way to query external services. Tree-sitter made multi-language parsing one dependency. Quantized embeddings made local semantic search possible without API keys.
The token economics alone are wild. I saw a dead code search burn 56 tool calls and nearly a million tokens. Same search with a structural graph: 3 calls, 60 seconds. Same model. Same question.
A few observations from mapping the space:
🧱 Deployment friction kills adoption. Tools that ship as a single binary get tried. Tools that need Python + Docker + Ollama get bookmarked and forgotten.
🎛️ The scope question is unresolved. Some tools expose 100+ MCP capabilities. Others expose 4. Nobody knows the sweet spot yet.
⚠️ IDE absorption risk. JetBrains ships a built-in MCP server starting with 2025.2. The IDE absorption risk is real for every standalone tool in this space.
Full disclosure: I'm building here too (Sense - structural graph + semantic search + blast radius + convention detection as a single Go binary via MCP. Pre-alpha, open source: https://luuuc.github.io/sense).
Wrote up the full landscape article with evaluation framework if you want the deep dive: https://medium.com/@lucdiallo/codebase-intelligence-in-the-age-of-ai-a-map-of-the-space-5fa7d349887d
What are you using to give your AI tools better codebase understanding? Anything working well that I missed?
Replies