Something I have been thinking about in the AI governance space that I do not see discussed enough: provenance capture is not like most tooling categories.
With most observability or audit tooling, the reasoning is "we should have this so we're better positioned going forward." You can turn it on when the need becomes clear. You lose some history, but the tooling from that point forward is complete.
AI code provenance does not work this way.
The prompt a developer submits to Claude Code exists for a few hundred milliseconds in transit. After the model returns its response and the editor applies the change, that prompt is gone. Git records the diff. Nothing else records the origin by default. There is no reconstruct operation.
Something I built led to a design decision I want to get feedback on.
LineageLens is a free VS Code extension that captures every AI code insertion and scores it for risk on a 0 100 scale. Works with Cursor, Copilot, ClaudeCode, Gemini CLI. Zero config on install just start using your AI tools and your insertions start showing up in the sidebar.
The scoring is deterministic rules: +28 for credential-like material, +24 for eval/exec patterns, +22 for subprocess calls, +14 for landing in an auth or payments file, and so on. Fully traceable. No ML, no black box.
The design decision that surprised me: missing prompt capture when the extension records a file insertion but has no record of what was asked adds +24 to the risk score. Same weight as detecting an eval() call.
Something that keeps coming up when I talk to teams about AI code governance: everyone focuses on capturing records, but almost nobody asks how confident they are in those records.
There are two very different things you can have. Record A: a file-watcher noticed 47 lines appeared in auth.py and Cursor was probably running. Record B:a proxy intercepted the Anthropic API call, matched it to the editor insertion via request UUID, measured 1.4 seconds between the API response and thecode appearing, and computed 0.81 trigram similarity between the model output and what landed in the file.
Both produce a row in your audit database. The second is dramatically more defensible but most governance tooling treats them identically.
In LineageLens, every record gets a confidence score from 0.0 to 1.0, broken into five independent evidence signals. Easy Mode captures (VS Code extension, no proxy) score around 0.27 honest about what you know. Power Mode captures (proxy running, full request interception) score up to 1.0. The score is not about whether the record is useful. It is about how much you can defend it when someone asks.