Launching today

projectmem
Memory + judgment for AI coding agents (local, MIT)
5 followers
Memory + judgment for AI coding agents (local, MIT)
5 followers
Lightweight memory + judgment layer for AI coding agents. No daemon, no ports — just stdio MCP + plain-text JSONL in your repo. Captures bugs, failed attempts, and fixes inside your project, then warns at git commit before you repeat a known dead-end. 14 tools, ~600 LOC, open source, MIT, 100% local — your AI finally remembers what it tried last week.









Memory for coding agents makes sense, but the interesting engineering challenge is relevance filtering .A codebase with months of history has way more context than any session can use. The 'judgment' part of the name suggests projectmem has an opinion on what to surface and what to leave out. Would love to understand the mechanism behind that, is it semantic similarity, recency, or something more task-aware?
@ayushi18 Thanks Ayushi! really good a question, and you put your finger on exactly the part I think about most.
Honest answer for v0.1.3: the "judgment" today is structural, not semantic. There are no embeddings yet. The filtering happens on three axes:
1. Structure. Events aren't a flat chat log ; they're typed: issue, attempt(worked|failed|partial), fix, decision, note, gotcha. So when an agent calls get_context, it's not getting raw history; it's getting open issues + recent decisions + the latest fix per issue. Closed-and-fixed stuff drops out by default.
2. File-path keying. This is where the pre-commit warning gets its bite. When you git commit styles.css, the hook runs precheck_file styles.css , it surfaces only failed attempts and gotchas tied to that path. So "months of history" never floods in; you get the slice that's literally about the code you're touching right now. Task-aware in the narrow sense.
3. Recency + caps. get_context defaults to the last N events with a hard cap (configurable). search_events is keyword + type filter, not vector search.
So today: opinionated because of the schema, not because of an ML ranker.
Semantic similarity is the obvious next layer, it's on the v0.3 roadmap (local embeddings, no cloud). The harder question I keep circling is task-aware ranking, knowing the agent is debugging a layout bug vs. refactoring auth, and weighting differently. That probably needs the agent itself to pass a query intent, not just a keyword. Still figuring it out.
Appreciate the sharp question, exactly the kind of feedback I was hoping launch day would surface.