Launching today

Glia
Local-first AI memory bridge between browser chats and IDEs
62 followers
Local-first AI memory bridge between browser chats and IDEs
62 followers
Glia is a 100% offline, open-source memory bridge. A Chrome extension auto-saves your web-based Claude/ChatGPT chats, while a native MCP server lets Cursor/Claude Code query those decisions locally from your shared SQLite database.



Glia
@eshaannair Congrats on the launch Eshaan. very cool, how do you extract the real convo gems instead of just having a context dump that might not be fully read?
Congrats on the launch! This is something I do feel daily: solve something in the Claude web app, switch to Claude Code in terminal, lose all that context. The MCP server + local SQLite combo is a great architectural bet for this.
Quick q for you: most of what happens in a Claude/ChatGPT chat is exploration, dead ends, half-formed ideas. How does Glia decide what to index as a meaningful technical decision vs noise? Is it post-hoc LLM extraction at save time, user-marked, or scored on whether the result actually got applied in code?
Glia
@ferdi_sigona Great question and you've identified exactly the hardest unsolved problem in this space.
Right now Glia uses post-hoc LLM extraction at ingest time. When you save a chat, it runs an extraction pass that pulls out structured knowledge triples (subject → relation → object) and chunks the raw text for semantic search. The LLM is prompted to focus on decisions, facts, and technical conclusions — not exploratory back-and-forth but you're right that it has no signal on whether something was actually applied or just considered.
The honest answer is that "was this used in code?" is a signal I don't have yet. That's a genuinely hard problem it would require IDE-level telemetry to close the loop. User-marked memory is on the roadmap as a lighter-weight version of that signal.
The current bet is that the extraction quality + RAG retrieval is good enough that noise gets naturally de-ranked by relevance at retrieval time, even if it gets indexed. But I'd love to hear how you'd approach the signal problem this feels like exactly the right design question to get right in v2.
@Glia The Hybrid RAG setup caught my eye immediately.. fusing sentence vectors, chunk vectors, and FTS5 keyword search together feels way more solid than what most memory systems are doing right now. I’ve been working on a Corrective RAG setup myself, and one of the biggest headaches was retrieval completely falling apart once the query got rephrased or drifted semantically from the original context. HyDE honestly feels like a really smart workaround for that.. generating a hypothetical answer first and then searching using that embedding instead of the raw query makes a lot of sense in practice.
What I’m curious about though is whether the synthetic embedding step adds noticeable latency during recall. In my experience even tiny delays start compounding very quickly once everything is happening inside an agent loop, especially when multiple retrieval passes are involved.
The shared SQLite bridge between the browser extension and MCP server is also honestly a really elegant design choice.. one database, two interfaces, no extra sync layer headaches. But I’d genuinely love to know how you’re handling write concurrency there. SQLite’s single-writer lock can get annoying fast, and if Cursor plus the browser extension both try writing context at the same time, does GLIA queue the writes internally or can one request fail silently? Feels like the kind of issue that would be super subtle to debug once an agent session is already running and actively mutating state.
Glia
@akshaypal_bishnoi Really appreciate the detailed breakdown you clearly know this space well. Two great questions:
On HyDE latency: yes, it adds a step Glia generates a hypothetical answer via Ollama before embedding the query. In practice the latency hit is ~200-500ms on a mid-range local machine, which is acceptable for a single retrieval call but would absolutely compound in an agent loop with multiple passes. The tradeoff is worth it for the semantic drift problem you described querying with the raw rephrased input was noticeably worse in my tests. That said, I'm considering making HyDE opt-in for latency-sensitive setups.
On SQLite write concurrency: writes from both the extension and MCP server go through the same Node.js HTTP backend, so they're already serialized through the async job queue before touching SQLite they never write directly in parallel. The bigger edge case is a PROCESSING job mutating state while a new ingest comes in, which I handle by resetting ghost jobs on startup. It's not bulletproof but it's been stable in practice. WAL mode is enabled so reads never block. Would love to hear how you're handling this in your Corrective RAG setup.
Really interesting approach to local memory. The SQLite + MCP combo is clean. Curious how you handle context relevance, when Cursor queries past decisions, how does it decide which memories are actually useful vs noise from older conversations?
Glia
@harshalvc_ai Good question! Relevance filtering happens at two layers:
Retrieval scoring - chunks are ranked by cosine similarity against the HyDE-augmented query embedding, with keyword boosting applied if the query entities match chunk content. Lower-scoring chunks get dropped naturally.
Character budget - the top-ranked chunks fill a fixed character budget (6000 chars per session), so noisy or older low-relevance context gets crowded out before it ever reaches the LLM.
There's no explicit time-decay penalty on older memories yet it's purely relevance-driven. That's something I want to add in v2, since a decision made 6 months ago probably deserves less weight than one made last week even if it's semantically similar.
Great work! The 7-platform Chrome extension surface is probably the trickiest bet here? Claude, ChatGPT, Gemini all rework their DOM constantly and each one breaks differently.
Glia
@artstavenka1 On point! maintaining and constanly extracting dom is the trickiest part, so i made a Selector staleness checker which runs every week and creates a issue if DOM changes in any of the supported platform.
mailX by mailwarm (YC S20)
ok this does tackle a huge discomfort of mine. Good job & congrats on your launch
Glia
@naimz Thank you Naim, that means a lot coming from you! The context-switching pain is real hoping Glia makes that a lot less frustrating. Would love to hear how it holds up in your workflow if you give it a spin.