Winnow

Keep the signal. Drop the noise.

2 followers

Keep the signal. Drop the noise.

2 followers

Winnow compresses RAG prompts before they hit your LLM, cutting token costs 50%+ while preserving meaning. Uses question-guided filtering + LLMLingua-2 for semantic accuracy. Key features: • FastAPI server with OpenAI-compatible proxy • Batch compression API • Question-aware filtering keeps answer-relevant tokens • Docker self-hosting, pip-installable SDK • MIT licensed

Free

Launch tags:Open Source•Developer Tools•Artificial Intelligence

Launch Team

Fin Startups get Fin free for a year + 93% off Intercom

Promoted

Maker

📌

🚀 Winnow is live (beta)! RAG pipelines eating your token budget? Winnow compresses prompts 50%+ while keeping answer-relevant tokens using question-guided filtering + LLMLingua-2. Try the live demo → https://trywinnow.vercel.app GitHub → https://github.com/itsaryanchauh... Currently testing/fixing along the way - feedback welcome! What would make this better for your stack?

Report

3mo ago

Reviews