Launched this week

GitHub
Transparent semantic cache for LLM API calls on Redis VS
6 followers
Transparent semantic cache for LLM API calls on Redis VS
6 followers
Khazad is a transparent semantic cache for LLM API calls. It intercepts LLM HTTP traffic at the httpx transport layer and serves semantically-equivalent requests from a Redis 8 vector cache with zero code changes. Works with OpenAI, Anthropic, Gemini, Azure OpenAI, and Mistral. Model-aware and conversation-aware caching, full streaming support, TTL, and tunable similarity thresholds. Stop paying for the same prompt twice in dev, CI, demos, or production. Open source (MIT).
GitHub Reviews
Reviews