Launched this week
GitHub

GitHub

Transparent semantic cache for LLM API calls on Redis VS

6 followers

Khazad is a transparent semantic cache for LLM API calls. It intercepts LLM HTTP traffic at the httpx transport layer and serves semantically-equivalent requests from a Redis 8 vector cache with zero code changes. Works with OpenAI, Anthropic, Gemini, Azure OpenAI, and Mistral. Model-aware and conversation-aware caching, full streaming support, TTL, and tunable similarity thresholds. Stop paying for the same prompt twice in dev, CI, demos, or production. Open source (MIT).
GitHub gallery image
Free
Launch Team