Vector Cache

A Python Library for Efficient LLM Query Caching

23 followers

A Python Library for Efficient LLM Query Caching

23 followers

Visit website

LLMs

•

Databases and backend frameworks

As AI applications gain traction, the costs and latency of using large language models (LLMs) can escalate. VectorCache addresses these issues by caching LLM responses based on semantic similarity, thereby reducing both costs and response times.

Free

Launch tags:Software Engineering•Developer Tools•Artificial Intelligence

Launch Team / Built With

KiloClaw – OpenClaw in under 60 seconds — Choose from 500+ models and get 7 days of free compute

Choose from 500+ models and get 7 days of free compute

Promoted

Maker

📌

Dear Makers and Product Hunt community, I am Shivendra and I am incredibly happy to introduce Vector Cache, a streamlined Python library that enhances LLM query performance through semantic caching, making responses faster and more cost-effective Vector Cache optimizes data retrieval processes by intelligently caching similar requests. This means faster response times and reduced load on your databases, perfect for applications dealing with large volumes of data or complex queries. 📈 Benefits: VectorCache, akin to a more nuanced Redis, enables efficient caching by recognizing not just exact matches but also semantically similar queries. This efficiency is particularly useful in domains where queries within a specific topic or field are frequent. Vector Cache is built to be LLM agnostic, supports multiple static caches (Redis, Mongo, Posgres etc) and even more vector stores (Chromdb, deeplake, PGVector etc). There is also a dynamic thresholding feature which adjusts the similarity threshold based on cache hit and miss rates. I would love for you all to try this and give me feedback (also a star on the repo would be great :D), Do create issues for features you want to see.

Report

1yr ago

💡 Bright idea

Interesting product, Shivendra! Dealing with large datasets and optimizing queries can be such a pain. The semantic caching bit sounds like a game-changer for performance. Just curious, how does Vector Cache handle scaling when dealing with multiple vector stores like Chromadb and deeplake? Also, loving the LLM agnostic approach, opens up a lot of possibilities!

Report

1yr ago

Maker

@alexandermoore Thank you Alexander, so right now the way to use it is you instantiate a vector store and a KV store and pass it to the vector cache object. To use multiple vector stores right now, you'd have to create multiple instances.of the vector cache. If there's a specific usecase, please create a issue on github I'd love to help you out. Again, thanks for the kind words.

Report

1yr ago

Maker

@alexandermoore Also Alex, if you see any value here, please consider starring the github repo. It would give me the visibility and motivation to keep maintaining and growing it.

Report

1yr ago

Paperguide

Cheers for the launch! @shivendra_soni I have a vector store which is not yet a part of the library, how can I use it.

Report

1yr ago

Maker

@sukhmani_kaur_paperguide Hi, so there is a base interface which you can implement for your own vector store.. And then pass it as Param to the vector cache object instantiation.

Report

1yr ago

Great to see innovative solutions like VectorCache coming out, @shivendra_soni! The emphasis on semantic caching to enhance LLM performance is exactly what the industry needs. This could significantly lower costs and boost efficiency for developers working with large datasets. Definitely upvoting this! 💡

Report

1yr ago

Maker

@marygarcia Thank you Mary, means a lot. And that was the plan to allow orgs save LLM costs for similar queries.

Report

1yr ago

Maker

@marygarcia Also, if you see any value here, please consider starring the github repo. It would give me the visibility and motivation to keep maintaining and growing it.

Report

1yr ago

Great to see a project like VectorCache making waves, Shivendra! The whole semantic caching approach sounds like a game-changer. It's impressive how you're tackling both latency and cost with such a nuanced solution. The fact that it supports multiple static caches and vector stores is a huge plus, especially for folks working in diverse database environments. I'm curious about the dynamic thresholding feature—those details about adjusting based on cache hit rates are definitely a clever touch. It’s like giving LLMs a superpower to understand context better. I can see this being extremely beneficial for B2B applications where performance is key. Excited to give it a spin and see how it fares against traditional setups. Keep pushing the envelope, and I'll definitely star the repo! 🚀

Report

1yr ago

Maker

@michaelgreen Thank you Micheal for the kind words, do create an issue on github for any feature requirement/ usecase you might have. And i'll be happy to build that for you or assist you in the same.

Report

1yr ago

Maker

@michaelgreen Also, if you see any value here, please consider starring the github repo. It would give me the visibility and motivation to keep maintaining and growing it.

Report

1yr ago

Startup Death Clock

This sounds impressive, Shivendra! I'm curious how Vector Cache handles scaling with increased query volume. Do you have any benchmarks yet on its performance compared to traditional caching solutions like Redis? Also, how does the dynamic thresholding work in practice? It seems like a valuable addition!

Report

1yr ago

Maker

@elke_qin Hi Elke, unfortunately I haven't been able to handle scaling yet. But since its an application level library, the scale would essentially depend on the scale the underlying database and vectorstore can handle. So dynamic thresholding is a simple implementation, where it changes the threshold until a desired cache hit rate is met. There are ofcourse some corner cases here hence i have explicitly called it out as an experimental feature

Report

1yr ago

Maker

@elke_qin Also, if you see any value here, please consider starring the github repo. It would give me the visibility and motivation to keep maintaining and growing it.

Report

1yr ago

Great to see the launch of VectorCache, @shivendra_soni! 📊 The concept of semantic caching to improve LLM query performance is truly game-changing. In today’s landscape where response time and cost management are paramount, this could make a significant impact on many projects, particularly those in B2B and data-intensive applications. The ability to use multiple static caches and vector stores really enhances its versatility, and I’m looking forward to testing the dynamic thresholding feature. Kudos to you and the team for this innovation! Will definitely be keeping an eye on your updates and looking to provide feedback. Keep up the great work!

Report

1yr ago

Maker

@ezshine Thank you for the kind words. Please checkout the github repo and star it / fork it if possible.

Report

1yr ago