Vector Cache

Vector Cache

A Python Library for Efficient LLM Query Caching

23 followers

As AI applications gain traction, the costs and latency of using large language models (LLMs) can escalate. VectorCache addresses these issues by caching LLM responses based on semantic similarity, thereby reducing both costs and response times.
Vector Cache gallery image
Free
Launch Team / Built With
AppSignal
AppSignal
Get the APM insights you need without enterprise price tags.
Promoted

What do you think? …

Shivendra Soni
Dear Makers and Product Hunt community, I am Shivendra and I am incredibly happy to introduce Vector Cache, a streamlined Python library that enhances LLM query performance through semantic caching, making responses faster and more cost-effective Vector Cache optimizes data retrieval processes by intelligently caching similar requests. This means faster response times and reduced load on your databases, perfect for applications dealing with large volumes of data or complex queries. 📈 Benefits: VectorCache, akin to a more nuanced Redis, enables efficient caching by recognizing not just exact matches but also semantically similar queries. This efficiency is particularly useful in domains where queries within a specific topic or field are frequent. Vector Cache is built to be LLM agnostic, supports multiple static caches (Redis, Mongo, Posgres etc) and even more vector stores (Chromdb, deeplake, PGVector etc). There is also a dynamic thresholding feature which adjusts the similarity threshold based on cache hit and miss rates. I would love for you all to try this and give me feedback (also a star on the repo would be great :D), Do create issues for features you want to see.
Alexander Moore
💡 Bright idea
Interesting product, Shivendra! Dealing with large datasets and optimizing queries can be such a pain. The semantic caching bit sounds like a game-changer for performance. Just curious, how does Vector Cache handle scaling when dealing with multiple vector stores like Chromadb and deeplake? Also, loving the LLM agnostic approach, opens up a lot of possibilities!
Shivendra Soni
@alexandermoore Thank you Alexander, so right now the way to use it is you instantiate a vector store and a KV store and pass it to the vector cache object. To use multiple vector stores right now, you'd have to create multiple instances.of the vector cache. If there's a specific usecase, please create a issue on github I'd love to help you out. Again, thanks for the kind words.
Shivendra Soni
@alexandermoore Also Alex, if you see any value here, please consider starring the github repo. It would give me the visibility and motivation to keep maintaining and growing it.
Mary Garcia
Great to see innovative solutions like VectorCache coming out, @shivendra_soni! The emphasis on semantic caching to enhance LLM performance is exactly what the industry needs. This could significantly lower costs and boost efficiency for developers working with large datasets. Definitely upvoting this! 💡
Shivendra Soni
@marygarcia Thank you Mary, means a lot. And that was the plan to allow orgs save LLM costs for similar queries.
Shivendra Soni
@marygarcia Also, if you see any value here, please consider starring the github repo. It would give me the visibility and motivation to keep maintaining and growing it.
Michael Green
Great to see a project like VectorCache making waves, Shivendra! The whole semantic caching approach sounds like a game-changer. It's impressive how you're tackling both latency and cost with such a nuanced solution. The fact that it supports multiple static caches and vector stores is a huge plus, especially for folks working in diverse database environments. I'm curious about the dynamic thresholding feature—those details about adjusting based on cache hit rates are definitely a clever touch. It’s like giving LLMs a superpower to understand context better. I can see this being extremely beneficial for B2B applications where performance is key. Excited to give it a spin and see how it fares against traditional setups. Keep pushing the envelope, and I'll definitely star the repo! 🚀
Shivendra Soni
@michaelgreen Thank you Micheal for the kind words, do create an issue on github for any feature requirement/ usecase you might have. And i'll be happy to build that for you or assist you in the same.
Shivendra Soni
@michaelgreen Also, if you see any value here, please consider starring the github repo. It would give me the visibility and motivation to keep maintaining and growing it.
Elke
This sounds impressive, Shivendra! I'm curious how Vector Cache handles scaling with increased query volume. Do you have any benchmarks yet on its performance compared to traditional caching solutions like Redis? Also, how does the dynamic thresholding work in practice? It seems like a valuable addition!
Shivendra Soni
@elke_qin Hi Elke, unfortunately I haven't been able to handle scaling yet. But since its an application level library, the scale would essentially depend on the scale the underlying database and vectorstore can handle. So dynamic thresholding is a simple implementation, where it changes the threshold until a desired cache hit rate is met. There are ofcourse some corner cases here hence i have explicitly called it out as an experimental feature
Shivendra Soni
@elke_qin Also, if you see any value here, please consider starring the github repo. It would give me the visibility and motivation to keep maintaining and growing it.
dashuai
Great to see the launch of VectorCache, @shivendra_soni! 📊 The concept of semantic caching to improve LLM query performance is truly game-changing. In today’s landscape where response time and cost management are paramount, this could make a significant impact on many projects, particularly those in B2B and data-intensive applications. The ability to use multiple static caches and vector stores really enhances its versatility, and I’m looking forward to testing the dynamic thresholding feature. Kudos to you and the team for this innovation! Will definitely be keeping an eye on your updates and looking to provide feedback. Keep up the great work!
Shivendra Soni
@ezshine Thank you for the kind words. Please checkout the github repo and star it / fork it if possible.
Cheers for the launch! @shivendra_soni I have a vector store which is not yet a part of the library, how can I use it.
Shivendra Soni
@sukhmani_kaur_paperguide Hi, so there is a base interface which you can implement for your own vector store.. And then pass it as Param to the vector cache object instantiation.