Hey makers and hunters! I'm so happy to announce that we're finally launching Supavec on PH 😸
Last year, I saw many founders/devs were confused when the shutdown of Carbon AI was announced. and that got me thinking...RAG platforms should be open source just like @Supabase.
I couldn't help but start building it on last Christmas Eve, and here we are!
What we're trying to achieve with Supavec is: Integrate your data to LLMs with simple APIs.
Today we can do these with Supavec very easily:
Upload PDFs/text files (and embed them)
Delete files
Get a list of your uploaded files
Run queries against your files
Obviously we have a long way to go but I'd like to mark our great start here on PH 🙏
@taishi_kato Interesting. Why do you think Carbon AI was shut down? And another question. Do you see the concept of RAG always being apart of the future of AI and LLM's? What do you think RAG looks like in 5-10 years?
Report
@taishi_kato The timing with Carbon AI's shutdown really highlights why open source alternatives are so crucial for building sustainable AI infrastructure! Great work!
Very cool that you're launching this as open source. Are you planning to turn it into a sustainable business as well, or what's your vision for the future?
@taishi_kato Ah, that makes sense! Classis pay for ease-of-use and infra.
Report
Congrats on the launch! Excited to see more open-source fundamental tools for building AI applications. BTW, my question is, since what you are doing is generally JINA Embeddings + JINA Segmenter + JINA Reranker (maybe you have reranker here?) + Milvus/ChromaDB (plenty of such vecDB) + WebUI, how do you stand out when user wants:
finer control over the pipeline (custom embedding model / reranker model)
local deployment (privacy, accessibility)
hybrid search (Sparse BM25 + Dense Embedding)
query augmentation strategy (HyDE)
etc...
These were extremely useful in some of the projects I took part in, and I wonder if you have further plan to support such features.
@dan_mindru Thanks Dan! Lmk if you have any questions or feature requests!
Report
Let's go Taishi, seriously exciting stuff - People are only ever wanting things to be simpler & simpler so you're without a doubt hitting the pain points!
The UI is looking clean & I look forward to trying it out this weekend :)
Let me know if you have any questions or feature requests!
Report
I like this idea. Why was SupaBase chosen to host your vectors? Just wondering as I'm considering them for vector hosting.
Report
Can the application reason about the data files that have been uploaded? For example, suppose I upload a 50 page document of rules and regulations, when users send natural language queries, will the api respond with results based on the knowledge contained in the document?
AutoRepurpose
Hey makers and hunters! I'm so happy to announce that we're finally launching Supavec on PH 😸
Last year, I saw many founders/devs were confused when the shutdown of Carbon AI was announced. and that got me thinking...RAG platforms should be open source just like @Supabase.
I couldn't help but start building it on last Christmas Eve, and here we are!
What we're trying to achieve with Supavec is: Integrate your data to LLMs with simple APIs.
Today we can do these with Supavec very easily:
Upload PDFs/text files (and embed them)
Delete files
Get a list of your uploaded files
Run queries against your files
Obviously we have a long way to go but I'd like to mark our great start here on PH 🙏
DiffSense
@taishi_kato Interesting. Why do you think Carbon AI was shut down? And another question. Do you see the concept of RAG always being apart of the future of AI and LLM's? What do you think RAG looks like in 5-10 years?
@taishi_kato The timing with Carbon AI's shutdown really highlights why open source alternatives are so crucial for building sustainable AI infrastructure! Great work!
AutoRepurpose
@sentry_co Hey André!
Carbon AI was shutdown cuz they got acquired by perplexity.
I think in the AI agent era, RAG will still be important even tho the context window size of AI models are getting bigger.
Given the costs of computing and the need for accuracy of the context and answers, we should feed AI only with necessary information imo.
Coffice City
AutoRepurpose
@jasonleowsg Thanks Jason!
Andsend
Very cool that you're launching this as open source. Are you planning to turn it into a sustainable business as well, or what's your vision for the future?
AutoRepurpose
@per_clingweld Thank you Per! I have a plan to make money from the cloud hosting version of it!
Andsend
@taishi_kato Ah, that makes sense! Classis pay for ease-of-use and infra.
Congrats on the launch! Excited to see more open-source fundamental tools for building AI applications.
BTW, my question is, since what you are doing is generally JINA Embeddings + JINA Segmenter + JINA Reranker (maybe you have reranker here?) + Milvus/ChromaDB (plenty of such vecDB) + WebUI, how do you stand out when user wants:
finer control over the pipeline (custom embedding model / reranker model)
local deployment (privacy, accessibility)
hybrid search (Sparse BM25 + Dense Embedding)
query augmentation strategy (HyDE)
etc...
These were extremely useful in some of the projects I took part in, and I wonder if you have further plan to support such features.
AutoRepurpose
@riafmz Thank you HY! It all depends on what customers want!
PageAI
Suuuuper interesting & ambitious product!
Browsed the GitHub and it looks solid, keeping this in my toolbelt 💪
AutoRepurpose
@dan_mindru Thanks Dan! Lmk if you have any questions or feature requests!
Let's go Taishi, seriously exciting stuff - People are only ever wanting things to be simpler & simpler so you're without a doubt hitting the pain points!
The UI is looking clean & I look forward to trying it out this weekend :)
Best of luck w/ the launch!!
AutoRepurpose
@cranqnow Thanks Sam!
Let me know if you have any questions or feature requests!
I like this idea. Why was SupaBase chosen to host your vectors? Just wondering as I'm considering them for vector hosting.
Can the application reason about the data files that have been uploaded? For example, suppose I upload a 50 page document of rules and regulations, when users send natural language queries, will the api respond with results based on the knowledge contained in the document?