Michael Seibel

AnyParser - Accurate, private and configurable document retrieval LLM

AnyParser empowers financial services by accurately extracting insights and mapping text, tables, and charts from PDFs and images to databases, doubling insights from fivefold data. Designed with client privacy and enterprise integration as priorities.

Add a comment

Replies

Best
Rachel Hu
Hey everyone 🎉 This is Rachel, Cofounder of Cambio ML. Every day, cutting-edge AI applications extract data from millions of documents. However, traditional OCR models often struggle with accuracy, especially when pulling tables and charts, and they frequently miss crucial details in diverse document formats. That's why we've developed AnyParser—a multi-modality model designed to accurately extract text, tables, and chart information from PDFs, PPTs, and images. With AnyParser, your AI applications can access a richer trove of information! What do our customers love about AnyParser? * 🛡 Privacy Protection: Activate the "Remove Private Information" feature, and AnyParser will automatically redact P.I.I. (Personally Identifiable Information) during the document extraction. * 🔐 Configurability: You can instruct the model to include or omit page numbers, headers, footers, figures, charts, etc. * 📊 Diverse Extraction: AnyParser doesn’t just extract text and tables, it also retrieves figures, charts, and footnotes packed with vital information. * 📈 High Accuracy: Bid farewell to jumbled tables and chaotic layouts that plague traditional OCR-based models. Over the past few months, AnyParser has helped dozens of financial services clients extract data from hundreds of thousands of document pages! Ready to get started? * If you're a financial analyst or wealth advisor tired of manual data entry, try your hand directly in our Playground: https://www.cambioml.com/playground! * If you’re a developer at a financial institution working on RAG or LLM applications, book a demo for an FREE API testing key today: https://calendly.com/cambio-intr...! We’re here to answer your questions and discuss how we can help enhance your AI applications. Cheers, Team Cambio ML
Max Savonin
@rachel_hu Hi Rachel and the Cambio ML team, AnyParser sounds like a valuable tool for developers working with AI applications that rely on accurate data extraction from documents! The focus on structured data extraction, including tables and charts, and the privacy protection features are impressive. Here are some questions and how I can help: - Can you share more details about the AnyParser machine learning model? Is it based on optical character recognition (OCR) or a different approach? - Does AnyParser allow customization of the output format (e.g., JSON, CSV) to integrate seamlessly with different AI applications? As a developer, I'm interested in the technical aspects behind the model and its ability to handle complex document layouts. If those align with my expertise (natural language processing), I might be able to contribute to future development. Overall, AnyParser seems like a powerful tool for improving the accuracy and efficiency of data extraction for AI applications. I'd love to learn more about the underlying machine learning model and customization options. I'll be sure to check out the playground and potentially reach out for an API testing key!
M Sulaiman
@rachel_hu Congratulations on the launch of AnyParser! Looking forward to seeing how it transforms data processing.
Rachel Hu
@rachel_hu @max_savonin1 Hey Matt, great questions. - For your question1, we originally trained "larger" OCR based models but they were not learning... OCR can finish some basic text extract (i.e. unstructured data) but struggle on structured data like tables and charts. So we trained these new multimodality model to handle complex layout. - For question 2, AnyParser can output JSON, CSV, with the schema user defined!
Richard Song
We have extensively used AnyParser in our RAG pipeline at Epsilla. We have compared AnyParser with Unstructured.io and LlamaParse, and AnyParser performs significantly better than other options, especially on table and chart extraction. Garbage in garbage out, with AnyParser our RAG pipeline performs much better in financial and healthcare use cases on unstructured data sources.
Andrew Aikawa
I've consumed and trained some OCR models before but it was always super hard to structure data from tables especially when the existing OCR solutions were only focused on extracting text without making the connection between entries especially for things like tables. It's exciting that this is a new capability I feel like was lacking from previous generations of OCR solutions.
Rachel Hu
@asai Exactly!! We originally tried training "larger" OCR based models but they were not learning... OCR can finish some basic text extract (i.e. unstructured data) but struggle on structured data like tables and charts.
Andriy Semenets
Congratulations on the launch! Is it possible to use it as an API? It would be nice to mass-parse lots of documents all at once. Also, what are the supported file type formats for the export?
Rachel Hu
@semanser Hey Andriy, sure shot an email to info@cambioml.com and will share a FREE testing API key.
Ditarth Desai
Congratulations on the successful launch, @rachel_hu
Rachel Hu
@ditarth_wbs Thank you Ditarth!
Pavel Bocharov
Congrats on the launch and kudos to the team! This is a very useful product, my accountant will definitely appreciate it. Upvoted!
Jojo Ortiz
@pavel_bocharov Thanks! Great to hear that your accountant will appreciate it - let us know any feedback!
Eddie Guo
Great work, Rachel! Looking forward to trying the tool out
Alex Chen
Congrats! Really amazing product!
Hazel Pan
Congratulations on the launch of AnyParser on Product Hunt! As a user, I'm excited to see this powerful document retrieval LLM hit the market. The accurate, private, and configurable features are exactly what financial services need to efficiently extract insights and map data. I'm looking forward to seeing how AnyParser can streamline my workflow and unlock new opportunities.
Rachel Hu
@hazellllpan Thanks Hazel!
Jing Conan Wang
This is a very awesome tool to handle that I would allow us to analyze many more documents that I couldn’t analyze before, highly recommend it!
Rachel Hu
@jingconan Thanks Jing! We are going to automate more tedious document tasks!
Yulong Liu
Congratulations on the launch, Rachel! AnyParser is amazing—making it easy to parse tables and charts from PDFs and images is indeed an important problem to solve. Great job!
Rachel Hu
@brianliu Thank you Yulong!
Mariam Ahmed
Congrats on the launch! This is awesome! We could actually use this at Menza...
Rachel Hu
@mariam_ahmed7 Thank you Mariam!! Let's talk!!
Eli Silas
I appreciate the thoughtfulness behind AnyParser's design. It’s clear that a lot of effort went into making it user-friendly. Congrats on the launch!
Jojo Ortiz
@eli_silas Thanks Eli!
Alena Bacharova
Congratulations on the launch! The ability to accurately extract text, tables, and charts while maintaining privacy and configurability sounds incredible!
Jojo Ortiz
@alena_bach Thanks Alena!
Sumanpreet Kaur
Incredible work to simplify the financial data. Great approach!
Jojo Ortiz
@sumanpreet_kaur3 Thanks Sumanpreet!
Daniel Tian
@rachel_hu Congrats on the launch! This looks incredible
Honglei Liu
Super cool! We have been trying to find a pdf parser like this for a while. This looks like a great solution!
M Sulaiman
Congratulations on the launch of AnyParser! This tool is a game-changer for financial services, enabling accurate extraction of insights from PDFs and images while prioritizing client privacy and enterprise integration. The ability to map text, tables, and charts to databases and double the insights from data is impressive. Looking forward to seeing how it transforms data processing.
Jojo Ortiz
@b2bsulaiman Thanks! We're excited to keep working on transforming data processing
Congrats on the launch! Big news for the financial services industry!
Jojo Ortiz
@ruming_zhen Thanks Ruming!!
Zack Li
Huge congrats to your launch! I personally know Rachel, @rachel_hu is an amazing founder and we've known each other since our time at Amazon. She is very talented and sharp. I've tried the PDF parser functionality of her product and found it very convenient.
Rachel Hu
@zack_learner Thanks for the support, Zack!! You and NEXA's work are also impressive!!