Kadoa uses AI to explore, extract, and transform web data. Save hours of time setting up and creating web scrapers. Extract the data you need effortlessly with Kadoa.
Hi PH! š
We got frustrated with the time and effort required to code and maintain custom web scrapers, so we built an LLM-based solution that can extract data from any website in the format you want. AI should automate tedious and un-creative work, and web scraping definitely fits this description.
We're leveraging large language models to semantically understand websites and generate the DOM selectors for them. Using GPT for every data extraction, as most comparable tools do, would be way too expensive and very slow, but using LLMs to generate the scraper code and subsequently adapt it to website modifications is highly efficient.
Try it out for free on our playground https://kadoa.com/playground and let us know what you think! And please don't bankrupt us :)
Here are a few examples:
- Product Listings (Specialized Bikes) https://www.kadoa.com/playground...
- Financial Data (Yahoo Finance) https://www.kadoa.com/playground...
- Player Stats (LeagueOfGraphs) https://www.kadoa.com/playground...
š ļø How it works š ļø (the playground uses a simplified version of this):
- Loading the website: automatically decide what kind of proxy and browser we need
- Analysing network calls: Try to find the desired data in the network calls
- Preprocessing the DOM: remove all unnecessary elements, compress it into a structure that GPT can understand
- Slicing: Slice the DOM into multiple chunks while still keeping the overall context
- Selector extraction: Use GPT (or Flan-T5) to find the desired information with the corresponding selectors
- Data extraction in the desired format
- Validation: Hallucination checks and verification that the data is actually on the website and in the right format
- Data transformation: Clean and map the data (e.g. if we need to aggregate data from multiple sources into the same format). LLMs are great at this task too
The vision is a fully autonomous, cost-efficient, and reliable web scraper :)
Report
Finally AI will automate my job!
Report
@neelrawat51 Hey how good was the accuracy for the data?
It seems to me that such a tool was needed by everyone for a long time!
It helps to save time, especially for those who work a lot with reading and spending time looking for the right information.
Thanks for making my job easier :)
Kadoa
DiffSense
Kadoa
Intelogos