Zaid Ahmed

caniscrape - Know before you scrape.

by
Know before you scrape. Analyze any website's anti-bot protections in seconds. Save hours of trial and error with ML-powered detection.

Add a comment

Replies

Best
Zaid Ahmed

While trying to build a comprehensive database on tech parts, I was led to scraping Newegg. I spent about an hour building the scraper, and it was about 1 AM in the morning when I finished and using Playwright, it was going to take a while, so I decided to let it run overnight. Little did I know that the 200 status codes I was seeing in the console were actually Cloudflare challenge pages and Newegg had me stuck in an infinite pagination loop. I gave up, thinking it was too tough.

However, after great research, I now know that anything is possible if you put your mind to it. I decided to create caniscrape to give everyone the convenience of not having to go through the same experience as me, saving lots of time and money. My CLI, which I released as an open source, lightweight version of my website, has already gotten around 6,000 PyPI downloads and around 230 stars on GitHub in about two weeks. After overwhelming feedback to create a web version, I decided to release this. Not only does it have everything from the CLI, but I have also created my own ML model that can analyze bot protections, the ones that only show up when you start scraping. Hopefully this product will save you the time and money I only wish it could've for me!