Featured

Crawly

Never write another web scraper

Featured comment

Dru Wynings@druwynings · Head of Business Development, Diffbot
Hey ProductHunters! Crawly is a free tool I built that uses Diffbot's automatic article extraction api to turn web content into structured data. I've used it for creating a centralized database of all of our content, but you could also use it to do content audits / migrations or analyze your competitor's content. It's currently limited to 200 pages and only articles at the moment, but I plan to add support for scraping products in the future. Any other features you'd like to see added?
Luka@prokotasic · @lukaivicev - Co-Founder at Penta
@druwynings images?
Erik Dungan@callmeed · Engineer @ Fame
@druwynings You should really update that page to clarify the "articles only" caveat. Especially when the tagline on your home page is "No rules required"
Dru Wynings@druwynings · Head of Business Development, Diffbot
@callmeed Will do! Like I mentioned, support for products, discussions, images, and videos is in the works.
DiscussionYou need to become a Contributor to join the discussion - Find out how.
Blaine Hatab@blainehatab · Co-founder, Open Minded Innovations
@datarade god mode scraping collection.
Cesare D. Forelli@cdf1982 · Indie developer
@datarade wow, thanks!
Robin Wouters@robinwo · Robin
@datarade This list should be a collection!
Nick Kwan@nwkwan · Head of Growth, Pakible (YC W15)
@datarade Great list! Have any of these had @Kimonify-like capabilities to generate api's? @skrypt
Dru Wynings
Maker
@druwynings · Head of Business Development, Diffbot
@nwkwan diffbot does =)
Chris Chinchilla@chrischinch
@datarade +1 to Diffbot
Chuck Pos@raritan
@datarade wow, sick collection
Dru Wynings
Maker
@druwynings · Head of Business Development, Diffbot
Hey ProductHunters! Crawly is a free tool I built that uses Diffbot's automatic article extraction api to turn web content into structured data. I've used it for creating a centralized database of all of our content, but you could also use it to do content audits / migrations or analyze your competitor's content. It's currently limited to 200 pages and only articles at the moment, but I plan to add support for scraping products in the future. Any other features you'd like to see added?
Luka@prokotasic · @lukaivicev - Co-Founder at Penta
@druwynings images?
Erik Dungan@callmeed · Engineer @ Fame
@druwynings You should really update that page to clarify the "articles only" caveat. Especially when the tagline on your home page is "No rules required"
Dru Wynings
Maker
@druwynings · Head of Business Development, Diffbot
@callmeed Will do! Like I mentioned, support for products, discussions, images, and videos is in the works.
Matt Gardner@thatmattgardner · Co-Founder, RouteThis.com
Awesome!!! Looking for a replacement for Kimono (https://www.kimonolabs.com/) since they got acquired by Palantir. Need something to power my Slack menu bot ;)
Eric Iannaccone@iannaccone15 · Software Engineer, Priceline.com
I would love to be able to scrape sports stats easily!
Neil Cocker@neilcocker · Founder - Ramptshirts.com
Looks great. Congrats on the launch! I can see some nice applications for this. One suggestion - I understand that it might take a while to scrape the data, but an instant email to say it will be X minutes, or just a notification after email input would be good, to manage expectations. I used it 10 mins ago, and am on the verge of tears that I still have no email... ;-)
Neil Cocker@neilcocker · Founder - Ramptshirts.com
Cannot GET /results/56ebdf254b0bfe03003ef0d8 :-(
Dru Wynings
Maker
@druwynings · Head of Business Development, Diffbot
@neilcocker Hey Neil, things should be back to normal. Servers were crumbling under the PH load :)
Dru Wynings
Maker
@druwynings · Head of Business Development, Diffbot
@neilcocker I didn't want to inundate people with unnecessary emails, but I don't want people crying either...
Neil Cocker@neilcocker · Founder - Ramptshirts.com
@druwynings Tears are over. All working now. Very impressive. Good work - this will definitely be very useful.
Dru Wynings
Maker
@druwynings · Head of Business Development, Diffbot
@neilcocker