@datarade Thanks for posting - educational !
@datarade Kimono is closing down
@datarade sounds like you should make a collection of Scrapers.
@datarade Wow this is an amazing list! Thanks!
I've worked for Scrapinghub for the past two years, happy to answer any questions about Portia, its big brother Scrapy, or any part of our platform... or even web scraping in general! Us and our users are currently crawling 3.5 pages billion per month, or around 80,000 pages per minute. So we know a little bit about scraping. :)
@jpmillions We're open source guys (and gals)… so we're definitely saddened to see users of a closed platform treated this way. OTOH, we've seen a lot of customers coming to try Portia from Kimono. :) We're actively working on ways to help people port their Kimono crawlers, so keen on hearing anyone in this boat! Email me directly (gabriel@scrapinghub.com) or sign up to the mailing list at the bottom of this post (https://goo.gl/CGxsFl). Both Portia and Scrapy are fully open source, and any crawlers (created or running) in our platform are fully exportable and interoperable with open technologies. While we are focused on the long-term and so doubt our platform will be shut down any time soon, if that ever happened, all of our users would be able to export their crawlers and use them on their own infrastructure. We've done this ourselves for some of our Professional Services clients who want us to build scrapers but also run things on their own infrastructure.
@gpuliatti I tried scraping a list of 27,000 urls... the browser crashed. Is there any easier way of adding URLs to the scrape?
ScrapingHub crew have been doing this a long time, and deliver good service. Since the untimely demise/acquihire of KimonoLabs, I'll be giving this a try.
Looks cool. I am curious to learn what the top uses case for Portia are? I understand that people scrape, but what interesting things do they do with the data?
@tribaling A few use-cases from our past client projects: - Scrape eCommerce sites that sell your products, to check for price violations and review data. - Build a broad crawler covering thousands of sites to automatically discover contact and profiles information for a specific industry. - Parse all shop locations for a number of big brands to provide a locator for users looking for a specific type of shop. - Build a database of interesting candidates to hire, by matching various sources of internet profiles with a series of filters which you or the HR team are interested in. I know people building boutique businesses on basic web scraping… like someone who uses our platform to offer a service that allows people to monitor Amazon Kindle Books pricing, and get alerted when the price drops or the book goes on sale. In effect, bringing Amazon's data "back to the people" to allow them to make better choices. But of course, most of the $$$ value comes from being a Fortune 500 company and being able to understand a lot more about the world, your industry and your competition. We help both large and small increase their reach and get access to the best technology. :)
Cool and freaky logo!