ScrapeOwl is a simple and powerful web scraping API that manages proxies, headless browsers, and HTML parsing. We develop the best tooling for you to get what you need by removing the complexity. Simply specify the website and the element you want.
Hi Folks 🙌
Thank you @musharofchy for hunting 🙏
I am super excited to share our first ever product with you all 🎉🥳😍
Web scraping is difficult and scaling a scraper is hard. So We are excited to announce https://scrapeowl.com
Simple and powerful scraping API with:
JavaScript Rendering
Proxies
Headless chrome
Extracting data from the page
ScrapeOwl turns a web page into formatted JSON with its powerful HTML parser that supports both the CSS selector and XPath.
The goal is to make scraping as easy as using a simple API.
Thanks,
Athar
😍 Anything for Product Hunters?
Offering 30% Recurring lifetime discount on any plan. Apply coupon code: PHSPECIAL during checkout.
Congrats on the launch.
Heads up that the "Try the Extraction API" does not seem to work. I added https://aboutsnack.com, it loads for a second, and nothing happens next. I tried a few other URLs, so it does not even seem to be URL specific.
@kuizinas Thanks for trying it out, that's rather strange — I just tested it with a few URLs and it works fine. Here is the example response from the URL you mentioned.
The demo box on the homepage takes up to 1 minute before it can show the details depending on the time it takes to extract the details from the page.
Let me know if you need to test the demo with actual elements.
Thanks
@nizamaniathar Oh I see what's going on. I simply typed "aboutsnack.com". I looked at the API response and that fails with "The url format is invalid."
@kuizinas Thanks for pointing that you, it is a minor oversight from my part on site's demo. I'm fixing it right now and making it always ask for URL with prefix.
The validator requires the URL to begin with either "http://" or "https://"
Thanks!
@georgewillaman All of the solutions that you mentioned are like simple API's which return raw HTML.
We return parsed data in JSON format organized by element (h1, p, etc.), we also return HTML if a user wants to parse the HTML on their own.
Report
@georgewillaman@nizamaniathar Hi athar will take a look at scapeowl, I am not sure returning the data as json like that is useful as it means someone needs to build a page parser over json when it is much easier to use xpath or css on the html. Interested to know the use case for this and how it would be useful over plain html?
@georgewillaman@pgordon Hi Paul,
ScrapeOwl does not return the JSON by default. What happens is you send the selectors (be it CSS selector or XPath) and it scrapes and then extracts those selectors that you sent along with the request and it returns them in a JSON so you do not need to parse the HTML yourself.
But if you want to do the parsing yourself then you can leave them empty and it will return full HTML of the page.
You can also have a look at our docs to better understand this https://scrapeowl.com/docs
I hope that helps.
You can reach out to me via live chat on our website if you need help setting up a demo.
Thanks
@kylegawley If setting cookies can set the login session then yes, you can send the cookies after logging in and the site should be logged in.
But right now we are not providing any specific parameters in our API to first login and then scrape the data.
Can you email me your use case at athar@scrapeowl.com?
Thanks
Lineicons
Contra
Lineicons
Contra
Lineicons
Invoice generator
Lineicons
Invoice generator
Lineicons
Lineicons
Olvy
Lineicons
Gravity
Lineicons
Hoverify
Hoverify