Instaparser

Cleanly pull content from any website

14 followers

Cleanly pull content from any website

14 followers

Visit website

Launch tags:Web App•API•Tech

Launch Team

Tines — Build agents & automations integrated across your workspace

Build agents & automations integrated across your workspace

Promoted

Dabble me

@bthdonohue I'm curious the reason to build in-house vs. use something like Embedly for the video section? This looks like a direct compete with them (at a 2.5x price increase per call) - what are the main reasons someone would use Instaparser over Embedly? No official association to Embedly, albeit a happy user of both services.

Report

10yr ago

Publish Current Page to Medium

Maker

@parterburn hi Paul! We did outsource some of our parsing for a while to Diffbot (https://www.diffbot.com), but we ran into a lot of issues using their service. Everything from inaccurate parses, returning elements that were incompatible with Instapaper, and slow speeds. When we re-wrote Instapaper's original parser for a new more modern web and replaced all of our parsing with the new parser we saw a 10x drop in parsing time (

) among other benefits like increased accuracy and better integration with Instapaper (e.g. inline video support). I'm not sure how you're figuring the 2.5x price increase per call for Embedly. We're pretty competitive in pricing although slightly more expensive on the lower end and slightly less expensive on the upper end: https://s3-us-west-2.amazonaws.c... Thanks for your questions!

Report

10yr ago

Dabble me

@bthdonohue Great explanation. I was only looking at the lowest tier, and I applaud you for "choosing your customers" via price. It's not something every company is comfortable doing. @Shpigford just published a great article on this.

Report

10yr ago

@bthdonohue this is really awesome! Thanks for building this. Now that you no longer use Diffbot (and they have a competing product) you should probably request that they remove Instapaper from their website.

Report

10yr ago

Publish Current Page to Medium

Maker

Brian from Instapaper here! Over the past few years we've gotten a significant number of requests from developers to have access to Instapaper's parser. Yesterday we launched Instaparser, an API to access Instapaper's parser. Instaparser is a paid service, but there's a free tier under https://www.instaparser.com/sign... that can be used for testing or just quick weekend hacks. Personally, this is the first developer-focused product I've launched, and I'm very excited to get it out into the community and see what people will do with it.

Report

10yr ago

aiden.ai

@bthdonohue This looks very interesting. I am not trying to be negative here, but I am just curious (as a potential customer): how do you guys compare to open source (and frankly: popular) solutions such as Newspaper? https://github.com/codelucas/new...

Report

10yr ago

Publish Current Page to Medium

Maker

@cam_pj Hi PJ! I'm unfamiliar with Newspaper, so I just took a look through the source code to get a feel for how they're doing the article parsing. It looks like a great tool for an open source parsing framework, and also appears to be at least somewhat influenced by the Readability parser (similar paragraph scoring, checking sibling nodes, etc). I think the major difference here is that, in order to have a large coverage for as many domains as possible, you need to implement and maintain a flexible system for domain-by-domain parser configurations. We have a dedicated support/community person that's trained to resolve parsing issues on a domain-by-domain basis when they do come up, and we use a variety of signals in order to make sure the parser is up-to-date. We have signals coming from the "Report a Problem" button in the Instapaper app, scheduled integration tests against our most popular domains, recorded failures from the Instaparser API, and we use a combination of those signals and domain popularity to prioritize fixes in parsing issues both on a proactive and reactive basis. Creating an accurate parser requires constant maintenance from a dedicated team and while I'm sure there are open source projects out there that will come up with 65%-75% accuracy, getting to 90%+ accuracy is the really tricky bit. Hope that's helpful!

Report

10yr ago

aiden.ai

@bthdonohue Understood. It makes sense. Like you said - the last 20% are always tricky with data extraction. Thanks for clarifying this.

Report

10yr ago

Rxbot

Great news! I'm a huge fan of Instapaper. So, I'm very excited to see more products based on your Instaparser. Special thanks for a free tier :)

Report

10yr ago

Publish Current Page to Medium

Maker

@suholet Thanks Dmitry! I was really impressed with the Yandex browser when it came out in 2014. I haven't used it much since, but I loved the innovations in the browser interface. Nice to have some mutual admiration! :-)

Report

10yr ago

Rxbot

@bthdonohue Brian, Im really impressed that you've heard of our browser ) How did you find it?

Report

10yr ago

Publish Current Page to Medium

Maker

@suholet I think it was this article from TNW in late 2014: http://thenextweb.com/apps/2014/...

Report

10yr ago

Rxbot

@bthdonohue haha thankd for the link :) Meh... Russians suck at promoting their products :)

Report

10yr ago

happy too see the positive responses, mainly because there are dev/content creators here for the most part perhaps. i've been a subscriber for years so the announcement made me feel better about people using the service for free taking up resources that could be put into the paid customers experience. thank you. edit: btw this is just a supplemental service right not a restructuring of the old business model correct? as in will i have to transition by subscription to this plan?

Report

10yr ago

Publish Current Page to Medium

Maker

@kleerkoat hey there – that's right this is supplemental and not a restructuring. Instaparser is our second paid product. :-)

Report

10yr ago

Digg Deeper

Well done, Brian! It's a really useful service. Parsing is super-valuable for good mobile UX, and Instaparser does it speedily, cheaply, and with good documentation.

Report

10yr ago

Citationsy

How exciting it is to be able to look behind the scenes of a technology that powers the app I use most. And I love the logo! Instantly recognisable as part of the Instapaper brand and as something meant for devs. Very impressive work.

Report

10yr ago

Great tool for people who blog heavily and need to produce a ton of content quickly like I do Will definitely give this a shot

Report

10yr ago

1 2 3

Reviews

Most Informative