Hello everyone, I am product owner of Proxycurl and our new Linkedin here.
We love (big) data and crawlers at our company, and many a times, we have tried building products on top of big data that we have crawled. But one of the hardest thing to crawl is Linkedin. While we have the technology and crawling network to do it, most of our customers really wanted a turnkey solution at a price that scales.
Eventually, after losing many inbound leads due to the developers not being able to use our rudimentary crawling API, we decided to overhaul our API and build in a dedicated Linkedin API endpoint.
The hard thing about Linkedin profiles are that they really hate HTML markup. Or maybe it is intended that way to deter scrapers.
Most of their content are placed in and later parsed using their frontend javascript code. This is a cat-and-mouse game between developers and Linkedin. Instead of fighting them one developer at a time, we think a managed service for Linkedin Profile crawling might be useful. Your thoughts?
Avoid - data is just plain wrong and they are putting their head in the sand about it. They said:
"If this level of accuracy or historical consistency is critical for your use case, we understand that our solution might not be the best fit at this time."
Yes, accurate data is obviously important from a data vendor.
Proxycurl has been a great resource for us to easily pull in complete LinkedIn profiles for our prospecting and use that information in our scorecards.
and later parsed using their frontend javascript code. This is a cat-and-mouse game between developers and Linkedin. Instead of fighting them one developer at a time, we think a managed service for Linkedin Profile crawling might be useful. Your thoughts?