Nanonets OCR

Intelligent text extraction using OCR and deep learning

#5 Product of the WeekAugust 16, 2019
+1
Transform unstructured, human-readable text into structured and validated data using OCR + Deep Learning to extract relevant information. Digitize everything from documents, PDFs to number plates and utility meters. Extract relevant info and key fields.
Discussion
Would you recommend this product?
6 Reviews4.2/5
Hello fellow hunters, Thank you for stopping by to have a look at Nanonets' OCR product. I'm one of the co-founders of Nanonets and I would like to give a quick overview of our OCR product. We set out to solve the problem of being able to simplify OCR integration into your product. Especially to automate manual data entry and validation processes in your pipelines. Through this integration, users can easily build production ready OCR models. To give you a little bit of background, Nanonets is a machine learning API for developers to integrate cutting edge ML into their products. Let me give you a quick walkthrough of this feature. 1. Assume you have a large number of invoices that are generated everyday. You have an entire team dedicated to digitizing and extracting key fields from these images. 2. With Nanonets, you can upload these images and teach your model what to look for. For eg: In invoices, you can build a model to extract the product names and prices. 3. Once your annotations are done and your model is built, integrating it is as easy as copying 2 lines of code :) I would urge you to take a look at the product webpage. We have built the product with a lot of passion and would love to have your feedback on it. Happy to answer any questions. Prathamesh
Looks promising!! Are these ready to use APIs or do you always use custom models
@anup_surana Thanks! Currently you build your own custom models with a handful of your data. We've seen that one size fits all models don't work out too well.
Isn't this template specific again? Or have you generalised it?
@yash_agarwal8 Hey, it isn't template specific. So if you have say 50 sets of different document types containing similar data, we're able to pull it out for you. Hope this helps
Awesome! Does it work for any specific file format or any image?
@pramod_kk It works for most of image types. For a few document digitization customers, we have processed PDF's as well. Are you looking for some specific file format support?
@pjuvatkar Looking at making the OCR model, it doesn't look like PDFs are supported?
Does this also work for hand written documents?
@shikhar_khanna2 Hey Shikhar, that's a great question. Given enough examples, we're definitely able to make it work on handwritten text.