
HunyuanOCR
Lightweight end-to-end OCR VLM for 100+ languages
2 followers
Lightweight end-to-end OCR VLM for 100+ languages
2 followers
HunyuanOCR is a 1B-parameter multimodal VLM delivering SOTA OCR across detection, recognition, complex multilingual document parsing, open-field info extraction, video subtitle extraction, photo translation and document QA. End-to-end single-inference, 100+ languages.




Mom Clock
I tested the HunyuanOCR demo and it handled noisy video frames impressively, it can be great for devs building transcription, localization, or archive tools.