Launching today

Extend
Parse any PDF layout with SOTA accuracy for AI pipelines
63 followers
Parse any PDF layout with SOTA accuracy for AI pipelines
63 followers
Parse, extract, and split your hardest documents with unmatched accuracy. Read any layout with specialized vision models, and ship reliable pipelines in minutes, not months.


Kilo Code
"Over 1 billion PDFs are created every day, and your agents still can't read them reliably."
@Extend announced Parse 2.0, their new document parsing API.
Founder and CEO @kbyatnal on X:
The real unlock here isn’t OCR accuracy it’s preserving semantic reading order under structure ambiguity.
Most pipelines break not on extraction, but on downstream assumptions about hierarchy (especially tables/forms where “correct text” ≠ “correct meaning flow”).
Curious how do you handle evaluation when ground truth layout interpretation is subjective (e.g. multi-table docs or mixed narrative/forms)?
Tech Marketing Framework
@new_user___1452026946a93788355af99 the challenge w/multi-table and mixed narrative comes down to reading order. irregular form means sometimes you have to read a whole column first before the next vs going left to right and up to down prescriptively. for reading order, ground truth is how a human would read a doc to extract meaning.
How do your specialized vision models handle multi-column layouts, mixed tables, or low-quality scanned PDFs compared to standard LLMs?
Tech Marketing Framework
hey @ingvar_borzov great question, standard LLMs are general-purpose and can be quite costly with high latency for doc parsing, esp on docs with those complex components you listed. You also get a lot less config control and relying on prompt engineering is brittle. Our VLMs are fine-tuned to handle specific layout components like tables, forms, handwriting, barcodes, etc. And we layer on an optional agentic OCR loop for especially challenging edge cases.
Here's a benchmark if you're interested in objective measures of performance! https://www.extend.ai/resources/realdocbench
Tech Marketing Framework
Hi everyone! If anyone tells you that PDFs are solved, they probably haven't worked with the PDFs our customers see in production. We're talking bill of lading in shipping and logistics, clinical reports, IRS forms, etc.
Parse 2.0 let's your agents actually work with reliable inputs, no matter how hard the documents are. This allows you to build:
RAG systems that accurately answers questions with precise citation sourcing
Automated workflows to accelerate document workflows
Agents that take action on documents (e.g. routing, classification, extraction, etc)
Parse 2.0 is a SOTA, layout-first document parsing API for agents that need reliable inputs. It features:
A completely rebuilt layout model trained on 1M+ of the hardest docs
New specialized OCR and VLM downstream models to handle specific doc components (e.g. forms, tables, handwriting, etc)
New reading order model to preserve semantic meaning (not every doc should be read left to right, top to bottom)
If you need accurate PDF parsing, check it out and let us know what you think!