Dmitriy Pospelov

98% Accuracy | KnoDL - Turn fractured data into clean knowledge — in real time

by
Your data is 95% noise. Duplicates, conflicts, fragmented records — and nobody can explain how it got that way. KnoDL is a graph-native reasoning engine that cleans, deduplicates and connects data in real time: 91–98% accuracy, <120ms, no GPU, full explainability. Drop it into your stack via API and finally trust your data.

Add a comment

Replies

Best
Dmitriy Pospelov
Hey Product Hunt! 👋 We built KnoDL because we kept running into the same problem across industries: organizations sit on massive datasets, but only 2–5% of that data is ever actually used for decisions. The rest? Fragmented, duplicated, inconsistent — and nobody can explain how it got that way. The typical fix is throwing more ETL at it, or switching to a vector DB and hoping embeddings will sort things out. They don't. We've seen 51% duplicate rates in production telecom catalogs. We've seen government registries where 8.3 million citizens had identification mismatches across agencies. We've even found that the Arabic subset of OpenAI's MMMLU benchmark — a dataset used to evaluate AI models globally — contains 380 duplicate question groups, including 23 cases where identical questions have conflicting correct answers. We found those in minutes. The root issue is architectural. Most tools are built for storage or indexing — not for reasoning about meaning. KnoDL is different: it applies deterministic symbolic logic, the way a careful analyst would, but at 2 million events per second. What we're most proud of: it's a "white box." Every match has a traceable path. In regulated industries — banking, healthcare, government — that's not optional. We'd love to hear from data engineers, ML teams, and anyone dealing with messy real-world data. What's the worst data quality problem you've hit? Drop it in the comments — we'll tell you honestly whether KnoDL would help. → knodl.tech | info@knodl.tech
Dmitriy Pospelov

🐳 Try it yourself — one command

Want to see KnoDL deduplication on your own data without signing up for anything?

We have a free Docker image with a full CLI:


docker run --rm -ti -v "$(pwd)":/opt/data knodlang/kdlfree:kdl

Drop in your CSV → run s2_dedup.sh → get clean data with similarity scores. Works offline, your data never leaves your machine.

👉 https://hub.docker.com/r/knodlang/kdlfree

This is the same core algorithm as the enterprise version — just without streaming, knowledge graph output, and API. Good way to feel the accuracy difference vs what you're using today.