Monkt

Monkt

Transform files and web pages into AI-ready Markdown or JSON

356 followers

Monkt convert PDFs, Word files, Excel sheets, PowerPoint presentations and web pages into structured Markdown or JSON while preserving semantic structure. Apply custom schemas, process in batches, and use predefined templates through REST API or web interface.
Interactive
Monkt gallery image
Monkt gallery image
Monkt gallery image
Monkt gallery image
Monkt gallery image
Free Options
Launch Team / Built With
Intercom
Intercom
Startups get 90% off Intercom + 1 year of Fin AI Agent free
Promoted

What do you think? …

Simeon Emanuilov
Hey Product Hunt community, I built Monkt to solve a recurring challenge in ML pipelines - converting various document formats into structured data while preserving their semantic structure. After building custom document processing solutions for different projects, I decided to package these patterns into a cloud service. The key features emerged from practical needs: ✔ Convert files/URLs to Markdown or JSON; ✔ Apply custom JSON schemas for validation; ✔ Process documents in batches; ✔ Use predefined templates for common patterns; ✔ Simple REST API integration. Would love your feedback on the approach and use cases you see for structured document processing in ML workflows. Have a great year!
Simeon Emanuilov
Hey everyone, I appreciate your support; the feedback I've received so far has been amazing. You made my day on January 1st! :D I’ve added some extra capacity since we experienced some spikes in traffic over the last hour. Everything is running smoothly now. I can offer a lifetime discount for anyone who has experienced technical difficulties! Just send me a screenshot through my contact channels. Thank you once again!
Simeon Emanuilov
@julia_zakharova2 Hi Julia, Thanks a lot for the nice words, appreciated!
André J
If I may ask? What does it mean to be "AI ready", in this context?
Leo
@sentry_co I guess the author want to express the transformed files can be understand by AI System(such as ChatGPT) directly
Simeon Emanuilov
@sentry_co Hi André, Thank you for the question. By "AI-ready" I mean that the files are in a format suitable for prompting large language models or for creating knowledge bases that can be utilized by various AI tools. Additionally, I want to "send a message" that this tool is primarily designed for AI practitioners, although it can also be used for other purposes.
Kunal Kumar
Hi @simeon_emanuilov Your product is pretty good fit for what I have been looking for. Do you have any plans to come on AppSumo or offer lifetime deal directly?
Simeon Emanuilov
@kkofficial Hello Kunal, Yes, but in a few months. I appreciate your interest.
Richard Song
Impressive work, @simeon_emanuilov! Monkt's ability to handle different document formats and convert them into AI-ready formats is very innovative. The option to use predefined templates for common patterns is a huge plus. I'm curious, how does Monkt ensure data privacy and security during the conversion process?
Simeon Emanuilov
@renchu_song Hey Richard, Thanks for the nice words; I appreciate them! Regarding your question, the conversion to Markdown is happening purely on our server. For JSON schemas, we are using LLM with a big context. We are not keeping the original files and training on that. The storage of the derivate file is encrypted. Hope I answered.
Leo
Just try this tool, this tool is not supporting very well for a file with chart now. And I guess you use LLM to process the file transforming, this method has some problems such as the content length limitation
Simeon Emanuilov
@tibelf Hi Leo, Can you share with me some example problematique cases? I would take a look. You can reach me via X, LinkedIn, or support channels. In short: transfer to MD should work without any troubles. Transforming to JSON could be further improved in my opinion, but for relatively short docs -> should work great. Pre-defined prompts could help doing some more complex operations, like summarization, translation, etc. Thanks.
Gaurav Singhal
Looks great. I am more interested in seeing how exactly are you doing it. There are many open source repos that does this including but not limited to from big companies like Facebook, Microsoft. Do you have a Hackernews discussion for this? or technical discussion somewhere?
Simeon Emanuilov
@krazygaurav93 Hey Gaurav, Thank you for the nice words. I have published this article with more information: https://medium.com/@simeon.emanu... Will record a few videos in the next two days. I think the API can bring a lot of value, so I will first make a video overview of this part.
Gaurav Singhal
@simeon @simeon_emanuilov I see you showed "MarkItDown". It's good but as you correctly pointed out it's not scalable. Also using LLM to parse document is just non sense, it becomes expensive, slow and non scalable. However you have not disclosed your solution there, I understand why. In case you want to brainstorm on the solution for further improvement let me know. Well this is a very common scientific problem you are trying to solve. I personally like "OpenParse" but it has limitations. I then started to use "Marker" which is really good but is commercial and is open for research only. I think OCR is the right way to go, I am thinking of training my own model on huge PDF formats to solve problems like headings, two-side pdf, multi-page tables etc. Best of luck.
Jorge Alcántara
Good use case, seeing a lot of interest in Firecrawl and multiOn’s features related to this; are there any particular differences to those services you’d like to make clear for the readers?
Simeon Emanuilov
@jalcantara Thanks Jorge. The main difference is our focus on ML/AI document processing rather than general web crawling. We preserve complete document structure and semantic relationships for Markdown exports, support custom JSON schemas, and optimize output for ML pipelines.
123
Next
Last