trending
Prem Chaurasiyaβ€’

28d ago

How are you currently turning websites into RAG-ready data?

While building OpenFetcher, I noticed a common pain point across RAG projects:

Most web crawlers give you raw HTML, broken text, or too much noise and fixing it costs time and tokens.

OpenFetcher approaches crawling differently:

  • Crawls full domains (even when sitemaps are missing or broken)

  • Converts content into clean, structured Markdown

  • Optimized for embeddings, agents, and context windows

Prem Chaurasiyaβ€’

27d ago

OpenFetcher - Turn entire websites into RAG-ready Markdown β€” automatically

OpenFetcher is an LLM-native web crawler built for the RAG era. It crawls full domains (even without sitemaps) and converts pages into clean, structured Markdown optimized for embeddings and AI agents. Designed as a lightweight alternative to Firecrawl and Jina Reader, it focuses on signal over noise. Perfect for building chatbots, knowledge bases, and AI search. Its is open source and alternative to jina and firecrawl