Extract clean text from any website

Start new thread

URLtoText - Extract clean text from any website

URLtoText

•1yr ago

Extract clean text or markdown from any website. Then paste into your favorite AI.

Replies

Best

URLtoText

Maker

📌

Hey Product Hunt community! I'm thrilled to launch urltotext.com today. Urltotext.com started as an internal debugging tool the web scraper for another product of ours but quickly became indispensable for our customers in extracting clean data from various websites. When working with LLMs, especially for RAG (retrieval augmented generation), clean data input is crucial. Urltotext.com excels at: 1. Extracting clean text from raw HTML, reducing token bloat 2. Intelligently isolating main content using AI-driven heuristics 3. Rendering JavaScript and using residential IPs to overcome common extraction hurdles We're exploring a paid version with higher rate limits, a fully documented API for programmatic access, and advanced features like CAPTCHA solving. If urltotext.com sounds useful for your projects, I'd love to hear your thoughts! Please share your feedback and use cases in the comments.

Report

2yr ago

Cycle

@timothybramlett congrats on the launch! What tech do you use under the hood? Firecrawl?

Report

1yr ago

URLtoText

Maker

@chethan_bm 🙏

Report

1yr ago

URLtoText

Maker

@thibautnyssens custom tech basically

Report

1yr ago

URLtoText

Maker

@a_zelenkov sounds good! 👍

Report

1yr ago

@timothybramlett Congrats on your launch day! Wishing you great success and new opportunities. What challenges did you overcome to get here?

Report

1yr ago

TTSynth.com

This sounds really promising, @timothybramlett! I'm curious about the AI-driven heuristics you mentioned. Can you elaborate on how they isolate main content? Also, do you have any plans to integrate this tool with popular LLMs for easier use?

Report

1yr ago

URLtoText

Maker

@robertthomas2 How could that integration work do you think? Such an interesting idea!

Report

1yr ago

Hurrayy

Congrats on the launch! This tool is actually really useful! Are you planning to add extraction for multiple pages as well?

Report

1yr ago

URLtoText

Maker

@twoheads I could definitely add that! Would that help your use case?

Report

1yr ago

Hurrayy

@timothybramlett Not a specific use case, just a general question! Keep up the good work 👍

Report

1yr ago

Really interested in this and would make very heavy use of an API. @timothybramlett when do you think a paid API will be available? Any chance for a beta access? Could also help a little in development as this is also something I need and would put hours into in my own platform haha. Would rather use yours as it seems a bit further along.

Report

1yr ago

URLtoText

Maker

@newms87 do you just need one page at a time? Or multiple?

Report

1yr ago

@timothybramlett Probably just 1 page at a time. Multiple pages could certainly be a nice feature, but I'm not sure if that would be something I'd want directly, or just make the decision for another page on my end and then request each page as needed. Would be a cool feature to get pagination fields returned and something I could pass as parameters tho to save a bit of time on my end

Report

1yr ago

URLtoText

Maker

@newms87 okay let me think on this, this weekend and get back to you. How can I contact you? Just through here? Are you on Twitter?

Report

1yr ago

@timothybramlett i didn't use Twitter, but ig maybe? @dan.el.martillo or i can contact you on linked in and send you my email

Report

1yr ago

URLtoText

Maker

@newms87 nice okay LinkedIn works

Report

1yr ago

Mailfox

Hey @timothybramlett really nice your service has huge potential 👌🏻 I was using Jina AI lately but in comparison I love the simplicity of your service. As feedback I would add a validation message when a url without schema is entered. Congrats on the launch 🚀

Report

1yr ago

URLtoText

Maker

@crebuh Oh that is a good point! And you mean without http vs https?

Report

1yr ago

Mailfox

@timothybramlett yes I entered www.mailfox.dev but I had to check the network tab to see what was wrong :)

Report

1yr ago

URLtoText

Maker

@crebuh oh good point I will add that!

Report

1yr ago

Telebugs

Hey Timothy, I'm curious about how it handles dynamic content. Does it wait for JavaScript to load before extracting the text? How does it perform with websites that have complex layouts or a lot of nested content? It would be interesting to see a comparison of your results versus other text extraction methods. Congrats on the launch!

Report

1yr ago

URLtoText

Maker

@kyrylosilin yes it waits for the JS to render

Report

1yr ago

Congratulations on the launch @timothybramlett. As we can see you are a solo maker, which makes URLtoText even more impressive. Very practical tool!

Report

1yr ago

This feature is very useful, can it also extract content from pages where copying is disabled?

Report

1yr ago

URLtoText

Maker

@mu_lin1 it should be able to. Is that useful to you?

Report

1yr ago

@timothybramlett Very useful, thank you for developing such a handy tool~

Report

1yr ago

URLtoText

Maker

@mu_lin1 👍

Report

1yr ago

Hello my guys

Report

1yr ago

URLtoText

Maker

@muhammad_salihu2 👋

Report

1yr ago

Hell yeah!! This is dope

Report

1yr ago

URLtoText

Maker

@jaydesilva 💪 Thanks!

Report

1yr ago

1 2