
Geekflare API
Extract, Search, Scrape, Screenshot and more...
163 followers
Extract, Search, Scrape, Screenshot and more...
163 followers
Scrape sites, Take screenshots, Search for AI, Test load times, Check DNS records, Ping IPs, Audit security for your site, Run Lighthouse and a lot more with Geekflare APIs.
This is the 3rd launch from Geekflare API. View more
Geekflare Scraping API v2
Launched this week
Feeding raw data directly into your AI agents eats up context windows and spikes your OpenAI and Anthropic costs.
Earlier this year, we launched standard HTML, JSON, and Markdown extraction. Today, we are introducing outputs built entirely for AI: markdown-llm, text-llm, and html-llm. We automatically strip out navbars, footers, ads, and scripts, delivering only the context your models actually need.
You can save up to 85% on tokens compared to raw HTML when using text-llm output format.




Free Options
Launch Team


Geekflare
Hello, everyone! 👋
Earlier this year, we launched the Geekflare Scraping API with standard Markdown, JSON, and HTML support. We prioritized your feedback about feeding our scraping results directly into AI agents and RAG pipelines.
Today we are launching our new -llm endpoints (markdown-llm, text-llm, html-llm). We do the heavy lifting behind the scenes to clean the DOM, strip the boilerplate, and return optimized structured content ready for generation.
Refer to the API reference for all supported formats.
You save up to 85% on tokens, speed up your LLM response times, and get better AI accuracy because the noise is gone.
I will be hanging out in the comments all day. Please let me know what you think and what you are building!
@chandankumar How consistent is the DOM cleaning across different CMS like Webflow vs WordPress? Any "gotcha" content types you've seen trip up the llm endpoints?
Geekflare
@dayal_punjabi hello Dayal,
Our DOM cleaning is consistent across platforms because we don’t rely on CMS-specific class names. Instead, our engine uses a mix of semantic HTML analysis (<article>, <main>, etc.) and text-to-DOM density scoring to isolate the primary content block and strip away the noise.
We continuously tune as we come across any issues around tables, pre or code tags.
If you run into any issues, please let me know.
@chandankumar Thanks a lot for the response. And congrats on the launch!
Geekflare
@dayal_punjabi thank you so much!
Bababot
Congratulations to the luanch.
Geekflare
@emma_watson21 Thank you so much!