Tabstack Dev Tools - Ditch your scraper. Make one API call with any tool.

Ditch your scraper. One API gives your code everything it needs from the web: structured JSON, clean markdown, cited research, and browser automation. No browser, LLM, or pipeline for you to run. Use it from the tools you already work in: an MCP server, CLI, Raycast extension, or as an Agent Skill. Grab a key and make your first call in less than three minutes. Mozilla-backed. Your data is never sold, never trained on.

Add a comment

Replies

Best

Built at Mozilla definitely got my attention. Curious how well it handles websites that change their layout frequently.

 Thanks! The Mozilla part means a lot to us.

Layout changes are exactly where this approach holds up. You define a schema for what you want, and the model reads the page to fill it. No CSS selectors or XPath to maintain, so when a site reshuffles its markup, your extraction keeps working. That's the whole reason we went schema-first instead of selector-based.

Honest caveat: if a site actually removes the data or buries it behind new clicks, that's a content change, not a layout change, and you'd feel it. For redesigns and DOM churn, you generally don't touch a thing.

Built at Mozilla definitely got my attention.

From my perspective, this makes the difference. It sets expectations.

Every call runs on a -backed platform. The pages you extract, the answers you research, and the tasks you automate stay yours, handled responsibly and never used to train models. See exactly how sources and handles data in the docs:

You're in good hands. Get started for free here:

The schema-first approach is interesting. Have you found that users spend more time defining the schema they want, or cleaning up the extracted data afterward? Curious where the bottleneck usually ends up.

 Great question. Schema-first moves the bottleneck to the front, and shrinks it.

With most extraction, the work lives on the back end. You get messy output and clean it, every run, forever. Schema-first flips that. You define the shape once, and Tabstack does the cleanup for you. The model reads the raw page, normalizes the values, and returns data that's already typed and in your desired shape. No separate cleanup pass on your side.

So the time does go into defining the schema. But that's a one-time decision, not a per-run tax. And it's mostly just deciding what you actually want out of the page, which is the part you wanted to think about anyway.

If you keep an eye out, you'll see our launch next week that's focused on schemas. 😉

If you keep an eye out, you'll see our launch next week that's focused on schemas.

 spoiler alert 🙈

"URL + schema in, clean JSON out" is exactly what I keep wishing for. I drive a lot of browser automation for my own agents and the thing that always bites me isn't the first run, it's the site quietly changing its DOM a week later and everything breaking silently. Does the schema-based extraction hold up when a page's layout changes, or does it need re-tuning? Mozilla-backed and "never trained on your data" is a strong trust angle too. Congrats on the launch.

 Thank you, that means a lot. This is exactly the problem we built around.

It holds up without re-tuning. Nothing's tied to the DOM, so when a site changes its markup, extraction keeps working. The model finds your fields by what they mean, not where they sit on the page.

And nothing breaks silently. If the data ever leaves the page, the field comes back null, so you see the gap instead of shipping bad data.

Trust matters to us as much as to you. That's the Mozilla manifesto. 🫶

Privacy, transparency, and control. You can read the Mozilla manifesto here:

It looks very interesting! How does it compare to other AI scraper solutions currently on the market? What specific use cases did you test during development, and which ones is the agent optimized for?

 thanks for the support, Corentin! and great questions. TL,DR:

  • focused on turning web pages into AI-ready structured content

  • optimized for fast adoption

  • lighter-weight and purpose-built for content extraction and structured output

  • built at , i.e. private by default and transparent by design

if you have a specific product in mind, you could find a more detailed comparison in the docs:

hope it helps!

random idea: migration guides to help users switch to

Congrats on the launch. The MCP + schema path is a nice fit for agents that need web context without owning brittle scrapers.

The thing I’d test is provenance across a multi-source run: exact URL, fetch time, cache/nocache state, and which fields came from which source. Do you return that beside the JSON, or mostly through citations today?

This is pretty cool.

 i do think so. what's your favorite part about this product/launch?

Congratulations on the launch! When you say one API call for any tool, are there really not restrictions of tools that can be used? If not, what tools have you seen be used the most?

Thanks for the support, Scott! And yes, you can use from the tools you already work in: an MCP server, CLI, Raycast extension, or as an Agent Skill.

The MCP + agent skill angle is the part that stands out to me. Scraping is usually treated as a one-off API call, but making it usable directly inside an agent workflow feels much more practical. Curious how you think about guardrails when agents browse or extract from messy pages.

 great question. is schema-based. You define a JSON schema for the fields you want, and every call reads the page and maps it to that schema. You maintain the schema, not the scraping logic. And the schema only changes when the data you want changes.

Start for free here: and let us know what you think!

 anything else should build/improve/fix from your perspective?

The MCP server angle is the interesting bit for me. When an agent pulls structured web data this way, how do you handle pages where the schema is slightly wrong or the DOM changes mid-run?

That's the best part: when a site reshuffles its DOM but still shows the same info, nothing on your end changes.

I've been thinking about this from the opposite direction. I spend time making my own site readable by AI agents (llms.txt, structured markdown endpoints, that sort of thing), and the annoying part isn't getting the content out there. It's that every agent reads the page differently, and you have no idea what they actually pulled.

Schema-first extraction kind of flips that. If agents start requesting a schema instead of just scraping raw HTML, the site owner can finally tell what the agent actually saw. That feels more like a contract than a crawl.

Has anyone tried pointing Tabstack at their own pages to check what an agent would get back? Feels like a quick way to sanity-check your own site's AI readiness.