
Tabstack is a web data and automation API that delivers reliable structured output. Pass a URL and a schema, get back JSON that matches every time. Run research in one call and get cited answers back. Automate browsers without running infrastructure. The intelligence is built into every API call. No scraper to build, maintain, or watch break when a site changes. Built at Mozilla.
This is the 3rd launch from Tabstack by Mozilla. View more
Tabstack Structured Extraction
Launched this week
Define a schema, pass a URL, get back JSON that matches. Tabstack's extract endpoint turns any web page into structured output, no parsing code and no LLM call to maintain. generate endpoint adds AI instructions for reasoned answers, not raw fields. Both enforce your schema on every call, even when the page changes. Tune speed with effort levels, target any country with geo_target. Mozilla-backed: your data is never sold or used to train models. 10,000 free credits to start.






Free Options
Launch Team



I've built a few scraping workflows before and maintenance was always the painful part. How it handles sites that change their structure frequently
Tabstack by Mozilla
@erkan this is the main reason Tabstack exists. The maintenance pain you are describing comes from selector-based extraction: you pin to CSS paths or DOM structure, and the moment a site reships its markup, those break.
Tabstack is schema-driven, not selector-driven. You define the JSON shape you want, the meaning of the data, not where it lives in the DOM. A model reads the rendered page and maps content to your schema. So when a site redesigns its layout, there is no selector pinned to the old structure to break. The same schema keeps returning the same shape against the new markup.
Honest boundary: this insulates you from layout and markup churn, not from the content itself changing. If a field genuinely disappears from the page, changes in meaning, or moves behind a new interaction, you may still need to adjust. But the day-to-day "they shipped a redesign and my extractor broke" maintenance mostly goes away.
Firma.dev
Seems like an interesting concept. How well do you handle things like sites with heavy js rendering in them?
Tabstack by Mozilla
@chris_davis23 Heavy JS rendering is handled through the effort parameter on the extract and generate endpoints:
max: full headless browser rendering. It executes JavaScript and waits for dynamic content to load before pulling data. This is the setting for SPAs (React, Vue, Angular, Next.js client-side), lazy-loaded content, and pricing or product grids that only appear after JS runs.
standard (default): lighter JS handling that covers most pages.
min: static HTML only, no JS, for lowest latency.
So for a JS-heavy site you set effort: 'max' and extract against the fully rendered DOM:
The same effort control applies to markdown extraction and the generate endpoint. And if a page only reveals content after interaction (click, scroll, log in), the automate endpoint drives a real browser to do that first, then hands back structured output.
I have been using Tabstack for quite some time and loving it :)
Tabstack by Mozilla
@natwar86 that's amazing! Feel free to provide us a review if you ship anything on Product Hunt! I would love to support your efforts across the board so reach out if I can help in any way.
jared.so
Solid launch for Tabstack by Mozilla: Extract web data and automate browsers, no scraper required.. What was the hardest part to get right so far?
Tabstack by Mozilla
@borrellbr for sure the marketing! hahah. Nah, I'm just biased. We have had a lot of challenges from getting agents to navigate the web to ensuring we're great stewards of Mozilla's manifesto for data privacy and transparency. It's been a really fun adventure, though.
@srbiv might have something else to share here that could be fun to hear.
Tabstack by Mozilla
@borrellbr Great question, two things come to mind:
- Building a product that puts privacy first means we're often making decisions with incomplete data. We have to get creative with the few signals we do have to understand our users' needs.
- In the same vein, we're tuning the system to use the most efficient fetching strategy for any given URL. Even with continuous learning and tuning, we don't always get it right, so we give the caller the ability to control the effort of the fetch. If they aren't happy with the results, they can increase the effort and we'll spend more time and resources to get back as much content as we can.
The structured extraction angle is useful, especially if it keeps schema drift visible instead of just returning a 'best effort' JSON blob. Not sure if I missed it, but can teams version extraction rules per site/workflow?
The schema first approach is what caught my eye. Scrapers usually work great until a site changes one small thing
Tabstack by Mozilla
@furkan_topcuoglu exactly. Tabstack takes a different approach, helping avoiding the constant scraper management fixes. I had an audit tool reviewing developer documentation and the pricing page was a constant pain point, so many varying needs and ways to get pricing across different brands. I implemented Tabstack (when I was interviewing) and realized how much more effective it is. I haven't had to adjust my app in this area since.
A private-forward solution to a technical headache? I'm listening...;)
Tabstack by Mozilla
@mark_toubman2 exactly what I said when I was interviewing to join the team! Such an awesome win.