
Tabstack gives AI agents and apps finished output from the live web in a single API call. Extract structured data to a schema you define, convert pages to Markdown, run cited multi-source research, and automate browser tasks. Every call returns exactly what you asked for. Built for developers shipping autonomous agents and those adding web interaction to an existing app or stack. Built by Mozilla, with ephemeral processing, no model training on your data, and robots.txt compliance by default.
This is the 6th launch from Tabstack by Mozilla. View more
Tabstack Browser Automation
Launched this week
Give /automate a task in plain English and it drives a real browser to do it: navigate a site, click through a multi-step flow, fill a form, reach a page that only renders after interaction. The result streams back in one API call.
It's an API you call, not a framework you install. Browser and LLM included, nothing to host, no concurrency ceiling. Accessibility-tree automation spends 60 to 80% fewer tokens than screenshot-based agents.
Built by Mozilla. Ephemeral, no training on your data.








Launch Team / Built With





ChatWebby AI
The accessibility-tree approach for 60-80% fewer tokens vs screenshot agents is a smart tradeoff. Curious how it holds up on pages where the a11y tree is sparse or mislabeled (canvas-heavy apps, unlabeled divs) does it fall back to vision, or does the task just fail there?
Tabstack by Mozilla
@zain_sheikh Honest answer: it doesn't silently switch to a pixel/vision mode when the tree is thin. /automate works from the accessibility snapshot, so if an interactive element is genuinely absent or mislabeled in the tree (canvas-drawn controls, unlabeled divs with click handlers), the agent can't reliably target it and that step will struggle rather than fall back to vision.
So the tradeoff is real in both directions: on structured, semantically-labeled pages you get the token savings and reliability; on pixel-only or canvas-heavy UIs, a tree-based approach is weakest, and we won't paper over that today. If your workflow lives mostly in that canvas-heavy territory, it's worth testing on your actual target pages before you commit.
Is this running remotely on another server? Seems really cool. How does things like Google auth work?
Tabstack by Mozilla
@campak Yeah, fully hosted: the browser runs on our infrastructure, so you just call the API and get the result back, nothing to run on your side.
Auth flows are the honest edge of the hosted API. It's stateless per call and doesn't manage sessions or credentials, so an interactive login (OAuth, "Sign in with Google," anything that hands back a session you need to hold onto) isn't something it carries across calls. It's built for public pages. Two ways people handle that:
1. Run the engine locally. /automate is powered by Pilo, our open browser-automation engine. Run it in your own environment, complete the login yourself, and it drives your authenticated session directly. That's the right path when the task genuinely needs to be signed in.
2. Split the work. Keep the authenticated steps in your own code and point hosted Tabstack at whatever's reachable without a session. You own the login; Tabstack does the structured extraction and automation around it.
Thanks, glad it caught your eye!
Congrats on the launch, folks! Turning "give agents the web" into five clean endpoints instead of a black box is an interesting call.
Just a curious question: when an agent's task requires a page that robots.txt disallows, does the API fail with a clear signal to reroute, or is that boundary invisible until production?
Tabstack by Mozilla
@soumya_ranjan_mohapatra thanks! It fails fast with a clear signal. A disallowed URL comes back as a 422 with the message blocked by robots.txt, surfaced in the SDK as a typed UnprocessableEntityError, so you can catch that specific case and reroute (skip the URL, try an allowed path, hand it back to your planner) instead of guessing. It's synchronous on the call, and it fires the same way mid-/automate, so an agent hits the boundary the moment it tries the page, not later.
One detail worth knowing: the check fails open. If a site's robots.txt can't be fetched or parsed, the request proceeds rather than blocking, so you only get the 422 on an actual disallow, not on an ambiguous or missing file.
Compliance is on by default, so this is the intended behavior rather than something you configure.
I have some experience with workflow automation, so this caught my attention. Is it fair to think of Tabstack as browser automation powered by AI, rather than a traditional RPA workflow? I'd love to understand where the biggest differences are.
Tabstack by Mozilla
Correct. Extract structured data to a schema you define, convert pages to Markdown, run cited multi-source research, and automate browser tasks using @Tabstack by Mozilla.
The key differenciator? @Mozilla. Every call runs on a Mozilla-backed platform. The pages you extract, the answers you research, and the tasks you automate stay yours, handled responsibly and never used to train models.
Private by default, transparent by design. @Tabstack by Mozilla is just different.
@fmerian Thanks for the clarification! The privacy-first approach is definitely a strong differentiator, especially for workflows that involve sensitive business data. Looking forward to seeing how Tabstack evolves.
Tabstack by Mozilla
awesome! really looking forward to seeing what you're building with @Tabstack by Mozilla - feel free to add your review here: https://www.producthunt.com/products/tabstack/reviews
Tabstack by Mozilla
oh and please help us spread the word on LinkedIn! repost this
How does it actually handle sites with heavy anti-bot protection or JavaScript-heavy SPAs that need real browser rendering under the hood?
Tabstack by Mozilla
@bavcichali87886 Thanks for getting into the mechanics, this is the right question to ask.
JS-heavy SPAs: these run on a real browser under the hood, not a plain HTTP fetch, so client-rendered content works. If a page loads its data lazily or after interaction (pricing tables, product grids, dashboards), set effort to max. That's full browser rendering built for exactly this case. standard is faster but can return empty fields on the heaviest SPAs, so max is the lever when you see that happen.
Anti-bot: Tabstack is built by Mozilla and respects robots.txt by default, so it's made for accessing the open web, not for defeating sites that don't want automated access. Real browser rendering plus geo-aware routing (geo_target) handles a lot of what trips up naive HTTP clients, but aggressive stealth and CAPTCHA-solving aren't what we do. If your use case leans on that, I'd rather tell you now that we're probably not the right fit.
If you've got a specific URL that's giving you trouble, send it over and I'll take a look.
how does the robots.txt compliance actually work when an agent needs to interact with a page that blocks scraping, does the API just refuse or is there a way to get the structured data another way
Tabstack by Mozilla
@azadfndk143609 robots.txt is respected by default, and it's checked per URL against the user-agent, not as a blanket on/off. If a site's robots.txt disallows the path you're pointing at, Tabstack treats it as blocked and won't fetch it. There's no compliant bypass, that's deliberate on Mozilla's side. In /research you'll even see blocked URLs counted in the robotsBlocked stat.
The part that usually matters: robots.txt is path-specific, so sites rarely disallow everything. The page you actually want is often allowed even when other parts of the site aren't. So it's less "the API refuses you" and more "it respects exactly what the site published."
And worth separating, since people lump them together: a robots.txt disallow is different from a page throwing bot-detection at you. Only the first is a robots.txt question. If you hit blockers, share the URL and I'll tell you which you're hitting.
how does the robots.txt compliance work when an agent needs data from a page that's blocked but technically accessible via the rendered DOM?
Tabstack by Mozilla
@zafer175063 If robots.txt disallows the path, we don't fetch or render it, whether or not the data would technically be sitting in the DOM. "Technically accessible" isn't the same as "allowed," and we go by allowed. That's the Mozilla line, we won't route around a site's stated policy. If the path isn't disallowed, the agent reads it normally.