Tabstack Dev Tools - Ditch your scraper. Make one API call with any tool.

Ditch your scraper. One API gives your code everything it needs from the web: structured JSON, clean markdown, cited research, and browser automation. No browser, LLM, or pipeline for you to run. Use it from the tools you already work in: an MCP server, CLI, Raycast extension, or as an Agent Skill. Grab a key and make your first call in less than three minutes. Mozilla-backed. Your data is never sold, never trained on.

Add a comment

Replies

Best

keeps cooking.

Today, the team is launching not 1, not 2, not 3, but 4 new features. Introducing:

  • Tabstack CLI for quick automation and scripting -

  • Tabstack MCP server that gives your AI assistant direct access to Tabstack -

  • Tabstack agent skill for your or Hermes agent

  • Tabstack for Raycast - an extension for scraping data without leaving

Go scrape something today.

 Converting data scrapped from website to a schema is an universal problem. I will surly give a try.

 yes! and with this launch, we hope perfectly fits with your existing dev workflow.

Looking forward to seeing what you're building with it.

 btw is going live later today at 12 PM SF time to walk through these new features.

The unified API abstraction on top of scraping is clever. We've hit the selector-maintenance problem building data pipelines where a single HTML change breaks weeks of work. Does it use headless browser pooling or something more lightweight for dynamic content, and how do you handle rate-limiting per domain when multiple callers share the same API key?

 Excellent questions!

For content extraction we use several different strategies including headless browsers. However, not every site needs a full headless browser as you alluded to. Sometimes a simple HTTP request will do the trick. Tabstack aims to pick the most efficient strategy based on the requested URL. Extraction effort is also configurable, you can read more about it here:

To prevent multiple callers from hammering the site over and over we use caching and honor robots.txt directives that target our user agent.

The unified API abstraction on top of scraping is clever. We've hit the selector-maintenance problem building data pipelines where a single HTML change breaks weeks of work.

Curious: Is it something you experimented yourself with another product? Would love to have your feedback about the first-time experience using - start here:

4 new features and its our 4th launch! How fun is that! 🚀

Try all the tools and tell us what you think. Really curious but don't have anything to build right now? Here are a few ideas of things to build:

  • Spec watcher that alerts you when TC39 proposals, Node.js, or TypeScript ship breaking change—one API call per source, diff the rest yourself ()

  • Competitor intelligence monitor that runs weekly, extracts structured data from any product homepage, and pings you when something changes ()

  • Podcast prep agent that researches a host's last 20 episodes and returns a one-page brief before you record

  • Vendor due diligence tool that pulls pricing, HN mentions, changelog velocity, and Reddit complaints into a structured brief before you sign a contract

  • CFP discovery agent that scrapes Papercall, Sessionize, and conference homepages and returns open calls filtered to your topic areas

  • Dependency security monitor that reads CVE databases and package changelogs for your exact stack and tells you if you're actually affected—not just "vulnerability found"

  • API docs watcher that diffs the docs for any API you depend on and tells you what changed since last week

  • HN front page tracker that extracts structured data daily—title, score, domain, category—and builds a dataset over time for content strategy

  • Job posting intelligence tool that monitors hiring pages for companies you care about and extracts structured signals: what they're building, what stack they're moving to

 always be launching 🔥

Very interesting approach. Most web extraction tools eventually struggle when sites change their structure. How does Tabstack handle schema reliability over time without developers constantly updating extraction rules? Is there a point where human intervention is still required, or is the adaptation fully automated?

 Schema-based, not selector-based. You define the fields you want and a short description of each, and the model maps page content by meaning instead of position. So when a site reshuffles its DOM but still shows the same info, nothing on your end changes.

Human intervention comes in when what you want changes, not when the page does. New field, you add it to the schema. And if a page stops carrying something, extract.json returns null for that field instead of failing, so you catch it instead of getting silently wrong data.

So layout churn is handled for you. Deciding what to pull is still yours.

Most web extraction tools eventually struggle when sites change their structure.

You're spot on. Is it something you experimented yourself with another product? Would love to have your feedback about the first-time experience using - get started here:

   Thanks for the detailed explanation! The schema-based approach makes a lot of sense, especially compared to brittle selector-based extraction.

I've seen similar challenges in test automation, where UI changes can break scripts even when the underlying user workflow hasn't changed. It's interesting to see the same problem being solved from a data extraction perspective.

I'm curious, how do you evaluate extraction quality over time? Do you have any automated validation or confidence scoring to detect when the model might be returning plausible but incorrect data?

the 'no scraper to maintain' pitch hits different when you've actually spent time babysitting selectors after a site redesign. schema → JSON output that reliably matches is the right abstraction. curious what the rate limits look like at scale - the mozilla backing + no training on your data is a genuinely good differentiator

 Thanks for the support, Gal! and good question re: rate limits. At , rate limits are per account. Not per API key or endpoint. The plan limits:

  • Trial: 10 requests per minute

  • Individual: 10

  • Team: 25

  • Pro: 100

To learn more about rate limits and usage, . Hope it helps!

 Is it something you experimented yourself with another product btw? Would love to have your feedback about the first-time experience using - start here:

Looking forward to it!

The underrated part is not scraping; it is giving the agent a stable contract back from the web. Agents get much more useful when the web step returns schema and citations instead of a brittle browser transcript that has to be re-interpreted every run.

 Couldn't agree more. A schema beats a raw page dump every time, especially when an agent has to act on it. Appreciate you supporting the launch!

 framing this!

 follow-up question: what should launch next from your perspective? take the poll here in

One API call for structured JSON, markdown, and browser automation is a solid combo. Does the schema validation handle edge cases well when a site's layout changes, or does it need manual updates?

 Great question. The trick is you're not writing selectors. You define a JSON schema for the fields you want, and every call re-reads the page and maps it to that schema. So when a site ships a redesign, there's nothing to patch. No selectors to go stale.

Couple things to know: if a field straight up isn't on the page anymore, you get null back for it instead of the whole call blowing up. And for heavy JS pages, set effort to 'max' so it fully renders first.

So you maintain the schema, not the scraping logic. And the schema only changes when the data you want changes.

 Thanks for your support, and great question. Random idea here: a tool to get the JSON Schema for any URL.  wdyt?

Back at it! Love the idea of creating a competitor intelligence monitor that runs weekly, especially knowing it doesn't train on my data.

S/O to for the great work! a real-world example built with -

 thanks Flo. I'm obsessed with Rival. It's 🔥

 launching soon on ? 👀

 you know it!

For launch/community ops, I would use the MCP server to turn docs, changelog pages, and competitor pages into a weekly pre-release brief. The edge case I would test first is source freshness. If a teammate asks the agent to re-check one URL after a cached extraction, can Tabstack force a fresh read for that source while still using cache for the rest?

 Love this use case, and you picked exactly the right edge case to poke at.

Yes, you can. The nocache flag is per-request, not a global mode. A brief like that is really a fan-out of individual extract calls, one per URL, so cache is decided per source. When your teammate wants to re-verify a single page, set nocache: true on that one call and leave it off the others. That source gets a fresh read while the rest of the brief still answers from cache, so you never have to bust the whole run to re-check one link.

If you want everything fresh later, setting nocache: true on every call does that too, but for your scenario the per-source control is exactly what you're after.

That per-source nocache model is exactly what I was hoping for. For a weekly brief, I’d probably mark changelog and pricing pages fresh while leaving older docs cached, so the fan-out shape makes sense. Thanks for clarifying the boundary.

 awesome! give it a spin and let us know what you think. you can leave a review here:

 follow-up question: what should launch next from your perspective? take the poll here:

Congratulations on your launch today!

 thanks for your support, Thami! what's your favorite new feature in this launch?

12
Next
Last