
BrowserAct
Web browser automation for AI agents
1.7K followers
Web browser automation for AI agents
1.7K followers
BrowserAct is built for agents using the web. It gives agents a browser layer for real websites, so they can pass blocked pages, adapt to real scenarios, run multiple tasks safely, and return clean web data for reasoning. Use BrowserAct when an agent needs to browse, click, extract, fill forms, upload files, work inside logged-in sites, handle verification, or run repeatable browser workflows.






ZapDigits
Is this any different from the @Tabstack by Mozilla ?
BrowserAct
@malithmcrdev Yes, a bit different. Tabstack feels more like a managed API for clean structured output, while BrowserAct is more like a browser layer for agents to actually interact with real sites. We focus more on using real Chrome sessions, handling logins/CAPTCHAs, and letting agents keep going when websites get messy.
Congrats for the launch. Does it also bypasses security layer of websites protected by Cloudflare ?
@johnsy
Thanks for your thoughtful question!
BrowserAct handles verifications through a tiered workflow.
It runs commands like solve-captcha to resolve standard verification challenges automatically.
For steps that demand real human identity or manual input—including logins, 2FA/MFA, OAuth, QR sign-ins, SMS/phone checks, security keys, biometric confirmation, drag CAPTCHAs and manual approvals—BrowserAct triggers human handoff via the remote-assist feature.
Mailwarm
How does BrowserAct handle verification flows, like does it pause for a human or have any built in solve options?
@thamibenjelloun Yes. BrowserAct supports both paths.
If a verification flow is supported, the agent can use `solve-captcha` to handle it automatically. For flows that need human identity or judgment, the agent can use `remote-assist` with an objective, so a person can step into the active browser session, complete the check, and let the agent continue from the same state.
Have all aspects of browser operation been tested?
@chen_rex We test all core BrowserAct capabilities, including browser control, state reading, extraction, sessions, isolation, verification handling, and human handoff.
That said, real websites vary a lot, so we’re always expanding edge-case coverage. If you try it on your own workflow, we’d really appreciate your feedback.
Prevention is the right first line. The part I keep getting stuck on is the residual: even with good stealth, some fraction still gets quietly degraded, and at that point the session looks healthy so nothing in the browser state tells the agent the content is junk. The only thing that worked for us was a content-level canary, asserting some value we know a real logged-in response should carry. Does BrowserAct expose anything at that layer, or is the bet that prevention keeps the residual small enough to ignore?
The browser task that still breaks agents for me is not just the click path. It is proving what happened after a login, upload, or verification handoff.
For agent workflows I would want BrowserAct to return a small run receipt: final URL, key actions taken, files uploaded, form fields changed, and the exact point where a human stepped in. That makes it easier to reuse safely instead of treating each browser run as a black box.
BrowserAct
@divvsaxena Thanks for your interest! We actually do have demo videos available — And if you'd like a more in-depth walkthrough, we'd be happy to schedule a video call demo at your convenience.