Codex in Chrome is an extension that lets the Codex app control your browser. It writes code to navigate websites, fill forms, and complete tasks in background tab groups using your active logins.
Browser automation is where coding agents start to feel less like autocomplete and more like actual operators. Iād be most interested in how it handles permissions, reversible actions, and knowing when to stop.
Report
An agent that uses your active logins is the right unlock ā most browser-automation tools live or die on auth, not logic. The interesting failure mode is the long-tail: forms that work fine in normal flow but break when an automation hits them out of context (custom dropdowns, lazy-loaded fields, third-party widgets). The pattern shows up in a different shape on StoryRoute, a small browser-based travel app I built where users want "narrate this city as I walk it" ā the navigation is mostly DOM-walks plus geolocation handoffs, and edge cases are where the experience falls apart. Question: when Codex hits a flow it doesn't recognize, does it fall back to LLM-vision-of-the-page or does it ask the user to demonstrate the click once?
Report
Really cool to see an agent working in background tabs. It's a huge time saver for research. Nice upgrade!
Report
@OpenAI how can I use Codex in Chrome for lead prospections ?
Report
It's really cool to see an actual agent working on with a browser. Am curious to know how the permissions and all would be handled!
Replies
Flowtica Scribe
Should I do this?
Browser automation is where coding agents start to feel less like autocomplete and more like actual operators. Iād be most interested in how it handles permissions, reversible actions, and knowing when to stop.
An agent that uses your active logins is the right unlock ā most browser-automation tools live or die on auth, not logic. The interesting failure mode is the long-tail: forms that work fine in normal flow but break when an automation hits them out of context (custom dropdowns, lazy-loaded fields, third-party widgets). The pattern shows up in a different shape on StoryRoute, a small browser-based travel app I built where users want "narrate this city as I walk it" ā the navigation is mostly DOM-walks plus geolocation handoffs, and edge cases are where the experience falls apart. Question: when Codex hits a flow it doesn't recognize, does it fall back to LLM-vision-of-the-page or does it ask the user to demonstrate the click once?
Really cool to see an agent working in background tabs. It's a huge time saver for research. Nice upgrade!
@OpenAI how can I use Codex in Chrome for lead prospections ?
It's really cool to see an actual agent working on with a browser. Am curious to know how the permissions and all would be handled!