Billy Enrizky

Billy Enrizky

OpenBrowser-AIOpenBrowser-AI
Building https://openbrowser.me/

Forums

Same model, same tasks. 4 browser automation tools used wildly different amounts of tokens. Why?

I watched Claude read the same Wikipedia page 6 times to extract one fact. The answer was right there after the first read. But something about the tool interface kept making it look again.

That got me curious. If every browser automation tool can get the right answer, what actually determines how much it costs to get there?

So I ran a benchmark. 4 CLI browser automation tools. Same model (Claude Sonnet 4.6). Same 6 real-world tasks against live websites. Same single Bash tool wrapper. Randomized approach and task order. 3 runs each. 10,000-sample bootstrap confidence intervals.

The results (average tokens per task / wall time / tool calls):

Billy Enrizky

19h ago

OpenBrowser-AI - Connect AI agents to browser through raw CDP

OpenBrowser connects AI agents to browser through raw CDP. No abstraction layer. The LLM writes Python in a persistent namespace, batching operations per call. Page state at ~450 characters. Benchmarked against 3 frameworks on 6 real tasks: 100% accuracy across the board, 2.6x fewer tokens, 59% lower inference costs. Methodology is public and reproducible. MIT licensed. CLI + MCP server. 15 LLM providers. Two published RL studies training open-source models for browser control.