All activity
New Userleft a comment
Hey everyone! I built Clawd Cursor because I was frustrated with how slow and expensive existing AI desktop agents are. They screenshot everything and send it to GPT-4V for every single click. My approach: use screen reader accessibility APIs first. The OS already knows what's on screen — button names, text fields, menu items. Why ask an AI to figure that out from pixels? The result: 80% of...

Clawd CursorDesktop AI agent using screen reader APIs, not screenshots
Most AI desktop agents screenshot your screen and send it to a vision model for every action. Clawd Cursor takes a different approach — it uses screen reader accessibility APIs first, falling back to vision only when needed. The result: 80% of tasks need zero LLM calls. It's 6x faster and 30x cheaper than screenshot-based agents. Built with TypeScript, it connects via VNC and uses a smart action router that tries accessibility APIs, then task decomposition, then AI vision as a last resort.

Clawd CursorDesktop AI agent using screen reader APIs, not screenshots
