Launching today

Rosply
AI agent that controls your computer autonomously
52 followers
AI agent that controls your computer autonomously
52 followers
Rosply is an AI agent that sees your screen and controls your computer. It moves the mouse, types, clicks, and completes real tasks autonomously on Windows, Mac, and Linux. It works with any vision capable, intelligent model via OpenRouter, so you are never locked into one provider. Native Claude Code and MCP integration lets developers plug it into their workflow as an agent that executes tasks, not just writes code.



Rosply
The hard part with screen-driving agents is not just whether they can click/type; it is whether the run stays legible after 20 autonomous steps.
For a tool like this I’d want a tiny run summary: apps touched, last action, waiting-on-human state, time spent, and maybe a loop/stuck signal. That is usually the difference between “neat demo” and something I would trust to leave running.
Are you thinking about per-run history or guardrails for when Rosply gets into a repetitive loop?
Rosply
@jaemin_song Good question, and you're right that legibility is the real differentiator here, not just whether it can click.
On loops: Rosply already has loop detection built in and pulls itself out if it detects it's repeating the same failed action.
On the broader run summary idea (apps touched, last action, waiting-on-human state, time spent): that's not fully there yet in the way you're describing it. Right now you get a live step-by-step log of what it's doing, but not a condensed "state" view that tells you at a glance whether it's stuck or just slow. That's a genuinely good idea for something I'd want to leave running unattended, going to think about adding that.
What does it use under the hood to control the computer?
Rosply
@devakash Great question! Under the hood, Rosply works in three main steps:
1. Vision: It takes a screenshot of your screen and overlays a coordinate grid, which lets the AI model read exact pixel positions instead of guessing.
2. Reasoning: that screenshot gets sent to a vision-capable AI model (you can pick from multiple models via OpenRouter, run it locally with Ollama, or even plug in Claude Code as the brain). The model looks at the screen and decides the next action: click, type, scroll, drag, etc.
3. Action: Rosply executes that action on your actual mouse/keyboard using low-level system calls, then takes a new screenshot to see the result and loops back to step 1.
On top of that there's persistent memory (so it doesn't lose context across steps), loop detection (so it doesn't get stuck repeating the same failed click), and a coarse-to-fine grounding system for more accurate clicking.
Everything's also written up in detail in the docs if you want the full breakdown: rosply.com/docs
Happy to answer anything else!
Is it local-first or there is a server component?
Rosply
@divya_kothari1 Local-first. Rosply runs entirely on your machine: screenshots, action execution, and the loop all happen locally. The only external call is to whichever AI model you choose as the brain, OpenRouter if you want cloud models, or fully local with Ollama if you want zero external calls at all. No backend of mine in the middle, no telemetry, no accounts.
Hey, Its a really cool product tbh. Can I get like a trial?
If anything, what I would do is simply create a closed-source product and make its API available to an interface which people can connect to and pay a monthly subscription instead of giving up the source code.
I would be very happy if you would allow me to have a free trial or something without having me pay anything, because I just want to know if it can really help me or not.