Launching today

SlimSnap
Your AI doesn't know which button you mean
123 followers
Your AI doesn't know which button you mean
123 followers
The AI reads your screenshot as a pixel blob and guesses which button you meant. SlimSnap converts the screenshot plus your annotation into structured JSON: every element has coordinates, an ID, and your arrow points at a specific one. Around 700 tokens vs 1,568 raw on Sonnet. Free Mac app. Schema and Claude Code skill are open MIT. Runs entirely on-device.






SlimSnap
This is a real pain with Claude Code and Cursor. The agent usually understands the general UI, but still touches the wrong element. Does SlimSnap keep enough context when there are multiple similar buttons or inputs on the same screen?
SlimSnap
@farrukh_butt1 Yes, exactly the case the schema was built for. Each element gets a unique ID regardless of how visually similar it is to others. OCR text + bbox coordinates + (if present) parent context disambiguate the duplicates. So if there are five "Submit" buttons on the screen, they show up as e_button_5, e_button_8, e_button_11 (or whatever IDs they get), and your arrow annotation points at exactly one of them.
The edge case where it still struggles: identical floating elements with no surrounding container or distinguishing text (rare but possible in canvas-based apps). For 95% of UI work, the ID + bbox + annotation combo holds up.
What kind of UI are you hitting this with most? Cursor with React forms? Claude Code with admin dashboards? Useful for prioritizing where to harden the schema.
SlimSnap
One follow-up question for anyone scrolling: when you paste a screenshot into your AI tool (ChatGPT, Claude, Cursor, Lovable, whatever), what's the #1 thing the AI gets wrong about it? Trying to figure out which gap to close next.
SlimSnap
@montverde That's the exact failure mode the target_ref field tries to address. When you annotate the misaligned button and the agent sees annotation.target_ref = e_button_3, it has a stronger anchor for what to touch and what to leave alone. Doesn't eliminate scope creep entirely (the agent still decides whether layout shifts are necessary), but it shifts the default from "rewrite the whole component" toward "fix the specific element referenced."
The backtracking compounds in longer sessions. Which AI tool is this happening most for you? Different agents handle scope differently and that helps me figure out which auto-loader to build next.
SlimSnap
@montverde Yeah, that matches what I've seen. Claude tends to respect the "change only this" intent better than GPT does, even before SlimSnap. With the Claude Code skill the loop gets tighter still: it auto-loads the latest capture so you don't even paste the JSON, just type "fix what I marked" and the agent reads the spec.
Curious if you're on Claude Code specifically or claude.ai / API. If it's Claude Code, the skill is at github.com/bickov/slimsnap-skill, MIT, install instructions in the README.