WebMarker

WebMarker

Mark web pages for use with multimodal large language models

5 followers

WebMarker adds visual markings with labels to elements on a web page. This can be used for Set-of-Mark (SoM) prompting, which improves visual grounding abilities of vision-language models such as GPT-4o, Claude 3.5, and Google Gemini 1.5.
WebMarker gallery image
WebMarker gallery image
WebMarker gallery image
WebMarker gallery image
Free
Launch Team
Flowstep
Flowstep
Generate real UI in seconds
Promoted

What do you think? …

Be the first to comment