VDA (Virtual Desktop Assistant) is a powerful, local, AI-driven assistant that controls your computer like a human. It features advanced vision-based UI element detection (OCR & SIFT), voice commands, and smart agent loops. Built with Python & PyQt6. - PCJIRON/VDA
Replies
Best
Hunter
📌
I’m incredibly excited to introduce VDA (Virtual Desktop Assistant) today! 🚀
As developers and power users, we do a lot of repetitive tasks on our computers. While there are many AI agents out there, most of them are cloud-dependent, expensive, or struggle with non-standard UI elements. I built VDA to solve this by creating a 100% open-source, local, and vision-driven desktop companion.
🤔 What is VDA? VDA controls your computer exactly like a human does—by seeing the screen, moving the mouse, and typing on the keyboard. You can instruct it via voice or text, and it figures out the rest!
✨ Key Features:
👁️ Vision-First Approach: Instead of relying purely on DOM/HTML, VDA uses OpenCV, UI Automation, and RapidOCR to click exactly where it needs to.
🧠 Smart Error Recovery: If a button moves or changes, VDA's agent loop re-evaluates the live screenshot and tries a fallback (like a keyboard shortcut Ctrl+L or Tab).
🔌 Bring Your Own AI: Connect your favorite provider seamlessly—OpenCode Zen, OpenRouter, Gemini, or DeepSeek.
🛠️ Customizable Skills: Teach VDA your specific workflows by simply uploading a Markdown (.md) file.
👻 Non-Intrusive UI: A sleek, translucent PyQt6 floating widget that stays out of your way.
💻 Tech Stack: Python 3.9+, PyQt6, OpenCV, RapidOCR.
VDA is fully open-source (GNU GPLv3), and I’m actively looking for contributors and feedback to make it the ultimate desktop assistant.
I’d love to hear your thoughts, feedback, and what specific workflows you would want to automate with VDA! Drop your questions below, I’ll be hanging out here all day answering them. 👇
Cheers! ☕
Replies