vAquilla

Deploy local LLMs with smart and auto GPU management

3 followers

Deploy local LLMs with smart and auto GPU management

3 followers

vAquila is an open-source AI model inference manager. It combines the absolute simplicity of a CLI with the production performance of vLLM and the isolation of Docker, all with smart and automated GPU management. It orchestrates everything for you. Like an eagle soaring over your infrastructure, it analyzes your GPU state in real-time, calculates the perfect memory ratio, and deploys the vLLM Docker container invisibly and securely.

Free

Launch tags:Developer Tools•Artificial Intelligence•GitHub

Launch Team

getviktor.com — An AI coworker that actually does the work

An AI coworker that actually does the work

Promoted

Maker

📌

Hi Product Hunt! I’m Xavier, maker of vAquila. vAquila is an open-source orchestrator for running vLLM models with a simple developer experience: - one-command model launches - Docker-native runtime - GPU/CPU support - local Web UI for monitoring and control We’re launching in public beta today. Core workflows are stable, and we’re actively improving VRAM estimation/tuning. If you test it, I’d love your feedback on: 1) setup experience 2) model startup reliability 3) VRAM estimation accuracy Thanks for checking it out 🙌 GitHub: https://github.com/xschahl/vAquila

Report

1mo ago

Reviews

Most Informative