Chameleon

Chameleon

Run any LLM on demand — zero idle VRAM.

2 followers

Chameleon is a stateless AI runtime that becomes any LLM on demand. Instead of keeping models loaded, it routes each request to the best model, loads it just-in-time, executes, and fully unloads — resulting in zero idle VRAM usage. Run multiple models efficiently with one runtime, without wasting memory or restarting systems.

Chameleon makers

Here are the founders, developers, designers and product people who worked on Chameleon