
MiniCPM
Ultra-efficient on-device AI, now even faster
154 followers
Ultra-efficient on-device AI, now even faster
154 followers
MiniCPM 4.0 is a family of ultra-efficient, open-source models for on-device AI. Offers significant speed-ups on edge chips, strong performance, and includes highly quantized BitCPM versions.
This is the 5th launch from MiniCPM. View more
MiniCPM-o 4.5
Launching today
A 9B omni-modal model that sees, listens, and speaks simultaneously. Features full-duplex streaming (no turn-taking lag) and proactive interaction. Outperforms GPT-4o on vision benchmarks. Runs locally via llama.cpp & Ollama.







Free
Launch Team



Flowtica Scribe
Hi everyone!
True "listen-while-speaking" capability is still hard to get right, especially for open models. Interruptions often don't feel quite seamless.
MiniCPM-o 4.5 fixes this with full-duplex. It listens while it speaks, so you can interrupt it naturally. It feels much more like a real conversation.
The crazy part is it does this locally (it is only 9B!). It supports llama.cpp and Ollama out of the box, so you can run it on your own device without latency.
Worth a try if you want a kinda Gemini Live experience but offline :P