MiniCPM

MiniCPM

Ultra-efficient on-device AI, now even faster

154 followers

MiniCPM 4.0 is a family of ultra-efficient, open-source models for on-device AI. Offers significant speed-ups on edge chips, strong performance, and includes highly quantized BitCPM versions.
This is the 5th launch from MiniCPM. View more

MiniCPM-o 4.5

Launching today
Real-time, full-duplex multimodal AI on your device
A 9B omni-modal model that sees, listens, and speaks simultaneously. Features full-duplex streaming (no turn-taking lag) and proactive interaction. Outperforms GPT-4o on vision benchmarks. Runs locally via llama.cpp & Ollama.
MiniCPM-o 4.5 gallery image
MiniCPM-o 4.5 gallery image
MiniCPM-o 4.5 gallery image
MiniCPM-o 4.5 gallery image
MiniCPM-o 4.5 gallery image
MiniCPM-o 4.5 gallery image
MiniCPM-o 4.5 gallery image
Free
Launch Team
Intercom
Intercom
Startups get 90% off Intercom + 1 year of Fin AI Agent free
Promoted

What do you think? …

Zac Zuo

Hi everyone!

True "listen-while-speaking" capability is still hard to get right, especially for open models. Interruptions often don't feel quite seamless.

MiniCPM-o 4.5 fixes this with full-duplex. It listens while it speaks, so you can interrupt it naturally. It feels much more like a real conversation.

The crazy part is it does this locally (it is only 9B!). It supports llama.cpp and Ollama out of the box, so you can run it on your own device without latency.

Worth a try if you want a kinda Gemini Live experience but offline :P