A 9B omni-modal model that sees, listens, and speaks simultaneously. Features full-duplex streaming (no turn-taking lag) and proactive interaction. Outperforms GPT-4o on vision benchmarks. Runs locally via llama.cpp & Ollama.
MiniCPM 4.1 is a new 8B open-source model designed for the edge. Featuring a novel trainable sparse attention architecture, it brings efficient "deep thinking" and long-context capabilities to on-device AI, achieving state-of-the-art performance for its size.
MiniCPM 4.0 is a family of ultra-efficient, open-source models for on-device AI. Offers significant speed-ups on edge chips, strong performance, and includes highly quantized BitCPM versions.