
Soniqo Speech
On-device speech AI for Mac, Windows, Linux & Android
61 followers
On-device speech AI for Mac, Windows, Linux & Android
61 followers
Every speech capability you'd normally rent from a cloud API — transcription, expressive TTS, voice cloning, speaker-aware diarization, denoising, full-duplex speech-to-speech — all on-device on Apple Silicon, Windows, Linux, and Android, via MLX, CoreML, ONNX Runtime, and LiteRT. Ships a CLI, a local HTTP server, and Swift/Kotlin/C++ APIs. Plus Speech Studio, a new open-source desktop voice-cloning app for creators (macOS to start). Apache 2.0.
This is the 2nd launch from Soniqo Speech. View more
Soniqo Speech
Launched this week
Build voice agents on a complete on-device speech stack: ASR (NVIDIA Nemotron, multilingual + streaming), TTS, voice cloning, diarization, denoising, and full-duplex speech-to-speech (NVIDIA PersonaPlex) — plus a voice agent pipeline for turn-taking, interruptions, and queuing. Runs on Mac, Windows, Linux, and mobile (iPhone + Android), with NPU-optimized inference (CoreML, NNAPI). Swift, Kotlin, and C++ APIs. Plus Speech Studio, a desktop voice-cloning app for creators. Apache 2.0, on-device.


Free
Launch Team / Built With


That's awesome, had no idea this was possible on-device. What kind of devices have you tested on? What are the hardware requirements?
@willsmithte It depends on model, some basic could run on Android or iPhone, some more heavy needs Mac 16Gb... 32Gb. For example Kokoro would run on Mac Air, even NVIDA Nemotron. But PersonaPlex might need at least 8GB for 4int quant, int4 also degrade a bit model. So better int8 with at least 16Gb.
Really proud to see Soniqo Speech live on Product Hunt.
The goal here was to build a practical on-device speech stack for developers - not just another wrapper around a cloud API. It includes ASR, TTS, voice cloning, diarization, denoising, and speech-to-speech, exposed through APIs that can be used in real products.
For voice agents, creator tools, accessibility, automotive, and embedded use cases, running locally makes a real difference: lower latency, offline support, better privacy, predictable costs, and more product control.
Excited to see what people build with it.