AssemblyAI — Build voice AI apps with a single API
Build voice AI apps with a single API
Promoted
Maker
📌
Hey Product Hunt!
I created ctrlSPEAK to fill the gaps I found in other speech-to-text tools as a product builder. Here’s why it stands out:
Open-Source: 100% transparent—dive into the code on GitHub.
Local: Built with NVIDIA’s Parakeet model with great balance of speed and accuracy.
Offline: Runs locally, keeping your data private and secure.
Easy Setup:
brew tap patelnav/ctrlspeak && brew install ctrlspeak
Give it a spin if you’re after a straightforward, developer-friendly tool. I’d love to hear your thoughts!
Report
Hey Nav,
Congrats on making this work for your needs!
I'm really curious thouhg: what is the point over using a phone that is natively doing this without many errors? I mean I'm using a 4yo Google Pixel 6 and it never failed my voice to text transcriptions, even capable of accepting punctuations and commas by command and of course it runs locally on google silicon.
I'm not saying that your app could not be useful, I just had this question in my mind.
Best of luck with your launch,
The Wolf
Report
Maker
@peterwolf Yup, Google has always been ahead of the curve on their Speech to text quality.
You can run this build with NVIDIA's Canary model. It comes with punctuation, and is more accurate, but much slower. Here are my benchmarks.
The goal of the project is two-fold, when a newer/better model comes out, it can easily be plugged in, and the code is transparent, so you know no shenanigans are happening with your data.
All that said, if Apple's Dictation wasn't atrocious, I would trust it, and we likely wouldn't be here. 😅
Report
@navpatel I've got your point about Apple. I've never owned an Iphone so I genuinely had no idea. But that market is quite big, so I can see some space for your app for sure.
Hey Product Hunt!
I created ctrlSPEAK to fill the gaps I found in other speech-to-text tools as a product builder. Here’s why it stands out:
Open-Source: 100% transparent—dive into the code on GitHub.
Local: Built with NVIDIA’s Parakeet model with great balance of speed and accuracy.
Offline: Runs locally, keeping your data private and secure.
Easy Setup:
Give it a spin if you’re after a straightforward, developer-friendly tool. I’d love to hear your thoughts!
Hey Nav,
Congrats on making this work for your needs!
I'm really curious thouhg: what is the point over using a phone that is natively doing this without many errors?
I mean I'm using a 4yo Google Pixel 6 and it never failed my voice to text transcriptions, even capable of accepting punctuations and commas by command and of course it runs locally on google silicon.
I'm not saying that your app could not be useful, I just had this question in my mind.
Best of luck with your launch,
The Wolf
@peterwolf Yup, Google has always been ahead of the curve on their Speech to text quality.
You can run this build with NVIDIA's Canary model. It comes with punctuation, and is more accurate, but much slower. Here are my benchmarks.
The goal of the project is two-fold, when a newer/better model comes out, it can easily be plugged in, and the code is transparent, so you know no shenanigans are happening with your data.
All that said, if Apple's Dictation wasn't atrocious, I would trust it, and we likely wouldn't be here. 😅
@navpatel I've got your point about Apple. I've never owned an Iphone so I genuinely had no idea. But that market is quite big, so I can see some space for your app for sure.
Keep pushing!