Creating with Text to Speech software, even with the best AI tools like ElevenLabs is a highly iterative process, trying out voices, editing scripts for pacing and pronunciation, etc. etc., especially if you need multiple speakers for a podcast, or an audiobook or a radio play with multiple characters. Vois.so supports multiple speakers with automatic recognition when importing scripts. Very few TTS systems do that at present, combine that with itβs killer feature, it runs locally on your computer and does not operate on a token or time basis, just a very reasonable monthly or annual fixed cost. So however many interactions, generations or how much text you throw at it the cost is the same. This is new software, with a responsive developer who is actively supporting users and with considerable plans to build on a strong foundation. For a monthly cost at a fraction of its competitors this is well worth trying, there is even a free version with unlimited text and 10 generations a day, and full audio export for just $5 a go. I have been using this software for a few weeks now, and can highly recommend it.
Vois
@praney_behlΒ Hi Praney. Congrats on the launch. What datasets were used to train the voice models?
Vois
@kimberly_ross Thanks! Great question. The TTS engines use models trained on publicly available speech datasets commonly used in speech synthesis research, clean, studio-quality speech corpora.
The 63+ production voices in the library were created using voice design techniques (generating voice characteristics from text descriptions) - they're not clones of real people.
For the voice cloning feature, the app requires users to confirm they have the voice owner's explicit consent before processing.
Happy to go deeper on any of this!
The no uploads angle is the one that would actually sell me β I never loved the idea of sending scripts to a cloud service just to get audio back. How does the voice quality hold up on longer form content like a full chapter of an ebook? That's usually where these tools start to sound robotic.
Vois
@zerodarkhubΒ Nice. Well you can virtually go as long as you want. Vois has a built-in complex optimization and memory management module that keeps everything in check and functional. The scripts functionality lets you split chapters Individually not only for management but also for export control. Ultimately it comes down to how powerful your computer is, the one that is running Vois, that would dictate the time it takes to export, but Vois has been optimized to work with decent speeds on even older computers. Actually as a matter of fact, all the Vois tutorials and demo videos the narration has been created within Vois itself. Not just narration but also the background music and effects are also done within the Vois app. Vois is Free to try!
https://vois.so/tutorials
Sounds amazing, this could tie into my voice server for claude code.
One question, does it support African voices and intonations? Big gap here industry-wide.
Vois
@clement_ozemoyaΒ Absolutely, we are launching agent skill and accompanying vois-cli for programatic access soon.
Super! Your back story is inspiring, and congrats on the launch. Will give it a shot and let you know my feedback :)
Vois
@abhinavrameshΒ Thanks Abhinav, I look forward to it. I hope you enjoy trying the app as much as I enjoyed building it.