Muyan-TTS

Muyan-TTS

Open-source, high-quality TTS for podcasts & voice cloning

3 followers

Muyan-TTS is an open-source TTS for podcasts, trained on 100k+ hours of audio. Offers high-quality zero-shot voice generation & speaker adaptation with minutes of speech.
Muyan-TTS gallery image
Muyan-TTS gallery image
Muyan-TTS gallery image
Free
Launch Team
AssemblyAI
AssemblyAI
Build voice AI apps with a single API
Promoted

What do you think? …

Zac Zuo

Hi everyone!

There's a new open-source text-to-speech model out called Muyan-TTS, from the MYZY-AI team, and it's specifically designed with podcast applications in mind.  

What's notable is that Muyan-TTS was pre-trained on over 100,000 hours of podcast audio. This allows it to generate high-quality voices zero-shot, meaning it can use a short audio sample to generate speech in that voice without new training. For more customized voices, the fine-tuned version (Muyan-TTS-SFT) can adapt to a specific speaker with just dozens of minutes of their audio. They've also been transparent about their development, mentioning it was built within a ~$50k budget.  

The models (both base zero-shot and the SFT version for speaker adaptation) and training code are all released.