Higgs Audio v2 - Lifelike, emotionally competent voice generation
Higgs Audio v2 by BosonAI is a powerful open-source audio foundation model. It excels at generating expressive, multi-speaker dialogues and long-form audio. It outperforms GPT-4o-mini-tts on emotion benchmarks and is now available for developers.

Replies
Flowtica Scribe
Hi everyone!
BosonAI has open-sourced Higgs Audio v2, a new and powerful audio foundation model.
It’s built for creating lifelike, multi-speaker conversations and long-form audio. The model also has a strong focus on emotional expression, achieving a 75.7% win rate over gpt-4o-mini-tts on the Emotions category of the EmergentTTS-Eval benchmark.
Being open source, it's very accessible. The smaller models can even run on a Jetson Orin Nano!