Fish Audio

Fish Audio

Expressive Text-to-Speech and Voice Cloning

4.5
6 reviews

995 followers

Fish Audio is the most expressive and emotionally rich text-to-speech model. It generates lifelike voices that capture emotion, rhythm, and nuance with remarkable realism. Fish Audio Voice Clone recreates a natural voice from just 10 seconds of audio—preserving accent, tone, and speaking habits. Proudly built by the open-source team behind So-VITS-SVC and Bert-VITS2, giving a soul to every voice.
This is the 4th launch from Fish Audio. View more
Fish Audio S2

Fish Audio S2

Launching today
Real Expressive AI Voices
We've open-sourced Fish Audio S2, a new generation of expressive TTS that lets you direct voices with natural language. Add cues like [whisper] or [laughing nervously], generate multi-speaker dialogue in one pass, and create scary-real voices across 80+ languages.
Fish Audio S2 gallery image
Fish Audio S2 gallery image
Fish Audio S2 gallery image
Fish Audio S2 gallery image
Fish Audio S2 gallery image
Fish Audio S2 gallery image
Free
Launch Team / Built With