Realtime TTS-2 - Voice AI that feels as good as it sounds

Raycast

•28d ago

Realtime TTS 1.5 is #1 on Artificial Analysis, voted best in blind tests by thousands of real users. TTS-2 builds on that with six major upgrades: natural language voice direction for tone, emotion, speed, and pitch. Text-based voice design, where you describe a voice in words and generate it. Cross-lingual synthesis across 100+ languages preserving speaker identity. IPA phonetic control for brand names and rare words. And improved alphanumeric pronunciation. Try it free at inworld.ai/tts.

Replies

Best

So I tried it, Speech to Speech. It confuses itself and hallucinates very quickly with just basic questions and conversation, I asked both bots how are you, what are you doing today, and what are you doing for dinner. Both gave me completely different spectrum of answers. They gave alot of filler responses like hey, hmm, huh, which I can understand why those are there. But Jason started telling me how to increase the gain of my television set, and Sarah thought I was going to a party. Also the vocal fidelity is alot to desire, in speech to speech. Just my honest feedback so far. Keep at it.

Report

27d ago

Inworld

Maker

@frank_cefalu thanks for sharing this feedback and for creating a new Product Hunt account to post it. Just to be clear, it's a voice synthesis technology, not sure how LLM model halucinations apply here.

Report

27d ago

What about less popular languages? Swedish or Danish? Formally, ChatGPT supports them too, but in practice speech recognition produces a lot of errors, which creates many problems, for example, for AI support projects.

Report

23d ago

1 2