A family of SOTA speech models (0.6B & 1.7B) supporting 10 languages. Features prompt-based Voice Design, 3s zero-shot cloning, and extreme low-latency streaming.
The Qwen team just dropped what might be the most comprehensive open-source TTS release we have seen. Qwen3-TTS combines three things that are usually mutually exclusive: SOTA quality, extreme speed, and creative control.
The "Voice Design" feature is really robust—just describing the persona (e.g., "sad old man") works surprisingly well.
Technically, the efficiency is wild. They use a 12Hz tokenizer to compress speech without losing detail, bringing the latency down to just 97ms 🤯
Open source TTS just raised the bar again. If you are building anything with voice, you might wanna check this out.
97ms latency thats faster than I can decide what to have for lunch! This is a massive win for the open-source community. The voice design sounds like a dream for creators who are tired of hearing the same 3 robotic voices everywhere. Can’t wait to try describing a caffeinated marketing manager on a Monday morning - that would be my perfect persona:D Congrats on the launch!
I’ve been using Qwen for building a simple code and website generator, and it works really well for fast iterations. Great for prototyping and lightweight generation.
What needs improvement
I need more on the history pages, a section when we can re-edit the input/process/output with easy UX. Basically, better handling of edge cases without extra prompting
vs Alternatives
I choose Qwen because it’s fast, lightweight, and great for turning ideas into simple, working code or websites. It was also the first web-based tool I explored for code generation, which made it easy to start prototyping right away.
How accurate is Qwen3 on real coding tasks you tried?
Quite good, but still need some touch-up especially on the logic.
Does Qwen3-Coder reduce PR review time or defects?
I’ve been trying Qwen alongside GPT-4o, and honestly it feels great — it’s noticeably faster and cheaper, yet most of the time the answer quality is hard to tell apart. For quick everyday tasks, I barely notice any trade-offs, which makes it a super practical choice.
Flowtica Scribe
Hi everyone!
The Qwen team just dropped what might be the most comprehensive open-source TTS release we have seen. Qwen3-TTS combines three things that are usually mutually exclusive: SOTA quality, extreme speed, and creative control.
The "Voice Design" feature is really robust—just describing the persona (e.g., "sad old man") works surprisingly well.
Technically, the efficiency is wild. They use a 12Hz tokenizer to compress speech without losing detail, bringing the latency down to just 97ms 🤯
Open source TTS just raised the bar again. If you are building anything with voice, you might wanna check this out.
Demo Here.
97ms latency thats faster than I can decide what to have for lunch! This is a massive win for the open-source community. The voice design sounds like a dream for creators who are tired of hearing the same 3 robotic voices everywhere. Can’t wait to try describing a caffeinated marketing manager on a Monday morning - that would be my perfect persona:D Congrats on the launch!
Camocopy
Okay but which languages? Why not show the 10 languages more obvious