ThinkSound

ThinkSound

Your AI sound designer with a Chain-of-Thought

5 followers

ThinkSound by Alibaba Tongyi Labs is the first open-source audio generation model to use Chain-of-Thought. It reasons about video scenes to create high-fidelity, synchronized spatial audio.
ThinkSound gallery image
ThinkSound gallery image
ThinkSound gallery image
ThinkSound gallery image
Free
Launch Team
AssemblyAI
AssemblyAI
Build voice AI apps with a single API
Promoted

What do you think? …

Zac Zuo

Hi everyone!

The new ThinkSound model from Alibaba Tongyi Labs brings a new idea to audio generation: letting the AI "think" before it creates sound.

It's the first model to apply Chain-of-Thought to this area. Instead of just matching sounds, it analyzes a video's events step-by-step to create high-fidelity, synchronized audio that really fits the scene. The results are pretty stunning!

The code is open-source under Apache 2.0. Just a heads up on the license for the model itself: it's available for research and educational use, but you'll need to contact the team for commercial licensing. Still, I do have a feeling this new approach to audio models will inspire and speed up the arrival of commercial ones.

You can try the demo here.