Zac Zuo

Wan2.2-S2V - Film-grade AI animation from a photo & audio

Wan2.2-S2V is an open-source model by the Wan team that creates film-grade digital human videos from a single image and an audio file. It generates natural expressions, lip-sync, and smooth body movements, with text prompts for extra control over the scene.

Add a comment

Replies

Best
Zac Zuo

Hi everyone!

The Wan team has released a new model, Wan2.2-S2V.

Perfect lip-sync is already a basic requirement for video models. Where this one really shines is in the high level of natural character movement it generates. It can create full-body actions and you can even control the scene with text prompts.

A really practical tool for creators, and it's open-source.

Leo Cobain

It’s perfect for storytelling. Can it be used in real time or is it limited to pre rendered content?

Brooklyn Campbell

How does the model deal with different input qualities? For example if someone uploads a lower resolution photo can it still produce convincing results or is high quality input necessary for the best outcome?

Grayson Parker

Is there a limit to how long the generated video can be or can it scale up to full presentations and interviews rather than just short clips?

Jude Gray

The natural expressions and lip sync are on another level can it manage multiple characters in a single scene?

Graham Weaver

I’m really impressed by the realism in body movements can we adjust gestures using prompts?

Amelia Smith

Creating film quality digital human videos from just an image and audio is a huge leap for both creators and studios. The level of realism in expressions and body movements sounds incredible. Do you think indie creators will adopt this first, or will it be the larger production houses?