Wan2.2-S2V - Film-grade AI animation from a photo & audio
Wan2.2-S2V is an open-source model by the Wan team that creates film-grade digital human videos from a single image and an audio file. It generates natural expressions, lip-sync, and smooth body movements, with text prompts for extra control over the scene.


Replies
Flowtica Scribe
Hi everyone!
The Wan team has released a new model, Wan2.2-S2V.
Perfect lip-sync is already a basic requirement for video models. Where this one really shines is in the high level of natural character movement it generates. It can create full-body actions and you can even control the scene with text prompts.
A really practical tool for creators, and it's open-source.
It’s perfect for storytelling. Can it be used in real time or is it limited to pre rendered content?
Is there a limit to how long the generated video can be or can it scale up to full presentations and interviews rather than just short clips?
The natural expressions and lip sync are on another level can it manage multiple characters in a single scene?
I’m really impressed by the realism in body movements can we adjust gestures using prompts?