Pippo: High-Res 3D Humans from a Single Photo.

Hi everyone!

Sharing Pippo, a new project from Meta Reality Labs that takes single-image human generation to the next level!

📸 One Photo, Full Turnaround: Create a complete 3D turnaround video from a single full-body or face-only photo.
🎥 Multi-View from Video: Generate multi-view videos from monocular (single-camera) video input.
✨ 1K Resolution: The output is high-resolution (1024x1024).
⚡ One Forward Pass: The entire video is generated in a single forward pass of the model – no iterative refinement needed.
⚙️ Advanced Tech: It's based on a Diffusion Transformer, with techniques like ControlMLP (for pixel-aligned control) and Attention Biasing (for longer videos).

While this is currently a code-only release (no pre-trained weights yet), the potential is huge for creating realistic avatars, enhancing video editing workflows, and more.