
MIDI
Create Complete 3D Scenes from a Single Image
7 followers
Create Complete 3D Scenes from a Single Image
7 followers
MIDI is an open-source model for generating 3D scenes from a single image. Simultaneously generates multiple 3D objects with correct spatial relationships in ~40 seconds.






Flowtica Scribe
Hi everyone!
Found something cool – MIDI, a new open-source project that generates a complete 3D scene from a single image!
What's special:
🖼️ A single image (with objects segmented) creates a full 3D scene.
🧩 Multi-instance diffusion generates all objects simultaneously, ensuring correct scene layout.
🤝 A new multi-instance attention mechanism handles interactions between objects.
💨 It's fast – generating a scene in as little as 40 seconds.
🔓 Code, model, and a demo are all open-source.
This significantly advances single-image 3D generation. A collaboration between VAST and university researchers made it happen.
Try MIDI here to see the magic happen!