Zac Zuo

Gemini Robotics 1.5 - Bringing AI agents to the physical world

Gemini Robotics 1.5 is a new agentic framework from Google that enables robots to perceive, plan, and act. It combines a reasoning model for complex planning and tool use with a vision-language-action model for physical execution.

Add a comment

Replies

Best
Zac Zuo
Hi everyone! The biggest challenge in robotics has always been bridging the gap from simple, pre-programmed tasks to handling the complexity of the real world. A robot that can put a block in a box is one thing, a robot that can figure out which box based on local recycling rules is another. The new Gemini Robotics 1.5 is so interesting. It's not a single model, but an agentic framework. It pairs a reasoning model (ER 1.5) that acts as the "brain": planning, reasoning, and even using Google Search—with an action model (VLA 1.5) that executes the physical tasks. The "thinking before acting" capability is a big deal, but the most impressive part for me is its ability to learn across different embodiments. Seeing a skill transfer from a simple robot arm to a full humanoid without retraining is a major step towards truly general-purpose robotics.