Gemini Robotics 1.5 is a new agentic framework from Google that enables robots to perceive, plan, and act. It combines a reasoning model for complex planning and tool use with a vision-language-action model for physical execution.
Hi everyone!
The biggest challenge in robotics has always been bridging the gap from simple, pre-programmed tasks to handling the complexity of the real world. A robot that can put a block in a box is one thing, a robot that can figure out which box based on local recycling rules is another.
The new Gemini Robotics 1.5 is so interesting. It's not a single model, but an agentic framework. It pairs a reasoning model (ER 1.5) that acts as the "brain": planning, reasoning, and even using Google Search—with an action model (VLA 1.5) that executes the physical tasks.
The "thinking before acting" capability is a big deal, but the most impressive part for me is its ability to learn across different embodiments. Seeing a skill transfer from a simple robot arm to a full humanoid without retraining is a major step towards truly general-purpose robotics.
Replies
Flowtica Scribe