Zac Zuo

Rubber Duck - Cross-model reviews in GitHub Copilot CLI

Rubber Duck is a new experimental mode in GitHub Copilot CLI that uses a second model from a different AI family to review plans, implementations, and tests before the agent moves forward. It is designed to catch architectural mistakes, edge cases, and cross-file issues earlier.

Add a comment

Replies

Best
Zac Zuo

Hi everyone!

@Github Copilot CLI added Rubber Duck in experimental mode.

It triggers a second model (can be a different family) that reviews the main agent’s plan and output at key moments — after planning, after big implementations, before tests. Think of it as an independent rubber duck that actually talks back.

Early numbers are solid: Claude Sonnet + Rubber Duck closes 74.7% of the gap to Opus on SWE-Bench Pro, especially on complex multi-file work.

If you’re already living in Copilot CLI, just turn on /experimental and you’ll probably stay afloat🛟

Rob

The good ol' Rubber Duck. Every developer knows how useful that can be (I use colleagues as rubber duckies sometimes tho, they don't always appreciate).