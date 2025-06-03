AI Diplomacy
Launching today
We made AIs battle for world domination
We gave seven AIs command of Europe's great powers to battle for global supremacy. Would o3 betray Claude? Could Gemini outwit DeepSeek? In AI Diplomacy, language models lie, scheme, and form shaky alliances in a high-stakes strategy game.
Hey Product Hunt! 👋
We've been working on AI Diplomacy for months and are excited to make it public today.
We built this because traditional AI benchmarks are challenging to understand and don't actually reflect how we, as humans, interact with AI. We wanted a way to understand the quality of how AI communicates and its ability to strategize long term. The cherry on top was seeing if it was able to lie and betray!
So we tried something different: What if we just let AI models play Diplomacy against each other and we exposed their communication and thinking behind each move?
The results are both entertaining and insightful. We tested out 18 different models across countless games to understand how each AI performs. One of our favorite insights: OpenAI's o3 turned into a master manipulator, lying and backstabbing its way to victory. Meanwhile, Anthropic's Claude 4 Opus refused to betray anyone—even when losing.
It's completely open source, and we'd love your help making it better! Try different model combinations, suggest new features, or just enjoy watching AIs negotiate (and betray) each other.
Huge thanks to Alex Duffy, Tyler Marques, Sam Paech, The TextArena team, Oam Patel, and countless others for leading the build, and the entire team at Every for making this launch possible.
