2,883 Upvotes

Qwen3.5 Small 0.8B-9B native multimodal w/ more intelligence, less compute

gpt-realtime-1.5 by OpenAI Tighter instruction adherence in speech agents
Grok 4.2 Four AI agents debate internally to build your answer

Lyria 3 by Google Deepmind Turn any photo or thought into a custom song inside Gemini

Gemini 3.1 Pro A smarter model for your most complex tasks

Sonnet 4.6 The most capable Sonnet model yet

Qwen3.5 The 397B native multimodal agent with 17B active params

MiniMax-M2.5 The first open model to beat Sonnet made for productivity

GLM-5 Open-weights model for long-horizon agentic engineering
Model Council in Perplexity Consult a council of multiple frontier models at once

Claude Opus 4.6 Claude’s most advanced model for agentic tasks

Voxtral Transcribe 2 by Mistral Real-time speech-to-text with speaker diarization

Kling 3.0 Native 4K output with extended video time with just a prompt

The new v0 Full stack vibe coding platform. Created by Vercel.

Codex by OpenAI A command center for working with agents

Grok Imagine API SOTA video generation across quality, cost, and latency

Agentic Vision in Gemini Agentic visual reasoning with code execution

Qwen3-TTS Voice design, cloning & 97ms streaming

FastMCP 3.0 The fast, Pythonic way to build MCP servers and clients

Cowork Turn Claude into your digital coworker
