Devstral 2 - SOTA open-source agentic coding models and CLI agent

Flowtica Scribe

•5mo ago

Devstral 2 is the new SOTA open-weight coding family, achieving 72.2% on SWE-bench Verified. It ships with Mistral Vibe, an open-source CLI agent for end-to-end code automation. Currently free via API.

Replies

Best

Flowtica Scribe

Hunter

📌

Hi everyone!

Mistral just raised the bar for open-weight coding models. Devstral 2 (123B) hits 72.2% on SWE-bench Verified, effectively making it the new SOTA in the open-source space.

It rivals larger open models like DeepSeek V3.2 and gets surprisingly close to closed models like Claude Sonnet 4.5, but at a fraction of the inference cost. The smaller 24B version runs locally on consumer hardware but still punches above its weight.

They also released Mistral Vibe, a native CLI agent that handles end-to-end code automation right in your terminal.

The API is currently free to use!

Report

5mo ago

Camocopy

Yeah what should I say. It's another Mistral release within 1 week and I've tested it yesterday. Must admit that it's really good and the Devstrall Small is punching above it's weight. Works smooth on my M4 chip and can't complain yet 👌🏻

Report

5mo ago

Congratulations... finally something from Europe. I currently use Claude Code. How does it fare in terms of data protection (model training)? Now it just needs to work as well as Claude Code.

Report

5mo ago

72.2% on SWE-bench is legit. Open-weight coding models being competitive with closed ones is huge for dev autonomy.

Q: How does the latency compare for real-time IDE integration? Also, is the VIbe CLI available now or just the model weights?

Shipping this matters! 🚀

Report

5mo ago

Would anyone know about a cursor like alternative that could use such models running locally ? I know about Cursor + running model locally and using nGrok but I am looking for something a bit more solid here.

Report

5mo ago

It’s wild that a 7B model is beating Llama 13B on reasoning benchmarks. The sliding window attention seems to be doing a lot of heavy lifting here. Curious if anyone has tried fine-tuning this for specific RAG tasks yet? I am wondering how fragile the reasoning gets once you saturate the context window

Report

5mo ago

Wow, Mistral AI looks amazing! The Devstral 2 SWE-bench score is incredible. How easily does Mistral Vibe integrate with existing CI/CD pipelines for automated testing?

Report

5mo ago