fmerian

What's the best AI model for coding?

by

New AI models pop up every week. Some developer tools like @Cursor, @Zed, and @Kilo Code let you choose between different models, while more opinionated products like @Amp and @Tonkotsu default to 1 model.

Curious what the community recommends for coding tasks? Any preferences?

2.6K views

Add a comment

Replies

Best
nxnze

I've been using Opus 4.5 via Claude Code. Full disclosure, I've been quite skeptical about AI development, but have been consulting for an Anthropic partner and have been using it for some internal tooling. It can be pretty powerful if used correctly, although it still makes a lot of questionable decisions and mistakes, and needs decent oversight.

A few tips:
👉 You want to stay within about 40% of the context window. It's the sweet spot to get the best results.

👉 I first plan, and ask CC to create a feature spec, ideally broken down into stages.
👉 ALWAYS start with a clean working tree

👉 I start a new session, and ask it to read through the feature spec and implement stage 1

👉 Review the diff and make any necessary changes.

👉 Commit, and start a new session.

👉 Repeat.

I also try to stay away from very indepth claude.md instructions. They tend to confuse it and overload it with instructions and it skips instructions. I also keep a directory with feature definitions (generated by CC) as well as a folder for specific todos. (Basically skills, but let's be real. Agents and skills are just prompt text files). Like this at the start of a new session I can point it to a specific feature definition without it having to read through the codebase to figure out how the feature works, heavily saving on context size.

Nikita Ivanov

@nxnze can confirm about context: once you start stuffing the whole repo into the model, the quality of the solutions drops, especially with backend code..

Kirill Ilichev

They say it's Claude but in Cursor I never see the difference - especially because ChatGPT has bigger context window.

fmerian

@ilichev any thoughts on Composer 1?

Anton Ponikarovskii

@ilichev but it also about speed of development, try to use Composer instead of chatGPT, when you have 5 agents coding at the same time it gives you significant productivity boost without quality loss

andrew zakonov
not that simple in my experience, depends on tech stack and task complexity (I see diff winners on diff datasets) overall, I’d go with claude opus 4.5 for complex tasks, but gemini pro for ui/frontend and most of tasks for speed
JP

Opus 4.5?

fmerian

@dynamo my favorite. to quote @rauchg, Opus is "on a different level, unreasonably good at @Next.js" - source

Invince

@dynamo its the best, but also the most expensive

Derek Cheng

For @Tonkotsu, we use Sonnet 4.5 under the covers (though we're always evaluating the best model) as we've found it's got the right mix of strong agentic coding performance while being relatively fast. Other models are also quite good but tend to have very high latency and go heads-down for a long time.

What we've learned from our users is that while they want to operate at a high level, they also want to see granular progress from the agents — classic manager behavior. Sonnet has the right mix of capability and speed for this.

fmerian

@derekattonkotsu Oh I like the reasoning, i.e. finding a balance between speed and capabilities.

What would be the close 2nd? And if we look at the capabilities only, what would be the best model from your POV? Also, why not Opus 4.5!?

Derek Cheng

@fmerian Opus is definitely a contender. It's more expensive but very capable and at least in my experience not terribly slower. I've found Codex to be quite smart but it likes to go "heads-down" for a long time before coming back with an answer -- which doesn't fit with the pattern of usage that we see from our users.

Alina Petrova

Does the pricing also count as a criteria? 😅 Sonnet is quite expensive. Running glm 4.7 and codex 5.2 full-time is only a few percent behind sonnet and costs 3–7x less

fmerian
haha yes! there are definitely many variables we could take into account - capacities, speed, pricing...
Zypressen

@alina_petrova3 Absolutely, cost matters a lot.

Yuanyuan Zhang

In cursor, I have tried Opus 4.5 and GPT 5.2 in the plan mode, and personally, I prefer the former. However, I’m still torn on the best setup for fixing bugs. What are your preferences for debug mode? Do you stick with the same model or switch to a new one?

Abhisek Basu

@yuanyuan_zhang0104 For debugging, nothing beats GPT-5.2x High right now

fmerian

@brightmirror oh good to know! thanks for the suggestion

Adi Ghiuro

In cursor, we use Opus 4.5... I couldn't find a better model than that. Too bad it's not on the list.

fmerian

@atomer any experiences with Composer 1?

Adi Ghiuro
@fmerian I tested it on small tasks and it’s … ok. But not for building full features.
fmerian

I tested it on small tasks and it’s … ok. But not for building full features.

@atomer @Claude by Anthropic ftw haha

Siarhei
@fmerian Sonnet 4.5 is solid for agentic coding — fast and strong on reasoning. We use similar in HireXHub for probing interview questions (real-time follow-ups on resume gaps). Claude latency can be killer though. What do you think is the best model right now for live voice AI agents? #AIHiring
fmerian

What do you think is the best model right now for live voice AI agents?

great question! definitely ping @aidanhornsby from @Layercode on this - they are the experts here

Siarhei
@fmerian Thanks for the ping! For live voice agents right now — Grok + ElevenLabs combo crushes it on natural flow & low latency.
But memory is the real bottleneck in 2026.
Building HireXHub voice interviews on top — any recs for persistent memory layers?
Janefrances Christopher

Opus 4.5. The only downside is it consumes very quickly. So my second choice is sonnet 4.5

fmerian
123
•••
Next
Last