Reviewers broadly see Cursor as a fast, deeply integrated AI coding tool that fits naturally into everyday development, especially for debugging, refactoring, multi-file edits, and navigating larger codebases without leaving the editor. Many say it beats Copilot on context awareness, autocomplete, and workflow, while also helping non-engineers and small teams ship faster. The main complaints are practical: pricing and token use feel confusing or expensive, Auto mode can be slow or verbose, and some users report instability, weak context handling in complex projects, and friction from it being a separate VS Code fork.
Flowtica Scribe
Hi everyone!
More training scale, still aggressive pricing, and a better model for long-running coding work.
That is Composer 2.5 in a nutshell.
It continues from the same base as Composer 2, and this version is trained to be better at sustained work, complex instructions, communication style, and effort calibration inside Cursor.
Targeted textual feedback helps the model improve specific mistakes inside long rollouts, while 25x more synthetic tasks push it into harder coding problems grounded in real codebases.
Looking forward to the next larger model trained from scratch with 10x more compute!
@zaczuo S/O to Moonshot's Kimi K2.5!
Specializing a model for code editing workflows rather than pure generation is the right call. Generic models fall apart on multi-file edits because they lose track of interdependencies. Building RetainSure, we've hit that quality cliff when an agent touches more than three or four tightly coupled files. How does Composer 2.5 handle semantic conflicts when simultaneous edits across files introduce inconsistencies?
The focus on long-horizon agentic tasks is the right unlock for real coding workflows. We've run into this building RetainSure where the model drops context mid-task once the chain gets 20+ steps deep. How does Composer 2.5 handle state management across very long multi-file agentic sessions, and is there a hard limit on task length before it degrades?
The effort calibration piece stands out to me. We've had agents lose coherence around step 6 or 7 in a long agentic chain, and it's tough to know if that's model drift or context decay. Curious how the targeted textual feedback is applied: at the step level within a rollout, or on the full trajectory?
Has the team experimented with giving the agent more awareness of runtime behavior — like logs or error traces — so it can reason about what's actually happening vs just what the code says? Curious if that's on the roadmap.
the shift from composer as a feature to composer as a versioned product with its own release cadence is an interesting signal about where Cursor thinks the value actually lives. it's not the editor anymore, it's the agent layer on top of it
What I liked most is that Cursor actually understands the repo structure instead of only using nearby code as context. The agent mode fixing errors and rerunning tests feels much more useful than normal AI autocomplete. Curious how performance scales on really large codebases or monorepos.