Thank you to everyone who upvoted, left thoughtful comments and engaged with our launch on social media! We're delighted to be finishing Trophy 1.0 at #3 with 323 points! We're collating all the questions and feedback and integrating it into our roadmap and plan to share product updates in this forum as we launch them each month.
We also have a few more launches planned for the coming months so stay tuned...
Most AI benchmarks are built backwards. Someone sits down, dreams up hard problems, and then measures how well agents solve them. The results are interesting, sure. But they don't always tell you what matters: how agents perform on the actual work that's sitting in your queue.
That's why we built cto.bench.
Instead of hypothetical tasks, we're building our benchmark from real work. Every data point on cto bench comes directly from how cto.new users are actually using our platform.