fmerian

Kimi K2.6 vs. Claude Opus 4.7

by

Kimi K2.6 launched last week on Product Hunt, 4 days after @Claude by Anthropic Opus 4.7.

How do they really compare? The @Kilo Code team ran the comparison. They gave both models the same workflow orchestration spec and reviewed the code. Here's what review turned up.

Key takeaways

  • Claude Opus 4.7 ran 31 tests, all green. 1 real bug.

  • Kimi K2.6 ran 20 tests, all green. 6 confirmed issues.

  • Opus 4.7 scored 91/100 at $3.56

  • Kimi K2.6 reached 75% of this score (68/100) at 19% of the cost ($0.67)

Full write-up here →

As pointed out in another thread comparing @MiniMax M2.7 with Opus 4.6, [1] the gap between open-weight and frontier models has narrowed significantly over the past year. For prototyping or exploring a design, the $0.67 run is a good deal. For work requiring correctness and accuracy, Opus 4.7 remains ahead.

Any experiences coding with open-weight models?

[1]: MiniMax M2.7 vs. Claude Opus 4.6

389 views

Add a comment

Replies

Best
Lennart Rikk

Can you do Sonnet 4.6 vs Kimi K2.6. This would be more appropriate comparison cost-wise imo.

fmerian

@yodalr good idea, thanks! it's definitely another angle. cc @dax1

Stoyan Minchev

I see how people use minimax or kiwi for code review and other operations that don't really touch the code. Then they prepare a descriptive report with the findings and pass it to Opus for the real implementation. This way, they save from tokens and give "another point of view" while reviewing

fmerian

neat approach

Stan Kolotinskiy

Interesting approach - I'm usually doing it the other way around: Sonnet/Opus for planning and then Kimi/Minimax (or Gemma/Qwen locally) for the implementation

Amit Raj
Is there any platform for renting GPU at reasonable price?
Paul Plessing

@amit_raj25 absolutely: Scaleway, Nebius, Inceptron for example

Stan Kolotinskiy

Given that I'm not trusting a LLM with writing all my code without me watching closely over it, having such a performance at a fraction of the cost is really impressive!

fmerian

oss ftw

Stan Kolotinskiy

@fmerian hahaha, hard to disagree - especially for sensitive tasks

Vittal Bharadwaj

Nice breakdown. Indie dev here — I build desktop apps, AI scrapers, and complex stuff. Kimi is amazing for prototyping at 19% of the cost. But for production where correctness matters? Claude still wins. Gap is closing, but not closed yet.

fmerian

exactly - curious what's your preferred AI model when coding? see this thread

David Sherer

Is everyone now just getting VC backing and creating there own AI's and datacenters?