Model Kombat by HackerRank

Model Kombat by HackerRank

The AI Code Arena

164 followers

Coding LLMs go head-to-head on real programming tasks. Developers vote on which solution they'd actually ship. These votes become training data for better models. No synthetic tests. Just code, performance, and brutal honesty.
Model Kombat by HackerRank gallery image
Model Kombat by HackerRank gallery image
Model Kombat by HackerRank gallery image
Model Kombat by HackerRank gallery image
Model Kombat by HackerRank gallery image
Model Kombat by HackerRank gallery image
Free
Launch Team / Built With
Framer
Framer
Launch websites with enterprise needs at startup speeds.
Promoted

What do you think? …

Rafik Matta
Hi Product Hunt community šŸ‘‹šŸ» I'm Rafik from HackerRank, and we're excited to introduce Model Kombat, our live coding arena where LLMs fight for developer approval on real programming tasks. š—Ŗš—µš—®š˜ š—¶š˜€ š—¶š˜? Model Kombat is a public evaluation arena where coding LLMs go head-to-head, generating solutions live. Developers vote on which code they'd actually ship to production. These votes become Direct Preference Optimization (DPO) training data, creating a continuous feedback loop that makes coding LLMs better for everyone. š—Ŗš—µš˜† š—»š—¼š˜„? Current LLM benchmarks are fundamentally broken. They rely on synthetic tests and crowd-labeled data from non-experts while companies bet millions on models that might fail at basic production tasks. Model Kombat solves this by putting real developers in charge. No more "trust me, bro, my model is best." Prove it in the arena, or lose. š—Ŗš—µš—®š˜'š˜€ š—¶š—»š—°š—¹š˜‚š—±š—²š—±? Live Model Battles: Two models generate solutions side by side, with problem statements always visible. You vote for the code that would pass your actual code review. Language Specific Leaderboards: Track which models dominate Python vs SQL vs JavaScript. Understand model strengths and weaknesses with precision. DPO Eval Pipeline: Every vote captures programming language, task type, difficulty level, model patterns, and developer comments. This rich metadata makes future models understand what production-ready code actually looks like. Full Transparency: All evaluation data is public. Leaderboards by language, task type, and difficulty level. No hidden benchmarks or cherry-picked results. For Developers, By Developers: Built as both a fun game for devs and a serious evaluation platform for model builders. Run controlled evaluations, test fine-tuned variants, and see how your model stacks up publicly. Welcome to the arena. The fight starts now! šŸ’ŖšŸ¼ Would love to hear what you think or which models you'd like to see battle next!
Mohsin Ali ✪

@rafik_matta_hrĀ I really like the ā€œwould you ship this?ā€ framing.
Will you also surface why devs picked one solution over another?

Rafik Matta

@mohsinproductĀ yes! That's part of the next release. We're keeping the experience super light right now but we will enable dev written feedback pretty soon

Mahdi Nouri

Congrats on the launch. Just tried it out and it's quite fun! :) When can we start testing on our own custom problems?

Rafik Matta

@mahdi_nouriĀ we're planning to enable that feature after the launch campaign! Let us know which models you'd like to see and the type of format for problems you're interested in

namira taif

Love the template theme!

Emily Campbell

@namira_taifĀ Thanks Namira!

Nisa Meray

Love the launch! The branding especially. It’s super engaging.

Emily Campbell

@nisa_merayĀ Thanks Nisa, it was a ton of fun to create

Michael Becker

Love the idea of models battling it out on real coding tasks instead of lab tests. Makes the results feel so much more authentic and useful beyond just the tech crowd. Congrats on the launch!

Daniil Poletaev

Nice, congratulations on your launch. It took me some time to understand what the product does ;)

Rafik Matta

@danpoleĀ Hey Dan, thanks for the feedback! Please let me know anything that would help make it clearer.

Jyotiska Khasnabish

Looks really cool, specially the sounds effects! Congratulations team for shipping this šŸ™Œ

12
Next
Last