Model Kombat by HackerRank

The AI Code Arena

222 followers

The AI Code Arena

222 followers

Coding LLMs go head-to-head on real programming tasks. Developers vote on which solution they'd actually ship. These votes become training data for better models. No synthetic tests. Just code, performance, and brutal honesty.

Model Kombat by HackerRank gallery image

Free

Launch tags:Developer Tools•Artificial Intelligence

Launch Team / Built With

memories.sh — One layer for memories, skills, and rules across any agent

One layer for memories, skills, and rules across any agent

Promoted

Model Kombat by HackerRank

Maker

📌

Hi Product Hunt community 👋🏻 I'm Rafik from HackerRank, and we're excited to introduce Model Kombat, our live coding arena where LLMs fight for developer approval on real programming tasks. 𝗪𝗵𝗮𝘁 𝗶𝘀 𝗶𝘁? Model Kombat is a public evaluation arena where coding LLMs go head-to-head, generating solutions live. Developers vote on which code they'd actually ship to production. These votes become Direct Preference Optimization (DPO) training data, creating a continuous feedback loop that makes coding LLMs better for everyone. 𝗪𝗵𝘆 𝗻𝗼𝘄? Current LLM benchmarks are fundamentally broken. They rely on synthetic tests and crowd-labeled data from non-experts while companies bet millions on models that might fail at basic production tasks. Model Kombat solves this by putting real developers in charge. No more "trust me, bro, my model is best." Prove it in the arena, or lose. 𝗪𝗵𝗮𝘁'𝘀 𝗶𝗻𝗰𝗹𝘂𝗱𝗲𝗱? Live Model Battles: Two models generate solutions side by side, with problem statements always visible. You vote for the code that would pass your actual code review. Language Specific Leaderboards: Track which models dominate Python vs SQL vs JavaScript. Understand model strengths and weaknesses with precision. DPO Eval Pipeline: Every vote captures programming language, task type, difficulty level, model patterns, and developer comments. This rich metadata makes future models understand what production-ready code actually looks like. Full Transparency: All evaluation data is public. Leaderboards by language, task type, and difficulty level. No hidden benchmarks or cherry-picked results. For Developers, By Developers: Built as both a fun game for devs and a serious evaluation platform for model builders. Run controlled evaluations, test fine-tuned variants, and see how your model stacks up publicly. Welcome to the arena. The fight starts now! 💪🏼 Would love to hear what you think or which models you'd like to see battle next!

Report

6mo ago

PicWish

@rafik_matta_hr I really like the “would you ship this?” framing.
Will you also surface why devs picked one solution over another?

Report

6mo ago

Model Kombat by HackerRank

Maker

@mohsinproduct yes! That's part of the next release. We're keeping the experience super light right now but we will enable dev written feedback pretty soon

Report

6mo ago

CatDoes

Congrats on the launch. Just tried it out and it's quite fun! :) When can we start testing on our own custom problems?

Report

6mo ago

Model Kombat by HackerRank

Maker

@mahdi_nouri we're planning to enable that feature after the launch campaign! Let us know which models you'd like to see and the type of format for problems you're interested in

Report

6mo ago

This looks really promising! Congratulations on the launch! 🎉

When are you planning to introduce more models?

Report

6mo ago

Model Kombat by HackerRank

Maker

@sanskaragar16 we'll add more models over the next month. For now we wanted to focus on the models that perform best on some popular benchmarks to compare

Report

6mo ago

Congratulations on the launch! I'm curious—are you using the same question library from HackerRank, or are these new ones?

Report

6mo ago

Model Kombat by HackerRank

Maker

@akshat_shah14 new ones but using the same rigorous approach we take to creating questions in general

Report

6mo ago

This is genius, finally a way to benchmark LLMs that actually respects developer standards. “Would you ship this?” is exactly the right question.

Report

6mo ago

BlogBowl

Nice, congratulations on your launch. It took me some time to understand what the product does ;)

Report

6mo ago

Model Kombat by HackerRank

Maker

@danpole Hey Dan, thanks for the feedback! Please let me know anything that would help make it clearer.

Report

6mo ago

Incredibly proud of what our team has built! Excited to see which models developers put their trust in the most.

Report

5mo ago

1 2