Ken Mueller

About

No single AI model is right all the time, and their errors don't strongly correlate. So I built Sup AI: a confidence-weighted ensemble that synthesizes multiple LLMs into one better answer. My dad Scott (AI Research Scientist, TRI/UCLA) is my research partner. Stanford CS.

Badges

Tastemaker
Tastemaker
Gone streaking
Gone streaking

Maker History

  • Sup AI
    Sup AIAI ensemble that scored #1 on Humanity's Last Exam
    Apr 2026
  • 🎉
    Joined Product HuntMay 10th, 2023

Forums

Ken Mueller

1mo ago

Sup AI - AI ensemble that scored #1 on Humanity's Last Exam

Every LLM hallucinates. They just don't hallucinate the same things. Sup AI runs multiple LLMs (out of 339) in parallel, then synthesizes answers by measuring confidence on every segment. High entropy = likely hallucination, downweighted. Low entropy = likely accurate, amplified. Result: 52.15% on Humanity's Last Exam, 7.41 points ahead of any individual model. $10 starter credit. Card verified. No auto-charge.
View more