All activity
Archie Chaudhuryleft a comment
We are a team of developers, engineers, and data scientists who constantly found ourselves asking "What is the best AI model for X?". As it turns out, this was not a straightforward question. Most benchmarks for frontier AI models came from the model creators themselves, or were reliant on crowdsourced "arena" style leaderboards which often felt subjective. Objective benchmarks did exist, but...
AtlasIndependent Evals and Benchmarks for GenAI models
Atlas, by LayerLens, is a community resource intended to provide insights about the performance of the top AI models through evals on benchmarks such as MATH, HumanEval, and MMLU. We are data-first, and provide a full suite of analytics for our benchmarks.
AtlasIndependent Evals and Benchmarks for GenAI models
