Launching today

Cli Modelarium
Compare LLMs with real statistics, right from your terminal
4 followers
Compare LLMs with real statistics, right from your terminal
4 followers
CLI tool for comparing AI language models with statistical rigor. Supports 8 cloud providers (OpenAI, Anthropic, Google, xAI, DeepSeek, Mistral, Groq, OpenRouter) plus local models. Bootstrap confidence intervals, paired significance tests, hallucination detection, LLM-as-judge panels, cost tracking with hard caps. One pip install, no infrastructure. Available on Linux, macOS, and Windows. Python 3.11+. Apache 2.0. pip install cli-modelarium


