We re trying something new on Thursday: Alpha Day.
The idea is simple. If this is the first time you re launching your product anywhere, you can tag it alpha and get a boost to your points (and land on a special leaderboard).
TAB independently benchmarks AI agents so you don't have to trust the builder's word. 299 benchmarks across 21 specialty domains test security, hallucination, sycophancy, contamination, and provenance. 59 models from Anthropic, OpenAI, Google, xAI, and more via OpenRouter. Every score published, including the failures. Pay-as-you-go: $0.03/test text, $0.10 tool-use, $0.25 browser. No subscriptions. No advertising. Free security screening for your first agent. SDKs on PyPI and npm.