All activity
Sertinoxleft a comment
Hey PH! I built PromptBench because I was tired of testing prompts in 5 different tabs and losing track of what worked. The core insight: prompt engineering is iterative, but nobody treats it that way. We version our code, A/B test our UIs — why not our prompts? Tech stack: Next.js, Supabase, shadcn/ui, Stripe. Supports Claude, GPT-5, GPT-4o, o3, Mistral. Free tier is BYOK (bring your own API...

PromptBenchVersion, test & compare your AI prompts across models
Stop guessing which prompts work. PromptBench lets you run the same prompt on Claude, GPT-5, o3, and Mistral side-by-side, score outputs 1-10, and track performance over time with analytics.
Features: multi-model playground, prompt versioning, scoring, analytics dashboard, chat & complete modes. 10 models supported.
Free with your own API keys. Pro $12/mo with managed credits.

PromptBenchVersion, test & compare your AI prompts across models
