MindTrial

Puts AI models to the test.

2 followers

Puts AI models to the test.

2 followers

Visit website

LLMs

•

Testing and QA software

•

AI Metrics and Evaluation

Test a single AI language model (LLM) or evaluate multiple models side-by-side. MindTrial supports providers like OpenAI, Google, Anthropic, DeepSeek, Mistral AI, xAI, and Alibaba. You can create your own custom tasks with text prompts, plain text or structured JSON response formats, optional file attachments, and tool use for enhanced capabilities; validate responses through exact value matching or an LLM judge for semantic evaluation; and get results in easy-to-read HTML and CSV formats.

Free

Launch tags:Open Source•Artificial Intelligence•GitHub

Launch Team

Framer — Launch websites with enterprise needs at startup speeds.

Launch websites with enterprise needs at startup speeds.

Promoted

Maker

📌

I’m happy to share the MindTrial project I worked on. MindTrial is a tool for evaluating and comparing AI language models on text-based tasks with optional file/image attachments.

Compare multiple AI models side by side (OpenAI, Google, Anthropic, DeepSeek).
Create custom test tasks using simple YAML files.
Attach files or images to prompts for visual tasks.
Fine-tune model behavior with advanced parameters (temperature, top-p, etc.).
Get results in HTML and CSV formats.
Prevent API overload with smart rate limiting.
Run interactively using a terminal-based UI.

See the README for setup, usage, and configuration details.

Report

11mo ago

Reviews

Most Informative