Hey Product Hunt!

I'm Babangida, and I'm sharing ArtemisKit—an open-source toolkit for testing LLM and agentic applications.

Why I built this:

Working with clients on LLM integrations, we developed internal testing practices with scenario definitions, adversarial probing, load testing etc. As we scaled across more projects, we realized we were rebuilding similar tooling each time.

ArtemisKit is the productization of those practices. The testing toolkit we wished existed when we started.

What makes it different:

1. Integrated: Functional testing, security testing, and performance testing in one CLI
2. YAML-based: Define tests declaratively, version them with your code
3. Multi-provider: Same tests work across OpenAI, Anthropic, Azure, Vercel AI (more coming soon)
4. Truly open: Apache-2.0, no telemetry, self-host everything

What I'd love feedback on:

- What testing challenges do you face with LLM apps?
- What features would make this more useful for you?
- What's confusing or could be improved?

We're here to answer questions!

GitHub: https://github.com/code-sensei/a...
Website: https://artemiskit.vercel.app
Docs: https://artemiskit.vercel.app/docs

ArtemisKit v0.1.7

Open-source testing toolkit for LLM applications

Open-source testing toolkit for LLM applications