How are you currently testing AI agents for security vulnerabilities before shipping to production?

I've been building safelabs-eval — an open-source framework for red-teaming and evaluating AI agents aligned to the OWASP LLM Top 10.

Before I launch, I'm genuinely curious about the community's current approach to agent safety:

Are you doing any adversarial testing on your agents before deploying them?
Which attack vectors concern you most — prompt injection, tool misuse, privilege escalation, data exfiltration?
Are you using any existing tools, or mostly manual testing?

I built this because I couldn't find anything open-source that worked across LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, and Google ADK without forcing you to change your stack.

Would love to hear what problems you're running into — it directly shapes what we build next.

GitHub: https://github.com/AgentSafeLabs/safelabs-eval

2 views

How are you currently testing AI agents for security vulnerabilities before shipping to production?

Replies