I've been building safelabs-eval an open-source framework for red-teaming and evaluating AI agents aligned to the OWASP LLM Top 10.
Before I launch, I'm genuinely curious about the community's current approach to agent safety:
Are you doing any adversarial testing on your agents before deploying them?
Which attack vectors concern you most prompt injection, tool misuse, privilege escalation, data exfiltration?
Are you using any existing tools, or mostly manual testing?
I built this because I couldn't find anything open-source that worked across LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, and Google ADK without forcing you to change your stack.
I just published an open-source framework for red-teaming AI agents.
Not LLM chatbots — agents. The kind built on LangChain, CrewAI, AutoGPT-style architectures that use tools, call APIs, and take multi-step actions in the world.
GitHub: https://lnkd.in/eCSea5ak
If you're building agents and you've hit unexpected failure modes — I'd like to hear about them.