trending
Waqar Javed

1d ago

How are you currently testing AI agents for security vulnerabilities before shipping to production?

I've been building safelabs-eval an open-source framework for red-teaming and evaluating AI agents aligned to the OWASP LLM Top 10.

Before I launch, I'm genuinely curious about the community's current approach to agent safety:

  • Are you doing any adversarial testing on your agents before deploying them?

  • Which attack vectors concern you most prompt injection, tool misuse, privilege escalation, data exfiltration?

  • Are you using any existing tools, or mostly manual testing?

I built this because I couldn't find anything open-source that worked across LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, and Google ADK without forcing you to change your stack.

Waqar Javed

1d ago

Open-source eval framework for AI agents - aligned to the OWASP Agentic Security Initiative Top 10

I just published an open-source framework for red-teaming AI agents. Not LLM chatbots — agents. The kind built on LangChain, CrewAI, AutoGPT-style architectures that use tools, call APIs, and take multi-step actions in the world. GitHub: https://lnkd.in/eCSea5ak If you're building agents and you've hit unexpected failure modes — I'd like to hear about them.