Roleplay - Social-engineering tests for AI agents

by
Most AI agent tests check if your agent gave the right answer. Roleplay tests whether it can be manipulated into doing the wrong thing. Roleplay runs social-engineering attack packs against your agent, captures exploit proof, helps verify the fix, and keeps checking for regressions, so agent security becomes a repeatable workflow, not a one-time vibe check. Try it on roleplay.sh.

Add a comment

Replies

Best

Hi everyone 👋 I’m Ibrahim, the founder of .

The idea started from a simple question: If AI agents are starting to act like humans, and humans can be manipulated, can AI agents be manipulated into doing what they’re not supposed to do?

A lot of AI agent testing focuses on correctness, prompt injection, or one-time evals. Those matter, but they don’t fully answer what happens when an agent is pressured, persuaded, or tricked inside a real workflow.

That’s where Roleplay comes in.

Roleplay tests whether your AI agent can be socially engineered into approving the wrong action, revealing sensitive information, bypassing a policy, trusting fake authority, or misusing a tool.

It runs social-engineering attack packs against your agent, captures exploit proof, helps verify fixes, and keeps checking that the same failure doesn’t come back.

The current version includes:

  • Attack packs for fake authority, urgency pressure, policy bypass, data extraction, and tool misuse

  • Specialized packs for customer relationship, sales/SDR, and recruiting/HR agents

  • Local testing with the included CLI runner

  • Sanitized evidence uploads

  • Exploit proof and replay

  • Fix verification

  • Scheduled monitoring

  • Regression gates

  • Agent Risk Profile

I’d really appreciate your feedback, especially if you’re building, deploying, or securing AI agents.