Roleplay: Social-engineering tests for AI agents

Hi everyone 👋 I’m Ibrahim, the founder of @Roleplay.

The idea started from a simple question: If AI agents are starting to act like humans, and humans can be manipulated, can AI agents be manipulated into doing what they’re not supposed to do?

A lot of AI agent testing focuses on correctness, prompt injection, or one-time evals. Those matter, but they don’t fully answer what happens when an agent is pressured, persuaded, or tricked inside a real workflow.

That’s where Roleplay comes in.

Roleplay tests whether your AI agent can be socially engineered into approving the wrong action, revealing sensitive information, bypassing a policy, trusting fake authority, or misusing a tool.

It runs social-engineering attack packs against your agent, captures exploit proof, helps verify fixes, and keeps checking that the same failure doesn’t come back.

The current version includes:

Attack packs for fake authority, urgency pressure, policy bypass, data extraction, and tool misuse
Specialized packs for customer relationship, sales/SDR, and recruiting/HR agents
Local testing with the included CLI runner
Sanitized evidence uploads
Exploit proof and replay
Fix verification
Scheduled monitoring
Regression gates
Agent Risk Profile

I’d really appreciate your feedback, especially if you’re building, deploying, or securing AI agents.