Before launching our AI assistant, we worked with a red teaming vendor (let s call them L ) to check how safe our product really was.
We were expecting a few corner cases or prompt injection attempts.
What we got was a pretty eye-opening report: infinite output loops, system prompt leaks, injection attacks that bypass moderation, and even scenarios where malicious content could be inserted by users via email inputs.