Open source AI agent penetration testing agent that automates recon, vulnerability discovery, and analysis. Executes security tests using installed tools, maintains immutable audit trails, and delivers findings with OWASP mapping and remediation guidance











First off, a huge thank you for checking out the project.
As a security professional, my terminal is my home. While I love the power of modern AI, I found a disconnect between its capabilities and the hands on, command-line workflows of penetration testing. I wanted to create a tool that didn't try to replace the terminal, but instead, kinda supercharge it. I wanted to build an assistant that felt native to the command line, not a workflow, but an actual agent.
What's new and unique about it?
While it's an AI terminal agent, theres the Project Management Engine. It's not just about running commands; it’s about conducting a proper, organized engagement. From the moment you start, it creates a dedicated project database that automatically handles:
Scope Management: You define your boundaries (URLs, CIDR ranges, etc.), and it helps you stay within them. This is huge for safety and professionalism.
Automatic Evidence Collection: When you find something, CortexAI captures the related HTTP traffic and links it to the vulnerability, saving you a ton of time on documentation.
Centralized Vulnerability Tracking: Every finding is logged in a structured way, so you have a single source of truth for your entire assessment.
It aims to handle the more parts of pentesting so you can focus on the creative, problem-solving parts.
It orchestrates your entire assessment. You can see from the terminal examples on the repo, it's not just a command and response tool. It forms a plan, executes it, analyzes the output, and when it hits a wall (like a DNS error or a WAF), it reevaluates (by itself) and suggests a new path forward. It's a true dialogue.
The moment you start, everything you do is captured in a structured project database:
Scope Management: You define your playground once, and it keeps you from accidentally stepping out of bounds. This is a lifesaver for staying professional and safe.
Automatic Evidence Collection: This is the feature I always wanted. When the agent identifies a potential vulnerability, it automatically captures the related HTTP requests, api endpoints, responses, etc as evidence. No more manual copy pasting. Every finding is logged with its severity, status, and linked evidence, creating a single source of truth for your report later.
What am I most proud of?
That back and forth interaction. Seeing the agent hit a Cloudflare CAPTCHA, acknowledge it, and then pivot to subdomain enumeration to find a different way in. It's not a script..it’s a thinking partner that helps you navigate (and learn) the complexities of a test.
What's next?
I have a pretty ambitious vision for this!
For the Community: First up, I'm building out more open-source features. Think easy-to-use report exporters (Markdown, JSON), a "CortexOS" Docker image that comes pre-loaded with dozens of popular security tools, and a simple desktop companion app for visually managing your projects.
I'm also working on a debian variant for CortexOS, so your options would be cortexAI (the terminal agent), CortexOS Docker, or the live CortexOS as an operating system.
Immediate Community Tools: I just got plugins implemented, so you can build your own plugins for cortex, but next: re-architecting CortexAI into a client server model. This will allow it to support multiple AI providers (OpenAI, Gemini, Claude, and even self-hosted models via Ollama) and lay the groundwork for other possibilites.
The API will power a professional web interface designed for teams. Imagine logging a vulnerability in your terminal, and it instantly appearing in your teammate's dashboard, ready for them to review, comment on, and triage.
The long term dream is to integrate the core features of giants like Burp Suite directly into CortexAI. Think a full intercepting proxy, AI powered passive traffic analysis, and a modular exploit framework where the agent can say, "I found Log4Shell, I have an exploit module ready. Shall I proceed?" To evolve CortexAI into a truly autonomous agent that can conduct entire security engagements with human oversight, making security assessments more efficient and comprehensive than ever.
I'm really not good at words, and this sounds like a sales pitch. I hope you just go check out the repo yourself if you're interested :)