How are you safely executing code generated by your AI agents?

Lineage Lens

•2mo ago

Hey Product Hunt community! 👋

As a solo maker, I've been diving deep into the world of autonomous AI agents (LangChain, LlamaIndex, etc.). One of the biggest bottlenecks I kept hitting was code execution.

When an agent needs to analyze data, scrape a site, or run a simulation, it writes code. But where do you safely run it?

exec() or subprocess on the host machine? 🚨 Terrifying. One bad prompt injection and the LLM accesses your .env files or exfiltrates data.
Standard Docker containers? 🐢 Too slow and heavy for rapid-fire agent tool calls.
Cloud sandboxes? ☁️ Great, but I didn't want to send my local data or proprietary agent logic to a 3rd party API just to run a simple pandas script.

To solve this, I started building Vela (powered by the Aegis runtime). It’s a local-first, open-source execution guard designed specifically for AI agents and SaaS platforms.

A quick look under the hood:

Rust Daemon + Firecracker: Uses pre-warmed Firecracker micro-VMs for near-instant, hardware-level isolation without the overhead of full containers.
HMAC Capability Tokens: Instead of blanket permissions, you issue scoped, time-bound tokens per request (e.g., "allow read/write to /tmp, block all network access, max 64MB RAM").
Framework Adapters: Built a drop-in AegisPythonREPL tool for LangChain so agents can route dangerous tool calls into the sandbox transparently.

I’m officially launching Vela on Product Hunt very soon, but I wanted to start a discussion here first to gather feedback and see how others are tackling this.

I’d love to hear from other makers, AI devs, and security folks:

How are you currently handling the code-execution bottleneck in your AI workflows?
Do you prefer local-first sandboxes for data privacy, or are you fully relying on cloud execution APIs?
If you were using this, what framework adapters would you want to see next?
Githu repo:https://github.com/karnati-praveen/VELA

9 views

Replies

Best

I’d keep the execution layer boring and auditable: per-run filesystem/network caps, short-lived credentials, and a small artifact that says what was allowed, what actually ran, and what was blocked. Local-first is attractive when the agent touches customer or repo data, but the key trust point is whether a teammate can replay the policy decision later. For adapters, I’d prioritize Claude Code / OpenAI-style tool calls and a simple Python REPL path first.

Report

2mo ago

Lineage Lens

@new_user___2672025cf1bc18102609b53 Thanks, Wang — I completely agree.

One thing we've learned while building Vela is that isolation alone isn't enough. Teams also need evidence. If a security incident happens six months later, the question isn't just "was this sandboxed?" but "what permissions were granted, what code executed, and what actually happened?"

That's why we're focusing heavily on policy-driven execution and auditability. Every execution carries explicit capabilities, resource limits, and an audit trail so teams can understand and replay decisions later rather than relying on trust alone.

And you're spot on regarding adapters. Claude Code, OpenAI tool-calling workflows, and a lightweight Python REPL path are at the top of our list because that's where we're seeing the most real-world agent activity today.

Really appreciate the thoughtful feedback — you're describing exactly the direction we think this space needs to move toward.

Report

2mo ago