Launching today

Agentspan
Open-source runtime for durable AI agents
67 followers
Open-source runtime for durable AI agents
67 followers
Agentspan is an open-source server and SDK for running AI agents as durable workflows. You can define agents programmatically, execute them server-side, and inspect each run and execution state in the UI. Agentspan adds crash recovery, human-in-the-loop approvals, guardrails, tool history, and observability around the agent frameworks and LLMs you already use. MIT licensed.




Starnus
@nickorkes Congrats! Looks amazing, it's super cool for people who want to just focus on the code and don't spend too much time on the infra.
QQ, maybe trivial since I didn't check the codebase in detail, but by server, you mean it's still local, right? Not based on any specific cloud provider. It could be amazing to see adapter/connectors/versions on major cloud providers too, and have it super easy to deploy with few line of code (then no need to learn anything major from any provider side).
@khashayar_mansourizadeh1 Thanks! Agentspan can definitely be installed locally, but it doesn't have to be. See https://agentspan.ai/docs/deployment/. Great feedback on cloud-specific connectors though. That would make it very easy to get up and running.
Durable AI agents that survive failures and interruptions is one of the harder infrastructure problems right now. Open-sourcing the runtime is a real commitment to the ecosystem. We've been building in the customer success for developer tool companies space at RetainSure, and Agentspan touches on something we think about a lot: how agent persistence changes what's possible in long-running business workflows. What's your approach to handling state when agents run for hours or days?
@shivam_jaiswal21 the way we approach state is thinking of it in terms of long-term durable workflows. Each agent run persists server-side as a workflow with a long lived execution ID, backed by a DB. If something interrupts the agent's execution, it can then resume from wherever it left off.
Crash recovery for agents is the thing nobody talks about until it breaks in production. We've had workflows silently fail partway through with no state to resume from. Human in the loop approvals are the other piece teams always bolt on last minute. Does Agentspan support branching approvals, where different steps route to different reviewers?
@dhiraj_patel5 yes, crash recovery is super important and a primary factor in us building this. Agentspan supports approvals as a first-class tool, though the branching logic would live in your agent/workflow code today.
The durability layer is the piece most agent frameworks skip. We're building AI workflows at RetainSure and the biggest headache isn't the LLM calls, it's what happens when a step fails partway through and the state is gone. Keeping execution state server side while defining agents client side is a clean separation. Does Agentspan support partial retries, or does a failure restart the whole run?
@dhiraj_patel5 yes, that's part of the design. We worked hard on crash resume being a core part of the project for the reasons you mentioned. Now, how the reconciliation works may need to be part of the workflow code you write as it might very agent to agent. But the fact that history and run state persists server-side makes that possible.
The durable runtime angle is the part I’d look at first. For agent teams, the hard bit is usually not starting a run, it’s resuming state, handling approvals, and seeing exactly what changed after a long task.
@new_user___2672025cf1bc18102609b53 exactly. Those are core production failure modes this project works hard to address.