We're missing an entire layer of the AI stack

by

I've spent the last few years building agentic systems. Going into that work, I assumed the hardest problems would be about the models themselves: better reasoning, larger context windows, stronger planning, and richer tool use. That's exactly where most of the industry's attention has been, and the progress has been remarkable. 

Over time, though, the models gradually stopped being the thing I thought about every day. 

Instead, I found myself spending more time on a different class of problems: the same kinds of problems every operating system, database, or distributed platform eventually must solve. What is this agent actually allowed to do? How do we know if the work is complete? Where does the state live? What happens when something fails halfway through? If the model changes next month, what will survive? 

None of those questions are about model intelligence, and they do not disappear just because the next frontier model scores a few more points on a benchmark. 

Most of the work I've seen in the industry starts with the agent and builds outward. The assumption is that if we keep making agents more capable, the rest will eventually take care of itself. I no longer believe that's true. Better models absolutely make autonomous systems more capable, but they don't solve the operational problems that appear once those systems begin doing real work. 

That realization forced me back to first principles. Instead of treating the model as the center of the system, I started treating it as one participant in a broader execution environment. Authority, evidence, recovery, cost attribution, lifecycle management, and organizational memory became first-class concerns because they have to survive changes in models, providers, and infrastructure. The work has to outlive the process that created it, and it certainly has to outlive the particular model that happened to perform it. 

Following that line of thinking eventually led to FAFO™ AgentOS. It grew out of a much simpler question: if autonomous work is going to become part of how organizations operate, what kind of operating system does that work actually require?  

Software engineering is simply where we're proving the idea first because it's measurable. We can inspect the artifacts, validate the outcome, replay failures, and quantify the cost. I don't think that's where the story ends, though.

 

The underlying problem is much broader. Any organization that expects autonomous systems to perform consequential work eventually runs into the same questions. Engineering, legal, finance, operations, and compliance all require different domain expertise, but they all depend on the same operational properties. Someone has to define what autonomous systems are allowed to do, prove what they did, recover when they fail, and provide enough evidence that another human can trust the result. These operational requirements are remarkably consistent regardless of the domain. 

Once I started thinking about the problem that way, it became difficult to see AgentOS as just another application. It increasingly felt like infrastructure, the kind of software that other systems are built on rather than another system sitting beside them. 

Stepping back, I think we're still missing an entire layer of the AI stack. We have foundation models, inference engines, orchestration frameworks, and applications, but we don't really have an operating system for autonomous work. That's the layer we set out to build with FAFO™ AgentOS. 

We're launching FAFO™ AgentOS shortly, but what I'm really interested in is whether other people building production AI systems have hit the same shift. If you've been running autonomous systems in production, did you see it too? 

25 views

Add a comment

Replies

Be the first to comment