Rithul Kamesh

Continuum - A runtime that reuses computation across AI workflows

by
AI workflows today are built from disconnected calls that do not share work. The same prompts and tokens are recomputed across steps, which wastes time and cost. Continuum treats workflows as executable graphs where tokens, tensors, and tools live in one system. It can reuse shared computation, such as long prompt prefixes, and optimize execution across runs. The result is faster agents, lower costs, and a system that understands what it is doing.

Add a comment

Replies

Best
Rithul Kamesh
Maker
📌

Thanks for checking out Continuum.

The idea came from a simple frustration. In most AI systems, we kept recomputing the same prompts and tokens across steps, even when most of the work was identical. Caching helped a bit, but only at the output level.

Continuum takes a different approach. It treats AI workflows as programs, so it can reuse computation across steps and across runs. For example, if multiple calls share most of a long prompt, it skips recomputing that shared part instead of starting from scratch each time.

Early tests show consistent latency and cost reductions on multi step agents.

Would love feedback, especially from people building complex workflows or agents.

Rithul Kamesh

Most AI systems recompute the same prompts across steps.

Continuum skips that work by reusing computation, not just outputs.

In our tests, a 5-step workflow with a shared 3k-token prefix drops compute by ~70% after the first call.

Would love feedback from people building agents or multi-step pipelines.