MartinLoop - Controls for Coding Agents

Stop AI coding agents from running wild

5.0•2 reviews•

65 followers

Stop AI coding agents from running wild

5.0•2 reviews•

65 followers

Visit website

AI Coding Agents

•

Command line tools

•

AI Infrastructure Tools

MartinLoop helps engineering teams safely scale AI coding agents from experiments into accountable, measurable production workers. AI coding agents are powerful, but they can also run too long, spend too much, change the wrong files, and leave you guessing what happened. MartinLoop gives every agent run a budget, safety checks, rollback, and a clear receipt showing what changed, what passed, what failed, and what it cost. Use the MCP or directly in your CLI. Model-agnostic above the agent.

Free

Launch tags:Developer Tools•Artificial Intelligence•GitHub

Launch Team / Built With

ElevenAgents by ElevenLabsScale conversations without scaling your team

Promoted

Hunter

📌

Hey Product Hunt, I built MartinLoop after watching useful coding agents fail in the most expensive way: they keep trying. A loop can look productive while it burns tokens, repeats the same failure class, edits outside scope, or stops with no durable evidence. MartinLoop is the open-source control layer around that loop. It adds hard budget caps, verifier-gated next-attempt admission, safety policy for scope/secrets/verifier commands, rollback evidence, and JSONL run records you can inspect or resume later. The goal is simple: keep the speed of autonomous agents, but make every run accountable. I would love feedback from anyone running Claude Code, Codex, OpenCode, or similar coding-agent workflows in real projects: what receipt would you need before trusting a long-running agent overnight?

Report

2mo ago

Hunter

Useful framing. The part I would want before trusting a long-running agent is a concise run receipt: budget spent, verifier failures, scope guard hits, and the exact stop reason. If you surface that per run, it becomes much easier to compare agents without rereading logs.

Report

2mo ago

The verifier gated admission rule is the part that separates this from a spend cap with a nicer dashboard. Capping tokens is easy. Refusing the next attempt until something actually changed is the rule that kills the “busy” loops. Where I would push: rollback evidence and a JSONL trail are clean when the only artifact is a file diff, but a real coding run also fires migrations, seeds data, and calls external APIs that git cannot undo. How does MartinLoop treat side effects that escape the repo boundary? Does the safety policy let me declare which actions are irreversible so the gate refuses to even attempt them unattended, or is fencing those off still on the operator?

Report

1mo ago

Hunter

@zimasilevuyo

That’s exactly the right push. Git rollback is only half the story.

The real danger is everything that escapes the repo: migrations, writes to external services, secrets use, emails, payments, queue publishes.

Our view is that those need to be declared up front as action classes, not discovered after the fact. Then the runtime can do three things: block some classes entirely when unattended, require explicit approval for others, and leave a receipt for the ones it does allow so you can reconstruct what happened later.

So no, I would not treat repo rollback as enough. The safety layer has to know which side effects are reversible, which are recoverable, and which are simply too risky to run on autopilot.

Report

1mo ago

Good work! Are JSONL records capturing rejected paths or only the committed one?

Report

2mo ago

Hunter

@artstavenka1 great question! Both! Every path gets recorded. ledger.jsonl captures all attempts rejected (blocked before execution), discarded (ran but failed verification or violated policy) and the committed one. The final run record and dossier surface the executed attempts with their outcomes. Nothing is lost — you can reconstruct the full loop history from the ledger.

Report

2mo ago

Hunter

One thing I would genuinely love feedback on from launch-day testers: what proof would you want before trusting a coding agent overnight? Budget receipt, verifier result, rollback path, or a file-level diff trail? MartinLoop is built around making those runs inspectable instead of just fast.

Report

2mo ago

Hunter

Good question. The useful receipt should show both the committed path and the rejected ones. If you only keep the final successful branch, you lose the story of why the run got expensive or risky. The operator should be able to see which paths were blocked, which verifier failed, and why the next attempt was or was not admitted.

Report

1mo ago

Hunter

One lesson from testing coding agents: cost usually spikes after the first failure, not before it. If the agent can't show a receipt for done, the next retry should get harder, not easier: cap the spend, require a verifier check, and stop when the same mistake repeats. If you've seen a failure mode we should test before launch, I'd love that feedback.

Report

2mo ago

Hunter

A small rule that catches a lot of fake progress: if the agent can't explain what changed since the last attempt in one sentence, it probably should not get another retry yet. That sounds strict, but it saves a lot of budget from "busy" loops that only reshuffle the same failure.

Report

2mo ago

1 2 3

Forum Threads

p/martinloop

•

2mo ago

What proof should a coding agent show before another retry?

One pattern we keep seeing: teams set a max budget or max iterations, but the expensive part usually starts earlier, when the agent keeps retrying without new evidence.

A better stop rule seems to be: before another retry, the agent should show what changed, what verifier passed, and what would make the next attempt stop.

View all

Found MartinLoop through a thread on Hacker News where someone was venting about a Claude Code run that estimated $2.40 and ended up burning $65 in retries before they noticed. Felt that pain personally so I clicked through and gave it a try. Spent the last 2 weeks running it as a wrapper around my Codex and Claude Code workflows for a side project. The whole "give the agent a finish line" concept clicks immediately when you've ever watched an agent loop on itself overnight.

The killer features for me are the hard budget caps and the verifier gates. You set a dollar limit before the run, point it at your test suite (like npm test), and the agent stops when it hits either the budget or a clean verifier pass. No more "wake up to a $300 OpenAI bill because the agent kept retrying a typo." The JSONL run records are also surprisingly useful when something does go sideways. You get a clean receipt of what changed, what passed, what failed, and exactly where the spend went.

Where it gets rough: setup is CLI-only right now, no hosted dashboard yet (it's on the waitlist for Pro tier). Side effects that escape the repo boundary, like DB migrations or external API calls, still need manual fencing through safety policy. Not a deal breaker but you do need to think about scope upfront. Apache 2.0 license and npm install -g martin-loop to get started means there's zero friction to try it. For anyone running coding agents in production or even just for side projects, this fixes a real problem that nobody else is addressing seriously

MartinLoop - Controls for Coding Agents

Stop AI coding agents from running wild

Stop AI coding agents from running wild

Forum Threads

What proof should a coding agent show before another retry?

Forum Threads

What proof should a coding agent show before another retry?

What's great

What needs improvement

vs Alternatives

What's great

What needs improvement

What's great

What needs improvement

vs Alternatives

What's great

What needs improvement