Building AI-powered build-fix on Bedrock — what worked, what burned

For our PH launch next Tuesday, I want to share the technical bet that's

become our differentiator:

SmartFix — when a deploy fails on a broken

requirements.txt / package.json / Dockerfile, an AI reads the build log,

proposes a fix, commits it, and re-runs the build.

A few things I learned that I'd love this community's take on:

1. We're on Claude (Bedrock) for production SmartFix, but we just got into

Anthropic's Claude for Startups program, so we're planning to migrate

to the direct Anthropic API for better prompt caching control. Anyone

here run a Bedrock → direct-API migration? Cache hit rate was the

thing we were most uncertain about.

2. The hardest design choice was "when to give up." If SmartFix loops

forever on a build that's broken in a non-AI-fixable way (private

submodule auth, hardware-pinned binaries), it burns API credits and

the user's deploy quota. We settled on max 2 attempts + a confidence

threshold from the model itself. Curious how others gate this.

3. Single-token shell obfuscation defeats static keyword scanners. Found

this the hard way: a user split "tailscale" into A="tail"; B="scale";

APP="${A}${B}". We fix it now via runtime daemon-log signatures

(wgengine, magicsock — internal package names that can't be hidden

from stdout).
What's your "discovered the hard way" abuse pattern?

Genuine questions — I'm at the stage where every answer compounds before

the launch on Jun 23.
Happy to dig into our SmartFix architecture in replies if useful.

7 views