
Blop AI
Blop turns design feedback into code you can ship
99 followers
Blop turns design feedback into code you can ship
99 followers
Blop makes sure your product actually works before your customers find out it doesn't. You simply describe what your app should do, like "a user can sign up and check out," and Blop automatically tests it every time you make a change, catching bugs before they reach anyone. When your product changes, it quietly fixes its own tests, so you don't need a QA team or engineers spending their week on it. You ship faster, with the confidence that nothing important broke.
This is the 2nd launch from Blop AI. View more
Blop
Launched this week
Most startups ship fast and have no QA team, so tests get written late, break often, and quietly get ignored.
Blop fixes that. You describe what your app should do in plain English. Blop builds a deep map of your product, tests it like a real user on every deploy, and when something changes and a test breaks, it repairs the test and opens a PR for you to review.
The tests stay as real code in your repo. You own them. Nothing locked inside our tool.




Free
Launch Team / Built With



Blop AI
If it auto-fixes its own tests when the product changes, what stops it from "fixing" a test into passing when the underlying behavior actually broke? That's the core tension with self-healing test tools generally, how do you tell the difference between an intentional UI change and a regression if the system's first instinct is to adapt rather than flag?
Blop AI
@ansari_adin Good question, this is the thing we worried about most too. The trick is we never let it touch the part that matters. Blop keeps two things separate: how a test finds stuff on the page (the locators) and what the test expects to be true (the assertions). Self-healing only ever touches the first one. If a button moved or got renamed, it can re-find it. But if the thing you said should happen didn't happen, that's a failure, full stop. It never adapts an assertion to turn a red test green.
So an intentional UI change usually looks like a locator drifting, the button is still there, just in a new spot. A regression looks like the expected result not showing up. We only auto-recover the first kind, and even then any real fix lands as a PR you review and merge yourself. Nothing rewrites your test behind your back. You get the trace and the diff and you make the call.
The self-healing part is what I keep thinking about. Writing tests is one problem, but the real grind is maintaining them after every refactor, and that's usually what causes teams to just give up on the test suite entirely. The "deep map" approach of tracking intended behavior rather than implementation details sounds like it could survive code changes better than traditional tests. The concern I'd have is the edge case where Blop quietly repairs a test that was actually catching a real bug. How do you distinguish between a test that broke because the product changed versus a test that broke because something is genuinely wrong?
Blop AI
@maylee_zhang That’s a great point, and honestly it’s something we noticed too. Self healing can’t just mean “make the test pass again” because then you risk hiding real failures. There’s also the agent reward hacking problem if the goal is just green tests, agents can create shallow tests that don’t really validate the feature.So for us, the key is that repairs should be anchored to expected behavior and remain reviewable. Blop should explain what changed and whether it looks like an intentional product change or a real regression not silently auto fix everything so the goal is assisted maintenance with guardrails, not blind auto fixing :)
How does Blop distinguish between an intentional product change and a bug ? I'd be curious to understand what signals it uses before deciding to repair a test automatically.
Blop AI
@craig_bennett1 The real answer is we don't let the runtime make that call at all, because it's the wrong place to make it. The runtime only ever recovers a locator, finding an element that moved. It never changes what a test expects. So a refactor that renames a button gets recovered quietly. A broken checkout flow fails loudly, because the expected outcome didn't happen and we never touch assertions. The "is this a real change or a bug" decision happens one level up, and a human is always in it. When something drifts, Blop opens a PR with the proposed update, the trace, and a diff, and you decide. We deliberately don't auto-merge anything. The signal we lean on is simple: did what you described still happen? If yes, it's structure that moved and we adapt the path. If no, that's a failure and it's yours to look at
Congrats! I wonder how does Blop ensure the generated test code matches your team's existing patterns and naming conventions?
Blop AI
@crystalmei Thank you! That’s actually something we’ve been working on quite a bit.
We added a knowledge base where teams can define their test patterns, naming conventions, helpers, and preferred structure. The agent also updates its memory as you use it, so it can learn from feedback and generated tests over time. The goal is for Blop’s tests to feel like they were written by the team, not dropped in by an external tool. We’re still looking at how we can improve this further.
Do you plan support for mobile/multi platform/cli products testing as well?
Blop AI
@wojtekszkutnik Hi! We’re currently working on mobile testing in the web app, as well as a Blop CLI that already powers our backend testing infrastructure. Soon, you’ll be able to use Blop directly from the CLI for your own workflows.
We also have an npm package available already @blopai/cli. We’re actively working on improving and publishing the documentation for it, so it’ll be much easier to get started soon.
The locator-only healing is a clever constraint - it answers the regression-masking concern pretty cleanly. The harder problem I'd push on is spec drift: "user can sign up and check out" sounds stable but the flow itself evolves constantly in early-stage products. When checkout adds a promo code step, or signup adds phone verification, Blop's deep map of expected behavior is now wrong - but locators might still pass. Does Blop have a way to surface when the original description no longer matches what the product actually does, or does that gap just quietly accumulate until a real regression gets missed?