Launched this week

QApilot's CoWork
3x Mobile Automation. Same QE Team.
636 followers
3x Mobile Automation. Same QE Team.
636 followers
CoWork turns existing test cases into executable mobile automation with AI planning, human-approved replanning, and real-device execution on iOS, Android, and Flutter.
Interactive







Payment Required
Launch Team / Built With




QApilot's CoWork
Hey everyone, Charan from QApilot here.
Every QA team knows what they want to test. "Log in, add an item to the cart, check the order confirmation." Saying it takes ten seconds. Then the team spends six months turning that sentence into something a machine will run: step definitions, selectors, automation glue, flaky scripts that break on every UI change.
The intent was never the hard part. The execution layer was the tax.
CoWork takes your existing natural-language or BDD test cases, converts them into structured BDD/Gherkin context, builds an execution plan, and runs the test on a real mobile device.
When the app behaves differently, like a changed label, unexpected popup, or interrupted flow, CoWork replans and asks for human approval before moving ahead. When it needs input, like an OTP, it pauses instead of guessing. If it can’t proceed, it fails honestly instead of faking a pass.
Who it’s for: QA leaders, SDETs, mobile engineering teams, and product teams with existing test cases but low execution coverage before releases.
Common use cases: Regression testing, Release readiness, checkout/login flows, OTP-heavy journeys, app flows that break often, and teams trying to reduce manual execution without rewriting everything from scratch.
What makes CoWork different is the balance: AI execution where it can move fast, human control where judgment matters.
If you run mobile tests, I’d genuinely love your take. Try it here: https://qapilot.io/product/cowork
Thanks for being here for the launch. I’ll be in the thread all day reading every comment.
-- Charan Tej, Product Guy @QApilot's CoWork
@charan_tej_kammara The "fails honestly instead of faking a pass" line is underrated. Most automation lies to you when the UI shifts under it, and a test that drifts and silently passes is worse than no test. When CoWork hits a screen that no longer matches the plan, does the human-approved replanning re-anchor to the intent of the step or to the old UI path? That distinction is what separates flaky from durable.
QApilot's CoWork
@charan_tej_kammara The honest-fail behavior is a smart choice. In mobile QA, a blocked test is annoying, but a fake pass is much worse because it creates confidence right when the team should slow down.
Curious how CoWork handles apps with heavy feature flags or UI states that change by user type?
QApilot's CoWork
@habibferdous - Thanks, Habib. Completely agree, a fake pass is far more dangerous than a blocked test.
For feature flags and user-type based UI states, we treat them as part of the execution context rather than noise. CoWork runs against the app state it sees, uses the given test intent and user context to decide the next step, and replans only within that boundary.
If the UI variant still supports the same journey, it can continue. If the flag or user state changes the intended behavior, it asks or fails honestly instead of forcing the test through.
PicWish
@charan_tej_kammara does it also support testing cross-app journeys?
@charan_tej_kammara The biggest challenge with mobile automation has never been writing the first version. It’s what happens after the app changes for the tenth or twentieth release.
You mention CoWork can replan execution when screens, labels or flows change, while still asking for human approval before changing intent. How much of that actually happens automatically in production? For teams using this every sprint, what percentage of failures end up needing manual intervention versus successfully recovering on their own?
That number would tell me more than any feature list because maintenance is usually where automation becomes expensive.
QApilot's CoWork
@vidushee_geetam The idea makes sense technically, but I’m interested in the operational impact.
If a team already has a mature manual regression process, what changes after six months of using CoWork? Is the biggest improvement shorter release cycles, higher regression coverage, fewer flaky tests, or simply less engineering time spent maintaining automation?
I’d be interested to know which metric your existing customers see improve first because that’s ultimately what teams will justify the investment with.
QApilot's CoWork
The replanning-on-UI-change part is the real claim here - most mobile suites die exactly when a label moves or an unexpected popup shows up, so an agent that recovers is genuinely useful. When CoWork replans around a changed label, does it persist the adapted step back into the BDD/Gherkin definition so the next run is deterministic, or does it re-infer the path every run (which would make pass/fail non-reproducible across CI runs)? And does execution happen on a hosted device farm or on my own connected devices - that matters for builds behind auth or internal-only distribution.
QApilot's CoWork
@hi_i_am_mimo Wonderful question. We gave that control to the human in the loop - to decide whether to save the adapted steps as the default for all the future runs or to roll back the adapted steps. Once CoWork replans and completes a run, the user can choose to accept the test steps and ingest them into the test suite or reject them and re-run CoWork as needed.
We believe that the judgement better lie in the hands of the human.
That human-in-the-loop accept/reject is the right default - auto-persisting silently is exactly how suites quietly drift out of sync. One follow-up on the accept path: when I ingest an adapted step, does it get written back as readable Gherkin/BDD text I can diff in a PR, or as an opaque recorded-selector blob? The first stays reviewable by the team; the second tends to rot.
QApilot's CoWork
@shibin_tv Great question, Shibin. We don’t see agentic testing as a replacement for deterministic tests. Those should stay where behavior is fixed and repeatable.
The challenge is mobile execution. The same flow can behave differently across devices, OS versions, permissions, keyboards, overlays, network states, and timing delays.
Traditional tests often fail on these execution quirks even when the user journey is still valid. CoWork tries to handle that in-run: continue when it is safe, ask when judgment is needed, and fail honestly when the product behavior has truly changed.
When scale testing mobile applications, how does CoWork handle deep state synchronization or state flakiness across parallel test instances? Is it actively maintaining a synchronized virtual state tree across the instances, or relying on aggressive DOM/view re-verification loops to prevent false negatives?
QApilot's CoWork
@juno_dost - Great question. CoWork doesn’t maintain one shared virtual state tree across parallel runs. Each device/run keeps its own execution context and validates against the app model, test intent, and current screen state.
It’s not just repeated view polling either. We use contextual re-verification, checkpoints, and bounded replanning to handle mobile state quirks.
If the journey is still valid, it continues. If the state is polluted or behavior has changed, it surfaces the issue.
This feels especially useful for teams that already have test cases written down, but still end up doing too much manual release checking because automation takes too long to maintain.
The human-approved replanning part is the most important detail to me. For QA, I’d rather have the system pause and ask when something changes than confidently fake its way through a broken flow :)
Curious how CoWork handles UI changes over time. does it learn from approved replans so the same changed label or popup does not need approval every time?
QApilot's CoWork
@andrasczeizel - "I’d rather have the system pause and ask when something changes" - exactly what our approach to building an autonomous system with human in the centre. And as per the approved replans once the test execution is done, we gave the control again to the human in the loop whether to ingest the approved replans to the test suite and use them for further runs. That way the wouldn't need approval every time.