Playwright marks a test passed when no exception threw. That's not the same as the right thing happening on screen.
Confidence Gate adds an AI verification layer to Playwright runs. Each step is verified against its screenshot using vision AI. Steps get three states: passed / inconclusive / failed. The run scores 0–100. Finalgate: ship / watch / block.
Penalises self-healing selectors, behavior overrides, and inconclusive steps. MIT licensed. Self-hostable via Docker Compose.