The full surface — every backend, every agent-native tool, the audits, the network control, the evidence, the personas, and the safeguards — in one place. Then how it maps to your industry.
One tool surface, four backends — watchr auto-detects what's connected and routes the command.
Launch, tap, type, swipe and observe simulators via simctl + idb.
Drive a real iPhone over USB through the on-device WatchrRunner.
adb-backed control; target a specific device by serial.
A real headed Chromium — navigate, click, type, assert.
Same commands on every backend; device="" auto-detects.
Boot, list, set, verify and set up simulators and devices.
Home, back, app-switch and more via press_button.
watchr is an MCP server — your coding agent does the tapping, waiting and observing.
"Test the checkout on Android." The agent drives; you describe.
perform, observe, run_steps, tap, type, swipe, wait — a full surface.
Taps by element label, not brittle x/y coordinates.
A whole flow in one round trip; observe = screenshot + UI tree.
Read the UI tree, page HTML, or run arbitrary JS in the page.
Claude Code, Codex, Cursor, Pi, OpenCode, Cline — anything MCP.
Nothing leaves your machine — ideal for proprietary code and regulated teams.
Drives your own simulators, devices and browser locally.
Your code, screens and data never leave your environment.
No stray output to the MCP channel; diagnostics are stderr-only.
Self-hosted by nature — fits a VPN / private-subnet QA setup.
From a one-line prompt to a structured, repeatable run — with audits on every screen.
Describe a flow → watchr builds a suite, runs every case, reports.
Save baselines, compare screens, catch UI drift.
Missing labels, small touch targets, contrast issues.
Slow screens flagged; timing on every action.
Surface common web security issues on each screen.
On-page SEO checks for web flows, built in.
Frustration tracking — rage taps, dead taps, dead ends.
App state, crash logs, console errors, ANRs after any action.
Random taps & swipes to surface crashes under stress.
Make tests deterministic — control the backend and the device's state.
Stub responses; simulate errors, timeouts and offline with mock_route.
Read, set and clear localStorage and cookies — seed logged-in state.
See every request, get a summary, pull response bodies.
Assert status codes, no failed requests, no console errors.
Explicit pass/fail checks the agent can act on.
assert_text_visible / verify_text — confirm what's on screen.
assert_url_matches — confirm navigation landed where it should.
assert_request_status, no-failed-requests, no-console-errors.
wait_for_stable / wait_until — sync on real app state, not sleeps.
Proof you can share — auto-captured, no setup.
Per-run .mp4 / .webm video, including whole run_steps runs.
Full-resolution captures saved for bug reports and diffs.
Captured per run and queryable, with response bodies.
Markdown + JSON with pass/fail, screenshots and evidence.
Playwright trace zips and HAR captures for deep web debugging.
report_issue writes a structured bug report from a run.
Many users, many devices, at once — locally.
Define users with their own timeouts and behavior; switch with use_persona.
Run the same app as different users, regions and slow-network profiles.
N isolated web sessions × N flows in a single round trip (parallel_run).
Target devices by serial; run across several phones at once.
Headed-Chromium profile so web logins survive across runs.
Track named actions and sessions across a run.
It explores on its own — within guardrails enterprise review asks for.
Turn it loose on a web app and let it map and probe flows.
You review actions; nothing irreversible runs blind.
Points at staging/QA; not a path to mutate production data.
Recovers from modals, session timeouts and unexpected states.
The suite engine collects evidence; you judge pass/fail.
Checks readiness across web/iOS/Android and prints exact fixes.
One agent does what a manual QA pod does — across every platform, on every release. That's the lever: automate the repetitive test cycle and a leaner team ships more.
Local-first, multi-platform QA maps directly onto regulated, high-stakes teams — and the manual headcount they'd otherwise need.
Customer financial data never leaves your machines. Drive account opening, login, 2FA, payments, transfers and statements on real iOS, Android and web — then mock declined cards, insufficient funds, rate-limit and outage responses to prove the unhappy paths without touching production. Every run leaves a timestamped audit trail of screenshots, network logs and a pass/fail report.
Verify exactly what the regulator checks: KYC and age-gates, geo/IP restrictions, responsible-gambling banners, deposit and loss limits, cool-off and self-exclusion — across iOS, Android and web. Seed accounts and mock provider responses so you can test limit breaches and exclusion states deterministically. Player data stays in your boundary.
PHI stays on-device — nothing uploads to a cloud. Test patient onboarding, appointment booking, secure messaging, prescriptions and clinician dashboards on real devices, and run built-in accessibility audits to meet the access standards healthcare apps are held to. Mock EHR/FHIR endpoints to test error and consent states safely.
Accessibility isn't optional — the a11y audit catches missing labels, small touch targets, low contrast and keyboard traps on every screen, with evidence for your conformance report. Self-hosted and local by default; add security and SEO scans across the whole service. Test forms, eligibility flows and document uploads end-to-end.
Catch a broken checkout before your customers do — search, product page, cart, promo codes, guest and account checkout, and payment across iOS, Android and web. Visual regression guards product and cart pages against silent breakage, performance audits protect conversion, and route mocking simulates out-of-stock, price changes and payment failures.
Your agent ships features faster than QA can keep up. watchr gives that agent hands — generate suites from a sentence, run them across simulators, devices and browsers, and get video plus reports back — right inside Claude Code, Cursor or any MCP client. No test code to write or maintain.