How to Reduce Playwright Test Flakiness (Hands-On Guide)

Flaky tests slow delivery, erode trust, and often signal deeper reliability issues. Large engineering orgs track and mitigate flakes systematically—Google, for example, has written extensively about causes and mitigation of flaky tests, including the observation that a noticeable share of test runs can be flaky across the whole company.

1) Use stable, user-facing locators (and let auto-waiting work)

Favor selectors that reflect what users see—getByRole, getByLabel, or getByTestId—so actions run only when elements are visible, enabled, and stable (see Playwright’s locators guide).

Pro tip: Replace waitForTimeout with assertions that wait for the right state (e.g., expect(locator).toBeVisible()); Playwright’s assertions are web-first and auto-retry until the condition is met.

2) Isolate state per test and authenticate deterministically

Run each test in a fresh browser context to prevent cross-test leakage, and preload login with a saved storageState so you avoid flakey UI logins. The official authentication guide shows a simple “setup” project pattern that shares state across tests.

3) Remove network randomness with mocks or HAR replay

External services, rate limits, or variable latency introduce nondeterminism. Intercept requests or replay a recorded HAR so UI flows see consistent responses (Playwright’s mock & HAR docs).

4) Tame visuals: disable animations and stabilize rendering

Animations and transitions cause pixel drift and screenshot noise. Disable or fast-forward animations for visual tests—this walkthrough shows a practical approach to stabilizing screenshots in Playwright (guide).

5) Keep retries low—but capture a trace on the first retry

Retries can de-noise CI, but they should help you debug, not hide, flakes. Enable minimal retries and set trace: 'on-first-retry' to capture DOM snapshots, network, and console logs when a test first flakes (see the Trace Viewer).

6) Standardize your CI environment

Many “it only fails on CI” issues are environment drift. Use the official CI guide to run the Playwright Docker image or install browsers with --with-deps, and cache binaries to keep runs consistent.

7) Treat flakes as real bugs and fix the nondeterminism

Flakes usually stem from concurrency, time, shared state, or infrastructure. Quarantine if needed, but prioritize root-cause fixes; Google’s analysis of where flaky tests come from is a useful checklist (see the post).

8) Reproduce locally under stress to mirror CI

CI is slower and more parallelized. Recreate that locally—bump --workers, loop with --repeat-each, and throttle resources—following this step-by-step guide to reproducing Playwright flakes (article).

Why these steps work

They directly address the dominant causes of flaky E2E tests—timing, state leakage, and environment drift—while leaning on Playwright’s built-ins (locators with auto-wait, contexts, mocking, traces). If you want deeper background on the reliability problem, Martin Fowler’s explainer on non-deterministic tests is a concise framing

Written by:

Michael Rosenfield

Co-founder

Share with friends:

Share on X