·Pramod Dutta

Fixing Flaky Tests in Playwright: A Step-by-Step Guide with Examples

A comprehensive guide to identifying and fixing flaky tests in Playwright. Learn about auto-waiting, locator best practices, retry strategies, and practical code examples to make your Playwright test suite rock-solid.

playwright flaky testsplaywright test reliabilityfix flaky playwright testsplaywright auto-waitingplaywright locator best practicesplaywright retryplaywright test timeoutplaywright CI flakyplaywright selector stabilityplaywright network mocking

Fixing Flaky Tests in Playwright: A Step-by-Step Guide with Examples

Playwright has quickly become the go-to browser automation framework for modern web testing. Its architecture -- with auto-waiting, browser contexts for isolation, and built-in network interception -- was specifically designed to reduce test flakiness. Yet flaky tests in Playwright remain one of the most common complaints in QA communities.

The truth is that Playwright gives you excellent tools for writing reliable tests, but it does not force you to use them correctly. This guide walks you through the most common causes of flaky Playwright tests, shows you exactly how to fix each one, and provides production-ready patterns that you can adopt immediately.

Why Playwright Tests Become Flaky

Despite Playwright's anti-flakiness design, tests still become flaky for several reasons.

The Framework Helps, But It Cannot Think for You

Playwright's auto-waiting mechanism waits for elements to be visible, enabled, and stable before interacting with them. This eliminates many common flakiness causes that plague Selenium tests. However, auto-waiting has limits. It waits for the element itself, not for the application state that the element represents.

For example, Playwright will wait for a button to become clickable, but it will not wait for the API call that the button triggers to complete. If your test clicks the button and then immediately asserts the result of the API call, you have a race condition that auto-waiting cannot prevent.

The Testing Pyramid Still Applies

Playwright tests are end-to-end tests by nature. They exercise the full stack -- browser, front-end application, API layer, and database. Every layer introduces potential variability. A test that depends on all layers being fast and available will occasionally fail when any single layer is slow or unavailable.

CI Environments Are Different from Your Laptop

Playwright tests that pass locally may fail in CI due to differences in available resources, network configurations, or display settings. CI runners typically have fewer CPU cores, less memory, and no GPU acceleration, all of which affect browser rendering speed and test reliability.

Common Causes and Fixes

Cause 1: Fragile Selectors

The most common cause of flaky Playwright tests is using selectors that are not stable across renders or code changes.

Problem: CSS class selectors that change with builds
// FLAKY: CSS class names may change with build tools (CSS modules, Tailwind)

await page.click('.btn-primary-2xl-variant-a');

CSS class names generated by CSS modules, Tailwind CSS, or styled-components can change between builds. A selector that works today may break tomorrow without any intentional code change.

Problem: XPath selectors that depend on DOM structure
// FLAKY: Breaks if any element is added/removed in the DOM hierarchy

await page.click('/html/body/div[2]/main/div[1]/form/button[3]');

Absolute XPath selectors are brittle because any change to the DOM structure -- even adding a wrapper div for styling -- breaks the selector.

Fix: Use resilient locator strategies

Playwright provides several locator strategies designed for stability. Use them in this order of preference.

// BEST: Role-based locators (resilient to implementation changes)

await page.getByRole('button', { name: 'Submit Order' });

// GOOD: Test ID locators (explicitly stable)

await page.getByTestId('submit-order-button');

// GOOD: Text-based locators (tied to user-visible content)

await page.getByText('Submit Order');

// GOOD: Label-based locators (for form inputs)

await page.getByLabel('Email Address');

// GOOD: Placeholder-based locators

await page.getByPlaceholder('Enter your email');

// OK: CSS selectors with data attributes

await page.locator('[data-testid="submit-order"]');

// AVOID: Generated CSS classes

// AVOID: Absolute XPath

// AVOID: Positional selectors (nth-child, etc.)

Best practice: Add data-testid attributes to your components


// Your test

await page.getByTestId('submit-order').click();

This creates a stable contract between your test and your component that is unaffected by styling changes, content changes (for non-text selectors), or DOM restructuring.

Cause 2: Not Waiting for Application State

Playwright's auto-waiting handles element-level waits, but application-level state changes require explicit waiting.

Problem: Asserting before data loads
// FLAKY: Navigation completes before the API response arrives

await page.goto('https://app.example.com/dashboard');

// The dashboard is rendered but the data hasn't loaded yet

const revenue = await page.textContent('#total-revenue');

expect(revenue).toBe('$42,500');

Fix: Wait for the specific condition you are asserting
// STABLE: Wait for the network request to complete

await page.goto('https://app.example.com/dashboard');

// Option 1: Wait for the API response

await page.waitForResponse(

response => response.url().includes('/api/revenue') && response.status() === 200

);

const revenue = await page.textContent('#total-revenue');

expect(revenue).toBe('$42,500');

// Option 2: Use Playwright's built-in assertion retries

await page.goto('https://app.example.com/dashboard');

await expect(page.locator('#total-revenue')).toHaveText('$42,500');

The second approach is preferred because expect(locator).toHaveText() is a retrying assertion. Playwright will keep checking the element's text content until it matches or the timeout expires. This is more resilient than a one-time check.

Cause 3: Improper Navigation Handling

Navigation-related flakiness is extremely common, especially in single-page applications where "navigation" may or may not involve an actual page load.

Problem: Clicking a link that triggers client-side routing
// FLAKY: page.waitForNavigation may or may not fire for SPA navigation

await Promise.all([

page.waitForNavigation(),

page.click('a[href="/settings"]'),

]);

In a single-page application, clicking a link may not trigger a traditional navigation event. The waitForNavigation call may hang until timeout or may resolve immediately without the page content having actually changed.

Fix: Wait for the destination content instead of the navigation event
// STABLE: Wait for the content that should appear after navigation

await page.click('a[href="/settings"]');

await expect(page.getByRole('heading', { name: 'Settings' })).toBeVisible();

This approach works regardless of whether the navigation is a full page load or a client-side route change.

Fix for actual page navigation:
// STABLE: Use waitForURL for real navigations

await page.click('a[href="/settings"]');

await page.waitForURL('**/settings');

Cause 4: Animation and Transition Interference

Modern web applications use animations extensively. Animations can interfere with Playwright's ability to click elements, read text, or take screenshots.

Problem: Clicking an element during animation
// FLAKY: The modal is animating into view, click may miss

await page.click('#modal-confirm-button');

Even though Playwright waits for the element to be visible, "visible" does not mean "finished animating." If the element is mid-animation, a click may land in the wrong position or the element may not be fully interactive.

Fix: Wait for animations to complete
// Option 1: Wait for the element to be stable (no ongoing animations)

await page.locator('#modal-confirm-button').click({ force: false });

// Option 2: Disable animations entirely in tests

await page.addStyleTag({

content:

, ::before, *::after {

animation-duration: 0s !important;

animation-delay: 0s !important;

transition-duration: 0s !important;

transition-delay: 0s !important;

}

});

// Option 3: Configure in playwright.config.ts

// playwright.config.ts

export default defineConfig({

use: {

// Reduce motion to avoid animation flakiness

reducedMotion: 'reduce',

},

});

Disabling animations in tests is a widely recommended practice. It eliminates an entire category of flakiness with minimal impact on test coverage.

Cause 5: Viewport and Layout Sensitivity

Tests that depend on specific viewport sizes or responsive behavior are vulnerable to flakiness when the viewport varies between environments.

Problem: Element hidden on smaller viewports
// FLAKY: The sidebar navigation might be collapsed on the CI runner's viewport

await page.click('#sidebar-menu-item-settings');

Fix: Set explicit viewport sizes
// playwright.config.ts

export default defineConfig({

use: {

viewport: { width: 1280, height: 720 },

},

});

// Or per-test:

test('desktop navigation', async ({ page }) => {

await page.setViewportSize({ width: 1280, height: 720 });

// Now the sidebar is guaranteed to be visible

await page.click('#sidebar-menu-item-settings');

});

Cause 6: Network Timing Variability

Tests that depend on real network requests are inherently flaky because network timing varies.

Problem: Test depends on real API responses
// FLAKY: API response time varies, may exceed default timeout

test('loads user profile', async ({ page }) => {

await page.goto('/profile');

await expect(page.getByText('John Doe')).toBeVisible();

});

Fix: Mock network requests for deterministic behavior
// STABLE: Mock API responses for consistent behavior

test('loads user profile', async ({ page }) => {

// Intercept the API call and return a mock response

await page.route('**/api/profile', async route => {

await route.fulfill({

status: 200,

contentType: 'application/json',

body: JSON.stringify({

name: 'John Doe',

email: 'john@example.com',

role: 'Admin'

}),

});

});

await page.goto('/profile');

await expect(page.getByText('John Doe')).toBeVisible();

});

Network mocking makes tests faster and more reliable. The trade-off is that you are not testing the real API integration -- but that should be covered by separate API-level tests, not by every UI test.

When you do need real network requests:
// Use increased timeouts and proper waiting

test('loads user profile from real API', async ({ page }) => {

await page.goto('/profile');

// Wait for the specific API call to complete

const response = await page.waitForResponse(

resp => resp.url().includes('/api/profile') && resp.status() === 200,

{ timeout: 15000 }

);

// Now assert

await expect(page.getByText('John Doe')).toBeVisible();

});

Cause 7: File Upload and Download Timing

File operations are a common source of flakiness because they depend on filesystem timing.

Problem: Asserting download before it completes
// FLAKY: Download might not be complete when we check the file

test('downloads report', async ({ page }) => {

await page.click('#download-report');

// File might not be on disk yet!

expect(fs.existsSync('/downloads/report.csv')).toBe(true);

});

Fix: Use Playwright's download handling
// STABLE: Wait for the download event

test('downloads report', async ({ page }) => {

const downloadPromise = page.waitForEvent('download');

await page.click('#download-report');

const download = await downloadPromise;

// Wait for download to complete

const path = await download.path();

expect(path).toBeTruthy();

// Verify file content

const content = fs.readFileSync(path, 'utf-8');

expect(content).toContain('Revenue');

});

Cause 8: Popup and Dialog Handling

Dialogs and popups must be handled before they appear. If you set up a dialog handler after the action that triggers the dialog, you have a race condition.

Problem: Dialog handler set up too late
// FLAKY: The dialog might fire before the handler is registered

test('confirms deletion', async ({ page }) => {

await page.click('#delete-account');

// Too late! The dialog already appeared and blocked execution

page.on('dialog', dialog => dialog.accept());

});

Fix: Set up the handler before the triggering action
// STABLE: Handler is ready before the dialog appears

test('confirms deletion', async ({ page }) => {

// Set up handler FIRST

page.on('dialog', dialog => dialog.accept());

// Then trigger the action

await page.click('#delete-account');

// Wait for the result

await expect(page.getByText('Account deleted')).toBeVisible();

});

// Alternative: Use once() for a one-time handler

test('confirms deletion', async ({ page }) => {

page.once('dialog', dialog => dialog.accept());

await page.click('#delete-account');

await expect(page.getByText('Account deleted')).toBeVisible();

});

Playwright Configuration for Maximum Reliability

Your playwright.config.ts plays a crucial role in test reliability. Here is a configuration optimized for stability.

import { defineConfig, devices } from '@playwright/test';

export default defineConfig({

// Retry failed tests automatically

retries: process.env.CI ? 2 : 0,

// Run tests in parallel, but limit workers in CI

workers: process.env.CI ? 2 : undefined,

fullyParallel: true,

// Fail the build if any test.only() is left in the code

forbidOnly: !!process.env.CI,

// Global timeout for each test

timeout: 30_000,

// Assertion timeout (for expect() retrying assertions)

expect: {

timeout: 10_000,

},

// Reporter configuration

reporter: process.env.CI

? [['html'], ['junit', { outputFile: 'test-results.xml' }]]

: [['html']],

use: {

// Base URL for all tests

baseURL: process.env.BASE_URL || 'http://localhost:3000',

// Consistent viewport

viewport: { width: 1280, height: 720 },

// Reduce motion to eliminate animation flakiness

reducedMotion: 'reduce',

// Capture trace on failure for debugging

trace: 'on-first-retry',

// Capture screenshot on failure

screenshot: 'only-on-failure',

// Capture video on failure

video: 'on-first-retry',

// Navigation timeout

navigationTimeout: 15_000,

// Action timeout (click, fill, etc.)

actionTimeout: 10_000,

},

projects: [

{

name: 'chromium',

use: { ...devices['Desktop Chrome'] },

},

{

name: 'firefox',

use: { ...devices['Desktop Firefox'] },

},

{

name: 'webkit',

use: { ...devices['Desktop Safari'] },

},

],

// Start the dev server before running tests

webServer: {

command: 'npm run start',

url: 'http://localhost:3000',

reuseExistingServer: !process.env.CI,

timeout: 120_000,

},

});

Key Configuration Decisions Explained

retries: 2 in CI: Automatically retries failed tests up to 2 times. This provides resilience against environmental flakiness while still surfacing consistently broken tests. workers: 2 in CI: Limits parallelism in CI to reduce resource contention. Too many parallel browsers on a limited CI runner causes out-of-memory errors and timeouts. trace: 'on-first-retry': Captures a trace only when a test fails and is being retried. This provides debugging information without the performance overhead of tracing every test. reducedMotion: 'reduce': Tells the browser to prefer reduced motion, which most web applications respect by disabling animations. This eliminates animation-related flakiness.

Advanced Anti-Flakiness Patterns

Pattern 1: Page Object Model with Built-In Waits

Encapsulate page interactions in Page Objects that include appropriate waits.

// pages/checkout-page.ts

import { Page, Locator, expect } from '@playwright/test';

export class CheckoutPage {

private page: Page;

private couponInput: Locator;

private applyCouponButton: Locator;

private discountLabel: Locator;

private orderTotal: Locator;

private submitButton: Locator;

private confirmationHeading: Locator;

constructor(page: Page) {

this.page = page;

this.couponInput = page.getByLabel('Coupon Code');

this.applyCouponButton = page.getByRole('button', { name: 'Apply Coupon' });

this.discountLabel = page.getByTestId('discount-amount');

this.orderTotal = page.getByTestId('order-total');

this.submitButton = page.getByRole('button', { name: 'Place Order' });

this.confirmationHeading = page.getByRole('heading', { name: 'Order Confirmed' });

}

async goto() {

await this.page.goto('/checkout');

// Wait for the page to be fully loaded (not just navigated)

await expect(this.orderTotal).toBeVisible();

}

async applyCoupon(code: string) {

await this.couponInput.fill(code);

await this.applyCouponButton.click();

// Wait for the discount to be applied (API call completes)

await expect(this.discountLabel).not.toHaveText('$0.00');

}

async getOrderTotal(): Promise {

return await this.orderTotal.textContent() ?? '';

}

async placeOrder() {

await this.submitButton.click();

// Wait for order confirmation (full round-trip to server)

await expect(this.confirmationHeading).toBeVisible({ timeout: 15000 });

}

}

// tests/checkout.spec.ts

import { test, expect } from '@playwright/test';

import { CheckoutPage } from '../pages/checkout-page';

test('apply coupon reduces total', async ({ page }) => {

const checkout = new CheckoutPage(page);

await checkout.goto();

await checkout.applyCoupon('SAVE20');

const total = await checkout.getOrderTotal();

expect(parseFloat(total.replace('$', ''))).toBeLessThan(100);

});

Pattern 2: API State Setup

Instead of using the UI to set up test state (slow and flaky), use API calls directly.

// helpers/api.ts

import { APIRequestContext } from '@playwright/test';

export async function createTestUser(request: APIRequestContext) {

const response = await request.post('/api/users', {

data: {

name: 'Test User',

email: test-${Date.now()}@example.com,

password: 'SecurePass123!',

},

});

return response.json();

}

export async function seedProductCatalog(request: APIRequestContext) {

await request.post('/api/admin/seed', {

data: { catalog: 'test-products' },

});

}

// tests/shopping.spec.ts

import { test, expect } from '@playwright/test';

import { createTestUser, seedProductCatalog } from '../helpers/api';

test.beforeEach(async ({ request }) => {

await seedProductCatalog(request);

});

test('user can add product to cart', async ({ page, request }) => {

const user = await createTestUser(request);

// Login via API (fast) instead of UI (slow and flaky)

await page.goto('/');

await page.evaluate((token) => {

localStorage.setItem('auth_token', token);

}, user.token);

await page.goto('/products');

await page.getByRole('button', { name: 'Add to Cart' }).first().click();

await expect(page.getByTestId('cart-count')).toHaveText('1');

});

Pattern 3: Network Interception for Slow Endpoints

For endpoints that are slow or unreliable, intercept and mock them while letting other requests pass through.

test('dashboard loads with mixed real and mocked data', async ({ page }) => {

// Mock the slow analytics endpoint

await page.route('/api/analytics/', async route => {

await route.fulfill({

status: 200,

contentType: 'application/json',

body: JSON.stringify({

visitors: 1500,

pageViews: 4200,

bounceRate: 0.35,

}),

});

});

// Let all other API calls go through to the real server

// (no route set up = passes through)

await page.goto('/dashboard');

await expect(page.getByText('1,500 visitors')).toBeVisible();

});

Pattern 4: Retry Logic for Known Unstable Operations

For operations that are inherently unstable (e.g., third-party widget loading), use explicit retry logic.

import { test, expect } from '@playwright/test';

async function waitForThirdPartyWidget(page: Page, maxRetries = 3) {

for (let attempt = 1; attempt <= maxRetries; attempt++) {

try {

await page.waitForSelector('#third-party-widget iframe', {

state: 'attached',

timeout: 5000,

});

return; // Success

} catch (error) {

if (attempt === maxRetries) throw error;

console.log(Widget load attempt ${attempt} failed, retrying...);

await page.reload();

}

}

}

test('interacts with third-party payment widget', async ({ page }) => {

await page.goto('/checkout/payment');

await waitForThirdPartyWidget(page);

const widgetFrame = page.frameLocator('#third-party-widget iframe');

await widgetFrame.getByLabel('Card Number').fill('4242424242424242');

});

Debugging Flaky Playwright Tests

When you encounter a flaky test, Playwright provides excellent debugging tools.

Using Traces

Traces capture a complete record of what happened during a test run, including screenshots, DOM snapshots, network requests, and console logs.

# Run tests with trace enabled

npx playwright test --trace on

View the trace

npx playwright show-trace test-results/my-test/trace.zip

The trace viewer shows a timeline of every action, assertion, and network request. You can step through the test and see exactly what the page looked like at each point, which is invaluable for understanding why a flaky test failed.

Using the Playwright Inspector

For interactive debugging, use the Playwright Inspector.

# Run tests with the inspector

npx playwright test --debug

Or set PWDEBUG environment variable

PWDEBUG=1 npx playwright test

Analyzing Failure Screenshots

Configure Playwright to capture screenshots on failure (which the recommended config above does). Compare failure screenshots across multiple flaky failures to identify visual patterns.

Integrating DeFlaky with Playwright

DeFlaky integrates with Playwright's JUnit reporter to track test reliability over time. This helps you identify which Playwright tests are flaky and prioritize fixes.

// playwright.config.ts

export default defineConfig({

reporter: [

['html'],

['junit', { outputFile: 'playwright-results.xml' }],

],

// ... other config

});

# After running tests, analyze results with DeFlaky

npx playwright test

deflaky analyze --input playwright-results.xml --format junit

View flakiness trends on the dashboard

deflaky dashboard --open

DeFlaky tracks each test's pass/fail rate across runs and surfaces tests whose flakiness rate exceeds your configured threshold. For Playwright tests specifically, it can correlate flakiness with browser type, helping you identify tests that are only flaky in specific browsers.

A Reliability Checklist for Playwright Tests

Use this checklist when writing or reviewing Playwright tests.

Selectors

  • [ ] Use role-based or test-id selectors instead of CSS classes
  • [ ] Avoid absolute XPath selectors
  • [ ] Avoid positional selectors (nth-child, nth-of-type)
  • [ ] Add data-testid attributes to interactive elements

Waiting

  • [ ] Use retrying assertions (expect(locator).toHaveText()) instead of one-time checks
    • [ ] Wait for API responses before asserting data-dependent content
    • [ ] Wait for destination content instead of navigation events
    • [ ] Set appropriate timeouts for slow operations

    Isolation

    • [ ] Each test creates its own test data
    • [ ] Tests do not depend on execution order
  • [ ] Use test.describe blocks for logical grouping, not for shared state
    • [ ] Clean up test data in afterEach hooks

    Network

    • [ ] Mock external API calls when testing UI behavior
    • [ ] Use real API calls only when testing integration
    • [ ] Set appropriate timeouts for network-dependent operations
    • [ ] Handle network errors gracefully in tests

    Configuration

    • [ ] Set explicit viewport size
    • [ ] Use reducedMotion to disable animations
    • [ ] Configure retries for CI (2 retries recommended)
    • [ ] Enable trace capture on first retry
    • [ ] Use JUnit reporter for test result tracking

    CI/CD

    • [ ] Limit parallel workers based on CI runner resources
    • [ ] Use a consistent browser version (pin with Playwright's browser management)
    • [ ] Start the application server as part of the test configuration
    • [ ] Upload traces and screenshots as CI artifacts for debugging

    Common Playwright Flakiness Patterns and Their Fixes

    Here is a quick reference table of the most common patterns.

    | Symptom | Root Cause | Fix |

    |---------|-----------|-----|

    | "Element not found" intermittently | Fragile selector | Use role/testid selectors |

    | "Timeout waiting for element" | Content loads after assertion | Use retrying assertions |

    | Test passes locally, fails in CI | Resource constraints | Limit workers, increase timeouts |

    | Different results across browsers | Browser-specific rendering | Test per-browser, use cross-browser locators |

    | Clicks have no effect | Element mid-animation | Disable animations with reducedMotion |

    | "Navigation timeout" | SPA routing does not trigger page load | Wait for content, not navigation |

    | Intermittent assertion failures on text | Dynamic content (timestamps, counters) | Mock dynamic data or use regex matchers |

    | "Target closed" errors | Browser context closed prematurely | Check for uncaught errors causing page crashes |

    | Screenshot mismatches | Font rendering differences | Use threshold in snapshot comparison |

    | File download assertions fail | Download not complete | Use waitForEvent('download') |

    Conclusion

    Flaky Playwright tests are not inevitable. By following the patterns and practices outlined in this guide -- using resilient selectors, proper waiting strategies, network mocking, and optimized configuration -- you can build a Playwright test suite that your team trusts and relies on.

    Start by auditing your current tests against the reliability checklist. Fix the most impactful issues first: fragile selectors and missing waits account for the majority of Playwright flakiness. Then progressively adopt the advanced patterns like API state setup and Page Object Models with built-in waits.

    Use tools like DeFlaky to track your progress. Measuring your test suite's reliability before and after applying these fixes proves the value of the investment and helps you identify remaining problem areas.

    A reliable Playwright test suite is not just less annoying -- it is a competitive advantage. It means faster feedback, more confident deployments, and more time spent building features instead of investigating false failures.

    Stop guessing. DeFlaky your tests.

    Detect flaky tests in minutes with a single CLI command.