Playwright vs Cypress: Which Framework Has Fewer Flaky Tests?
Choosing an end-to-end testing framework is one of the most consequential decisions a frontend team makes. The wrong choice costs months of productivity fighting unreliable tests. The right choice gives you a test suite that actually catches bugs without crying wolf. When it comes to playwright vs cypress flaky tests, the architectural differences between these two frameworks have a direct and measurable impact on test reliability.
This is not a general comparison of features. This guide focuses specifically on which framework produces fewer flaky tests, why, and what you can do about it regardless of which framework you use.
Architecture: The Root of All Flakiness
The single biggest factor determining test flakiness is how the framework interacts with the browser. Playwright and Cypress take fundamentally different approaches, and this architectural choice ripples through everything from waiting strategies to network handling to parallel execution.
Cypress: In-Browser Execution
Cypress runs inside the browser alongside your application. It injects itself into the same event loop, giving it direct access to the DOM, network requests, and application state. This architecture has clear advantages for debugging and developer experience, but it comes with constraints:
Playwright: Out-of-Process Control
Playwright controls the browser from outside via the Chrome DevTools Protocol (CDP) or equivalent wire protocols. The test code runs in a Node.js process that sends commands to the browser:
How Architecture Affects Flakiness
Cypress's in-browser model means that when the application's event loop is busy (heavy rendering, long JavaScript execution), test commands can be delayed. The test and the application compete for the same thread. Playwright's out-of-process model means test execution is independent of application performance, reducing an entire category of timing-related flakiness.
Auto-Waiting: The Most Important Reliability Feature
Both frameworks implement automatic waiting, but their approaches differ significantly in scope and reliability.
Cypress Retry-ability
Cypress uses a retry-ability mechanism. When a command fails, Cypress retries it until a timeout is reached. This applies to DOM queries but not to all commands:
// Cypress: cy.get retries automatically
cy.get('[data-testid="user-list"]') // Retries until element exists
.should('have.length', 3); // Retries the assertion
// BUT: Some commands do NOT retry
cy.get('[data-testid="submit"]')
.click(); // Click does NOT retry if it fails
// If the element is covered or animating,
// the click may fail without retry
The gap between retryable and non-retryable commands is a significant source of flakiness in Cypress. Developers assume all commands retry, but actions like click(), type(), and select() do not.
Playwright Auto-Waiting
Playwright auto-waits before performing every action. Before clicking an element, Playwright automatically waits for it to be:
- Attached to the DOM
- Visible
- Stable (not animating)
- Enabled (not disabled)
- Not obscured by another element
// Playwright: click auto-waits for actionability
await page.click('[data-testid="submit"]');
// Automatically waits for the button to be visible,
// stable, enabled, and not obscured
// Assertions also auto-wait with expect
await expect(page.locator('[data-testid="user-list"] li'))
.toHaveCount(3); // Retries until count matches or timeout
This comprehensive auto-waiting is the single biggest reason playwright vs cypress flaky tests comparisons tend to favor Playwright. The actionability checks before every interaction eliminate the most common class of E2E test flakiness: clicking elements that are not ready.
Real-World Impact
Consider a common scenario: a button that is briefly covered by a loading overlay:
// Cypress: This can fail if the overlay is still visible
cy.get('[data-testid="submit"]').click();
// Error: "element is being covered by another element"
// Workaround: Force click (BAD - hides real bugs)
cy.get('[data-testid="submit"]').click({ force: true });
// Better workaround: Wait for overlay to disappear
cy.get('.loading-overlay').should('not.exist');
cy.get('[data-testid="submit"]').click();
// Playwright: Automatically waits for the overlay to clear
await page.click('[data-testid="submit"]');
// No workaround needed - waits for element to not be obscured
Network Handling
Network-related flakiness is the second most common category in E2E tests. How each framework handles network requests directly affects test reliability.
Cypress Network Interception
Cypress provides cy.intercept() for network mocking and monitoring:
// Cypress: Intercept API calls
cy.intercept('GET', '/api/users', {
statusCode: 200,
body: [{ id: 1, name: 'Alice' }],
}).as('getUsers');
cy.visit('/dashboard');
cy.wait('@getUsers'); // Wait for the specific request
cy.get('[data-testid="user-name"]').should('have.text', 'Alice');
A common source of Cypress flakiness is forgetting cy.wait() after navigation. Without it, the assertion may run before the API response is received. The ordering of cy.intercept() relative to the action that triggers the request is also critical. If the intercept is registered after the navigation starts, it may miss the request entirely.
Playwright Network Interception
Playwright offers page.route() for network interception:
// Playwright: Route API calls
await page.route('**/api/users', async (route) => {
await route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify([{ id: 1, name: 'Alice' }]),
});
});
await page.goto('/dashboard');
// Playwright's auto-waiting handles the rest
await expect(page.locator('[data-testid="user-name"]'))
.toHaveText('Alice');
Playwright's advantage here is that route handlers are registered before navigation and apply to all matching requests regardless of timing. Combined with auto-waiting assertions, you rarely need explicit waits for network requests.
Waiting for Network Idle
Both frameworks can wait for network activity to settle, but they implement it differently:
// Cypress: No built-in network idle wait
// Common workaround using aliases:
cy.intercept('GET', '/api/**').as('apiCalls');
cy.visit('/dashboard');
cy.wait('@apiCalls');
// Playwright: Built-in network idle waiting
await page.goto('/dashboard', { waitUntil: 'networkidle' });
// Or wait for specific load state:
await page.waitForLoadState('networkidle');
Cross-Browser Support
Browser compatibility issues are a source of flakiness that depends entirely on framework support.
Cypress Browser Support
Cypress supports Chrome, Edge, Firefox, and Electron. WebKit (Safari) support is experimental. Each browser runs in a separate process, and you can only run one browser per Cypress instance.
Playwright Browser Support
Playwright supports Chromium, Firefox, and WebKit with first-class support for all three. It downloads and manages browser binaries automatically, ensuring consistent versions across environments:
// Playwright: Run the same test across all browsers
// playwright.config.ts
export default defineConfig({
projects: [
{ name: 'chromium', use: { ...devices['Desktop Chrome'] } },
{ name: 'firefox', use: { ...devices['Desktop Firefox'] } },
{ name: 'webkit', use: { ...devices['Desktop Safari'] } },
],
});
When evaluating playwright vs cypress flaky tests across browsers, Playwright's consistent WebKit support means you can reliably test Safari behavior in CI without additional infrastructure.
Parallel Execution and Isolation
Parallel test execution amplifies flakiness. Tests that share state, compete for resources, or depend on execution order will fail when parallelized.
Cypress Parallelism
Cypress parallelism requires Cypress Cloud (paid) or third-party orchestration. Each parallel instance runs a separate Cypress process with its own browser. Test distribution is handled by the orchestration layer:
# Cypress: Parallel execution requires Cypress Cloud
npx cypress run --record --parallel --group "e2e"
Playwright Parallelism
Playwright has built-in parallelism with strong isolation. Each test gets a fresh browser context (equivalent to an incognito window), and workers run in parallel by default:
// playwright.config.ts
export default defineConfig({
workers: process.env.CI ? 4 : undefined,
fullyParallel: true,
use: {
// Each test gets a fresh context - no state leakage
contextOptions: {
storageState: undefined, // Clean storage per test
},
},
});
Playwright's browser context isolation is a powerful anti-flakiness feature. Each test starts with a clean slate: no cookies, no localStorage, no cached requests. This eliminates an entire class of order-dependent failures.
Test Isolation Comparison
// Playwright: Built-in isolation via browser contexts
test('test A', async ({ page }) => {
// Fresh browser context - no state from other tests
await page.goto('/');
await page.evaluate(() => localStorage.setItem('key', 'value'));
});
test('test B', async ({ page }) => {
// Different context - localStorage is empty
const value = await page.evaluate(() => localStorage.getItem('key'));
// value is null, not 'value' from test A
});
// Cypress: State can leak between tests within a spec file
it('test A', () => {
cy.visit('/');
cy.window().then(win => win.localStorage.setItem('key', 'value'));
});
it('test B', () => {
// localStorage from test A is still present unless explicitly cleared
cy.visit('/');
cy.window().then(win => {
expect(win.localStorage.getItem('key')).to.equal('value'); // This passes!
});
});
Real-World Flake Rates
Based on industry data and community reports, here are typical flake rates for mature test suites:
| Metric | Cypress | Playwright |
|--------|---------|------------|
| Typical flake rate (mature suite) | 3-8% | 1-3% |
| Most common flake cause | DOM timing/retryability gaps | Network timing |
| Time to debug flaky test | 30-60 min | 15-30 min |
| Built-in trace/debug tools | Screenshots, videos | Traces, screenshots, videos, HAR |
| Retry mechanism | cypress-plugin-retries / built-in | Built-in with configurable retries |
Playwright's lower flake rate is primarily attributable to its comprehensive auto-waiting, stronger test isolation, and built-in trace viewer that makes debugging faster.
Playwright's Trace Viewer
When a test does fail intermittently, Playwright's trace viewer provides a complete record of what happened:
// Enable tracing for failed tests
export default defineConfig({
use: {
trace: 'on-first-retry', // Only capture traces for retries
},
retries: process.env.CI ? 2 : 0,
});
The trace captures DOM snapshots, network requests, console logs, and action timing for every step, making playwright vs cypress flaky tests debugging significantly faster with Playwright.
Migration Considerations
If you are considering migrating from Cypress to Playwright (or vice versa) to reduce flakiness, consider these factors:
When to Stay with Cypress
- Your team has deep Cypress expertise and a large existing test suite
- You primarily test a single-origin application
- Your flakiness is application-level (not framework-level)
- You are heavily invested in Cypress Cloud for test analytics
When to Move to Playwright
- Multi-tab or cross-origin testing is required
- Flakiness from retryability gaps is a persistent problem
- You need first-class WebKit/Safari testing
- Your team values built-in parallelism without paid services
- You want comprehensive trace-based debugging
Migration Strategy
If you decide to migrate, do not rewrite everything at once. Run both frameworks in parallel:
// package.json
{
"scripts": {
"test:cypress": "cypress run",
"test:playwright": "playwright test",
"test:e2e": "npm run test:playwright && npm run test:cypress"
}
}
Migrate the flakiest tests first. These are the tests that will benefit most from Playwright's auto-waiting and isolation, and they give you the fastest return on migration effort.
Framework-Agnostic Anti-Flakiness Patterns
Regardless of which framework you choose, these patterns reduce flakiness:
cy.wait(5000) or page.waitForTimeout(5000)Automate Flake Detection Regardless of Framework
Whether you choose Playwright, Cypress, or both, automated flake detection catches intermittent failures before they erode your team's confidence in the test suite. DeFlaky integrates with both frameworks and provides cross-run analysis to identify tests whose playwright vs cypress flaky tests behavior indicates framework-level versus application-level issues.
Start detecting flaky tests in your E2E suite:
npx deflaky run
DeFlaky analyzes your test results across multiple runs, identifies patterns in intermittent failures, and provides actionable recommendations for fixing the flakiest tests in your suite. Stop debating frameworks and start measuring actual test reliability with data.