Test Plans

Playwright Test Reporting in Azure Pipelines

A practical guide to integrating Playwright test reporting with Azure Pipelines, covering test runner configuration, JUnit result publishing, HTML report artifacts, trace viewer integration, screenshot and video capture, sharding for parallel execution, and custom reporter development.

Playwright Test Reporting in Azure Pipelines

Overview

Playwright is the modern alternative to Selenium for browser automation testing. It ships with a built-in test runner, automatic waiting, trace recording, and multiple reporter formats -- all designed for CI/CD integration from the start. Unlike Selenium, where you assemble the test runner, assertion library, reporter, and screenshot capture yourself, Playwright includes all of these out of the box. The Azure Pipelines integration involves configuring the right reporters, publishing results and artifacts, and handling the nuances of headless browser execution on Linux agents.

I switched several projects from Selenium to Playwright specifically because of the CI story. Playwright's JUnit reporter produces clean results for Azure DevOps, the HTML reporter generates self-contained report files for artifact publishing, and the trace viewer lets you debug failures by replaying every network request, DOM snapshot, and console message. Setting up this integration correctly takes about 30 minutes and pays for itself on the first flaky test you debug.

Prerequisites

  • Node.js 18+ installed locally and on pipeline agents
  • Azure DevOps organization with Azure Pipelines enabled
  • Basic familiarity with Playwright test syntax
  • A web application accessible from pipeline agents
  • npm or yarn for dependency management
  • Familiarity with YAML pipeline syntax

Setting Up Playwright with Reporting

Project Initialization

npm init playwright@latest

This creates the standard Playwright project structure:

playwright-tests/
├── package.json
├── playwright.config.js
├── tests/
│   ├── login.spec.js
│   ├── dashboard.spec.js
│   └── checkout.spec.js
├── pages/
│   └── (page objects)
├── test-results/
│   └── (generated by Playwright)
└── playwright-report/
    └── (HTML report output)

Playwright Configuration for CI

The playwright.config.js file controls test execution and reporting. Configure it for both local development and CI:

// playwright.config.js
var path = require("path");

var isCI = !!process.env.CI;

module.exports = {
  testDir: "./tests",
  timeout: 30000,
  expect: {
    timeout: 5000,
  },
  fullyParallel: true,
  forbidOnly: isCI,
  retries: isCI ? 2 : 0,
  workers: isCI ? 2 : undefined,
  reporter: isCI
    ? [
        ["junit", { outputFile: "test-results/junit-results.xml" }],
        ["html", { open: "never", outputFolder: "playwright-report" }],
      ]
    : [["html", { open: "on-failure" }]],

  use: {
    baseURL: process.env.BASE_URL || "http://localhost:3000",
    trace: isCI ? "retain-on-failure" : "on-first-retry",
    screenshot: "only-on-failure",
    video: isCI ? "retain-on-failure" : "off",
    headless: true,
    viewport: { width: 1920, height: 1080 },
    actionTimeout: 10000,
    navigationTimeout: 15000,
  },

  projects: [
    {
      name: "chromium",
      use: {
        browserName: "chromium",
      },
    },
    {
      name: "firefox",
      use: {
        browserName: "firefox",
      },
    },
    {
      name: "webkit",
      use: {
        browserName: "webkit",
      },
    },
  ],
};

Key configuration decisions:

  • forbidOnly: isCI: Prevents accidental .only test markers from reaching CI
  • retries: isCI ? 2 : 0: Retries flaky tests in CI, not locally (where you want immediate feedback)
  • trace: "retain-on-failure": Records traces but only keeps them for failed tests, saving disk space
  • screenshot: "only-on-failure": Captures screenshots only when tests fail
  • video: "retain-on-failure": Records video but only saves it for failures

Reporter Configuration

Playwright supports multiple reporters simultaneously. For Azure Pipelines, use these:

JUnit Reporter: Produces XML results that PublishTestResults@2 consumes. This puts test results in the Azure DevOps Tests tab with pass/fail status, duration, and error details.

HTML Reporter: Generates a self-contained HTML report with screenshots, traces, and test timeline. Publish this as a build artifact so anyone can download and view the full report.

List Reporter: Console output during the pipeline run, useful for real-time build monitoring.

// For comprehensive CI reporting
reporter: [
  ["junit", { outputFile: "test-results/junit-results.xml" }],
  ["html", { open: "never", outputFolder: "playwright-report" }],
  ["list"],
],

Writing Tests

// tests/login.spec.js
var test = require("@playwright/test").test;
var expect = require("@playwright/test").expect;

test.describe("Login Page", function () {
  test.beforeEach(function ({}, testInfo) {
    // Add test metadata for Azure DevOps
  });

  test("should display login form", function ({ page }) {
    return page.goto("/login").then(function () {
      return expect(page.locator('input[name="email"]')).toBeVisible();
    }).then(function () {
      return expect(page.locator('input[name="password"]')).toBeVisible();
    }).then(function () {
      return expect(page.locator('button[type="submit"]')).toBeVisible();
    });
  });

  test("should login with valid credentials", function ({ page }) {
    return page.goto("/login").then(function () {
      return page.fill('input[name="email"]', "[email protected]");
    }).then(function () {
      return page.fill('input[name="password"]', "ValidP@ss123");
    }).then(function () {
      return page.click('button[type="submit"]');
    }).then(function () {
      return page.waitForURL("**/dashboard");
    }).then(function () {
      return expect(page.locator(".welcome-text")).toContainText("Welcome");
    });
  });

  test("should show error for invalid password", function ({ page }) {
    return page.goto("/login").then(function () {
      return page.fill('input[name="email"]', "[email protected]");
    }).then(function () {
      return page.fill('input[name="password"]', "wrongpassword");
    }).then(function () {
      return page.click('button[type="submit"]');
    }).then(function () {
      return expect(page.locator(".alert-danger")).toContainText("Invalid");
    });
  });

  test("should validate required fields", function ({ page }) {
    return page.goto("/login").then(function () {
      return page.click('button[type="submit"]');
    }).then(function () {
      return expect(page.locator(".field-error, .invalid-feedback")).toBeVisible();
    });
  });
});

Azure Pipeline Integration

Basic Pipeline

trigger:
  branches:
    include:
      - main
      - feature/*

pool:
  vmImage: ubuntu-latest

variables:
  BASE_URL: "https://staging.example.com"
  CI: "true"

steps:
  - task: NodeTool@0
    inputs:
      versionSpec: "20.x"
    displayName: Use Node.js 20

  - script: npm ci
    workingDirectory: playwright-tests
    displayName: Install dependencies

  - script: npx playwright install --with-deps chromium
    workingDirectory: playwright-tests
    displayName: Install Playwright browsers

  - script: npx playwright test --project=chromium
    workingDirectory: playwright-tests
    displayName: Run Playwright tests
    env:
      BASE_URL: $(BASE_URL)
    continueOnError: true

  - task: PublishTestResults@2
    inputs:
      testResultsFormat: JUnit
      testResultsFiles: "**/test-results/junit-results.xml"
      testRunTitle: "Playwright - Chromium"
      mergeTestResults: true
    condition: always()
    displayName: Publish test results

  - task: PublishBuildArtifacts@1
    inputs:
      pathToPublish: playwright-tests/playwright-report
      artifactName: playwright-html-report
    condition: always()
    displayName: Publish HTML report

  - task: PublishBuildArtifacts@1
    inputs:
      pathToPublish: playwright-tests/test-results
      artifactName: playwright-traces
    condition: failed()
    displayName: Publish traces on failure

The npx playwright install --with-deps chromium command installs the Chromium browser binary along with its system dependencies (font libraries, graphics libraries). This is essential on Linux agents where these dependencies are not pre-installed. Install only the browsers you need to save time -- installing all three browsers adds 2-3 minutes to the pipeline.

Multi-Browser Pipeline

trigger:
  branches:
    include:
      - main

pool:
  vmImage: ubuntu-latest

variables:
  BASE_URL: "https://staging.example.com"
  CI: "true"

jobs:
  - job: PlaywrightTests
    strategy:
      matrix:
        Chromium:
          PROJECT: chromium
          BROWSER_INSTALL: chromium
        Firefox:
          PROJECT: firefox
          BROWSER_INSTALL: firefox
        WebKit:
          PROJECT: webkit
          BROWSER_INSTALL: webkit

    steps:
      - task: NodeTool@0
        inputs:
          versionSpec: "20.x"

      - script: npm ci
        workingDirectory: playwright-tests
        displayName: Install dependencies

      - script: npx playwright install --with-deps $(BROWSER_INSTALL)
        workingDirectory: playwright-tests
        displayName: Install $(PROJECT) browser

      - script: npx playwright test --project=$(PROJECT)
        workingDirectory: playwright-tests
        displayName: Run tests on $(PROJECT)
        env:
          BASE_URL: $(BASE_URL)
        continueOnError: true

      - task: PublishTestResults@2
        inputs:
          testResultsFormat: JUnit
          testResultsFiles: "**/junit-results.xml"
          testRunTitle: "Playwright - $(PROJECT)"
          mergeTestResults: true
        condition: always()

      - task: PublishBuildArtifacts@1
        inputs:
          pathToPublish: playwright-tests/playwright-report
          artifactName: report-$(PROJECT)
        condition: always()

      - task: PublishBuildArtifacts@1
        inputs:
          pathToPublish: playwright-tests/test-results
          artifactName: traces-$(PROJECT)
        condition: failed()

Sharded Parallel Execution

For large test suites, Playwright's built-in sharding splits tests across multiple agents:

jobs:
  - job: PlaywrightSharded
    strategy:
      matrix:
        Shard1:
          SHARD: "1/4"
        Shard2:
          SHARD: "2/4"
        Shard3:
          SHARD: "3/4"
        Shard4:
          SHARD: "4/4"

    steps:
      - task: NodeTool@0
        inputs:
          versionSpec: "20.x"

      - script: npm ci
        workingDirectory: playwright-tests

      - script: npx playwright install --with-deps chromium
        workingDirectory: playwright-tests

      - script: npx playwright test --shard=$(SHARD)
        workingDirectory: playwright-tests
        displayName: Run shard $(SHARD)
        continueOnError: true

      - task: PublishTestResults@2
        inputs:
          testResultsFormat: JUnit
          testResultsFiles: "**/junit-results.xml"
          testRunTitle: "Playwright Shard $(SHARD)"
          mergeTestResults: true
        condition: always()

Sharding distributes test files (not individual tests) across agents. A 100-test-file suite sharded across 4 agents runs roughly 25 files per agent. Azure DevOps merges the results from all shards into a single test run view.

Working with Traces

Playwright traces are the most powerful debugging tool for CI failures. A trace captures:

  • DOM snapshots at every action
  • Network requests and responses
  • Console messages
  • Screenshots before and after each action
  • Source code location

Viewing Traces

Download the trace artifact from the pipeline and open it locally:

npx playwright show-trace trace.zip

Or upload to trace.playwright.dev for a web-based viewer. The trace viewer shows a timeline of every action the test performed, with before/after screenshots and the DOM state at each point.

Trace Configuration Strategies

// Always record traces (large files, slower execution)
trace: "on",

// Record traces only for failed tests (recommended for CI)
trace: "retain-on-failure",

// Record traces on first retry only
trace: "on-first-retry",

// Never record traces (fastest, no debugging data)
trace: "off",

For CI, retain-on-failure is the best trade-off. Traces are recorded for every test but only saved to disk when a test fails. The performance overhead of recording is minimal (5-10%), and the debugging value when something fails is enormous.

Custom Trace Attachments

Add custom data to traces for debugging:

test("should process payment", function ({ page }, testInfo) {
  return page.goto("/checkout").then(function () {
    // Add custom attachment to trace
    return testInfo.attach("checkout-state", {
      body: JSON.stringify({ cart: "items", total: 99.99 }),
      contentType: "application/json",
    });
  }).then(function () {
    return page.click("#pay-now");
  }).then(function () {
    return page.waitForURL("**/confirmation");
  });
});

Custom Reporter for Azure DevOps

Build a custom reporter that creates Azure DevOps work items for test failures:

// reporters/azure-devops-reporter.js
var https = require("https");
var url = require("url");

function AzureDevOpsReporter(options) {
  this.options = options || {};
  this.org = process.env.AZURE_ORG || this.options.org;
  this.project = process.env.AZURE_PROJECT || this.options.project;
  this.pat = process.env.AZURE_DEVOPS_PAT;
  this.failures = [];
}

AzureDevOpsReporter.prototype.onTestEnd = function (test, result) {
  if (result.status === "failed" || result.status === "timedOut") {
    this.failures.push({
      title: test.title,
      suite: test.parent.title,
      file: test.location.file,
      line: test.location.line,
      error: result.errors.map(function (e) { return e.message; }).join("\n"),
      duration: result.duration,
      retries: result.retry,
      browser: test.parent.project().name,
    });
  }
};

AzureDevOpsReporter.prototype.onEnd = function (result) {
  var self = this;
  if (self.failures.length === 0) {
    console.log("[AzureDevOps Reporter] All tests passed, no bugs to file.");
    return;
  }

  console.log("[AzureDevOps Reporter] " + self.failures.length + " failure(s) detected.");

  if (!self.pat) {
    console.log("[AzureDevOps Reporter] No PAT configured, skipping bug creation.");
    self.failures.forEach(function (f) {
      console.log("  FAILED: " + f.suite + " > " + f.title + " (" + f.browser + ")");
    });
    return;
  }

  // Create bugs for persistent failures (failed after all retries)
  var chain = Promise.resolve();
  self.failures.forEach(function (failure) {
    chain = chain.then(function () {
      return self.createBug(failure);
    });
  });

  return chain;
};

AzureDevOpsReporter.prototype.createBug = function (failure) {
  var self = this;
  var title = "[Playwright] " + failure.suite + " - " + failure.title + " (" + failure.browser + ")";
  var repro = "<p><strong>Test:</strong> " + failure.file + ":" + failure.line + "</p>" +
    "<p><strong>Browser:</strong> " + failure.browser + "</p>" +
    "<p><strong>Error:</strong></p><pre>" + failure.error + "</pre>" +
    "<p><strong>Duration:</strong> " + failure.duration + "ms</p>" +
    "<p><strong>Retries:</strong> " + failure.retries + "</p>";

  var patchDoc = [
    { op: "add", path: "/fields/System.Title", value: title },
    { op: "add", path: "/fields/Microsoft.VSTS.TCM.ReproSteps", value: repro },
    { op: "add", path: "/fields/System.Tags", value: "playwright;automated-bug" },
    { op: "add", path: "/fields/Microsoft.VSTS.Common.Priority", value: 2 },
  ];

  var apiUrl = "https://dev.azure.com/" + self.org + "/" + self.project +
    "/_apis/wit/workitems/$Bug?api-version=7.1";
  var parsed = url.parse(apiUrl);

  return new Promise(function (resolve, reject) {
    var options = {
      hostname: parsed.hostname,
      path: parsed.path,
      method: "POST",
      headers: {
        "Content-Type": "application/json-patch+json",
        Authorization: "Basic " + Buffer.from(":" + self.pat).toString("base64"),
      },
    };

    var req = https.request(options, function (res) {
      var data = "";
      res.on("data", function (chunk) { data += chunk; });
      res.on("end", function () {
        if (res.statusCode >= 200 && res.statusCode < 300) {
          var bug = JSON.parse(data);
          console.log("[AzureDevOps Reporter] Created bug #" + bug.id + ": " + title);
          resolve(bug);
        } else {
          console.error("[AzureDevOps Reporter] Failed to create bug: " + res.statusCode);
          resolve(null);
        }
      });
    });

    req.on("error", function (err) {
      console.error("[AzureDevOps Reporter] " + err.message);
      resolve(null);
    });

    req.write(JSON.stringify(patchDoc));
    req.end();
  });
};

module.exports = AzureDevOpsReporter;

Register the custom reporter in playwright.config.js:

reporter: [
  ["junit", { outputFile: "test-results/junit-results.xml" }],
  ["html", { open: "never" }],
  ["./reporters/azure-devops-reporter.js", { org: "my-org", project: "my-project" }],
],

Complete Working Example

A complete pipeline configuration combining all features -- multi-browser testing, sharding, artifact publishing, and trace collection:

trigger:
  branches:
    include:
      - main
  paths:
    include:
      - src/**
      - playwright-tests/**

pool:
  vmImage: ubuntu-latest

variables:
  BASE_URL: "https://staging.example.com"
  CI: "true"
  PLAYWRIGHT_JUNIT_OUTPUT_NAME: "junit-results.xml"

stages:
  - stage: Test
    displayName: Playwright Tests
    jobs:
      - job: Chromium
        displayName: Chromium Tests
        strategy:
          matrix:
            Shard1:
              SHARD: "1/2"
            Shard2:
              SHARD: "2/2"
        steps:
          - task: NodeTool@0
            inputs:
              versionSpec: "20.x"

          - script: npm ci
            workingDirectory: playwright-tests
            displayName: Install dependencies

          - script: npx playwright install --with-deps chromium
            workingDirectory: playwright-tests
            displayName: Install Chromium

          - script: npx playwright test --project=chromium --shard=$(SHARD)
            workingDirectory: playwright-tests
            displayName: Run Chromium shard $(SHARD)
            env:
              BASE_URL: $(BASE_URL)
            continueOnError: true

          - task: PublishTestResults@2
            inputs:
              testResultsFormat: JUnit
              testResultsFiles: "**/junit-results.xml"
              testRunTitle: "Chromium Shard $(SHARD)"
              mergeTestResults: true
            condition: always()

          - task: PublishBuildArtifacts@1
            inputs:
              pathToPublish: playwright-tests/playwright-report
              artifactName: report-chromium-shard-$(SHARD)
            condition: always()

          - task: PublishBuildArtifacts@1
            inputs:
              pathToPublish: playwright-tests/test-results
              artifactName: traces-chromium-shard-$(SHARD)
            condition: failed()

      - job: Firefox
        displayName: Firefox Tests
        steps:
          - task: NodeTool@0
            inputs:
              versionSpec: "20.x"

          - script: npm ci
            workingDirectory: playwright-tests

          - script: npx playwright install --with-deps firefox
            workingDirectory: playwright-tests
            displayName: Install Firefox

          - script: npx playwright test --project=firefox
            workingDirectory: playwright-tests
            displayName: Run Firefox tests
            env:
              BASE_URL: $(BASE_URL)
            continueOnError: true

          - task: PublishTestResults@2
            inputs:
              testResultsFormat: JUnit
              testResultsFiles: "**/junit-results.xml"
              testRunTitle: "Playwright - Firefox"
              mergeTestResults: true
            condition: always()

          - task: PublishBuildArtifacts@1
            inputs:
              pathToPublish: playwright-tests/playwright-report
              artifactName: report-firefox
            condition: always()

Common Issues and Troubleshooting

Browser Installation Fails on Pipeline Agent

Error: Host system is missing dependencies to run browsers.
Missing: libatk-bridge2.0-0, libgtk-3-0, libgbm1

Use npx playwright install --with-deps instead of npx playwright install. The --with-deps flag installs system-level dependencies (GTK, GBM, ATK) that browsers need. On Microsoft-hosted ubuntu-latest agents, these dependencies are not pre-installed. If you use a custom Docker image for your pipeline agent, add the Playwright system dependencies to your Dockerfile.

JUnit Results File Not Found

warning: No test result files matching '**/junit-results.xml' were found.

The JUnit reporter only creates the output file when at least one test runs. If the test step fails before any tests execute (e.g., configuration error), no file is produced. Add failTaskOnMissingResultsFile: false to the PublishTestResults@2 task to prevent this from failing the pipeline. Also verify the outputFile path in your Playwright config matches the glob in the publish task.

Tests Timeout in CI But Pass Locally

Pipeline agents are typically slower than developer machines, especially for browser operations. Increase timeouts in your CI configuration: set timeout: 60000 for the global test timeout, actionTimeout: 15000 for individual actions, and navigationTimeout: 30000 for page loads. Also check if the BASE_URL is reachable from the pipeline agent -- network latency to the staging server adds to every navigation.

HTML Report Shows No Screenshots

Screenshots are only captured based on the screenshot configuration option. With "only-on-failure", passing tests have no screenshots. If you need screenshots for all tests (e.g., for visual review), use "on" -- but be aware this significantly increases artifact size. For a 100-test suite, screenshots can add 200-500 MB to the artifacts.

Traces Are Too Large to Download

Playwright traces include DOM snapshots, network HAR data, and screenshots for every action. A single failing test trace can be 10-50 MB. With "retain-on-failure", only failed test traces are saved, but a suite with many failures can produce gigabytes of trace data. Set a maximum trace size in your config or use "on-first-retry" to only record traces on the first retry attempt.

Best Practices

  • Install only the browsers you need. npx playwright install downloads all three browsers (400+ MB). Use npx playwright install chromium to install only Chromium, saving 2-3 minutes of pipeline time. Install additional browsers only when you specifically test against them.

  • Use retain-on-failure for traces in CI. This gives you debugging data for failures without the storage cost of recording every test. Traces for passing tests are deleted automatically.

  • Publish the HTML report as a build artifact on every run. The HTML report is self-contained and includes traces, screenshots, and test results. Team members can download it without Azure DevOps access, making it useful for stakeholder reviews and cross-team debugging.

  • Set retries: 2 in CI to handle flaky tests. Playwright's retry mechanism re-runs failed tests up to the specified count. A test that passes on retry is marked as "flaky" in the report, giving you visibility into test reliability without breaking the build.

  • Use forbidOnly: true in CI. This fails the pipeline if any test has .only() applied, preventing accidental partial test runs from reaching the main branch.

  • Shard large test suites across multiple agents. Sharding is Playwright's built-in parallel execution mechanism. Unlike test-level parallelism (which runs tests in parallel within one agent), sharding distributes test files across multiple agents, linearly reducing total execution time.

  • Pin Playwright version in package.json. Playwright browser binaries are tied to the Playwright npm package version. If one developer updates the package without reinstalling browsers, tests fail. Pin the version and update deliberately.

  • Use the --project flag to run specific browser tests. Run Chromium tests on PR builds for speed, and run the full matrix (Chromium + Firefox + WebKit) on merge to main.

References

Powered by Contentful