Pipelines

Matrix Builds and Parallel Job Strategies

A practical guide to Azure DevOps matrix builds and parallel job strategies, covering cross-platform testing, multi-version matrices, dynamic generation, fan-out/fan-in patterns, and concurrency optimization.

Matrix Builds and Parallel Job Strategies

Overview

Matrix builds let you define a set of dimensions — like operating system, runtime version, or database engine — and Azure Pipelines automatically expands every combination into parallel jobs. Instead of copying and pasting the same job definition six times with minor variations, you declare the axes and let the pipeline engine do the multiplication. This article covers matrix syntax, parallel job strategies, fan-out/fan-in patterns, and the concurrency controls you need to keep your pipeline fast without burning through your entire agent pool.

Prerequisites

  • An Azure DevOps organization with at least one project and a YAML pipeline
  • Basic understanding of YAML pipeline structure (triggers, stages, jobs, steps)
  • Familiarity with Azure DevOps agent pools (Microsoft-hosted or self-hosted)
  • Understanding of how pipeline jobs consume parallel job slots
  • Node.js v18+ installed if running the Node.js examples locally
  • At least 2 parallel job slots in your Azure DevOps organization (free tier includes 1 Microsoft-hosted)

What Matrix Builds Are and When to Use Them

A matrix build is a job-level strategy that takes a set of named variable combinations and expands them into separate parallel jobs. Each combination runs as its own job on its own agent, with the matrix variables injected as pipeline variables accessible in every step.

Think of it as a grid. If you define two dimensions — operating system (Windows, Linux, macOS) and Node.js version (18, 20, 22) — the matrix produces 9 jobs, one for each cell in the 3x3 grid. Each job gets both variables, so you can reference $(nodeVersion) and $(imageName) in your steps.

Use matrix builds when:

  • Cross-platform testing — You need to verify your code works on multiple operating systems
  • Multi-version testing — You support multiple runtime versions (Node.js, Python, .NET) and need CI coverage for each
  • Multi-database testing — Your application abstracts the database layer and you test against PostgreSQL, MySQL, and SQLite
  • Multi-browser testing — End-to-end tests need to run in Chrome, Firefox, and Edge
  • Any combinatorial test scenario — Where the test logic is identical but the environment differs

Do not use matrix builds when each "variant" requires substantially different steps. If your Windows build uses MSBuild and your Linux build uses Make, a matrix adds complexity without simplification. Use explicit separate jobs instead.


Basic Matrix Syntax

The matrix is defined under strategy.matrix on a job. Each key under matrix is a named combination, and each combination defines one or more variables.

jobs:
  - job: Test
    strategy:
      matrix:
        linux_node18:
          imageName: 'ubuntu-latest'
          nodeVersion: '18'
        linux_node20:
          imageName: 'ubuntu-latest'
          nodeVersion: '20'
        linux_node22:
          imageName: 'ubuntu-latest'
          nodeVersion: '22'
    pool:
      vmImage: $(imageName)
    steps:
      - task: NodeTool@0
        inputs:
          versionSpec: $(nodeVersion)
      - script: |
          node --version
          npm ci
          npm test
        displayName: 'Install and test'

Each key (linux_node18, linux_node20, linux_node22) becomes a separate job. The job name in the Azure DevOps UI shows as Test linux_node18, Test linux_node20, etc. The variables you define (imageName, nodeVersion) are available as macro syntax $(variableName) or as environment variables.

Naming Matters

The matrix key name is appended to the job name in the UI and in dependency references. Keep them descriptive but short. Avoid spaces — use underscores or camelCase. Azure DevOps strips characters that are invalid in job names.


How Azure DevOps Expands the Matrix

When the pipeline engine encounters a matrix strategy, it creates one copy of the job for each entry in the matrix. The expansion happens at pipeline compile time, not at runtime, which means:

  1. Each matrix leg is a fully independent job with its own agent
  2. Each leg has its own job ID for dependency and condition references
  3. Variables defined in the matrix override variables defined at the job level
  4. All legs share the same step definitions — there is no way to conditionally include a step for only one matrix leg (use conditions instead)

In the pipeline run UI, you see each matrix leg as an expandable entry under the parent job. Each one has its own log, its own status, and its own duration.

If one leg fails, the other legs continue by default. The overall job status shows as "partially succeeded" if at least one leg passed and at least one failed.


Controlling Concurrency with maxParallel

By default, all matrix legs run simultaneously. If you have a 3x3 matrix producing 9 jobs and your organization has 10 parallel job slots, all 9 will start at once. That is fast, but it might starve other pipelines in your organization.

The maxParallel property limits how many legs run concurrently:

jobs:
  - job: Test
    strategy:
      maxParallel: 3
      matrix:
        linux_node18:
          imageName: 'ubuntu-latest'
          nodeVersion: '18'
        linux_node20:
          imageName: 'ubuntu-latest'
          nodeVersion: '20'
        linux_node22:
          imageName: 'ubuntu-latest'
          nodeVersion: '22'
        windows_node18:
          imageName: 'windows-latest'
          nodeVersion: '18'
        windows_node20:
          imageName: 'windows-latest'
          nodeVersion: '20'
        windows_node22:
          imageName: 'windows-latest'
          nodeVersion: '22'
    pool:
      vmImage: $(imageName)
    steps:
      - task: NodeTool@0
        inputs:
          versionSpec: $(nodeVersion)
      - script: npm ci && npm test
        displayName: 'Test'

With maxParallel: 3, only 3 of the 6 legs run at a time. When one finishes, the next queued leg starts. This is essential when:

  • You have a limited number of parallel job slots
  • You are running against shared infrastructure (like a shared database server) that cannot handle 9 simultaneous test suites
  • Self-hosted agents have a fixed pool size and you do not want matrix jobs queuing behind each other for hours

Set maxParallel to 1 if you need serial execution of matrix legs — useful for ordered deployments across regions where you want to validate one region before proceeding.


Fine-Tuning Combinations with Include and Exclude

Sometimes a full cross-product generates combinations you do not need. You cannot use include and exclude keywords directly in Azure DevOps the way GitHub Actions supports them. Instead, you control your matrix by explicitly listing only the combinations you want.

This is actually an advantage. Explicit listing is clearer than implicit exclusion rules. Your pipeline file is the single source of truth for what runs:

jobs:
  - job: Test
    strategy:
      matrix:
        # We support Node 18 on all platforms
        linux_node18:
          imageName: 'ubuntu-latest'
          nodeVersion: '18'
        windows_node18:
          imageName: 'windows-latest'
          nodeVersion: '18'
        mac_node18:
          imageName: 'macos-latest'
          nodeVersion: '18'
        # But Node 22 only needs Linux testing — it is our dev target
        linux_node22:
          imageName: 'ubuntu-latest'
          nodeVersion: '22'
    pool:
      vmImage: $(imageName)
    steps:
      - task: NodeTool@0
        inputs:
          versionSpec: $(nodeVersion)
      - script: npm ci && npm test
        displayName: 'Test'

This gives you 4 legs instead of 9, saving agent time on combinations that do not add value.

If you need to conditionally skip steps within a specific matrix leg, use the condition property on individual steps:

steps:
  - script: npm run test:integration
    displayName: 'Integration tests'
    condition: eq(variables['imageName'], 'ubuntu-latest')

This runs integration tests only on Linux legs, regardless of which Node version is paired with it.


Dynamic Matrix Generation with Each Expressions

For pipelines that pull matrix values from parameters or templates, you can use the ${{ each }} expression to generate matrix entries dynamically. This is powerful when you want the matrix to be configurable at trigger time or reusable across repositories.

parameters:
  - name: nodeVersions
    type: object
    default:
      - '18'
      - '20'
      - '22'
  - name: operatingSystems
    type: object
    default:
      - name: linux
        vmImage: 'ubuntu-latest'
      - name: windows
        vmImage: 'windows-latest'

jobs:
  - job: Test
    strategy:
      matrix:
        ${{ each os in parameters.operatingSystems }}:
          ${{ each nodeVer in parameters.nodeVersions }}:
            ${{ os.name }}_node${{ nodeVer }}:
              imageName: ${{ os.vmImage }}
              nodeVersion: ${{ nodeVer }}
    pool:
      vmImage: $(imageName)
    steps:
      - task: NodeTool@0
        inputs:
          versionSpec: $(nodeVersion)
      - script: npm ci && npm test
        displayName: 'Test on $(imageName) with Node $(nodeVersion)'

At compile time, this expands into 6 matrix entries (2 OSes x 3 versions). The advantage is that you can override the parameters when triggering manually or from a template, without editing the pipeline file.

A more practical use is reading versions from a parameter so your nightly build tests against a broader matrix than your PR build:

parameters:
  - name: fullMatrix
    type: boolean
    default: false

jobs:
  - job: Test
    strategy:
      matrix:
        linux_node20:
          imageName: 'ubuntu-latest'
          nodeVersion: '20'
        ${{ if eq(parameters.fullMatrix, true) }}:
          linux_node18:
            imageName: 'ubuntu-latest'
            nodeVersion: '18'
          linux_node22:
            imageName: 'ubuntu-latest'
            nodeVersion: '22'
          windows_node20:
            imageName: 'windows-latest'
            nodeVersion: '20'
          mac_node20:
            imageName: 'macos-latest'
            nodeVersion: '20'
    pool:
      vmImage: $(imageName)
    steps:
      - task: NodeTool@0
        inputs:
          versionSpec: $(nodeVersion)
      - script: npm ci && npm test

PR builds test only Linux + Node 20. Nightly builds set fullMatrix: true and get the full grid. This saves you hundreds of agent minutes per week.


Using Matrix Variables in Job Steps

Matrix variables behave exactly like any other pipeline variable. They are accessible via macro syntax $(variableName), template expressions ${{ variables.variableName }}, and as environment variables in scripts.

jobs:
  - job: Test
    strategy:
      matrix:
        pg15:
          dbEngine: 'postgres'
          dbVersion: '15'
          dbPort: '5432'
        pg16:
          dbEngine: 'postgres'
          dbVersion: '16'
          dbPort: '5433'
        mongo7:
          dbEngine: 'mongo'
          dbVersion: '7'
          dbPort: '27017'
    pool:
      vmImage: 'ubuntu-latest'
    steps:
      - script: |
          echo "Testing against $(dbEngine) version $(dbVersion) on port $(dbPort)"
          docker run -d --name testdb -p $(dbPort):$(dbPort) $(dbEngine):$(dbVersion)
        displayName: 'Start $(dbEngine) $(dbVersion)'

      - script: |
          export DB_ENGINE=$(dbEngine)
          export DB_PORT=$(dbPort)
          npm ci
          npm run test:database
        displayName: 'Run database tests'
        env:
          DB_ENGINE: $(dbEngine)
          DB_PORT: $(dbPort)

Note the env mapping on the script step. While $(variableName) works for inline expansion, explicitly mapping variables via env is more reliable for scripts that read environment variables programmatically.


Matrix for Cross-Platform Testing

Cross-platform testing is the most common matrix use case. Here is a realistic example for a Node.js CLI tool that needs to work on all three major platforms:

trigger:
  branches:
    include:
      - main
      - 'release/*'

jobs:
  - job: CrossPlatformTest
    strategy:
      maxParallel: 4
      matrix:
        linux:
          imageName: 'ubuntu-22.04'
          testSuite: 'full'
        windows:
          imageName: 'windows-2022'
          testSuite: 'full'
        macos_intel:
          imageName: 'macos-12'
          testSuite: 'full'
        macos_arm:
          imageName: 'macos-14'
          testSuite: 'core'
    pool:
      vmImage: $(imageName)
    steps:
      - task: NodeTool@0
        inputs:
          versionSpec: '20'
      - script: npm ci
        displayName: 'Install dependencies'
      - script: npm run test:$(testSuite)
        displayName: 'Run $(testSuite) test suite'
      - task: PublishTestResults@2
        inputs:
          testResultsFormat: 'JUnit'
          testResultsFiles: '**/test-results.xml'
        condition: always()

Notice the testSuite variable. The macOS ARM runner (macos-14) runs a reduced core suite because some native module tests are flaky on ARM. This is the kind of pragmatic decision matrix builds let you encode directly in the pipeline.


Matrix for Multi-Version Testing

If you maintain a library or framework that promises compatibility across multiple Node.js versions, matrix builds are non-negotiable:

jobs:
  - job: MultiVersionTest
    strategy:
      matrix:
        node18:
          nodeVersion: '18'
        node20:
          nodeVersion: '20'
        node22:
          nodeVersion: '22'
    pool:
      vmImage: 'ubuntu-latest'
    steps:
      - task: NodeTool@0
        inputs:
          versionSpec: $(nodeVersion)
      - script: |
          node --version
          npm --version
        displayName: 'Print versions'
      - script: npm ci
        displayName: 'Install'
      - script: npm test
        displayName: 'Test'
      - script: npm run build
        displayName: 'Build'

This is simple, but it catches real bugs. I have personally seen Object.hasOwn() calls break on Node 14, structuredClone fail on Node 16, and fetch being unavailable before Node 18. The matrix catches these before your users do.


Matrix for Multi-Database Testing

Applications that support multiple database backends need to test against each one. Here is a pattern using Docker service containers in the matrix:

jobs:
  - job: DatabaseTest
    strategy:
      matrix:
        postgresql:
          dbImage: 'postgres:16'
          dbPort: '5432'
          dbType: 'pg'
          dbEnv: '-e POSTGRES_PASSWORD=testpass -e POSTGRES_DB=testdb'
        mysql:
          dbImage: 'mysql:8'
          dbPort: '3306'
          dbType: 'mysql'
          dbEnv: '-e MYSQL_ROOT_PASSWORD=testpass -e MYSQL_DATABASE=testdb'
        mongodb:
          dbImage: 'mongo:7'
          dbPort: '27017'
          dbType: 'mongo'
          dbEnv: ''
    pool:
      vmImage: 'ubuntu-latest'
    steps:
      - script: |
          docker run -d \
            --name testdb \
            -p $(dbPort):$(dbPort) \
            $(dbEnv) \
            $(dbImage)
          sleep 10
        displayName: 'Start $(dbType) container'
      - task: NodeTool@0
        inputs:
          versionSpec: '20'
      - script: |
          npm ci
          DB_TYPE=$(dbType) DB_PORT=$(dbPort) npm run test:integration
        displayName: 'Run integration tests against $(dbType)'
      - script: docker stop testdb && docker rm testdb
        displayName: 'Cleanup'
        condition: always()

The test runner reads DB_TYPE and DB_PORT to connect to the right database. Your data access layer tests verify that queries produce identical results regardless of backend.


Combining Matrix with Templates

Templates make matrix strategies reusable across repositories. Define the job template once, and every repo can use it with its own parameters:

File: templates/test-matrix.yml

parameters:
  - name: nodeVersions
    type: object
    default: ['20']
  - name: operatingSystems
    type: object
    default:
      - name: linux
        vmImage: 'ubuntu-latest'
  - name: maxParallel
    type: number
    default: 4
  - name: testCommand
    type: string
    default: 'npm test'

jobs:
  - job: MatrixTest
    strategy:
      maxParallel: ${{ parameters.maxParallel }}
      matrix:
        ${{ each os in parameters.operatingSystems }}:
          ${{ each ver in parameters.nodeVersions }}:
            ${{ os.name }}_node${{ ver }}:
              imageName: ${{ os.vmImage }}
              nodeVersion: ${{ ver }}
    pool:
      vmImage: $(imageName)
    steps:
      - task: NodeTool@0
        inputs:
          versionSpec: $(nodeVersion)
      - script: npm ci
        displayName: 'Install'
      - script: ${{ parameters.testCommand }}
        displayName: 'Test'
      - task: PublishTestResults@2
        inputs:
          testResultsFormat: 'JUnit'
          testResultsFiles: '**/junit.xml'
        condition: always()

Usage in a pipeline:

stages:
  - stage: Test
    jobs:
      - template: templates/test-matrix.yml
        parameters:
          nodeVersions: ['18', '20', '22']
          operatingSystems:
            - name: linux
              vmImage: 'ubuntu-latest'
            - name: windows
              vmImage: 'windows-latest'
          maxParallel: 4
          testCommand: 'npm run test:ci'

Every team in your organization uses the same test template. When you need to add Node 24 support, update the default in one place.


Parallel Jobs Without Matrix

Not every parallel scenario needs a matrix. When jobs do fundamentally different work but can run simultaneously, use explicit parallel jobs:

stages:
  - stage: QualityGates
    jobs:
      - job: UnitTests
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - script: npm ci && npm test
            displayName: 'Unit tests'

      - job: Lint
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - script: npm ci && npm run lint
            displayName: 'Lint'

      - job: SecurityScan
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - script: npm ci && npm audit --audit-level=high
            displayName: 'Security audit'

      - job: TypeCheck
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - script: npm ci && npx tsc --noEmit
            displayName: 'Type check'

All four jobs run in parallel because they have no dependencies on each other. A pipeline that ran these sequentially might take 12 minutes. In parallel, it takes as long as the slowest job — typically 4-5 minutes.


Fan-Out/Fan-In Patterns

Fan-out/fan-in is a powerful pattern where you parallelize work in the middle of your pipeline and then converge back to a single job for aggregation or final deployment.

Fan-Out: Parallel Regional Deployment

stages:
  - stage: Build
    jobs:
      - job: BuildArtifact
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - script: npm ci && npm run build
          - publish: $(System.DefaultWorkingDirectory)/dist
            artifact: app-dist

  - stage: DeployRegions
    dependsOn: Build
    jobs:
      - job: DeployEastUS
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - download: current
            artifact: app-dist
          - script: |
              echo "Deploying to East US..."
              # az webapp deploy --resource-group rg-east --name app-eastus
            displayName: 'Deploy East US'

      - job: DeployWestUS
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - download: current
            artifact: app-dist
          - script: |
              echo "Deploying to West US..."
              # az webapp deploy --resource-group rg-west --name app-westus
            displayName: 'Deploy West US'

      - job: DeployEUWest
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - download: current
            artifact: app-dist
          - script: |
              echo "Deploying to EU West..."
              # az webapp deploy --resource-group rg-eu --name app-euwest
            displayName: 'Deploy EU West'

  - stage: Validate
    dependsOn: DeployRegions
    jobs:
      - job: HealthCheck
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - script: |
              echo "Running health checks across all regions..."
              curl -f https://app-eastus.azurewebsites.net/health || exit 1
              curl -f https://app-westus.azurewebsites.net/health || exit 1
              curl -f https://app-euwest.azurewebsites.net/health || exit 1
            displayName: 'Health check all regions'

The Build stage runs once. DeployRegions fans out into 3 parallel jobs. Validate fans back in and waits for all 3 deployments to finish before running health checks.

Fan-In with Matrix

You can combine fan-out with matrix for the deployment stage too:

  - stage: DeployRegions
    dependsOn: Build
    jobs:
      - job: Deploy
        strategy:
          maxParallel: 2
          matrix:
            eastus:
              region: 'eastus'
              appName: 'app-eastus'
            westus:
              region: 'westus'
              appName: 'app-westus'
            euwest:
              region: 'euwest'
              appName: 'app-euwest'
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - download: current
            artifact: app-dist
          - script: |
              echo "Deploying to $(region)..."
              # az webapp deploy --name $(appName) --src-path $(Pipeline.Workspace)/app-dist
            displayName: 'Deploy to $(region)'

With maxParallel: 2, you deploy to 2 regions at a time, keeping one region live while the others update. This is a basic rolling deployment strategy.


Optimizing Pipeline Duration with Parallelism

Here are concrete strategies I use to cut pipeline duration:

1. Split test suites across parallel jobs:

jobs:
  - job: Test
    strategy:
      matrix:
        unit:
          testGlob: 'test/unit/**/*.test.js'
        integration:
          testGlob: 'test/integration/**/*.test.js'
        e2e:
          testGlob: 'test/e2e/**/*.test.js'
    pool:
      vmImage: 'ubuntu-latest'
    steps:
      - script: npm ci
      - script: npx jest --testPathPattern="$(testGlob)"
        displayName: 'Run tests matching $(testGlob)'

2. Cache dependencies per matrix leg:

steps:
  - task: Cache@2
    inputs:
      key: 'npm | "$(Agent.OS)" | "$(nodeVersion)" | package-lock.json'
      path: '$(Pipeline.Workspace)/.npm'
    displayName: 'Cache npm'
  - script: npm ci --cache $(Pipeline.Workspace)/.npm
    displayName: 'Install (cached)'

The cache key includes Agent.OS and nodeVersion from the matrix, so each leg caches independently. The Linux + Node 20 leg does not pollute the cache for Windows + Node 18.

3. Use parallel strategy for test sharding:

Azure Pipelines also supports a parallel strategy (distinct from matrix) that creates N identical jobs:

jobs:
  - job: ShardedTests
    strategy:
      parallel: 4
    pool:
      vmImage: 'ubuntu-latest'
    steps:
      - script: npm ci
      - script: |
          npx jest --shard=$(System.JobPositionInPhase)/$(System.TotalJobsInPhase)
        displayName: 'Run test shard $(System.JobPositionInPhase) of $(System.TotalJobsInPhase)'

Jest's --shard flag splits tests across workers. If you have 400 tests and 4 parallel jobs, each job runs roughly 100 tests. This is faster than running all 400 on one agent.


Cost Considerations

Parallel jobs are not free. Every matrix leg consumes one parallel job slot for the duration of its execution. Here is what that means in practice:

  • Microsoft-hosted agents: Free tier gets 1 parallel job with 1800 minutes/month. Each additional parallel job is approximately $40/month. A 9-leg matrix on every PR commit adds up fast.
  • Self-hosted agents: No per-minute cost, but you need hardware. Each parallel leg needs an available agent. If you have 4 self-hosted agents and a 9-leg matrix, 5 legs queue until agents free up.
  • macOS agents: Microsoft-hosted macOS agents cost 10x the minutes of Linux agents. A matrix that includes macOS should limit those legs to nightly builds, not PR triggers.

Practical rules I follow:

  1. PR builds use a minimal matrix (1-2 legs). Full matrix runs nightly or on release branches.
  2. macOS legs only run on nightly schedules or manually triggered runs.
  3. maxParallel is always set to leave at least 2 slots free for other pipelines.
  4. Self-hosted agent pools have dedicated agents for matrix jobs, separate from deployment agents.

Complete Working Example

Here is a production-ready pipeline that tests a Node.js application across 3 operating systems and 3 Node.js versions, limits concurrency, and then deploys to 3 regions with a final health check.

trigger:
  branches:
    include:
      - main
  paths:
    exclude:
      - '*.md'
      - 'docs/**'

pr:
  branches:
    include:
      - main

parameters:
  - name: fullMatrix
    displayName: 'Run full test matrix'
    type: boolean
    default: false
  - name: deployAfterTest
    displayName: 'Deploy after tests pass'
    type: boolean
    default: false

variables:
  - name: npmCacheFolder
    value: $(Pipeline.Workspace)/.npm

stages:
  # ============================================================
  # Stage 1: Build
  # ============================================================
  - stage: Build
    displayName: 'Build'
    jobs:
      - job: BuildApp
        displayName: 'Build application'
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - task: NodeTool@0
            inputs:
              versionSpec: '20'
          - task: Cache@2
            inputs:
              key: 'npm | "$(Agent.OS)" | package-lock.json'
              path: $(npmCacheFolder)
            displayName: 'Cache npm packages'
          - script: npm ci --cache $(npmCacheFolder)
            displayName: 'Install dependencies'
          - script: npm run build
            displayName: 'Build'
          - publish: $(System.DefaultWorkingDirectory)/dist
            artifact: app-dist
            displayName: 'Publish build artifact'

  # ============================================================
  # Stage 2: Test Matrix
  # ============================================================
  - stage: Test
    displayName: 'Test'
    dependsOn: Build
    jobs:
      - job: TestMatrix
        displayName: 'Test'
        strategy:
          maxParallel: 4
          matrix:
            # Always run: Linux + Node 20 (primary target)
            linux_node20:
              imageName: 'ubuntu-latest'
              nodeVersion: '20'
              osLabel: 'Linux'

            # Always run: Windows + Node 20
            windows_node20:
              imageName: 'windows-latest'
              nodeVersion: '20'
              osLabel: 'Windows'

            # Full matrix only: remaining combinations
            ${{ if or(eq(parameters.fullMatrix, true), eq(variables['Build.Reason'], 'Schedule')) }}:
              linux_node18:
                imageName: 'ubuntu-latest'
                nodeVersion: '18'
                osLabel: 'Linux'
              linux_node22:
                imageName: 'ubuntu-latest'
                nodeVersion: '22'
                osLabel: 'Linux'
              windows_node18:
                imageName: 'windows-latest'
                nodeVersion: '18'
                osLabel: 'Windows'
              windows_node22:
                imageName: 'windows-latest'
                nodeVersion: '22'
                osLabel: 'Windows'
              mac_node18:
                imageName: 'macos-latest'
                nodeVersion: '18'
                osLabel: 'macOS'
              mac_node20:
                imageName: 'macos-latest'
                nodeVersion: '20'
                osLabel: 'macOS'
              mac_node22:
                imageName: 'macos-latest'
                nodeVersion: '22'
                osLabel: 'macOS'

        pool:
          vmImage: $(imageName)
        steps:
          - task: NodeTool@0
            inputs:
              versionSpec: $(nodeVersion)
            displayName: 'Use Node.js $(nodeVersion)'

          - task: Cache@2
            inputs:
              key: 'npm | "$(Agent.OS)" | "$(nodeVersion)" | package-lock.json'
              path: $(npmCacheFolder)
            displayName: 'Cache npm packages'

          - script: npm ci --cache $(npmCacheFolder)
            displayName: 'Install dependencies'

          - script: |
              echo "========================================="
              echo "Testing on $(osLabel) with Node.js $(nodeVersion)"
              echo "========================================="
              node --version
              npm --version
            displayName: 'Print environment info'

          - script: npm run lint
            displayName: 'Lint'

          - script: npm test -- --ci --reporters=default --reporters=jest-junit
            displayName: 'Run tests'
            env:
              JEST_JUNIT_OUTPUT_DIR: $(System.DefaultWorkingDirectory)/test-results
              JEST_JUNIT_OUTPUT_NAME: junit-$(osLabel)-node$(nodeVersion).xml

          - task: PublishTestResults@2
            inputs:
              testResultsFormat: 'JUnit'
              testResultsFiles: '**/junit-*.xml'
              testRunTitle: '$(osLabel) - Node $(nodeVersion)'
              mergeTestResults: false
            condition: always()
            displayName: 'Publish test results'

  # ============================================================
  # Stage 3: Deploy (fan-out to 3 regions)
  # ============================================================
  - stage: Deploy
    displayName: 'Deploy to regions'
    dependsOn: Test
    condition: |
      and(
        succeeded(),
        eq(variables['Build.SourceBranch'], 'refs/heads/main'),
        or(
          eq('${{ parameters.deployAfterTest }}', 'true'),
          eq(variables['Build.Reason'], 'Schedule')
        )
      )
    jobs:
      - job: DeployRegion
        displayName: 'Deploy'
        strategy:
          maxParallel: 2
          matrix:
            eastus:
              region: 'East US'
              regionCode: 'eastus'
              appName: 'myapp-eastus'
              healthUrl: 'https://myapp-eastus.azurewebsites.net/health'
            westus:
              region: 'West US'
              regionCode: 'westus'
              appName: 'myapp-westus'
              healthUrl: 'https://myapp-westus.azurewebsites.net/health'
            euwest:
              region: 'EU West'
              regionCode: 'euwest'
              appName: 'myapp-euwest'
              healthUrl: 'https://myapp-euwest.azurewebsites.net/health'
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - download: current
            artifact: app-dist
            displayName: 'Download build artifact'

          - script: |
              echo "Deploying to $(region) ($(regionCode))..."
              echo "Target: $(appName)"
              # In production, replace with:
              # az webapp deploy \
              #   --resource-group rg-$(regionCode) \
              #   --name $(appName) \
              #   --src-path $(Pipeline.Workspace)/app-dist
            displayName: 'Deploy to $(region)'

          - script: |
              echo "Waiting 30 seconds for deployment to stabilize..."
              sleep 30
              echo "Checking health at $(healthUrl)..."
              curl -sf $(healthUrl) || exit 1
              echo "$(region) is healthy."
            displayName: 'Post-deploy health check ($(region))'

  # ============================================================
  # Stage 4: Final Validation (fan-in)
  # ============================================================
  - stage: Validate
    displayName: 'Final validation'
    dependsOn: Deploy
    condition: succeeded()
    jobs:
      - job: GlobalHealthCheck
        displayName: 'Global health check'
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - script: |
              echo "Running global validation across all regions..."
              FAILED=0

              for URL in \
                "https://myapp-eastus.azurewebsites.net/health" \
                "https://myapp-westus.azurewebsites.net/health" \
                "https://myapp-euwest.azurewebsites.net/health"; do
                echo "Checking $URL..."
                if curl -sf "$URL" > /dev/null 2>&1; then
                  echo "  OK"
                else
                  echo "  FAILED"
                  FAILED=1
                fi
              done

              if [ "$FAILED" -eq 1 ]; then
                echo "One or more regions failed health check!"
                exit 1
              fi

              echo "All regions healthy. Deployment complete."
            displayName: 'Verify all regions'

          - script: |
              echo "Sending deployment notification..."
              # curl -X POST "$SLACK_WEBHOOK" \
              #   -H 'Content-Type: application/json' \
              #   -d '{"text":"Deployment to all regions complete."}'
            displayName: 'Notify team'
            condition: always()

This pipeline demonstrates:

  1. Conditional matrix expansion — PR builds test 2 legs; scheduled/manual builds test 9 legs
  2. maxParallel: 4 — Limits test concurrency to leave agent pool capacity for other pipelines
  3. Fan-out deployment — 3 regions deploy in parallel with maxParallel: 2 for rolling safety
  4. Fan-in validation — A single job waits for all deployments and validates the global state
  5. Cache per matrix leg — Each OS + Node version combination has its own npm cache
  6. Per-leg test results — JUnit files are named with the OS and Node version for clear reporting

Common Issues and Troubleshooting

1. Matrix Variable Not Expanding

Error:

##[error]Pool not found: $(imageName)

Cause: You used $(imageName) in the pool section but the matrix variable name does not match. Matrix variables are case-sensitive.

Fix: Double-check the variable name in your matrix definition matches exactly what you reference. imagename is not the same as imageName.

# Wrong - case mismatch
matrix:
  linux:
    ImageName: 'ubuntu-latest'  # Capital I
pool:
  vmImage: $(imageName)  # Lowercase i — will not resolve

# Right
matrix:
  linux:
    imageName: 'ubuntu-latest'
pool:
  vmImage: $(imageName)

2. maxParallel Ignored on Free Tier

Symptom: You set maxParallel: 3 but jobs run one at a time.

Cause: Your Azure DevOps organization only has 1 parallel job slot (the free tier default). maxParallel cannot exceed what your organization actually has.

Fix: Purchase additional parallel job slots in Organization Settings > Billing, or use self-hosted agents.

Organization Settings > Billing > Paid parallel jobs
  Microsoft-hosted: Change from 0 to desired number
  Self-hosted: Change from 0 to desired number

3. Matrix Expansion Produces Invalid YAML

Error:

##[error]The pipeline is not valid. /azure-pipelines.yml (Line: 14, Col: 13):
Unexpected value 'linux node18'

Cause: Matrix key names cannot contain spaces. The YAML parser treats the space as a key-value separator.

Fix: Use underscores, hyphens, or camelCase for matrix key names:

# Wrong
matrix:
  linux node18:     # Space in key
    imageName: 'ubuntu-latest'

# Right
matrix:
  linux_node18:     # Underscore
    imageName: 'ubuntu-latest'

4. Each Expression Not Generating Matrix Entries

Error:

##[error]The pipeline is not valid. Job TestMatrix: Matrix 'strategy' does not
contain any configurations.

Cause: The ${{ each }} expression produced zero entries, usually because the parameter was empty or the nesting was incorrect.

Fix: Verify your parameter defaults are not empty arrays, and check the nesting level of your ${{ each }} block:

# Wrong — parameters.versions is an object, not iterable
parameters:
  - name: versions
    type: string
    default: '18,20,22'

# Right — use type: object for iterable lists
parameters:
  - name: versions
    type: object
    default:
      - '18'
      - '20'
      - '22'

5. Dependent Stage Cannot Reference Matrix Job Output

Error:

##[error]Unrecognized value: 'stageDependencies.Test.TestMatrix.outputs'

Cause: Matrix jobs produce output variables per leg, and the syntax to reference them requires the matrix leg name. You cannot reference the parent matrix job's outputs generically.

Fix: Reference the specific leg's output using the format dependencies.JobName.outputs['legName.stepName.variableName']:

# In a dependent job:
variables:
  testResult: $[ dependencies.TestMatrix.outputs['linux_node20.runTests.testStatus'] ]

6. Pipeline Times Out with Large Matrix

Error:

##[error]The job running on agent Hosted Agent ran longer than the maximum time
of 360 minutes.

Cause: A large matrix with maxParallel: 1 or limited agent capacity causes legs to queue. Each leg's timeout includes queue time.

Fix: Increase maxParallel, reduce matrix size, or set timeoutInMinutes at the job level:

jobs:
  - job: Test
    timeoutInMinutes: 30
    strategy:
      maxParallel: 4
      matrix:
        # ...

Best Practices

  • Start minimal, expand on schedule. Use a 1-2 leg matrix for PR builds and the full matrix for nightly or release builds. Developers should not wait 20 minutes for a 9-leg matrix on every push.

  • Always set maxParallel. Without it, a large matrix consumes every available agent slot and blocks other pipelines in your organization. Leave headroom for deployments and other teams.

  • Name matrix keys descriptively. linux_node20_pg16 is immediately clear. config1 tells you nothing. These names appear in the Azure DevOps UI, in test result dashboards, and in failure notifications.

  • Cache per matrix leg. Include OS and version identifiers in your cache key. A cache built on Linux with Node 18 will not work on Windows with Node 22. Shared caches lead to mysterious build failures.

  • Publish test results with unique run titles. Use the matrix variables in testRunTitle so you can immediately see which leg failed without drilling into logs. Aggregated test results without leg identification are useless.

  • Use conditions to skip expensive steps on certain legs. Integration tests might only need to run on Linux. Documentation builds might only need the latest Node version. Do not waste macOS agent minutes on tasks that are platform-independent.

  • Put the most common failure scenario first in the matrix. If 80% of your bugs are caught on Linux + Node 20, make sure that combination runs first. If maxParallel is lower than your total legs, the first entries in the matrix run first.

  • Monitor agent pool utilization. In Organization Settings > Agent Pools, check queue times. If matrix jobs routinely queue for more than 2 minutes, you either need more parallel job slots or a smaller matrix.

  • Keep matrix variables focused. Each matrix entry should only define variables that actually differ between legs. Do not duplicate shared configuration in every entry — put shared values in the variables section of the job.

  • Test your matrix expansion locally. Use the Azure DevOps REST API or the az pipelines run CLI to preview what the matrix expands into before committing a complex ${{ each }} template.


References

Powered by Contentful