Integrations

Kubernetes Deployment from Azure DevOps

A comprehensive guide to deploying applications to Kubernetes clusters from Azure DevOps Pipelines, covering AKS integration, Helm chart releases, manifest management, canary deployments, and rollback strategies with complete working examples.

Kubernetes Deployment from Azure DevOps

Overview

Deploying to Kubernetes from Azure DevOps is one of the most common integration patterns I encounter. Azure Kubernetes Service gives you a managed cluster, Azure Container Registry holds your images, and Azure Pipelines orchestrates the build-push-deploy cycle. The challenge is not getting a basic deployment working — that takes fifteen minutes. The challenge is building a pipeline that handles multiple environments, manages secrets properly, supports rollbacks, and does not leave you debugging a broken production cluster at 2 AM because someone pushed a bad image tag.

Prerequisites

  • Azure Kubernetes Service (AKS) cluster or any Kubernetes cluster with API access
  • Azure Container Registry (ACR) or other container registry
  • Azure DevOps project with Pipelines enabled
  • Service Connection to your AKS cluster (Kubernetes type) in Azure DevOps
  • Docker installed for local image building
  • kubectl and helm CLI tools for local testing
  • Basic familiarity with Kubernetes manifests and Helm chart structure

Connecting Azure DevOps to Kubernetes

AKS Service Connection

The simplest path is using Azure DevOps' native AKS integration. Navigate to Project Settings > Service connections > New service connection > Kubernetes and select "Azure Subscription." Pick your AKS cluster and Azure DevOps creates a service account with the necessary RBAC bindings.

For non-AKS clusters or tighter security, use the "KubeConfig" option and paste your kubeconfig file. Store the kubeconfig as a secure file in the Pipeline Library rather than inline — this lets you reference it across pipelines and rotate credentials in one place.

ACR Integration with AKS

Attach your Azure Container Registry directly to AKS so the cluster can pull images without separate credentials:

# Attach ACR to AKS
az aks update \
  --resource-group rg-myapp \
  --name aks-myapp \
  --attach-acr myappregistry

# Verify the attachment
az aks check-acr \
  --resource-group rg-myapp \
  --name aks-myapp \
  --acr myappregistry.azurecr.io

This creates a role assignment granting the AKS managed identity AcrPull access to the registry. No image pull secrets needed in your manifests.

Basic Build and Deploy Pipeline

The fundamental pipeline builds a Docker image, pushes it to ACR, and deploys to AKS:

# azure-pipelines-k8s.yml
trigger:
  branches:
    include:
      - main
  paths:
    include:
      - src/**
      - k8s/**
      - Dockerfile

pool:
  vmImage: "ubuntu-latest"

variables:
  acrName: "myappregistry"
  acrLoginServer: "myappregistry.azurecr.io"
  imageName: "myapp-api"
  k8sNamespace: "production"
  k8sServiceConnection: "aks-production"

stages:
  - stage: Build
    displayName: "Build & Push Image"
    jobs:
      - job: BuildImage
        steps:
          - task: Docker@2
            displayName: "Build and push image"
            inputs:
              containerRegistry: $(acrName)
              repository: $(imageName)
              command: buildAndPush
              Dockerfile: "**/Dockerfile"
              tags: |
                $(Build.BuildId)
                latest

  - stage: Deploy
    displayName: "Deploy to AKS"
    dependsOn: Build
    jobs:
      - deployment: DeployToAKS
        displayName: "Deploy to Kubernetes"
        environment: "production.$(k8sNamespace)"
        strategy:
          runOnce:
            deploy:
              steps:
                - task: KubernetesManifest@1
                  displayName: "Deploy manifests"
                  inputs:
                    action: deploy
                    kubernetesServiceConnection: $(k8sServiceConnection)
                    namespace: $(k8sNamespace)
                    manifests: |
                      k8s/deployment.yaml
                      k8s/service.yaml
                    containers: |
                      $(acrLoginServer)/$(imageName):$(Build.BuildId)

The KubernetesManifest@1 task automatically patches the image tag in your deployment manifest. It finds container image references matching your repository name and replaces the tag with the one you specify.

Kubernetes Manifests

# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-api
  labels:
    app: myapp-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp-api
  template:
    metadata:
      labels:
        app: myapp-api
    spec:
      containers:
        - name: myapp-api
          image: myappregistry.azurecr.io/myapp-api:latest
          ports:
            - containerPort: 3000
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              cpu: 500m
              memory: 512Mi
          readinessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 15
            periodSeconds: 20
          env:
            - name: NODE_ENV
              value: production
            - name: PORT
              value: "3000"
            - name: DB_CONNECTION
              valueFrom:
                secretKeyRef:
                  name: myapp-secrets
                  key: db-connection
---
# k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: myapp-api
spec:
  type: ClusterIP
  selector:
    app: myapp-api
  ports:
    - port: 80
      targetPort: 3000
      protocol: TCP

Helm Chart Deployments

For anything beyond trivial applications, Helm charts are the standard. They handle templating, versioning, rollbacks, and dependency management.

Pipeline with Helm

# azure-pipelines-helm.yml
trigger:
  branches:
    include:
      - main

pool:
  vmImage: "ubuntu-latest"

variables:
  acrName: "myappregistry"
  acrLoginServer: "myappregistry.azurecr.io"
  imageName: "myapp-api"
  chartPath: "charts/myapp-api"
  releaseName: "myapp-api"

stages:
  - stage: Build
    jobs:
      - job: BuildAndPush
        steps:
          - task: Docker@2
            displayName: "Build and push"
            inputs:
              containerRegistry: $(acrName)
              repository: $(imageName)
              command: buildAndPush
              Dockerfile: "**/Dockerfile"
              tags: $(Build.BuildId)

          - task: HelmInstaller@0
            displayName: "Install Helm"
            inputs:
              helmVersion: "3.13.0"

          - script: |
              helm package $(chartPath) \
                --version 1.0.$(Build.BuildId) \
                --app-version $(Build.BuildId)
            displayName: "Package Helm chart"

          - task: PublishPipelineArtifact@1
            inputs:
              targetPath: "$(System.DefaultWorkingDirectory)/*.tgz"
              artifact: "helm-chart"

  - stage: DeployDev
    displayName: "Deploy to Dev"
    dependsOn: Build
    jobs:
      - deployment: HelmDeployDev
        environment: "dev.myapp"
        strategy:
          runOnce:
            deploy:
              steps:
                - task: HelmInstaller@0
                  inputs:
                    helmVersion: "3.13.0"

                - task: HelmDeploy@1
                  displayName: "Helm upgrade (dev)"
                  inputs:
                    connectionType: "Kubernetes Service Connection"
                    kubernetesServiceConnection: "aks-dev"
                    namespace: "dev"
                    command: upgrade
                    chartType: FilePath
                    chartPath: "$(Pipeline.Workspace)/helm-chart/*.tgz"
                    releaseName: "$(releaseName)"
                    overrideValues: |
                      image.repository=$(acrLoginServer)/$(imageName)
                      image.tag=$(Build.BuildId)
                      replicaCount=1
                      resources.requests.cpu=50m
                      resources.requests.memory=64Mi
                    install: true
                    waitForExecution: true
                    arguments: "--timeout 300s"

  - stage: DeployProd
    displayName: "Deploy to Production"
    dependsOn: DeployDev
    jobs:
      - deployment: HelmDeployProd
        environment: "production.myapp"
        strategy:
          runOnce:
            deploy:
              steps:
                - task: HelmInstaller@0
                  inputs:
                    helmVersion: "3.13.0"

                - task: HelmDeploy@1
                  displayName: "Helm upgrade (production)"
                  inputs:
                    connectionType: "Kubernetes Service Connection"
                    kubernetesServiceConnection: "aks-production"
                    namespace: "production"
                    command: upgrade
                    chartType: FilePath
                    chartPath: "$(Pipeline.Workspace)/helm-chart/*.tgz"
                    releaseName: "$(releaseName)"
                    overrideValues: |
                      image.repository=$(acrLoginServer)/$(imageName)
                      image.tag=$(Build.BuildId)
                      replicaCount=3
                      resources.requests.cpu=200m
                      resources.requests.memory=256Mi
                      ingress.enabled=true
                      ingress.host=api.myapp.com
                    install: true
                    waitForExecution: true
                    arguments: "--timeout 600s"

Helm Chart Structure

charts/myapp-api/
  Chart.yaml
  values.yaml
  templates/
    deployment.yaml
    service.yaml
    ingress.yaml
    hpa.yaml
    _helpers.tpl

Example values.yaml:

# charts/myapp-api/values.yaml
replicaCount: 2

image:
  repository: myappregistry.azurecr.io/myapp-api
  tag: latest
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  port: 80
  targetPort: 3000

ingress:
  enabled: false
  host: ""
  tls: false

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

autoscaling:
  enabled: false
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilization: 70

env:
  NODE_ENV: production

Canary Deployments

The KubernetesManifest@1 task supports canary deployments natively. It creates a canary deployment alongside the stable deployment, routes a percentage of traffic to it, and promotes or rejects based on the outcome:

# azure-pipelines-canary.yml
stages:
  - stage: DeployCanary
    displayName: "Canary Deployment"
    jobs:
      - deployment: Canary
        environment: "production.myapp"
        strategy:
          canary:
            increments: [10, 20]
            preDeploy:
              steps:
                - script: echo "Preparing canary deployment..."
            deploy:
              steps:
                - task: KubernetesManifest@1
                  displayName: "Deploy canary"
                  inputs:
                    action: deploy
                    kubernetesServiceConnection: "aks-production"
                    namespace: production
                    strategy: canary
                    percentage: $(strategy.increment)
                    manifests: |
                      k8s/deployment.yaml
                      k8s/service.yaml
                    containers: |
                      $(acrLoginServer)/$(imageName):$(Build.BuildId)
            postRouteTraffic:
              steps:
                - script: |
                    echo "Running smoke tests against canary..."
                    sleep 30
                    curl -f http://myapp-api-canary.production.svc.cluster.local/health || exit 1
                  displayName: "Canary health check"
            on:
              success:
                steps:
                  - task: KubernetesManifest@1
                    displayName: "Promote canary"
                    inputs:
                      action: promote
                      kubernetesServiceConnection: "aks-production"
                      namespace: production
                      strategy: canary
                      manifests: |
                        k8s/deployment.yaml
              failure:
                steps:
                  - task: KubernetesManifest@1
                    displayName: "Reject canary"
                    inputs:
                      action: reject
                      kubernetesServiceConnection: "aks-production"
                      namespace: production
                      strategy: canary
                      manifests: |
                        k8s/deployment.yaml

Secret Management

Never put secrets in Kubernetes manifests or pipeline variables in plain text. Use Azure Key Vault with the CSI driver:

# k8s/secret-provider.yaml
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: myapp-secrets
spec:
  provider: azure
  parameters:
    usePodIdentity: "false"
    useVMManagedIdentity: "true"
    userAssignedIdentityID: "<managed-identity-client-id>"
    keyvaultName: "kv-myapp-prod"
    objects: |
      array:
        - |
          objectName: db-connection-string
          objectType: secret
        - |
          objectName: api-key
          objectType: secret
    tenantId: "<tenant-id>"
  secretObjects:
    - secretName: myapp-secrets
      type: Opaque
      data:
        - objectName: db-connection-string
          key: db-connection
        - objectName: api-key
          key: api-key

In your pipeline, install the Key Vault secrets before deploying:

- script: |
    kubectl apply -f k8s/secret-provider.yaml -n $(k8sNamespace)
  displayName: "Apply Secret Provider"

Complete Working Example: Multi-Service Deployment Pipeline

This example deploys a complete application stack — API, worker, and migration job — to Kubernetes with proper ordering and health verification.

// scripts/verify-deployment.js
// Checks Kubernetes deployment health after pipeline deploy
var exec = require("child_process").execSync;

var namespace = process.argv[2] || "production";
var deploymentName = process.argv[3] || "myapp-api";
var timeout = parseInt(process.argv[4], 10) || 120;

function getDeploymentStatus(ns, name) {
    try {
        var output = exec(
            "kubectl get deployment " + name + " -n " + ns + " -o json",
            { encoding: "utf8", timeout: 10000 }
        );
        return JSON.parse(output);
    } catch (err) {
        console.error("Failed to get deployment status: " + err.message);
        return null;
    }
}

function checkRolloutStatus(ns, name) {
    try {
        var output = exec(
            "kubectl rollout status deployment/" + name + " -n " + ns + " --timeout=" + timeout + "s",
            { encoding: "utf8", timeout: (timeout + 10) * 1000 }
        );
        console.log(output);
        return true;
    } catch (err) {
        console.error("Rollout failed: " + err.message);
        return false;
    }
}

function getPodStatuses(ns, name) {
    try {
        var output = exec(
            "kubectl get pods -n " + ns + " -l app=" + name + " -o json",
            { encoding: "utf8", timeout: 10000 }
        );
        var pods = JSON.parse(output);

        pods.items.forEach(function (pod) {
            var phase = pod.status.phase;
            var ready = pod.status.conditions
                ? pod.status.conditions.filter(function (c) { return c.type === "Ready" && c.status === "True"; }).length > 0
                : false;
            var restarts = 0;
            if (pod.status.containerStatuses) {
                pod.status.containerStatuses.forEach(function (cs) {
                    restarts += cs.restartCount;
                });
            }
            console.log("  Pod " + pod.metadata.name + ": phase=" + phase + " ready=" + ready + " restarts=" + restarts);
        });

        return pods;
    } catch (err) {
        console.error("Failed to get pod statuses: " + err.message);
        return null;
    }
}

function verifyHealthEndpoint(ns, name, port) {
    try {
        // Port-forward briefly to check health
        var curlResult = exec(
            "kubectl exec -n " + ns + " deploy/" + name +
            " -- wget -qO- http://localhost:" + port + "/health 2>/dev/null || " +
            "kubectl exec -n " + ns + " deploy/" + name +
            " -- curl -sf http://localhost:" + port + "/health 2>/dev/null",
            { encoding: "utf8", timeout: 15000 }
        );
        console.log("Health check response: " + curlResult.trim());
        return true;
    } catch (err) {
        console.error("Health check failed: " + err.message);
        return false;
    }
}

console.log("Verifying deployment: " + deploymentName + " in namespace: " + namespace);
console.log("Timeout: " + timeout + "s\n");

// Step 1: Check rollout status
console.log("--- Checking rollout status ---");
var rolloutOk = checkRolloutStatus(namespace, deploymentName);
if (!rolloutOk) {
    console.error("\nROLLOUT FAILED. Checking pod details...");
    getPodStatuses(namespace, deploymentName);
    process.exit(1);
}

// Step 2: Verify deployment object
console.log("\n--- Checking deployment details ---");
var deployment = getDeploymentStatus(namespace, deploymentName);
if (deployment) {
    var status = deployment.status;
    console.log("  Desired replicas: " + status.replicas);
    console.log("  Ready replicas: " + (status.readyReplicas || 0));
    console.log("  Updated replicas: " + (status.updatedReplicas || 0));
    console.log("  Available replicas: " + (status.availableReplicas || 0));

    if ((status.readyReplicas || 0) < status.replicas) {
        console.error("\nNot all replicas are ready!");
        process.exit(1);
    }
}

// Step 3: Check individual pods
console.log("\n--- Pod statuses ---");
getPodStatuses(namespace, deploymentName);

// Step 4: Health endpoint
console.log("\n--- Health endpoint check ---");
var healthy = verifyHealthEndpoint(namespace, deploymentName, 3000);

if (healthy) {
    console.log("\nDeployment verification PASSED");
    process.exit(0);
} else {
    console.error("\nDeployment verification FAILED — health endpoint not responding");
    process.exit(1);
}

The full multi-service pipeline:

# azure-pipelines-multi-service.yml
trigger:
  branches:
    include:
      - main

pool:
  vmImage: "ubuntu-latest"

variables:
  acrLoginServer: "myappregistry.azurecr.io"
  tag: "$(Build.BuildId)"

stages:
  - stage: Build
    displayName: "Build All Services"
    jobs:
      - job: BuildAPI
        displayName: "Build API"
        steps:
          - task: Docker@2
            inputs:
              containerRegistry: myappregistry
              repository: myapp-api
              command: buildAndPush
              Dockerfile: services/api/Dockerfile
              tags: $(tag)

      - job: BuildWorker
        displayName: "Build Worker"
        steps:
          - task: Docker@2
            inputs:
              containerRegistry: myappregistry
              repository: myapp-worker
              command: buildAndPush
              Dockerfile: services/worker/Dockerfile
              tags: $(tag)

  - stage: Migrate
    displayName: "Run Migrations"
    dependsOn: Build
    jobs:
      - deployment: RunMigrations
        environment: "production.myapp"
        strategy:
          runOnce:
            deploy:
              steps:
                - task: Kubernetes@1
                  displayName: "Run migration job"
                  inputs:
                    connectionType: "Kubernetes Service Connection"
                    kubernetesServiceConnection: "aks-production"
                    namespace: production
                    command: apply
                    arguments: "-f k8s/migration-job.yaml"

                - task: Kubernetes@1
                  displayName: "Wait for migration"
                  inputs:
                    connectionType: "Kubernetes Service Connection"
                    kubernetesServiceConnection: "aks-production"
                    namespace: production
                    command: wait
                    arguments: "--for=condition=complete job/myapp-migration --timeout=300s"

  - stage: DeployServices
    displayName: "Deploy Services"
    dependsOn: Migrate
    jobs:
      - deployment: DeployAPI
        displayName: "Deploy API"
        environment: "production.myapp"
        strategy:
          runOnce:
            deploy:
              steps:
                - task: KubernetesManifest@1
                  displayName: "Deploy API"
                  inputs:
                    action: deploy
                    kubernetesServiceConnection: "aks-production"
                    namespace: production
                    manifests: |
                      k8s/api-deployment.yaml
                      k8s/api-service.yaml
                    containers: |
                      $(acrLoginServer)/myapp-api:$(tag)

                - script: node scripts/verify-deployment.js production myapp-api 180
                  displayName: "Verify API deployment"

      - deployment: DeployWorker
        displayName: "Deploy Worker"
        environment: "production.myapp"
        dependsOn: DeployAPI
        strategy:
          runOnce:
            deploy:
              steps:
                - task: KubernetesManifest@1
                  displayName: "Deploy Worker"
                  inputs:
                    action: deploy
                    kubernetesServiceConnection: "aks-production"
                    namespace: production
                    manifests: |
                      k8s/worker-deployment.yaml
                    containers: |
                      $(acrLoginServer)/myapp-worker:$(tag)

                - script: node scripts/verify-deployment.js production myapp-worker 120
                  displayName: "Verify Worker deployment"

Common Issues and Troubleshooting

ImagePullBackOff after deployment

Warning  Failed   4m   kubelet  Failed to pull image "myappregistry.azurecr.io/myapp-api:123":
  unauthorized: authentication required

The AKS cluster cannot pull from ACR. Run az aks check-acr to verify the attachment. If you recently recreated the cluster or changed managed identities, the role assignment may be stale. Re-attach with az aks update --attach-acr. For non-AKS clusters, verify your image pull secret is in the correct namespace and referenced in the deployment spec.

Deployment stuck in "Progressing" state

deployment "myapp-api" exceeded its progress deadline

The pods are failing to become ready within the progressDeadlineSeconds (default 600 seconds). Check pod logs with kubectl logs -n production -l app=myapp-api --previous to see why containers crash. Common causes: missing environment variables, failed health checks, insufficient resource limits causing OOM kills.

Helm release stuck in "pending-upgrade" state

Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress

A previous Helm operation was interrupted. Fix it with helm rollback <release> 0 -n <namespace> to roll back to the last successful revision, then retry the pipeline. In your pipeline, add --force to the Helm upgrade for resilience, but understand that --force deletes and recreates resources rather than patching them.

Service connection authentication expires

##[error] The Kubernetes cluster service connection does not have access

Azure DevOps Kubernetes service connections using Azure AD tokens expire. The service connection auto-refreshes when using managed identity, but token-based connections require manual renewal. Prefer "Azure Subscription" service connection type over kubeconfig-based connections for AKS clusters — they handle token refresh automatically.

Pipeline cannot find kubectl or helm

##[error] Unable to locate executable file: 'kubectl'

Microsoft-hosted agents include kubectl but the version may not match your cluster. Use KubectlInstaller@0 and HelmInstaller@0 tasks to install specific versions at the start of your pipeline. Pin these versions so a cluster upgrade does not break your pipeline.

Best Practices

  • Tag images with build IDs, never "latest." The :latest tag is ambiguous — Kubernetes may cache it and skip pulling the new image. Use $(Build.BuildId) or a git SHA as the tag. This also makes rollbacks trivial since each build has a unique, traceable tag.

  • Run migrations before application deployment. Database schema changes must complete before new application code tries to use them. Use a Kubernetes Job for migrations and kubectl wait --for=condition=complete to block the pipeline until it finishes.

  • Set resource requests and limits on every container. Without resource limits, a single misbehaving pod can starve the entire node. Without requests, the scheduler cannot make intelligent placement decisions. Define these in your Helm values and override per environment.

  • Use readiness probes to prevent traffic to unhealthy pods. Kubernetes sends traffic to pods as soon as they start without a readiness probe. Set a readiness probe that checks your application's health endpoint. The deployment will not mark pods as available until the probe passes.

  • Store Kubernetes manifests in the same repository as application code. Keeping manifests next to the Dockerfile and source ensures changes to the application and its deployment configuration are reviewed and versioned together.

  • Implement rollback automation. Add a manual trigger pipeline that runs helm rollback or kubectl rollout undo for emergency rollbacks. Do not rely on re-running a previous pipeline — the images may have been cleaned up.

  • Use namespaces to isolate environments. Deploy dev, staging, and production to separate namespaces (or separate clusters for production). Namespace-level RBAC prevents a dev pipeline from accidentally deploying to production.

  • Monitor deployments with Azure Monitor for containers. Enable Container Insights on your AKS cluster. It captures pod logs, metrics, and Kubernetes events that are essential for debugging failed deployments after the pipeline has finished.

References

Powered by Contentful