Terraform in Azure DevOps Pipelines
A comprehensive guide to integrating Terraform with Azure DevOps Pipelines for infrastructure as code deployments, including remote state management, plan approvals, multi-environment workflows, and drift detection strategies.
Terraform in Azure DevOps Pipelines
Overview
Terraform and Azure DevOps Pipelines are a natural pairing for infrastructure as code. Terraform handles the declarative infrastructure definitions, and Azure Pipelines provides the execution environment with approval gates, variable groups, and audit trails that enterprises need. I have deployed this combination across dozens of projects and the patterns I describe here come from real production setups where mistakes cost real money — accidentally destroying a database, deploying to the wrong subscription, or running apply without reviewing the plan first.
Prerequisites
- Azure DevOps project with Pipelines enabled
- Terraform 1.0 or later installed on pipeline agents (or use the Terraform installer task)
- Azure subscription with a Service Principal for Terraform authentication
- Azure Storage Account for Terraform remote state backend
- Azure DevOps Service Connection configured for your Azure subscription
- Basic familiarity with Terraform HCL syntax and Azure resource types
Setting Up the Foundation
Service Principal for Terraform
Terraform needs credentials to manage Azure resources. Create a dedicated Service Principal with only the permissions Terraform needs:
# Create the Service Principal
az ad sp create-for-rbac \
--name "terraform-pipeline-sp" \
--role Contributor \
--scopes /subscriptions/YOUR_SUBSCRIPTION_ID
# Output:
# {
# "appId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
# "displayName": "terraform-pipeline-sp",
# "password": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
# "tenant": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
# }
Store these values as pipeline variables: ARM_CLIENT_ID, ARM_CLIENT_SECRET, ARM_SUBSCRIPTION_ID, and ARM_TENANT_ID. Mark ARM_CLIENT_SECRET as a secret variable.
Remote State Backend
Never store Terraform state locally in a pipeline. The state file is lost between runs, and concurrent executions corrupt it. Use Azure Storage:
# Create storage for Terraform state
az group create --name rg-terraform-state --location eastus
az storage account create \
--name tfstateYOURORG \
--resource-group rg-terraform-state \
--sku Standard_LRS \
--encryption-services blob \
--min-tls-version TLS1_2
az storage container create \
--name tfstate \
--account-name tfstateYOURORG
Enable blob versioning on the storage account. This gives you free state file history without manual backups:
az storage account blob-service-properties update \
--account-name tfstateYOURORG \
--resource-group rg-terraform-state \
--enable-versioning true
Lock down the storage account to only the pipeline Service Principal and your infrastructure team. State files contain sensitive information — resource IDs, connection strings, IP addresses.
Terraform Backend Configuration
# backend.tf
terraform {
backend "azurerm" {
resource_group_name = "rg-terraform-state"
storage_account_name = "tfstateyourorg"
container_name = "tfstate"
key = "myproject/dev.tfstate"
}
}
Use different state file keys per environment. The key path acts as your state file namespace.
Pipeline Structure
Basic Plan and Apply Pipeline
The fundamental pattern is a two-stage pipeline: plan, then apply with an approval gate.
# azure-pipelines-terraform.yml
trigger:
branches:
include:
- main
paths:
include:
- infrastructure/**
pool:
vmImage: "ubuntu-latest"
variables:
- group: terraform-credentials
- name: terraformVersion
value: "1.6.5"
- name: workingDirectory
value: "$(System.DefaultWorkingDirectory)/infrastructure"
stages:
- stage: Plan
displayName: "Terraform Plan"
jobs:
- job: TerraformPlan
displayName: "Plan Infrastructure Changes"
steps:
- task: TerraformInstaller@1
displayName: "Install Terraform $(terraformVersion)"
inputs:
terraformVersion: $(terraformVersion)
- script: terraform init -input=false
displayName: "Terraform Init"
workingDirectory: $(workingDirectory)
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
- script: terraform validate
displayName: "Terraform Validate"
workingDirectory: $(workingDirectory)
- script: |
terraform plan \
-input=false \
-out=tfplan \
-detailed-exitcode \
-var-file=environments/dev.tfvars
displayName: "Terraform Plan"
workingDirectory: $(workingDirectory)
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
- script: terraform show -no-color tfplan > tfplan.txt
displayName: "Save Plan Output"
workingDirectory: $(workingDirectory)
- task: PublishPipelineArtifact@1
displayName: "Publish Plan Artifact"
inputs:
targetPath: $(workingDirectory)
artifact: "terraform-plan"
- stage: Apply
displayName: "Terraform Apply"
dependsOn: Plan
condition: succeeded()
jobs:
- deployment: TerraformApply
displayName: "Apply Infrastructure Changes"
environment: "production-infrastructure"
strategy:
runOnce:
deploy:
steps:
- task: TerraformInstaller@1
displayName: "Install Terraform $(terraformVersion)"
inputs:
terraformVersion: $(terraformVersion)
- script: terraform init -input=false
displayName: "Terraform Init"
workingDirectory: $(Pipeline.Workspace)/terraform-plan
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
- script: terraform apply -input=false -auto-approve tfplan
displayName: "Terraform Apply"
workingDirectory: $(Pipeline.Workspace)/terraform-plan
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
The deployment job type with an environment enables approval gates. Configure approvals in Pipelines > Environments > production-infrastructure > Approvals and checks.
The -detailed-exitcode Flag
Use -detailed-exitcode on terraform plan. It returns exit code 0 for no changes, 1 for errors, and 2 for changes detected. This lets you conditionally skip the apply stage when there are no changes:
- script: |
terraform plan \
-input=false \
-out=tfplan \
-detailed-exitcode \
-var-file=environments/dev.tfvars
PLAN_EXIT=$?
echo "##vso[task.setvariable variable=planExitCode;isOutput=true]$PLAN_EXIT"
if [ $PLAN_EXIT -eq 1 ]; then exit 1; fi
exit 0
name: planStep
displayName: "Terraform Plan"
workingDirectory: $(workingDirectory)
Then condition the apply stage:
- stage: Apply
condition: and(succeeded(), eq(dependencies.Plan.outputs['TerraformPlan.planStep.planExitCode'], '2'))
Multi-Environment Workflows
Real projects deploy to dev, staging, and production. Use variable files and pipeline parameters:
# azure-pipelines-multi-env.yml
trigger:
branches:
include:
- main
- develop
parameters:
- name: environment
displayName: "Target Environment"
type: string
default: dev
values:
- dev
- staging
- production
variables:
- group: "terraform-${{ parameters.environment }}"
- name: terraformVersion
value: "1.6.5"
- name: backendKey
value: "myproject/${{ parameters.environment }}.tfstate"
- name: workingDirectory
value: "$(System.DefaultWorkingDirectory)/infrastructure"
stages:
- stage: Plan
displayName: "Plan (${{ parameters.environment }})"
jobs:
- job: TerraformPlan
steps:
- task: TerraformInstaller@1
inputs:
terraformVersion: $(terraformVersion)
- script: |
terraform init \
-input=false \
-backend-config="key=$(backendKey)"
displayName: "Terraform Init"
workingDirectory: $(workingDirectory)
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
- script: |
terraform plan \
-input=false \
-out=tfplan \
-var-file=environments/${{ parameters.environment }}.tfvars
displayName: "Terraform Plan (${{ parameters.environment }})"
workingDirectory: $(workingDirectory)
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
- task: PublishPipelineArtifact@1
inputs:
targetPath: $(workingDirectory)
artifact: "terraform-plan-${{ parameters.environment }}"
- stage: Apply
displayName: "Apply (${{ parameters.environment }})"
dependsOn: Plan
jobs:
- deployment: TerraformApply
environment: "${{ parameters.environment }}-infrastructure"
strategy:
runOnce:
deploy:
steps:
- task: TerraformInstaller@1
inputs:
terraformVersion: $(terraformVersion)
- script: |
terraform init \
-input=false \
-backend-config="key=$(backendKey)"
displayName: "Terraform Init"
workingDirectory: "$(Pipeline.Workspace)/terraform-plan-${{ parameters.environment }}"
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
- script: terraform apply -input=false -auto-approve tfplan
displayName: "Terraform Apply (${{ parameters.environment }})"
workingDirectory: "$(Pipeline.Workspace)/terraform-plan-${{ parameters.environment }}"
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
Create separate variable groups (terraform-dev, terraform-staging, terraform-production) with environment-specific Service Principal credentials. Each environment's Service Principal should only have access to its own subscription or resource group.
Drift Detection
Infrastructure drift — manual changes made outside Terraform — is a constant problem. Run scheduled drift detection pipelines:
# azure-pipelines-drift-detection.yml
trigger: none
schedules:
- cron: "0 6 * * *"
displayName: "Daily drift detection at 6 AM"
branches:
include:
- main
always: true
pool:
vmImage: "ubuntu-latest"
variables:
- group: terraform-credentials
jobs:
- job: DetectDrift
displayName: "Check for Infrastructure Drift"
steps:
- task: TerraformInstaller@1
inputs:
terraformVersion: "1.6.5"
- script: terraform init -input=false
displayName: "Terraform Init"
workingDirectory: infrastructure
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
- script: |
terraform plan \
-input=false \
-detailed-exitcode \
-var-file=environments/production.tfvars \
-out=driftplan 2>&1 | tee plan-output.txt
PLAN_EXIT=${PIPESTATUS[0]}
echo "Plan exit code: $PLAN_EXIT"
if [ $PLAN_EXIT -eq 2 ]; then
echo "##vso[task.logissue type=warning]Infrastructure drift detected!"
echo "##vso[task.setvariable variable=driftDetected]true"
terraform show -no-color driftplan > drift-report.txt
elif [ $PLAN_EXIT -eq 0 ]; then
echo "No drift detected."
echo "##vso[task.setvariable variable=driftDetected]false"
else
echo "##vso[task.logissue type=error]Terraform plan failed"
exit 1
fi
displayName: "Detect Drift"
workingDirectory: infrastructure
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
- script: |
if [ "$(driftDetected)" = "true" ]; then
echo "Sending drift notification..."
cat drift-report.txt
# Post to Azure DevOps work item or send webhook
curl -s -X POST "$(DRIFT_WEBHOOK_URL)" \
-H "Content-Type: application/json" \
-d '{
"text": "Infrastructure drift detected in production. Review the pipeline output for details.",
"pipeline": "$(Build.BuildNumber)",
"url": "$(System.TeamFoundationCollectionUri)$(System.TeamProject)/_build/results?buildId=$(Build.BuildId)"
}'
fi
displayName: "Notify on Drift"
workingDirectory: infrastructure
condition: eq(variables.driftDetected, 'true')
Complete Working Example: Full Terraform Pipeline with Validation
This example shows a production-grade pipeline with linting, security scanning, cost estimation, plan review, and gated apply.
// scripts/parse-terraform-plan.js
// Parses terraform plan output and creates a summary for PR comments
var fs = require("fs");
var path = require("path");
var planFile = process.argv[2] || "tfplan.txt";
function parsePlanOutput(filePath) {
var content = fs.readFileSync(filePath, "utf8");
var lines = content.split("\n");
var summary = {
additions: 0,
changes: 0,
destructions: 0,
resources: [],
hasDestructiveChanges: false
};
var resourcePattern = /^ # (.+) (will be|must be) (created|updated|destroyed|replaced)/;
var summaryPattern = /Plan: (\d+) to add, (\d+) to change, (\d+) to destroy/;
lines.forEach(function (line) {
var resourceMatch = line.match(resourcePattern);
if (resourceMatch) {
var resource = {
name: resourceMatch[1],
action: resourceMatch[3]
};
summary.resources.push(resource);
if (resource.action === "destroyed" || resource.action === "replaced") {
summary.hasDestructiveChanges = true;
}
}
var summaryMatch = line.match(summaryPattern);
if (summaryMatch) {
summary.additions = parseInt(summaryMatch[1], 10);
summary.changes = parseInt(summaryMatch[2], 10);
summary.destructions = parseInt(summaryMatch[3], 10);
}
});
return summary;
}
function formatMarkdown(summary) {
var md = "## Terraform Plan Summary\n\n";
if (summary.hasDestructiveChanges) {
md += "> **WARNING:** This plan includes destructive changes!\n\n";
}
md += "| Action | Count |\n";
md += "|--------|-------|\n";
md += "| Add | " + summary.additions + " |\n";
md += "| Change | " + summary.changes + " |\n";
md += "| Destroy | " + summary.destructions + " |\n\n";
if (summary.resources.length > 0) {
md += "### Resources Affected\n\n";
summary.resources.forEach(function (resource) {
var icon;
switch (resource.action) {
case "created": icon = "+"; break;
case "updated": icon = "~"; break;
case "destroyed": icon = "-"; break;
case "replaced": icon = "!"; break;
default: icon = "?";
}
md += "- `" + icon + "` " + resource.name + " (" + resource.action + ")\n";
});
}
return md;
}
try {
var summary = parsePlanOutput(planFile);
var markdown = formatMarkdown(summary);
console.log(markdown);
// Write summary for pipeline consumption
var outputFile = path.join(path.dirname(planFile), "plan-summary.md");
fs.writeFileSync(outputFile, markdown);
console.log("\nSummary written to " + outputFile);
// Set pipeline variables
if (summary.hasDestructiveChanges) {
console.log("##vso[task.setvariable variable=hasDestructiveChanges;isOutput=true]true");
console.log("##vso[task.logissue type=warning]Plan contains destructive changes — extra approval required");
}
process.exit(0);
} catch (err) {
console.error("Failed to parse plan: " + err.message);
process.exit(1);
}
The full pipeline YAML that uses this script:
# azure-pipelines-terraform-full.yml
trigger:
branches:
include:
- main
paths:
include:
- infrastructure/**
pr:
branches:
include:
- main
paths:
include:
- infrastructure/**
pool:
vmImage: "ubuntu-latest"
variables:
- group: terraform-production
- name: terraformVersion
value: "1.6.5"
- name: workDir
value: "$(System.DefaultWorkingDirectory)/infrastructure"
- name: isPR
value: ${{ eq(variables['Build.Reason'], 'PullRequest') }}
stages:
- stage: Validate
displayName: "Validate & Lint"
jobs:
- job: Validate
steps:
- task: TerraformInstaller@1
inputs:
terraformVersion: $(terraformVersion)
- script: terraform fmt -check -recursive -diff
displayName: "Check Formatting"
workingDirectory: $(workDir)
- script: terraform init -input=false -backend=false
displayName: "Init (no backend)"
workingDirectory: $(workDir)
- script: terraform validate
displayName: "Validate Configuration"
workingDirectory: $(workDir)
- script: |
# Install tflint
curl -s https://raw.githubusercontent.com/terraform-linters/tflint/master/install_linux.sh | bash
tflint --init
tflint --format compact
displayName: "TFLint"
workingDirectory: $(workDir)
- stage: SecurityScan
displayName: "Security Scan"
dependsOn: Validate
jobs:
- job: Checkov
steps:
- script: |
pip install checkov
checkov -d $(workDir) \
--output cli \
--output junitxml \
--output-file-path . \
--soft-fail
displayName: "Checkov Security Scan"
- task: PublishTestResults@2
inputs:
testResultsFormat: "JUnit"
testResultsFiles: "**/results_junitxml.xml"
testRunTitle: "Checkov Security Scan"
condition: always()
- stage: Plan
displayName: "Terraform Plan"
dependsOn: SecurityScan
jobs:
- job: TerraformPlan
steps:
- task: TerraformInstaller@1
inputs:
terraformVersion: $(terraformVersion)
- script: terraform init -input=false
displayName: "Terraform Init"
workingDirectory: $(workDir)
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
- script: |
terraform plan \
-input=false \
-out=tfplan \
-detailed-exitcode \
-var-file=environments/production.tfvars 2>&1 | tee plan-output.txt
PLAN_EXIT=${PIPESTATUS[0]}
echo "##vso[task.setvariable variable=planExitCode;isOutput=true]$PLAN_EXIT"
if [ $PLAN_EXIT -eq 1 ]; then exit 1; fi
exit 0
name: plan
displayName: "Terraform Plan"
workingDirectory: $(workDir)
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
- script: terraform show -no-color tfplan > tfplan.txt
displayName: "Export Plan Text"
workingDirectory: $(workDir)
- script: node $(System.DefaultWorkingDirectory)/scripts/parse-terraform-plan.js $(workDir)/tfplan.txt
name: parsePlan
displayName: "Parse Plan Summary"
- task: PublishPipelineArtifact@1
inputs:
targetPath: $(workDir)
artifact: "terraform-plan"
- stage: Apply
displayName: "Terraform Apply"
dependsOn: Plan
condition: |
and(
succeeded(),
eq(variables.isPR, false),
eq(dependencies.Plan.outputs['TerraformPlan.plan.planExitCode'], '2')
)
jobs:
- deployment: TerraformApply
environment: "production-infrastructure"
strategy:
runOnce:
deploy:
steps:
- task: TerraformInstaller@1
inputs:
terraformVersion: $(terraformVersion)
- script: terraform init -input=false
displayName: "Terraform Init"
workingDirectory: "$(Pipeline.Workspace)/terraform-plan"
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
- script: terraform apply -input=false -auto-approve tfplan
displayName: "Terraform Apply"
workingDirectory: "$(Pipeline.Workspace)/terraform-plan"
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
- script: |
terraform output -json > outputs.json
cat outputs.json
displayName: "Capture Outputs"
workingDirectory: "$(Pipeline.Workspace)/terraform-plan"
env:
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)
- task: PublishPipelineArtifact@1
inputs:
targetPath: "$(Pipeline.Workspace)/terraform-plan/outputs.json"
artifact: "terraform-outputs"
Terraform Module Structure
Organize your infrastructure code for the pipeline:
infrastructure/
main.tf
variables.tf
outputs.tf
backend.tf
providers.tf
environments/
dev.tfvars
staging.tfvars
production.tfvars
modules/
networking/
main.tf
variables.tf
outputs.tf
compute/
main.tf
variables.tf
outputs.tf
database/
main.tf
variables.tf
outputs.tf
Example main.tf:
# main.tf
module "networking" {
source = "./modules/networking"
environment = var.environment
vnet_address_space = var.vnet_address_space
subnet_prefixes = var.subnet_prefixes
location = var.location
resource_group_name = azurerm_resource_group.main.name
}
module "compute" {
source = "./modules/compute"
environment = var.environment
subnet_id = module.networking.app_subnet_id
vm_size = var.vm_size
instance_count = var.instance_count
resource_group_name = azurerm_resource_group.main.name
location = var.location
}
module "database" {
source = "./modules/database"
environment = var.environment
subnet_id = module.networking.db_subnet_id
sku_name = var.db_sku_name
resource_group_name = azurerm_resource_group.main.name
location = var.location
}
resource "azurerm_resource_group" "main" {
name = "rg-${var.project_name}-${var.environment}"
location = var.location
tags = {
Environment = var.environment
ManagedBy = "terraform"
Project = var.project_name
}
}
State Locking and Workspaces
Azure Storage backend supports state locking via blob leases. This prevents two pipeline runs from corrupting state by writing simultaneously. If a pipeline fails mid-apply and leaves a stale lock, you can break it:
# Check for stale lock
az storage blob lease show \
--account-name tfstateyourorg \
--container-name tfstate \
--blob-name myproject/production.tfstate
# Break stale lock (use with caution)
az storage blob lease break \
--account-name tfstateyourorg \
--container-name tfstate \
--blob-name myproject/production.tfstate
For Terraform workspaces in pipelines, pass the workspace as a parameter:
- script: |
terraform workspace select ${{ parameters.environment }} || \
terraform workspace new ${{ parameters.environment }}
displayName: "Select Workspace"
I generally recommend separate state files over workspaces for production use. Workspaces share the same backend configuration, which means a single misconfigured terraform destroy can target the wrong workspace. Separate state files with separate Service Principals provide stronger isolation.
Common Issues and Troubleshooting
State lock timeout during pipeline runs
Error: Error acquiring the state lock
Lock Info:
ID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Path: myproject/production.tfstate
Operation: OperationTypeApply
Created: 2024-01-15 14:30:00.000000 +0000 UTC
This happens when a previous pipeline run failed without releasing the lock. Check if another run is actually in progress before breaking the lock. Use az storage blob lease break only after confirming no concurrent operations. Add a timeout to your pipeline apply step so stuck operations do not hold locks indefinitely.
Provider authentication fails in apply stage but works in plan stage
Error: building AzureRM Client: Authenticating using a Service Principal with a Client Secret
When using pipeline artifacts, the environment variables must be set again in the apply stage. A common mistake is only setting ARM_* variables in the plan stage and forgetting them in the apply stage. Every step that runs a Terraform command needs these variables.
Terraform init fails with "backend configuration changed"
Error: Backend configuration changed
This happens when you modify backend.tf between runs. Terraform detects the mismatch and refuses to initialize. Add -reconfigure to terraform init to force the new configuration, or -migrate-state if you need to preserve existing state. In a pipeline, -reconfigure is usually the right choice since the plan file was created with the current backend.
Plan artifact is too large for pipeline artifacts
##[error]Upload '/infrastructure' failed. Total file count: 12847, Total file size: 856 MB.
The .terraform directory with provider binaries is huge. Exclude it from the artifact or restructure to re-run terraform init in the apply stage instead of carrying the full directory:
- task: PublishPipelineArtifact@1
inputs:
targetPath: $(workDir)/tfplan
artifact: "terraform-plan-file"
Then in the apply stage, check out the code again and run terraform init before applying.
Resource already exists error after failed apply
Error: A resource with the ID "/subscriptions/.../resourceGroups/rg-myapp" already exists
If a previous apply partially succeeded, some resources exist but are not in state. Import them: terraform import azurerm_resource_group.main /subscriptions/.../resourceGroups/rg-myapp. In a pipeline, add a recovery step that attempts import for known resources before retrying the apply.
Best Practices
Never run
terraform applywithout a saved plan file. Always use-out=tfplanduring plan and thenapply tfplan. This guarantees the apply executes exactly what was reviewed, not whatever the current state happens to be.Pin your Terraform and provider versions. Use the
required_versionconstraint interraformblock andversionconstraints on providers. Different versions can produce different plans, and you do not want the pipeline's Terraform version to drift from what developers use locally.Use separate Service Principals per environment. The dev SP should not have access to production resources. If a pipeline bug points the wrong variable file at the wrong environment, the SP permissions limit the blast radius.
Store sensitive outputs in Azure Key Vault, not pipeline variables. After
terraform apply, write connection strings and credentials to Key Vault rather than passing them through pipeline variables. This keeps secrets out of build logs and makes them accessible to applications at runtime.Tag every resource with
ManagedBy = terraform. This makes it obvious which resources are under Terraform management and which are not. It also helps audit for drift — if someone modifies a tagged resource manually, you know it needs reconciliation.Run
terraform fmt -checkin CI. Consistent formatting prevents noisy diffs and merge conflicts. Fail the pipeline if formatting is wrong rather than auto-fixing, so developers learn to format before pushing.Implement cost estimation before apply. Tools like Infracost integrate with Terraform plan output and post estimated monthly costs as PR comments. This catches accidentally expensive changes — like someone changing a VM SKU from B2s to E64s — before they reach production.
Keep state file access audited. Enable Azure Storage diagnostic logging for the state file container. Every read and write is logged, giving you a full audit trail of who accessed or modified state and when.