Connecting Azure DevOps to Azure Monitor
Integrate Azure DevOps with Azure Monitor for deployment annotations, release gates, and automated health verification
Connecting Azure DevOps to Azure Monitor
Azure Monitor and Azure DevOps are two pillars of a healthy deployment pipeline. When you connect them properly, you get deployment annotations that correlate releases with telemetry, release gates that prevent bad code from reaching production, and automated work item creation when alerts fire. This article walks through the full integration, from Application Insights instrumentation to building a deployment health checker in Node.js.
Prerequisites
Before diving in, make sure you have the following in place:
- An Azure subscription with an Application Insights resource provisioned
- An Azure DevOps organization with at least one project and pipeline
- Node.js 18+ installed locally
- The Azure CLI installed and authenticated (
az login) - A service connection in Azure DevOps linked to your Azure subscription
- Basic familiarity with YAML pipelines and Azure Monitor concepts
Azure Monitor Overview
Azure Monitor is the unified observability platform in Azure. It collects metrics, logs, and traces from virtually every Azure resource and routes them into Log Analytics workspaces, Application Insights instances, and alert rules. For DevOps integration, the pieces that matter most are:
- Application Insights — captures request telemetry, dependency tracking, exceptions, and custom events from your running application
- Log Analytics — stores structured log data that you query with Kusto Query Language (KQL)
- Alerts — fires notifications or triggers actions when metrics cross thresholds
- Workbooks — interactive dashboards that combine metrics, logs, and Azure Resource Graph data
The integration with Azure DevOps happens at multiple points: pipelines push deployment annotations into Application Insights, release gates query Azure Monitor for health signals, and alert actions create work items in Azure Boards.
Application Insights for Node.js
The first step is instrumenting your Node.js application. The applicationinsights SDK auto-collects HTTP requests, dependencies, exceptions, and performance counters with minimal configuration.
// telemetry.js
var appInsights = require("applicationinsights");
function initializeTelemetry() {
appInsights.setup(process.env.APPINSIGHTS_INSTRUMENTATIONKEY)
.setAutoCollectRequests(true)
.setAutoCollectPerformance(true)
.setAutoCollectExceptions(true)
.setAutoCollectDependencies(true)
.setAutoCollectConsole(true, true)
.setUseDiskRetryCaching(true)
.setSendLiveMetrics(true)
.start();
var client = appInsights.defaultClient;
// Tag every telemetry item with the deployment version
client.addTelemetryProcessor(function (envelope) {
envelope.tags["ai.application.ver"] = process.env.APP_VERSION || "unknown";
return true;
});
return client;
}
module.exports = {
initializeTelemetry: initializeTelemetry,
getClient: function () {
return appInsights.defaultClient;
}
};
// app.js
var express = require("express");
var telemetry = require("./telemetry");
var client = telemetry.initializeTelemetry();
var app = express();
app.get("/health", function (req, res) {
client.trackEvent({
name: "HealthCheckRequested",
properties: { source: "pipeline" }
});
res.json({ status: "healthy", version: process.env.APP_VERSION });
});
app.get("/", function (req, res) {
res.send("Running version " + (process.env.APP_VERSION || "local"));
});
var port = process.env.PORT || 3000;
app.listen(port, function () {
console.log("Server listening on port " + port);
});
The setSendLiveMetrics(true) call enables the Live Metrics stream, which is invaluable during deployments. You can watch request rates, failure rates, and dependency latency in real time as traffic shifts to the new version.
Deployment Annotations from Pipelines
Deployment annotations are markers on Application Insights charts that show exactly when a deployment occurred. They let you visually correlate changes in error rates or latency with specific releases. Azure DevOps pipelines can create these annotations via the Azure Monitor REST API.
Here is a pipeline task that creates an annotation after a successful deployment:
# azure-pipelines.yml (annotation step)
- task: AzureCLI@2
displayName: 'Create Deployment Annotation'
inputs:
azureSubscription: 'MyAzureServiceConnection'
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
ANNOTATION_ID=$(uuidgen)
DEPLOY_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S.0000000Z")
ANNOTATION_BODY=$(cat <<EOF
{
"Id": "$ANNOTATION_ID",
"AnnotationName": "Release $(Build.BuildNumber)",
"EventTime": "$DEPLOY_TIME",
"Category": "Deployment",
"Properties": "{\"ReleaseName\":\"$(Build.BuildNumber)\",\"Environment\":\"$(Environment)\",\"BuildId\":\"$(Build.BuildId)\",\"CommitId\":\"$(Build.SourceVersion)\"}"
}
EOF
)
az rest --method put \
--uri "https://management.azure.com/subscriptions/$(SubscriptionId)/resourceGroups/$(ResourceGroup)/providers/microsoft.insights/components/$(AppInsightsName)/Annotations?api-version=2015-05-01" \
--body "$ANNOTATION_BODY"
You can also create annotations programmatically from Node.js, which is useful for custom deployment scripts:
// create-annotation.js
var https = require("https");
var exec = require("child_process").execSync;
function createDeploymentAnnotation(options) {
// Get an access token via Azure CLI
var tokenResult = exec("az account get-access-token --query accessToken -o tsv");
var accessToken = tokenResult.toString().trim();
var annotationId = require("crypto").randomUUID();
var deployTime = new Date().toISOString();
var properties = JSON.stringify({
ReleaseName: options.releaseName,
Environment: options.environment,
CommitId: options.commitId
});
var body = JSON.stringify({
Id: annotationId,
AnnotationName: options.releaseName,
EventTime: deployTime,
Category: "Deployment",
Properties: properties
});
var resourcePath = "/subscriptions/" + options.subscriptionId +
"/resourceGroups/" + options.resourceGroup +
"/providers/microsoft.insights/components/" + options.appInsightsName +
"/Annotations?api-version=2015-05-01";
var requestOptions = {
hostname: "management.azure.com",
path: resourcePath,
method: "PUT",
headers: {
"Authorization": "Bearer " + accessToken,
"Content-Type": "application/json",
"Content-Length": Buffer.byteLength(body)
}
};
return new Promise(function (resolve, reject) {
var req = https.request(requestOptions, function (res) {
var data = "";
res.on("data", function (chunk) { data += chunk; });
res.on("end", function () {
if (res.statusCode >= 200 && res.statusCode < 300) {
console.log("Annotation created: " + annotationId);
resolve(JSON.parse(data));
} else {
reject(new Error("Annotation failed: " + res.statusCode + " " + data));
}
});
});
req.on("error", reject);
req.write(body);
req.end();
});
}
module.exports = { createDeploymentAnnotation: createDeploymentAnnotation };
Release Gates with Azure Monitor
Release gates are automated checks that run before or after a deployment stage. Azure Monitor gates query metrics or log data and only allow the pipeline to proceed if the results meet your criteria. This is one of the most powerful integrations between DevOps and monitoring.
In classic release pipelines, you configure gates through the UI. In YAML pipelines, you use the InvokeRestAPI task or a custom task to query Azure Monitor and evaluate the response.
Here is a gate that checks the error rate in Application Insights before promoting to production:
# Gate: Check error rate before production promotion
- stage: ValidateStaging
dependsOn: DeployStaging
jobs:
- job: CheckHealth
pool: server
steps:
- task: AzureCLI@2
displayName: 'Query Error Rate'
inputs:
azureSubscription: 'MyAzureServiceConnection'
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
ERROR_RATE=$(az monitor app-insights query \
--app $(AppInsightsAppId) \
--analytics-query "requests | where timestamp > ago(10m) | summarize errorRate = todouble(countif(success == false)) / count() * 100 | project errorRate" \
--query "tables[0].rows[0][0]" -o tsv)
echo "Current error rate: ${ERROR_RATE}%"
THRESHOLD=5
if (( $(echo "$ERROR_RATE > $THRESHOLD" | bc -l) )); then
echo "##vso[task.logissue type=error]Error rate ${ERROR_RATE}% exceeds threshold ${THRESHOLD}%"
echo "##vso[task.complete result=Failed;]"
else
echo "Error rate within acceptable range"
fi
Monitoring Deployment Health
Beyond simple error rate checks, you should monitor multiple health signals after each deployment. Response time degradation, increased dependency failures, and memory leaks are all indicators that a deployment has introduced problems.
// deployment-health.js
var https = require("https");
var exec = require("child_process").execSync;
function queryAppInsights(appId, query) {
var tokenResult = exec("az account get-access-token --resource https://api.applicationinsights.io --query accessToken -o tsv");
var accessToken = tokenResult.toString().trim();
var encodedQuery = encodeURIComponent(query);
var path = "/v1/apps/" + appId + "/query?query=" + encodedQuery;
return new Promise(function (resolve, reject) {
var options = {
hostname: "api.applicationinsights.io",
path: path,
method: "GET",
headers: { "Authorization": "Bearer " + accessToken }
};
var req = https.request(options, function (res) {
var data = "";
res.on("data", function (chunk) { data += chunk; });
res.on("end", function () {
if (res.statusCode === 200) {
resolve(JSON.parse(data));
} else {
reject(new Error("Query failed: " + res.statusCode));
}
});
});
req.on("error", reject);
req.end();
});
}
function checkDeploymentHealth(appId) {
var checks = [
{
name: "Error Rate",
query: "requests | where timestamp > ago(10m) | summarize errorRate = todouble(countif(success == false)) / count() * 100",
threshold: 5,
compare: "lessThan"
},
{
name: "Avg Response Time",
query: "requests | where timestamp > ago(10m) | summarize avg(duration)",
threshold: 2000,
compare: "lessThan"
},
{
name: "Dependency Failure Rate",
query: "dependencies | where timestamp > ago(10m) | summarize failRate = todouble(countif(success == false)) / count() * 100",
threshold: 10,
compare: "lessThan"
},
{
name: "Exception Count",
query: "exceptions | where timestamp > ago(10m) | count",
threshold: 50,
compare: "lessThan"
}
];
var results = [];
return checks.reduce(function (chain, check) {
return chain.then(function () {
return queryAppInsights(appId, check.query).then(function (result) {
var value = result.tables[0].rows[0][0];
var passed = check.compare === "lessThan" ? value < check.threshold : value > check.threshold;
results.push({
name: check.name,
value: value,
threshold: check.threshold,
passed: passed
});
console.log("[" + (passed ? "PASS" : "FAIL") + "] " + check.name + ": " + value + " (threshold: " + check.threshold + ")");
});
});
}, Promise.resolve()).then(function () {
var allPassed = results.every(function (r) { return r.passed; });
return { healthy: allPassed, checks: results };
});
}
module.exports = { checkDeploymentHealth: checkDeploymentHealth, queryAppInsights: queryAppInsights };
Creating Work Items from Alerts
When Azure Monitor fires an alert, you can automatically create a work item in Azure Boards. This closes the feedback loop between operations and development. Configure this through Azure Monitor action groups.
First, create a service hook or use the Azure DevOps REST API from an Azure Function triggered by an alert:
// alert-to-workitem.js
var https = require("https");
function createWorkItemFromAlert(alertPayload, devopsConfig) {
var alertName = alertPayload.data.essentials.alertRule;
var severity = alertPayload.data.essentials.severity;
var description = alertPayload.data.essentials.description || "No description provided";
var firedTime = alertPayload.data.essentials.firedDateTime;
var severityMap = {
"Sev0": "1 - Critical",
"Sev1": "2 - High",
"Sev2": "3 - Medium",
"Sev3": "4 - Low"
};
var patchDocument = JSON.stringify([
{ op: "add", path: "/fields/System.Title", value: "[Alert] " + alertName },
{ op: "add", path: "/fields/System.Description", value: description + "\n\nFired at: " + firedTime },
{ op: "add", path: "/fields/Microsoft.VSTS.Common.Priority", value: severityMap[severity] || "3 - Medium" },
{ op: "add", path: "/fields/System.Tags", value: "auto-created;azure-monitor;alert" },
{ op: "add", path: "/fields/Microsoft.VSTS.Common.Severity", value: severity }
]);
var path = "/" + devopsConfig.project + "/_apis/wit/workitems/$Bug?api-version=7.1";
var options = {
hostname: "dev.azure.com",
path: "/" + devopsConfig.organization + path,
method: "POST",
headers: {
"Authorization": "Basic " + Buffer.from(":" + devopsConfig.pat).toString("base64"),
"Content-Type": "application/json-patch+json",
"Content-Length": Buffer.byteLength(patchDocument)
}
};
return new Promise(function (resolve, reject) {
var req = https.request(options, function (res) {
var data = "";
res.on("data", function (chunk) { data += chunk; });
res.on("end", function () {
if (res.statusCode === 200) {
var workItem = JSON.parse(data);
console.log("Created work item #" + workItem.id + ": " + alertName);
resolve(workItem);
} else {
reject(new Error("Failed to create work item: " + res.statusCode + " " + data));
}
});
});
req.on("error", reject);
req.write(patchDocument);
req.end();
});
}
module.exports = { createWorkItemFromAlert: createWorkItemFromAlert };
To wire this up, create an Azure Function with an HTTP trigger and add it as an action in your Azure Monitor action group. When an alert fires, the function receives the alert payload and creates a bug in Azure Boards.
Log Analytics Queries for Pipeline Data
Azure DevOps can send audit logs and pipeline telemetry to a Log Analytics workspace. Once connected, you can write KQL queries that join pipeline data with application telemetry to answer questions like "which deployment caused this spike in errors?"
// KQL: Correlate deployments with error spikes
let deployments = customEvents
| where name == "DeploymentCompleted"
| project deployTime = timestamp, releaseName = tostring(customDimensions.ReleaseName);
let errorSpikes = requests
| where success == false
| summarize errorCount = count() by bin(timestamp, 5m)
| where errorCount > 50;
deployments
| join kind=inner errorSpikes on $left.deployTime == $right.timestamp
| project deployTime, releaseName, errorCount
| order by deployTime desc
Another useful query finds deployments that caused response time regressions:
// KQL: Detect response time regression after deployment
let baseline = requests
| where timestamp between(ago(2h) .. ago(1h))
| summarize baselineP95 = percentile(duration, 95);
let current = requests
| where timestamp > ago(1h)
| summarize currentP95 = percentile(duration, 95);
baseline | extend currentP95 = toscalar(current | project currentP95)
| extend regressionPercent = round((currentP95 - baselineP95) / baselineP95 * 100, 2)
| where regressionPercent > 20
| project baselineP95, currentP95, regressionPercent
Custom Dashboards Combining DevOps and Monitoring Data
Azure Dashboards and Workbooks can pull data from both Azure Monitor and Azure DevOps to give you a unified view of deployment and operational health. Workbooks are particularly powerful because they support parameterized queries and dynamic layouts.
A useful dashboard layout includes:
- Deployment timeline — annotations from Application Insights showing when each release went out
- Error rate over time — line chart with deployment markers overlaid
- Response time percentiles — P50, P95, P99 with deployment correlation
- Active alerts — current firing alerts tied to the deployed application
- Pipeline success rate — percentage of pipelines that completed without failures
You can create these programmatically using ARM templates or the Azure CLI, but the Workbooks visual editor in the Azure portal is the fastest way to iterate on layout and queries.
Availability Tests Triggered by Deployments
URL ping tests and multi-step availability tests in Application Insights verify that your application is reachable after deployment. You can trigger a custom availability test from your pipeline to get immediate feedback:
// availability-test.js
var http = require("http");
var https = require("https");
var url = require("url");
function runAvailabilityTest(endpoints, options) {
var timeout = (options && options.timeout) || 10000;
var retries = (options && options.retries) || 3;
function testEndpoint(endpoint, attempt) {
return new Promise(function (resolve, reject) {
var parsed = url.parse(endpoint.url);
var protocol = parsed.protocol === "https:" ? https : http;
var startTime = Date.now();
var req = protocol.get(endpoint.url, { timeout: timeout }, function (res) {
var duration = Date.now() - startTime;
var data = "";
res.on("data", function (chunk) { data += chunk; });
res.on("end", function () {
var passed = res.statusCode === (endpoint.expectedStatus || 200);
if (endpoint.expectedBody) {
passed = passed && data.indexOf(endpoint.expectedBody) !== -1;
}
resolve({
name: endpoint.name,
url: endpoint.url,
statusCode: res.statusCode,
duration: duration,
passed: passed,
attempt: attempt
});
});
});
req.on("error", function (err) {
if (attempt < retries) {
console.log("Retrying " + endpoint.name + " (attempt " + (attempt + 1) + ")");
setTimeout(function () {
resolve(testEndpoint(endpoint, attempt + 1));
}, 2000);
} else {
resolve({
name: endpoint.name,
url: endpoint.url,
error: err.message,
passed: false,
attempt: attempt
});
}
});
req.on("timeout", function () {
req.destroy();
});
});
}
return Promise.all(endpoints.map(function (ep) {
return testEndpoint(ep, 1);
}));
}
// Usage
var endpoints = [
{ name: "Homepage", url: "https://myapp.azurewebsites.net/", expectedStatus: 200 },
{ name: "Health Check", url: "https://myapp.azurewebsites.net/health", expectedStatus: 200, expectedBody: "healthy" },
{ name: "API Status", url: "https://myapp.azurewebsites.net/api/status", expectedStatus: 200 }
];
runAvailabilityTest(endpoints).then(function (results) {
results.forEach(function (r) {
console.log("[" + (r.passed ? "PASS" : "FAIL") + "] " + r.name + " - " + r.duration + "ms");
});
var allPassed = results.every(function (r) { return r.passed; });
process.exit(allPassed ? 0 : 1);
});
Performance Regression Detection
Comparing pre-deployment and post-deployment performance baselines is critical. The following script queries Application Insights for P95 response times and compares them against a baseline window:
// regression-detector.js
var healthModule = require("./deployment-health");
function detectRegression(appId, options) {
var baselineMinutes = (options && options.baselineMinutes) || 60;
var currentMinutes = (options && options.currentMinutes) || 15;
var regressionThreshold = (options && options.regressionThreshold) || 20;
var baselineQuery = "requests | where timestamp between(ago(" + (baselineMinutes + currentMinutes) + "m) .. ago(" + currentMinutes + "m)) | summarize p95 = percentile(duration, 95), p50 = percentile(duration, 50), avgDuration = avg(duration), requestCount = count()";
var currentQuery = "requests | where timestamp > ago(" + currentMinutes + "m) | summarize p95 = percentile(duration, 95), p50 = percentile(duration, 50), avgDuration = avg(duration), requestCount = count()";
return Promise.all([
healthModule.queryAppInsights(appId, baselineQuery),
healthModule.queryAppInsights(appId, currentQuery)
]).then(function (results) {
var baseline = results[0].tables[0].rows[0];
var current = results[1].tables[0].rows[0];
var baselineP95 = baseline[0];
var currentP95 = current[0];
var regressionPercent = ((currentP95 - baselineP95) / baselineP95) * 100;
var report = {
baseline: {
p95: baselineP95,
p50: baseline[1],
avg: baseline[2],
requestCount: baseline[3]
},
current: {
p95: currentP95,
p50: current[1],
avg: current[2],
requestCount: current[3]
},
regressionPercent: Math.round(regressionPercent * 100) / 100,
regressed: regressionPercent > regressionThreshold
};
if (report.regressed) {
console.log("REGRESSION DETECTED: P95 increased by " + report.regressionPercent + "%");
console.log(" Baseline P95: " + report.baseline.p95 + "ms");
console.log(" Current P95: " + report.current.p95 + "ms");
} else {
console.log("No regression detected. P95 change: " + report.regressionPercent + "%");
}
return report;
});
}
module.exports = { detectRegression: detectRegression };
Complete Working Example
Here is a full Azure Pipeline that deploys a Node.js app to Azure App Service, creates a deployment annotation, runs availability tests, checks deployment health, and uses the results as a gate for production promotion.
# azure-pipelines.yml
trigger:
branches:
include:
- main
variables:
azureSubscription: 'MyAzureServiceConnection'
appName: 'my-nodejs-app'
resourceGroup: 'rg-myapp-prod'
appInsightsName: 'ai-myapp-prod'
appInsightsAppId: '$(APP_INSIGHTS_APP_ID)'
subscriptionId: '$(AZURE_SUBSCRIPTION_ID)'
stages:
- stage: Build
displayName: 'Build and Test'
jobs:
- job: BuildJob
pool:
vmImage: 'ubuntu-latest'
steps:
- task: NodeTool@0
inputs:
versionSpec: '20.x'
- script: |
npm ci
npm test
displayName: 'Install and Test'
- task: ArchiveFiles@2
inputs:
rootFolderOrFile: '$(System.DefaultWorkingDirectory)'
includeRootFolder: false
archiveType: 'zip'
archiveFile: '$(Build.ArtifactStagingDirectory)/app.zip'
- publish: $(Build.ArtifactStagingDirectory)/app.zip
artifact: drop
- stage: DeployStaging
displayName: 'Deploy to Staging'
dependsOn: Build
jobs:
- deployment: DeployStaging
pool:
vmImage: 'ubuntu-latest'
environment: 'staging'
strategy:
runOnce:
deploy:
steps:
- download: current
artifact: drop
- task: AzureWebApp@1
inputs:
azureSubscription: $(azureSubscription)
appType: 'webAppLinux'
appName: '$(appName)-staging'
package: '$(Pipeline.Workspace)/drop/app.zip'
appSettings: '-APP_VERSION $(Build.BuildNumber) -APPINSIGHTS_INSTRUMENTATIONKEY $(AI_IKEY)'
- stage: ValidateStaging
displayName: 'Validate Staging Health'
dependsOn: DeployStaging
jobs:
- job: HealthGate
pool:
vmImage: 'ubuntu-latest'
steps:
- task: NodeTool@0
inputs:
versionSpec: '20.x'
- script: |
echo "Waiting 60 seconds for application to stabilize..."
sleep 60
displayName: 'Warm-up Period'
- script: |
npm install
node availability-test.js
displayName: 'Run Availability Tests'
- task: AzureCLI@2
displayName: 'Check Error Rate Gate'
inputs:
azureSubscription: $(azureSubscription)
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
ERROR_RATE=$(az monitor app-insights query \
--app $(appInsightsAppId) \
--analytics-query "requests | where timestamp > ago(10m) | summarize errorRate = todouble(countif(success == false)) / count() * 100 | project errorRate" \
--query "tables[0].rows[0][0]" -o tsv)
echo "Staging error rate: ${ERROR_RATE}%"
if (( $(echo "$ERROR_RATE > 5" | bc -l) )); then
echo "##vso[task.logissue type=error]Error rate too high for production promotion"
exit 1
fi
- task: AzureCLI@2
displayName: 'Check Response Time Gate'
inputs:
azureSubscription: $(azureSubscription)
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
P95=$(az monitor app-insights query \
--app $(appInsightsAppId) \
--analytics-query "requests | where timestamp > ago(10m) | summarize percentile(duration, 95)" \
--query "tables[0].rows[0][0]" -o tsv)
echo "Staging P95 response time: ${P95}ms"
if (( $(echo "$P95 > 3000" | bc -l) )); then
echo "##vso[task.logissue type=error]P95 response time exceeds 3000ms threshold"
exit 1
fi
- stage: DeployProduction
displayName: 'Deploy to Production'
dependsOn: ValidateStaging
jobs:
- deployment: DeployProd
pool:
vmImage: 'ubuntu-latest'
environment: 'production'
strategy:
runOnce:
deploy:
steps:
- download: current
artifact: drop
- task: AzureWebApp@1
inputs:
azureSubscription: $(azureSubscription)
appType: 'webAppLinux'
appName: $(appName)
package: '$(Pipeline.Workspace)/drop/app.zip'
appSettings: '-APP_VERSION $(Build.BuildNumber) -APPINSIGHTS_INSTRUMENTATIONKEY $(AI_IKEY)'
- task: AzureCLI@2
displayName: 'Create Deployment Annotation'
inputs:
azureSubscription: $(azureSubscription)
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
ANNOTATION_ID=$(uuidgen)
DEPLOY_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S.0000000Z")
az rest --method put \
--uri "https://management.azure.com/subscriptions/$(subscriptionId)/resourceGroups/$(resourceGroup)/providers/microsoft.insights/components/$(appInsightsName)/Annotations?api-version=2015-05-01" \
--body "{\"Id\":\"$ANNOTATION_ID\",\"AnnotationName\":\"Release $(Build.BuildNumber)\",\"EventTime\":\"$DEPLOY_TIME\",\"Category\":\"Deployment\",\"Properties\":\"{\\\"ReleaseName\\\":\\\"$(Build.BuildNumber)\\\",\\\"Environment\\\":\\\"production\\\",\\\"CommitId\\\":\\\"$(Build.SourceVersion)\\\"}\"}"
- task: AzureCLI@2
displayName: 'Post-Deploy Health Check'
inputs:
azureSubscription: $(azureSubscription)
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
echo "Waiting 120 seconds for production stabilization..."
sleep 120
ERROR_RATE=$(az monitor app-insights query \
--app $(appInsightsAppId) \
--analytics-query "requests | where timestamp > ago(5m) | summarize errorRate = todouble(countif(success == false)) / count() * 100 | project errorRate" \
--query "tables[0].rows[0][0]" -o tsv)
echo "Production error rate: ${ERROR_RATE}%"
if (( $(echo "$ERROR_RATE > 2" | bc -l) )); then
echo "##vso[task.logissue type=warning]Elevated error rate in production: ${ERROR_RATE}%"
fi
This pipeline follows a progressive delivery pattern: build, deploy to staging, validate staging health with automated gates, then promote to production with annotation and post-deployment monitoring.
Common Issues and Troubleshooting
1. Deployment annotations not appearing on charts
The most common cause is a mismatch between the Application Insights resource ID and the one you are targeting in the REST API call. Double check your subscription ID, resource group, and App Insights resource name. Also verify that the service connection has Contributor access to the Application Insights resource. Annotations use a legacy API version (2015-05-01), and the response format is not intuitive — a successful creation returns a 200 with an array, not a single object.
2. Release gates timing out with no data
If your gate queries Application Insights for data from the last 10 minutes but your staging environment receives little traffic, the query returns null or zero rows. This causes the gate evaluation to fail or produce misleading results. Solve this by running a synthetic load test or at minimum a set of health check requests before evaluating gates. The warm-up period in the pipeline example above exists for this reason.
3. Application Insights SDK not sending telemetry
The most frequent cause is a missing or incorrect instrumentation key. Check that APPINSIGHTS_INSTRUMENTATIONKEY is set in your App Service configuration. If you are using connection strings (the newer approach), make sure you call appInsights.setup() with the connection string instead. Also check that start() is actually called — forgetting to call .start() after .setup() is a surprisingly common mistake.
4. Azure CLI az monitor app-insights query returning authentication errors
The Azure CLI command requires that the service principal associated with your Azure DevOps service connection has at least Reader access to the Application Insights resource. Additionally, some organizations restrict API access to Application Insights through Azure AD conditional access policies. If you see 403 errors, check the IAM roles on both the resource and the resource group.
5. Work item creation from alerts producing duplicates
When an alert fires repeatedly (e.g., due to a flapping condition), the action group triggers your function multiple times, creating duplicate work items. Implement deduplication by querying Azure Boards for existing work items with the same alert name before creating a new one. Use the System.Tags field to mark auto-created items and check for open items with matching tags.
Best Practices
Set a warm-up period after deployment. Never query Application Insights for health metrics immediately after deploying. Wait at least 60-120 seconds for the application to stabilize and for telemetry to flow.
Use connection strings instead of instrumentation keys. Microsoft is moving toward connection strings for Application Insights. They support regional ingestion endpoints and are more resilient than bare instrumentation keys.
Create annotations in every environment, not just production. Staging annotations help you correlate test failures with specific deployments during the validation phase.
Set realistic thresholds for release gates. An error rate threshold of 0% will cause false failures. Real applications have some baseline error rate from bots, health probes, and transient network issues. Measure your baseline and set thresholds accordingly.
Combine multiple health signals in gates. Error rate alone is not sufficient. Check response time percentiles, dependency success rates, and exception counts together. A single metric can hide problems that surface in another.
Dedup alerts-to-work-items by alert rule name. Always check for existing open work items before creating new ones. Flapping alerts will flood your backlog with duplicates otherwise.
Use deployment slots with monitoring. Azure App Service deployment slots let you deploy, warm up, and validate before swapping traffic. Combine this with Application Insights to monitor the slot independently before the swap.
Tag telemetry with the build version. The telemetry processor shown earlier tags every event with
ai.application.ver. This makes it trivial to filter Application Insights data by deployment version and compare behavior across releases.Store pipeline-generated metrics in Log Analytics. Send custom events from your pipeline (deployment duration, test results, artifact size) to Log Analytics so you can query them alongside application telemetry in unified dashboards.
Review alert thresholds quarterly. As your application evolves, baseline performance changes. Thresholds that made sense six months ago might be too loose or too tight today. Schedule regular reviews of your monitoring configuration.