Testing Infrastructure as Code
Test infrastructure code with unit tests, integration tests, policy checks, and compliance scanning for Terraform and CDK
Testing Infrastructure as Code
Infrastructure as Code brought software engineering discipline to infrastructure provisioning, but most teams skip the part that makes software engineering actually work: testing. Treating Terraform modules and CDK stacks as untested scripts is how you end up with a security group that allows 0.0.0.0/0 on port 22 in production. This article covers the full testing pyramid for IaC, from static analysis through integration testing, with practical examples you can implement today.
Prerequisites
- Node.js 18+ installed
- Terraform 1.6+ installed (for
terraform testsupport) - AWS CDK v2 installed (
npm install -g aws-cdk) - Basic familiarity with Terraform HCL and AWS CDK constructs
- Go 1.21+ installed (for Terratest examples)
- An AWS account with credentials configured for integration tests
Why Test Infrastructure Code
Infrastructure code has a unique failure profile. When application code breaks, you get an error message and a stack trace. When infrastructure code breaks, you get a $47,000 AWS bill, a data breach, or a four-hour outage at 3 AM. The blast radius is enormous.
There are three categories of defects in IaC that testing catches:
Correctness defects — the infrastructure does not do what you intended. Your VPC has no internet gateway, your Lambda function has the wrong runtime, your RDS instance is in a single AZ.
Security defects — the infrastructure is configured in a way that exposes attack surface. Public S3 buckets, overly permissive IAM policies, unencrypted EBS volumes.
Compliance defects — the infrastructure violates organizational or regulatory policies. Wrong region, missing tags, instance types outside the approved list, no encryption at rest.
Testing IaC is not optional. It is the mechanism that turns your infrastructure definitions from "scripts we hope work" into "validated specifications we can reason about."
The Testing Pyramid for Infrastructure as Code
The traditional testing pyramid applies to IaC, but the layers look different:
/ E2E Tests \ ← Deploy and validate in real environment
/ Integration \ ← Terratest, deploy modules in isolation
/ Contract Tests \ ← Module interface validation
/ Unit Tests \ ← CDK assertions, terraform test
/ Policy & Compliance \ ← OPA, Sentinel, Checkov
/ Static Analysis \ ← tflint, cdk-nag, tfsec
The bottom layers run in seconds with no cloud credentials. The top layers require real infrastructure and take minutes. A healthy IaC pipeline runs all of them, but invests most heavily in the bottom three.
Static Analysis
Static analysis catches structural errors and security misconfigurations without executing anything. These tools parse your IaC files and check them against rule databases.
tflint for Terraform
tflint catches issues that terraform validate misses — deprecated syntax, invalid instance types, naming convention violations.
# Install tflint
curl -s https://raw.githubusercontent.com/terraform-linters/tflint/master/install_linux.sh | bash
# Initialize with AWS ruleset
cat > .tflint.hcl <<EOF
plugin "aws" {
enabled = true
version = "0.30.0"
source = "github.com/terraform-linters/tflint-ruleset-aws"
}
rule "terraform_naming_convention" {
enabled = true
format = "snake_case"
}
rule "terraform_documented_variables" {
enabled = true
}
EOF
tflint --init
tflint --recursive
Checkov for Multi-Framework Scanning
Checkov is the Swiss Army knife of IaC security scanning. It supports Terraform, CloudFormation, Kubernetes manifests, Dockerfiles, and CDK.
pip install checkov
# Scan Terraform directory
checkov -d ./modules/networking --framework terraform
# Scan with custom policy directory
checkov -d ./modules/networking --external-checks-dir ./policies
# Output as JUnit XML for CI integration
checkov -d . --output junitxml > checkov-results.xml
cdk-nag for CDK Stacks
cdk-nag applies AWS Solutions Architect best practices directly to your CDK constructs. It runs during synthesis, catching issues before anything touches CloudFormation.
var cdk = require("aws-cdk-lib");
var nag = require("cdk-nag");
var MyStack = require("./lib/my-stack");
var app = new cdk.App();
var stack = new MyStack(app, "MyStack");
// Apply AWS Solutions checks
cdk.Aspects.of(app).add(new nag.AwsSolutionsChecks({ verbose: true }));
// Suppress specific rules when justified
nag.NagSuppressions.addStackSuppressions(stack, [
{
id: "AwsSolutions-S1",
reason: "Access logging bucket does not need its own access logs"
}
]);
app.synth();
Unit Testing CDK with Assertions
The aws-cdk-lib/assertions module lets you write unit tests that inspect the synthesized CloudFormation template without deploying anything. These tests run in milliseconds.
var cdk = require("aws-cdk-lib");
var assertions = require("aws-cdk-lib/assertions");
var assert = require("assert");
// The stack under test
var NetworkStack = require("../lib/network-stack");
function testVpcConfiguration() {
var app = new cdk.App();
var stack = new NetworkStack(app, "TestNetworkStack", {
environment: "production",
cidrBlock: "10.0.0.0/16"
});
var template = assertions.Template.fromStack(stack);
// Assert VPC exists with correct CIDR
template.hasResourceProperties("AWS::EC2::VPC", {
CidrBlock: "10.0.0.0/16",
EnableDnsHostnames: true,
EnableDnsSupport: true
});
// Assert we have exactly 3 private and 3 public subnets
template.resourceCountIs("AWS::EC2::Subnet", 6);
// Assert NAT Gateway exists in production
template.resourceCountIs("AWS::EC2::NatGateway", 3);
// Assert flow logs are enabled
template.hasResourceProperties("AWS::EC2::FlowLog", {
ResourceType: "VPC",
TrafficType: "ALL"
});
console.log("VPC configuration tests passed");
}
function testSecurityGroupRules() {
var app = new cdk.App();
var stack = new NetworkStack(app, "TestNetworkStack", {
environment: "production",
cidrBlock: "10.0.0.0/16"
});
var template = assertions.Template.fromStack(stack);
// Assert no security group allows unrestricted SSH
var securityGroups = template.findResources("AWS::EC2::SecurityGroup");
Object.keys(securityGroups).forEach(function(key) {
var sg = securityGroups[key];
var ingress = sg.Properties.SecurityGroupIngress || [];
ingress.forEach(function(rule) {
if (rule.FromPort === 22 || rule.ToPort === 22) {
assert.notStrictEqual(
rule.CidrIp,
"0.0.0.0/0",
"SSH must not be open to the world: " + key
);
}
});
});
console.log("Security group tests passed");
}
function testTaggingPolicy() {
var app = new cdk.App();
var stack = new NetworkStack(app, "TestNetworkStack", {
environment: "production",
cidrBlock: "10.0.0.0/16"
});
var template = assertions.Template.fromStack(stack);
// Assert all taggable resources have required tags
var allResources = template.toJSON().Resources;
var taggableTypes = [
"AWS::EC2::VPC",
"AWS::EC2::Subnet",
"AWS::EC2::SecurityGroup"
];
Object.keys(allResources).forEach(function(key) {
var resource = allResources[key];
if (taggableTypes.indexOf(resource.Type) !== -1) {
var tags = resource.Properties.Tags || [];
var tagNames = tags.map(function(t) { return t.Key; });
assert.ok(
tagNames.indexOf("Environment") !== -1,
"Missing Environment tag on " + key
);
assert.ok(
tagNames.indexOf("ManagedBy") !== -1,
"Missing ManagedBy tag on " + key
);
}
});
console.log("Tagging policy tests passed");
}
testVpcConfiguration();
testSecurityGroupRules();
testTaggingPolicy();
Unit Testing Terraform with terraform test
Terraform 1.6 introduced native testing with terraform test. Test files use .tftest.hcl extension and run against your modules without deploying real infrastructure (when using mocks).
# tests/networking.tftest.hcl
# Mock the AWS provider to avoid real API calls
mock_provider "aws" {}
variables {
environment = "production"
vpc_cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
enable_nat = true
enable_flowlog = true
}
run "vpc_creates_correct_cidr" {
command = plan
assert {
condition = aws_vpc.main.cidr_block == "10.0.0.0/16"
error_message = "VPC CIDR block does not match expected value"
}
}
run "production_enables_nat_gateway" {
command = plan
assert {
condition = length(aws_nat_gateway.main) == 3
error_message = "Production should have 3 NAT gateways, one per AZ"
}
}
run "staging_disables_nat_gateway" {
command = plan
variables {
environment = "staging"
enable_nat = false
}
assert {
condition = length(aws_nat_gateway.main) == 0
error_message = "Staging should not have NAT gateways"
}
}
run "flow_logs_enabled" {
command = plan
assert {
condition = aws_flow_log.main[0].traffic_type == "ALL"
error_message = "Flow logs should capture all traffic"
}
}
run "private_subnets_tagged_correctly" {
command = plan
assert {
condition = alltrue([
for subnet in aws_subnet.private :
subnet.tags["Tier"] == "private"
])
error_message = "All private subnets must have Tier=private tag"
}
}
Run the tests with:
terraform test -verbose
Integration Testing with Terratest
Terratest is a Go library that deploys real infrastructure, validates it, and tears it down. These tests are slow (5-30 minutes) and cost money, but they catch issues that unit tests cannot — IAM permission errors, API quota limits, cross-service integration failures.
// test/networking_test.go
package test
import (
"fmt"
"testing"
"time"
"github.com/gruntwork-io/terratest/modules/aws"
"github.com/gruntwork-io/terratest/modules/retry"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestNetworkingModule(t *testing.T) {
t.Parallel()
// Use a unique name to avoid collisions
uniqueId := fmt.Sprintf("test-%d", time.Now().Unix())
awsRegion := "us-east-1"
terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
TerraformDir: "../modules/networking",
Vars: map[string]interface{}{
"environment": "test",
"vpc_cidr": "10.99.0.0/16",
"name_prefix": uniqueId,
"enable_nat": true,
},
EnvVars: map[string]string{
"AWS_DEFAULT_REGION": awsRegion,
},
})
// Destroy infrastructure after test completes
defer terraform.Destroy(t, terraformOptions)
// Deploy the infrastructure
terraform.InitAndApply(t, terraformOptions)
// Validate outputs exist
vpcId := terraform.Output(t, terraformOptions, "vpc_id")
assert.NotEmpty(t, vpcId)
privateSubnetIds := terraform.OutputList(t, terraformOptions, "private_subnet_ids")
assert.Equal(t, 3, len(privateSubnetIds))
// Validate VPC properties via AWS API
vpc := aws.GetVpcById(t, vpcId, awsRegion)
assert.Equal(t, "10.99.0.0/16", vpc.CidrBlock)
// Validate subnets have internet access through NAT
for _, subnetId := range privateSubnetIds {
routeTable := aws.GetRouteTableForSubnet(t, subnetId, awsRegion)
hasNatRoute := false
for _, route := range routeTable.Routes {
if route.NatGatewayId != nil {
hasNatRoute = true
break
}
}
assert.True(t, hasNatRoute,
"Private subnet %s should route through NAT gateway", subnetId)
}
// Validate DNS resolution works inside the VPC
retry.DoWithRetry(t, "Check DNS resolution", 5, 10*time.Second,
func() (string, error) {
// Verify VPC DNS attributes are enabled
dnsSupport := aws.GetDnsSupport(t, vpcId, awsRegion)
if !dnsSupport {
return "", fmt.Errorf("DNS support not enabled")
}
return "DNS OK", nil
},
)
}
func TestNetworkingModuleWithoutNat(t *testing.T) {
t.Parallel()
uniqueId := fmt.Sprintf("test-%d", time.Now().Unix())
terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
TerraformDir: "../modules/networking",
Vars: map[string]interface{}{
"environment": "dev",
"vpc_cidr": "10.98.0.0/16",
"name_prefix": uniqueId,
"enable_nat": false,
},
})
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
// Verify no NAT gateways were created
natGatewayIds := terraform.OutputList(t, terraformOptions, "nat_gateway_ids")
assert.Equal(t, 0, len(natGatewayIds))
}
Contract Testing for Modules
Contract tests validate that a module's interface — its inputs and outputs — behaves according to its documented specification. This prevents breaking changes when modules are shared across teams.
// test/module-contract.test.js
var assert = require("assert");
var fs = require("fs");
var path = require("path");
var child_process = require("child_process");
function loadTerraformOutputs(moduleDir) {
var result = child_process.execSync(
"terraform output -json",
{ cwd: moduleDir, encoding: "utf-8" }
);
return JSON.parse(result);
}
function loadTerraformVariables(moduleDir) {
var variablesFile = path.join(moduleDir, "variables.tf");
var content = fs.readFileSync(variablesFile, "utf-8");
// Parse variable blocks (simplified parser)
var variables = [];
var regex = /variable\s+"(\w+)"\s*\{/g;
var match;
while ((match = regex.exec(content)) !== null) {
variables.push(match[1]);
}
return variables;
}
function testNetworkModuleContract() {
var moduleDir = path.resolve(__dirname, "../modules/networking");
// Contract: required input variables
var requiredInputs = [
"environment",
"vpc_cidr",
"name_prefix",
"azs",
"enable_nat",
"enable_flowlog"
];
var actualVariables = loadTerraformVariables(moduleDir);
requiredInputs.forEach(function(input) {
assert.ok(
actualVariables.indexOf(input) !== -1,
"Module must accept variable: " + input
);
});
// Contract: required output values
var requiredOutputs = [
"vpc_id",
"vpc_cidr",
"private_subnet_ids",
"public_subnet_ids",
"nat_gateway_ids"
];
var outputsFile = path.join(moduleDir, "outputs.tf");
var outputContent = fs.readFileSync(outputsFile, "utf-8");
requiredOutputs.forEach(function(output) {
assert.ok(
outputContent.indexOf('output "' + output + '"') !== -1,
"Module must export output: " + output
);
});
console.log("Contract tests passed: networking module");
}
testNetworkModuleContract();
Policy Testing with OPA and Sentinel
Policy-as-code tools let you write organizational rules that infrastructure must satisfy. Open Policy Agent (OPA) uses Rego, while HashiCorp Sentinel is Terraform Enterprise/Cloud specific.
OPA with Terraform Plan
# policies/terraform/networking.rego
package terraform.networking
import input as tfplan
# Deny public SSH access
deny[msg] {
resource := tfplan.resource_changes[_]
resource.type == "aws_security_group_rule"
resource.change.after.type == "ingress"
resource.change.after.from_port <= 22
resource.change.after.to_port >= 22
resource.change.after.cidr_blocks[_] == "0.0.0.0/0"
msg := sprintf("Security group rule %s allows SSH from 0.0.0.0/0", [resource.address])
}
# Require encryption on EBS volumes
deny[msg] {
resource := tfplan.resource_changes[_]
resource.type == "aws_ebs_volume"
not resource.change.after.encrypted
msg := sprintf("EBS volume %s must be encrypted", [resource.address])
}
# Enforce approved instance types
approved_types := {
"t3.micro", "t3.small", "t3.medium", "t3.large",
"m5.large", "m5.xlarge", "m5.2xlarge",
"r5.large", "r5.xlarge"
}
deny[msg] {
resource := tfplan.resource_changes[_]
resource.type == "aws_instance"
instance_type := resource.change.after.instance_type
not approved_types[instance_type]
msg := sprintf("Instance type %s is not approved. Resource: %s",
[instance_type, resource.address])
}
# Require specific tags on all resources
required_tags := {"Environment", "Team", "CostCenter"}
deny[msg] {
resource := tfplan.resource_changes[_]
tags := resource.change.after.tags
tag := required_tags[_]
not tags[tag]
msg := sprintf("Resource %s missing required tag: %s", [resource.address, tag])
}
Run OPA against a Terraform plan:
# Generate plan JSON
terraform plan -out=tfplan.binary
terraform show -json tfplan.binary > tfplan.json
# Evaluate policy
opa eval \
--data policies/terraform/ \
--input tfplan.json \
"data.terraform.networking.deny" \
--format pretty
Automating OPA Checks in Node.js
// scripts/check-policy.js
var child_process = require("child_process");
var path = require("path");
function runOpaEval(policyDir, planFile, policyPackage) {
var cmd = [
"opa", "eval",
"--data", policyDir,
"--input", planFile,
"--format", "json",
"data." + policyPackage + ".deny"
].join(" ");
var result = child_process.execSync(cmd, { encoding: "utf-8" });
var parsed = JSON.parse(result);
var violations = parsed.result[0].expressions[0].value;
return violations;
}
function checkPolicies(planFile) {
var policyDir = path.resolve(__dirname, "../policies/terraform");
var violations = runOpaEval(policyDir, planFile, "terraform.networking");
if (violations.length > 0) {
console.error("Policy violations found:");
violations.forEach(function(v) {
console.error(" - " + v);
});
process.exit(1);
}
console.log("All policies passed");
}
var planFile = process.argv[2];
if (!planFile) {
console.error("Usage: node check-policy.js <plan.json>");
process.exit(1);
}
checkPolicies(planFile);
Snapshot Testing for CloudFormation
Snapshot testing captures a known-good CloudFormation template and alerts you when the synthesized output changes. This catches unintended drift caused by CDK version upgrades or dependency changes.
// test/snapshot.test.js
var cdk = require("aws-cdk-lib");
var assertions = require("aws-cdk-lib/assertions");
var fs = require("fs");
var path = require("path");
var assert = require("assert");
var NetworkStack = require("../lib/network-stack");
function updateSnapshot(snapshotPath, template) {
fs.writeFileSync(snapshotPath, JSON.stringify(template, null, 2));
console.log("Snapshot updated: " + snapshotPath);
}
function testSnapshot() {
var app = new cdk.App();
var stack = new NetworkStack(app, "SnapshotStack", {
environment: "production",
cidrBlock: "10.0.0.0/16"
});
var template = assertions.Template.fromStack(stack);
var templateJson = template.toJSON();
var snapshotPath = path.join(__dirname, "__snapshots__", "network-stack.json");
var shouldUpdate = process.env.UPDATE_SNAPSHOTS === "true";
if (shouldUpdate || !fs.existsSync(snapshotPath)) {
fs.mkdirSync(path.dirname(snapshotPath), { recursive: true });
updateSnapshot(snapshotPath, templateJson);
return;
}
var savedSnapshot = JSON.parse(fs.readFileSync(snapshotPath, "utf-8"));
// Compare resource counts
var currentResources = Object.keys(templateJson.Resources);
var snapshotResources = Object.keys(savedSnapshot.Resources);
assert.deepStrictEqual(
currentResources.sort(),
snapshotResources.sort(),
"Resource list has changed. Run with UPDATE_SNAPSHOTS=true to update."
);
// Compare resource types
currentResources.forEach(function(key) {
assert.strictEqual(
templateJson.Resources[key].Type,
savedSnapshot.Resources[key].Type,
"Resource type changed for " + key
);
});
console.log("Snapshot test passed");
}
testSnapshot();
Compliance Testing
Compliance testing validates that infrastructure meets regulatory or organizational standards. Unlike policy tests that check individual resources, compliance tests verify cross-cutting concerns across your entire deployment.
// test/compliance.test.js
var assert = require("assert");
var child_process = require("child_process");
function getTerraformState(dir) {
var result = child_process.execSync(
"terraform show -json",
{ cwd: dir, encoding: "utf-8" }
);
return JSON.parse(result);
}
function checkEncryptionAtRest(state) {
var violations = [];
var resources = state.values.root_module.resources || [];
resources.forEach(function(resource) {
switch (resource.type) {
case "aws_s3_bucket":
// S3 default encryption is enforced at bucket level since Jan 2023
break;
case "aws_rds_instance":
if (!resource.values.storage_encrypted) {
violations.push("RDS instance " + resource.address + " is not encrypted");
}
break;
case "aws_ebs_volume":
if (!resource.values.encrypted) {
violations.push("EBS volume " + resource.address + " is not encrypted");
}
break;
case "aws_dynamodb_table":
var sse = resource.values.server_side_encryption;
if (!sse || !sse[0] || !sse[0].enabled) {
violations.push("DynamoDB table " + resource.address + " lacks SSE");
}
break;
}
});
return violations;
}
function checkNetworkSegmentation(state) {
var violations = [];
var resources = state.values.root_module.resources || [];
// Databases must be in private subnets
var dbSubnetGroups = resources.filter(function(r) {
return r.type === "aws_db_subnet_group";
});
dbSubnetGroups.forEach(function(group) {
var subnetIds = group.values.subnet_ids || [];
subnetIds.forEach(function(subnetId) {
var subnet = resources.find(function(r) {
return r.type === "aws_subnet" && r.values.id === subnetId;
});
if (subnet && subnet.values.map_public_ip_on_launch) {
violations.push(
"DB subnet group " + group.address +
" includes public subnet " + subnetId
);
}
});
});
return violations;
}
function runComplianceChecks(dir) {
var state = getTerraformState(dir);
var allViolations = [];
var encryptionViolations = checkEncryptionAtRest(state);
allViolations = allViolations.concat(encryptionViolations);
var networkViolations = checkNetworkSegmentation(state);
allViolations = allViolations.concat(networkViolations);
if (allViolations.length > 0) {
console.error("Compliance violations:");
allViolations.forEach(function(v) {
console.error(" FAIL: " + v);
});
process.exit(1);
}
console.log("All compliance checks passed");
}
runComplianceChecks(process.argv[2] || ".");
Cost Estimation in Tests
Deploying infrastructure without understanding the cost is as reckless as deploying without testing. Infracost integrates into your test pipeline to catch cost surprises before they hit your bill.
// scripts/cost-check.js
var child_process = require("child_process");
var path = require("path");
function getInfracostBreakdown(terraformDir) {
var cmd = [
"infracost", "breakdown",
"--path", terraformDir,
"--format", "json",
"--no-color"
].join(" ");
var result = child_process.execSync(cmd, { encoding: "utf-8" });
return JSON.parse(result);
}
function checkCostThresholds(breakdown, maxMonthlyCost, maxHourlyCost) {
var totalMonthlyCost = parseFloat(breakdown.totalMonthlyCost);
var totalHourlyCost = parseFloat(breakdown.totalHourlyCost);
var violations = [];
if (totalMonthlyCost > maxMonthlyCost) {
violations.push(
"Monthly cost $" + totalMonthlyCost.toFixed(2) +
" exceeds threshold $" + maxMonthlyCost.toFixed(2)
);
}
if (totalHourlyCost > maxHourlyCost) {
violations.push(
"Hourly cost $" + totalHourlyCost.toFixed(2) +
" exceeds threshold $" + maxHourlyCost.toFixed(2)
);
}
// Check for expensive individual resources
var projects = breakdown.projects || [];
projects.forEach(function(project) {
var resources = project.breakdown.resources || [];
resources.forEach(function(resource) {
var resourceMonthlyCost = parseFloat(resource.monthlyCost || 0);
if (resourceMonthlyCost > 500) {
violations.push(
"Resource " + resource.name + " costs $" +
resourceMonthlyCost.toFixed(2) + "/month — needs review"
);
}
});
});
return violations;
}
var terraformDir = process.argv[2] || ".";
var maxMonthly = parseFloat(process.argv[3]) || 1000;
var maxHourly = parseFloat(process.argv[4]) || 2;
var breakdown = getInfracostBreakdown(terraformDir);
var violations = checkCostThresholds(breakdown, maxMonthly, maxHourly);
if (violations.length > 0) {
console.error("Cost policy violations:");
violations.forEach(function(v) {
console.error(" - " + v);
});
process.exit(1);
}
console.log("Cost check passed: $" +
parseFloat(breakdown.totalMonthlyCost).toFixed(2) + "/month");
CI/CD Pipeline Integration
All these tests need to run automatically. Here is a GitHub Actions workflow that implements the full IaC testing pyramid:
# .github/workflows/iac-tests.yml
name: Infrastructure Tests
on:
pull_request:
paths:
- 'infrastructure/**'
- 'modules/**'
- 'policies/**'
env:
TF_VERSION: "1.7.0"
AWS_REGION: "us-east-1"
jobs:
static-analysis:
name: Static Analysis
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run tflint
uses: terraform-linters/setup-tflint@v4
with:
tflint_version: latest
- run: |
tflint --init
tflint --recursive --format compact
- name: Run Checkov
uses: bridgecrewio/checkov-action@master
with:
directory: infrastructure/
framework: terraform
output_format: junitxml
output_file_path: checkov-results.xml
- name: Upload Checkov results
uses: actions/upload-artifact@v4
if: always()
with:
name: checkov-results
path: checkov-results.xml
unit-tests:
name: Unit Tests
runs-on: ubuntu-latest
needs: static-analysis
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
with:
terraform_version: ${{ env.TF_VERSION }}
- name: Terraform Unit Tests
run: |
cd modules/networking
terraform init
terraform test -verbose
- uses: actions/setup-node@v4
with:
node-version: '20'
- name: CDK Unit Tests
run: |
cd infrastructure/cdk
npm ci
npm test
policy-check:
name: Policy Evaluation
runs-on: ubuntu-latest
needs: unit-tests
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
with:
terraform_version: ${{ env.TF_VERSION }}
- name: Generate plan
run: |
cd infrastructure
terraform init -backend=false
terraform plan -out=tfplan.binary
terraform show -json tfplan.binary > tfplan.json
- name: OPA policy check
uses: open-policy-agent/setup-opa@v2
- run: |
opa eval \
--data policies/ \
--input infrastructure/tfplan.json \
--format json \
"data.terraform.networking.deny" | \
node -e "
var input = '';
process.stdin.on('data', function(d){ input += d; });
process.stdin.on('end', function(){
var result = JSON.parse(input);
var violations = result.result[0].expressions[0].value;
if (violations.length > 0) {
violations.forEach(function(v){ console.error(v); });
process.exit(1);
}
console.log('All policies passed');
});
"
- name: Cost estimation
uses: infracost/actions/setup@v3
with:
api-key: ${{ secrets.INFRACOST_API_KEY }}
- run: |
infracost breakdown --path infrastructure/ --format json > cost.json
node scripts/cost-check.js infrastructure/ 2000 5
integration-tests:
name: Integration Tests
runs-on: ubuntu-latest
needs: [unit-tests, policy-check]
if: github.event.pull_request.label == 'run-integration'
permissions:
id-token: write
contents: read
steps:
- uses: actions/checkout@v4
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-arn: ${{ secrets.AWS_TEST_ROLE_ARN }}
aws-region: ${{ env.AWS_REGION }}
- uses: actions/setup-go@v5
with:
go-version: '1.21'
- name: Run Terratest
run: |
cd test
go test -v -timeout 30m ./...
env:
AWS_DEFAULT_REGION: ${{ env.AWS_REGION }}
Complete Working Example
Here is a complete test suite for a Terraform networking module. This ties together unit tests, contract tests, and policy tests into a single runnable project.
Module Structure
modules/
networking/
main.tf
variables.tf
outputs.tf
tests/
networking.tftest.hcl
test/
contract.test.js
policy.test.js
networking_test.go
policies/
terraform/
networking.rego
The Module Under Test
# modules/networking/main.tf
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
enable_dns_support = true
tags = merge(var.common_tags, {
Name = "${var.name_prefix}-vpc"
})
}
resource "aws_subnet" "private" {
count = length(var.azs)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index)
availability_zone = var.azs[count.index]
tags = merge(var.common_tags, {
Name = "${var.name_prefix}-private-${var.azs[count.index]}"
Tier = "private"
})
}
resource "aws_subnet" "public" {
count = length(var.azs)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index + length(var.azs))
availability_zone = var.azs[count.index]
map_public_ip_on_launch = true
tags = merge(var.common_tags, {
Name = "${var.name_prefix}-public-${var.azs[count.index]}"
Tier = "public"
})
}
resource "aws_nat_gateway" "main" {
count = var.enable_nat ? length(var.azs) : 0
allocation_id = aws_eip.nat[count.index].id
subnet_id = aws_subnet.public[count.index].id
tags = merge(var.common_tags, {
Name = "${var.name_prefix}-nat-${var.azs[count.index]}"
})
}
resource "aws_eip" "nat" {
count = var.enable_nat ? length(var.azs) : 0
domain = "vpc"
tags = merge(var.common_tags, {
Name = "${var.name_prefix}-eip-${var.azs[count.index]}"
})
}
resource "aws_flow_log" "main" {
count = var.enable_flowlog ? 1 : 0
vpc_id = aws_vpc.main.id
traffic_type = "ALL"
log_destination_type = "cloud-watch-logs"
log_destination = aws_cloudwatch_log_group.flow_log[0].arn
iam_role_arn = aws_iam_role.flow_log[0].arn
}
Test Runner Script
// test/run-tests.js
var child_process = require("child_process");
var path = require("path");
var results = {
passed: 0,
failed: 0,
errors: []
};
function runTest(name, command, cwd) {
console.log("\n=== " + name + " ===");
try {
child_process.execSync(command, {
cwd: cwd || process.cwd(),
stdio: "inherit",
encoding: "utf-8"
});
results.passed++;
console.log("PASS: " + name);
} catch (err) {
results.failed++;
results.errors.push(name + ": " + (err.message || "unknown error"));
console.error("FAIL: " + name);
}
}
// Layer 1: Static Analysis
runTest(
"tflint",
"tflint --recursive",
path.resolve(__dirname, "../modules")
);
runTest(
"checkov",
"checkov -d . --framework terraform --quiet",
path.resolve(__dirname, "../modules/networking")
);
// Layer 2: Unit Tests
runTest(
"terraform test",
"terraform test -verbose",
path.resolve(__dirname, "../modules/networking")
);
// Layer 3: Contract Tests
runTest(
"contract tests",
"node " + path.resolve(__dirname, "contract.test.js")
);
// Layer 4: Policy Tests
runTest(
"OPA policy evaluation",
"node " + path.resolve(__dirname, "policy.test.js")
);
// Summary
console.log("\n=== Test Summary ===");
console.log("Passed: " + results.passed);
console.log("Failed: " + results.failed);
if (results.errors.length > 0) {
console.error("\nFailures:");
results.errors.forEach(function(e) {
console.error(" - " + e);
});
process.exit(1);
}
console.log("\nAll tests passed");
Common Issues and Troubleshooting
1. Terratest tests leave orphaned resources after failure
When a test panics or times out, defer terraform.Destroy() may not execute. Use a scheduled cleanup job that scans for resources tagged with Environment=test older than 4 hours and deletes them. AWS Resource Groups Tag Editor helps find these. Also set TF_CLI_ARGS_apply="-lock-timeout=5m" to handle lock contention from parallel tests.
2. CDK snapshot tests break on every CDK version upgrade
CDK generates logical IDs and metadata that change between versions. Filter out aws:cdk:path and CDKMetadata resources from your snapshots. Compare only the resource types, properties, and dependency graph — not the entire template. Some teams skip snapshot testing entirely and rely on property-based assertions instead.
3. terraform test mock provider does not simulate all behaviors
The mock_provider in terraform test returns zero values for all computed attributes. If your module logic depends on computed values like ARNs or IDs, the assertions will fail. Use override_resource blocks to provide realistic mock values for computed attributes that matter to your test logic.
4. OPA policies pass locally but fail in CI
This usually happens because the Terraform plan JSON structure differs between Terraform versions. Pin your Terraform version in CI to match local development. Also verify that terraform show -json output format matches what your Rego policies expect — the schema changed between Terraform 0.12, 0.13, and 1.x.
5. Checkov false positives block the pipeline
Checkov is aggressive by default. Use .checkov.yaml to skip rules that do not apply to your environment. Document every suppression with a reason. Consider running Checkov in "soft-fail" mode during initial adoption and gradually making it a hard gate as you fix existing violations.
6. Integration tests are too slow for PR workflows
Run integration tests only on merge to main, or trigger them with a PR label (run-integration). Use Terraform workspaces or unique naming prefixes to enable parallel test execution. Cache Terraform provider binaries in CI to save 30-60 seconds per run.
Best Practices
Test at the lowest possible level first. Static analysis and unit tests run in seconds with zero cloud cost. Push as much validation as possible into these layers before reaching for integration tests.
Use unique naming in all test infrastructure. Append timestamps or random strings to resource names. Parallel test runs will collide on hardcoded names, producing flaky failures that are painful to debug.
Tag all test resources consistently. Use
Environment=testandCreatedBy=citags on everything. This makes cleanup trivial and prevents accidental deletion of real resources.Run integration tests in an isolated AWS account. Never run Terratest against your production or staging account. Use AWS Organizations to create a dedicated test account with restricted permissions and budget alerts.
Pin tool versions in CI. Terraform, tflint, Checkov, OPA, and Infracost all release frequently. Version drift between local development and CI causes false failures. Use lockfiles, version constraints, and pinned action versions.
Treat policy violations as build failures. Once a policy is adopted, make it a hard gate in the pipeline. "Soft fail" modes create technical debt because nobody goes back to fix warnings.
Test module upgrades in isolation before rolling them out. When you update a shared module version, run the full test suite against the new version in a feature branch before merging. Breaking changes in modules cascade across every consumer.
Keep test infrastructure costs visible. Run Infracost on every PR and post the cost diff as a PR comment. Engineers make better decisions when cost impact is visible at review time, not 30 days later on the AWS bill.
Version your policy rules alongside your infrastructure code. Store Rego policies, Checkov configurations, and Sentinel rules in the same repository as your Terraform modules. Policy and infrastructure should evolve together.
References
- Terraform Test Documentation — Official docs for the native
terraform testframework - Terratest by Gruntwork — Go library for integration testing Terraform, Packer, and Kubernetes
- AWS CDK Assertions Module — Unit testing CDK constructs
- Open Policy Agent — Policy-as-code for Terraform plans
- Checkov — Static analysis for IaC security scanning
- cdk-nag — CDK best practice validation
- tflint — Terraform linter with pluggable rules
- Infracost — Cloud cost estimation for Terraform