Infrastructure As Code

AWS CDK Patterns for Common Architectures

Build common AWS architectures with CDK patterns including serverless APIs, static sites, event processing, and multi-environment deployments

AWS CDK Patterns for Common Architectures

AWS CDK lets you define cloud infrastructure in real programming languages instead of YAML or JSON templates. The real power shows up when you start composing reusable patterns that encode your organization's best practices into constructs you can share across teams. This article walks through the most common architectural patterns I deploy repeatedly in production, with working Node.js code you can adapt immediately.

Prerequisites

  • Node.js 18+ installed
  • AWS CLI configured with credentials
  • AWS CDK CLI installed (npm install -g aws-cdk)
  • Basic familiarity with AWS services (Lambda, API Gateway, DynamoDB, S3)
  • A working understanding of CloudFormation concepts

Install the CDK libraries you will need:

npm install aws-cdk-lib constructs

Understanding L3 Construct Patterns

CDK organizes constructs into three levels. L1 constructs are direct CloudFormation resource mappings (prefixed with Cfn). L2 constructs add sensible defaults and convenience methods. L3 constructs, also called patterns, combine multiple resources into higher-level abstractions that represent entire architectural patterns.

The distinction matters because L3 constructs are where CDK becomes genuinely powerful. Instead of wiring together an API Gateway, Lambda function, IAM roles, and log groups individually, an L3 construct handles all of that in a single declaration. AWS ships some L3 constructs in aws-cdk-lib, but the @aws-solutions-constructs library provides dozens more battle-tested patterns.

var cdk = require("aws-cdk-lib");
var apigateway = require("aws-cdk-lib/aws-apigateway");
var lambda = require("aws-cdk-lib/aws-lambda");
var constructs = require("constructs");

// L1 - Raw CloudFormation. You manage every property yourself.
var cfnFunction = new lambda.CfnFunction(this, "CfnFunc", {
  runtime: "nodejs18.x",
  handler: "index.handler",
  code: { zipFile: "exports.handler = async () => ({ statusCode: 200 })" },
  role: roleArn
});

// L2 - Sensible defaults. IAM role, log group, and runtime config handled for you.
var fn = new lambda.Function(this, "L2Func", {
  runtime: lambda.Runtime.NODEJS_18_X,
  handler: "index.handler",
  code: lambda.Code.fromInline("exports.handler = async () => ({ statusCode: 200 })")
});

// L3 - Full pattern. API Gateway + Lambda + CORS + deployment stage wired together.
var api = new apigateway.LambdaRestApi(this, "L3Api", {
  handler: fn,
  proxy: true
});

The takeaway: always start with the highest-level construct that fits your use case and drop down to L2 or L1 only when you need finer control.

API + Lambda + DynamoDB Pattern

This is the most common serverless pattern. An API Gateway receives HTTP requests, routes them to Lambda functions, and those functions read from and write to DynamoDB. CDK makes wiring this up trivial.

var cdk = require("aws-cdk-lib");
var apigateway = require("aws-cdk-lib/aws-apigateway");
var lambda = require("aws-cdk-lib/aws-lambda");
var dynamodb = require("aws-cdk-lib/aws-dynamodb");
var constructs = require("constructs");

function ApiLambdaDynamoStack(scope, id, props) {
  cdk.Stack.call(this, scope, id, props);

  // DynamoDB table with on-demand billing
  var table = new dynamodb.Table(this, "ItemsTable", {
    partitionKey: { name: "pk", type: dynamodb.AttributeType.STRING },
    sortKey: { name: "sk", type: dynamodb.AttributeType.STRING },
    billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
    removalPolicy: cdk.RemovalPolicy.DESTROY,
    pointInTimeRecovery: true
  });

  // Add a GSI for querying by status
  table.addGlobalSecondaryIndex({
    indexName: "StatusIndex",
    partitionKey: { name: "status", type: dynamodb.AttributeType.STRING },
    sortKey: { name: "createdAt", type: dynamodb.AttributeType.STRING },
    projectionType: dynamodb.ProjectionType.ALL
  });

  // Lambda function with table name injected via environment
  var handler = new lambda.Function(this, "ApiHandler", {
    runtime: lambda.Runtime.NODEJS_18_X,
    handler: "index.handler",
    code: lambda.Code.fromAsset("lambda/api"),
    environment: {
      TABLE_NAME: table.tableName,
      NODE_OPTIONS: "--enable-source-maps"
    },
    memorySize: 256,
    timeout: cdk.Duration.seconds(30),
    tracing: lambda.Tracing.ACTIVE
  });

  // Grant the Lambda read/write access to the table
  table.grantReadWriteData(handler);

  // REST API with request validation
  var api = new apigateway.RestApi(this, "ItemsApi", {
    restApiName: "Items Service",
    deployOptions: {
      stageName: "v1",
      throttlingRateLimit: 100,
      throttlingBurstLimit: 50
    },
    defaultCorsPreflightOptions: {
      allowOrigins: apigateway.Cors.ALL_ORIGINS,
      allowMethods: apigateway.Cors.ALL_METHODS
    }
  });

  var items = api.root.addResource("items");
  var integration = new apigateway.LambdaIntegration(handler);

  items.addMethod("GET", integration);
  items.addMethod("POST", integration);

  var singleItem = items.addResource("{id}");
  singleItem.addMethod("GET", integration);
  singleItem.addMethod("PUT", integration);
  singleItem.addMethod("DELETE", integration);

  new cdk.CfnOutput(this, "ApiUrl", { value: api.url });
}

Object.setPrototypeOf(ApiLambdaDynamoStack.prototype, cdk.Stack.prototype);

The table.grantReadWriteData(handler) call is doing heavy lifting here. It creates a precisely scoped IAM policy that gives the Lambda function only the DynamoDB permissions it needs, on only that specific table. No wildcard * resources, no overly broad policies.

Static Website with CloudFront

Hosting a static site on S3 behind CloudFront is another pattern I deploy constantly. CDK handles the certificate validation, DNS records, and origin access identity automatically.

var cdk = require("aws-cdk-lib");
var s3 = require("aws-cdk-lib/aws-s3");
var cloudfront = require("aws-cdk-lib/aws-cloudfront");
var origins = require("aws-cdk-lib/aws-cloudfront-origins");
var s3deploy = require("aws-cdk-lib/aws-s3-deployment");
var acm = require("aws-cdk-lib/aws-certificatemanager");
var route53 = require("aws-cdk-lib/aws-route53");
var targets = require("aws-cdk-lib/aws-route53-targets");

function StaticSiteStack(scope, id, props) {
  cdk.Stack.call(this, scope, id, props);

  var domainName = props.domainName;

  // S3 bucket - no public access, CloudFront handles that
  var siteBucket = new s3.Bucket(this, "SiteBucket", {
    blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
    removalPolicy: cdk.RemovalPolicy.DESTROY,
    autoDeleteObjects: true,
    encryption: s3.BucketEncryption.S3_MANAGED
  });

  // Look up the hosted zone
  var zone = route53.HostedZone.fromLookup(this, "Zone", {
    domainName: domainName
  });

  // TLS certificate (must be in us-east-1 for CloudFront)
  var certificate = new acm.Certificate(this, "SiteCert", {
    domainName: domainName,
    subjectAlternativeNames: ["www." + domainName],
    validation: acm.CertificateValidation.fromDns(zone)
  });

  // CloudFront distribution
  var distribution = new cloudfront.Distribution(this, "SiteDistribution", {
    defaultBehavior: {
      origin: new origins.S3Origin(siteBucket),
      viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
      cachePolicy: cloudfront.CachePolicy.CACHING_OPTIMIZED
    },
    domainNames: [domainName, "www." + domainName],
    certificate: certificate,
    defaultRootObject: "index.html",
    errorResponses: [
      {
        httpStatus: 404,
        responseHttpStatus: 200,
        responsePagePath: "/index.html",
        ttl: cdk.Duration.minutes(5)
      }
    ]
  });

  // DNS records pointing to CloudFront
  new route53.ARecord(this, "SiteAlias", {
    zone: zone,
    recordName: domainName,
    target: route53.RecordTarget.fromAlias(
      new targets.CloudFrontTarget(distribution)
    )
  });

  // Deploy site contents to S3 and invalidate CloudFront cache
  new s3deploy.BucketDeployment(this, "DeploySite", {
    sources: [s3deploy.Source.asset("./site-contents")],
    destinationBucket: siteBucket,
    distribution: distribution,
    distributionPaths: ["/*"]
  });
}

Object.setPrototypeOf(StaticSiteStack.prototype, cdk.Stack.prototype);

Notice the error response configuration. For single-page applications, you want to redirect 404s back to index.html so client-side routing can handle the path. Without this, refreshing a deep link returns a CloudFront error page.

VPC with Bastion Host Pattern

When you need private resources accessible only through a jump box, CDK simplifies the VPC setup considerably.

var cdk = require("aws-cdk-lib");
var ec2 = require("aws-cdk-lib/aws-ec2");

function VpcBastionStack(scope, id, props) {
  cdk.Stack.call(this, scope, id, props);

  var vpc = new ec2.Vpc(this, "AppVpc", {
    maxAzs: 2,
    natGateways: 1,
    subnetConfiguration: [
      { name: "Public", subnetType: ec2.SubnetType.PUBLIC, cidrMask: 24 },
      { name: "Private", subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS, cidrMask: 24 },
      { name: "Isolated", subnetType: ec2.SubnetType.PRIVATE_ISOLATED, cidrMask: 24 }
    ]
  });

  // Bastion host in public subnet with SSM access (no SSH keys needed)
  var bastion = new ec2.BastionHostLinux(this, "BastionHost", {
    vpc: vpc,
    subnetSelection: { subnetType: ec2.SubnetType.PUBLIC },
    instanceType: ec2.InstanceType.of(ec2.InstanceClass.T3, ec2.InstanceSize.MICRO)
  });

  // Allow the bastion to connect to resources in the private subnet
  bastion.connections.allowTo(
    ec2.Peer.ipv4(vpc.vpcCidrBlock),
    ec2.Port.tcp(5432),
    "Allow PostgreSQL access from bastion"
  );

  new cdk.CfnOutput(this, "BastionInstanceId", {
    value: bastion.instanceId,
    description: "Use: aws ssm start-session --target " + bastion.instanceId
  });
}

Object.setPrototypeOf(VpcBastionStack.prototype, cdk.Stack.prototype);

The BastionHostLinux construct uses AWS Systems Manager Session Manager instead of SSH keys. This eliminates the need to manage key pairs and opens no inbound ports. You connect via aws ssm start-session and get a fully audited shell session.

Event-Driven Processing Pattern

Decoupling producers from consumers with SQS and SNS is fundamental to resilient architectures. Here is a pattern where an API publishes events to SNS, which fans out to multiple SQS queues for different processors.

var cdk = require("aws-cdk-lib");
var sns = require("aws-cdk-lib/aws-sns");
var sqs = require("aws-cdk-lib/aws-sqs");
var subscriptions = require("aws-cdk-lib/aws-sns-subscriptions");
var lambdaEventSources = require("aws-cdk-lib/aws-lambda-event-sources");
var lambda = require("aws-cdk-lib/aws-lambda");

function EventProcessingStack(scope, id, props) {
  cdk.Stack.call(this, scope, id, props);

  // Central event topic
  var orderTopic = new sns.Topic(this, "OrderEvents", {
    topicName: "order-events"
  });

  // Dead letter queue for failed processing
  var dlq = new sqs.Queue(this, "OrderDLQ", {
    retentionPeriod: cdk.Duration.days(14)
  });

  // Inventory processing queue
  var inventoryQueue = new sqs.Queue(this, "InventoryQueue", {
    visibilityTimeout: cdk.Duration.seconds(300),
    deadLetterQueue: { queue: dlq, maxReceiveCount: 3 }
  });

  // Notification processing queue
  var notificationQueue = new sqs.Queue(this, "NotificationQueue", {
    visibilityTimeout: cdk.Duration.seconds(60),
    deadLetterQueue: { queue: dlq, maxReceiveCount: 3 }
  });

  // Subscribe queues to the topic with filters
  orderTopic.addSubscription(
    new subscriptions.SqsSubscription(inventoryQueue, {
      filterPolicy: {
        eventType: sns.SubscriptionFilter.stringFilter({
          allowlist: ["ORDER_PLACED", "ORDER_CANCELLED"]
        })
      }
    })
  );

  orderTopic.addSubscription(
    new subscriptions.SqsSubscription(notificationQueue)
  );

  // Lambda processors
  var inventoryProcessor = new lambda.Function(this, "InventoryProcessor", {
    runtime: lambda.Runtime.NODEJS_18_X,
    handler: "inventory.handler",
    code: lambda.Code.fromAsset("lambda/processors"),
    timeout: cdk.Duration.minutes(5),
    reservedConcurrentExecutions: 10
  });

  var notificationProcessor = new lambda.Function(this, "NotificationProcessor", {
    runtime: lambda.Runtime.NODEJS_18_X,
    handler: "notification.handler",
    code: lambda.Code.fromAsset("lambda/processors"),
    timeout: cdk.Duration.seconds(30)
  });

  // Wire Lambda to SQS with batch processing
  inventoryProcessor.addEventSource(
    new lambdaEventSources.SqsEventSource(inventoryQueue, {
      batchSize: 10,
      maxBatchingWindow: cdk.Duration.seconds(30),
      reportBatchItemFailures: true
    })
  );

  notificationProcessor.addEventSource(
    new lambdaEventSources.SqsEventSource(notificationQueue, {
      batchSize: 1
    })
  );
}

Object.setPrototypeOf(EventProcessingStack.prototype, cdk.Stack.prototype);

Two things to highlight. First, reportBatchItemFailures: true enables partial batch failure reporting. Without this, a single failed message causes the entire batch to retry, leading to duplicate processing. Second, reservedConcurrentExecutions: 10 on the inventory processor prevents it from consuming all available Lambda concurrency in your account during traffic spikes.

Scheduled Task Pattern

Running periodic tasks with EventBridge rules and Lambda is cleaner than managing cron jobs on EC2.

var cdk = require("aws-cdk-lib");
var lambda = require("aws-cdk-lib/aws-lambda");
var events = require("aws-cdk-lib/aws-events");
var eventsTargets = require("aws-cdk-lib/aws-events-targets");

function ScheduledTaskStack(scope, id, props) {
  cdk.Stack.call(this, scope, id, props);

  var cleanupFunction = new lambda.Function(this, "CleanupFunction", {
    runtime: lambda.Runtime.NODEJS_18_X,
    handler: "cleanup.handler",
    code: lambda.Code.fromAsset("lambda/scheduled"),
    timeout: cdk.Duration.minutes(15),
    memorySize: 512,
    environment: {
      RETENTION_DAYS: "90"
    }
  });

  // Run every day at 3 AM UTC
  var rule = new events.Rule(this, "DailyCleanupRule", {
    schedule: events.Schedule.cron({
      minute: "0",
      hour: "3",
      month: "*",
      weekDay: "*",
      year: "*"
    })
  });

  rule.addTarget(new eventsTargets.LambdaFunction(cleanupFunction, {
    retryAttempts: 2,
    maxEventAge: cdk.Duration.hours(1)
  }));

  // Also allow manual triggering via a separate rule
  var manualRule = new events.Rule(this, "ManualTriggerRule", {
    eventPattern: {
      source: ["custom.cleanup"],
      detailType: ["ManualTrigger"]
    }
  });

  manualRule.addTarget(new eventsTargets.LambdaFunction(cleanupFunction));
}

Object.setPrototypeOf(ScheduledTaskStack.prototype, cdk.Stack.prototype);

The manual trigger rule lets operators invoke the cleanup on demand via the AWS CLI: aws events put-events --entries '[{"Source":"custom.cleanup","DetailType":"ManualTrigger","Detail":"{}"}]'. This is useful during incident response when you cannot wait for the next scheduled run.

Multi-Environment Deployment

Deploying the same stack across dev, staging, and production requires parameterization. CDK handles this through stack props, not CloudFormation parameters.

var cdk = require("aws-cdk-lib");

// bin/app.js - Entry point
var app = new cdk.App();

var environments = {
  dev: {
    account: "111111111111",
    region: "us-east-1",
    domainName: "dev.example.com",
    instanceSize: "small",
    enableAlarms: false
  },
  staging: {
    account: "222222222222",
    region: "us-east-1",
    domainName: "staging.example.com",
    instanceSize: "medium",
    enableAlarms: true
  },
  prod: {
    account: "333333333333",
    region: "us-east-1",
    domainName: "example.com",
    instanceSize: "large",
    enableAlarms: true
  }
};

var envName = app.node.tryGetContext("env") || "dev";
var config = environments[envName];

new ApiStack(app, "ApiStack-" + envName, {
  env: { account: config.account, region: config.region },
  config: config,
  envName: envName
});

Deploy to a specific environment with:

cdk deploy -c env=staging

Inside the stack, use the config to vary resource sizes, alarm thresholds, and removal policies:

var removalPolicy = props.envName === "prod"
  ? cdk.RemovalPolicy.RETAIN
  : cdk.RemovalPolicy.DESTROY;

var table = new dynamodb.Table(this, "DataTable", {
  partitionKey: { name: "pk", type: dynamodb.AttributeType.STRING },
  removalPolicy: removalPolicy,
  pointInTimeRecovery: props.envName === "prod"
});

Never use CloudFormation parameters with CDK. They defer resolution to deploy time and prevent CDK from making compile-time decisions. Use context values or stack props instead.

Cross-Stack References

When your infrastructure grows beyond a single stack, you need to share resources between stacks. CDK handles this through CloudFormation exports under the hood.

// Network stack exports the VPC
function NetworkStack(scope, id, props) {
  cdk.Stack.call(this, scope, id, props);

  this.vpc = new ec2.Vpc(this, "SharedVpc", {
    maxAzs: 2,
    natGateways: 1
  });
}
Object.setPrototypeOf(NetworkStack.prototype, cdk.Stack.prototype);

// Application stack consumes the VPC
function ApplicationStack(scope, id, props) {
  cdk.Stack.call(this, scope, id, props);

  var fn = new lambda.Function(this, "AppFunction", {
    runtime: lambda.Runtime.NODEJS_18_X,
    handler: "index.handler",
    code: lambda.Code.fromAsset("lambda/app"),
    vpc: props.vpc,
    vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS }
  });
}
Object.setPrototypeOf(ApplicationStack.prototype, cdk.Stack.prototype);

// Wire them together in the app entry point
var networkStack = new NetworkStack(app, "NetworkStack", { env: envConfig });
new ApplicationStack(app, "ApplicationStack", {
  env: envConfig,
  vpc: networkStack.vpc
});

Be cautious with cross-stack references. Once a CloudFormation export exists, you cannot modify or delete the exporting stack until all importing stacks remove their references. For frequently changing resources, use SSM Parameter Store lookups instead of direct cross-stack references.

Custom Construct Libraries

The real return on CDK investment comes from building reusable construct libraries that encode your organization's standards.

var cdk = require("aws-cdk-lib");
var lambda = require("aws-cdk-lib/aws-lambda");
var logs = require("aws-cdk-lib/aws-logs");
var cloudwatch = require("aws-cdk-lib/aws-cloudwatch");
var constructs = require("constructs");

function StandardLambda(scope, id, props) {
  constructs.Construct.call(this, scope, id);

  var defaults = {
    runtime: lambda.Runtime.NODEJS_18_X,
    memorySize: 256,
    timeout: cdk.Duration.seconds(30),
    tracing: lambda.Tracing.ACTIVE,
    logRetention: logs.RetentionDays.TWO_WEEKS,
    environment: {
      NODE_OPTIONS: "--enable-source-maps",
      LOG_LEVEL: props.logLevel || "info"
    }
  };

  // Merge user-provided props with defaults
  var mergedProps = Object.assign({}, defaults, props);
  mergedProps.environment = Object.assign(
    {},
    defaults.environment,
    props.environment || {}
  );

  this.fn = new lambda.Function(this, "Function", mergedProps);

  // Every Lambda gets an error alarm
  this.errorAlarm = new cloudwatch.Alarm(this, "ErrorAlarm", {
    metric: this.fn.metricErrors({ period: cdk.Duration.minutes(5) }),
    threshold: props.errorThreshold || 5,
    evaluationPeriods: 1,
    alarmDescription: "Lambda " + id + " error rate exceeded threshold",
    treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING
  });

  // And a duration alarm at 80% of timeout
  var timeoutMs = (props.timeout || cdk.Duration.seconds(30)).toMilliseconds();
  this.durationAlarm = new cloudwatch.Alarm(this, "DurationAlarm", {
    metric: this.fn.metricDuration({ period: cdk.Duration.minutes(5), statistic: "p99" }),
    threshold: timeoutMs * 0.8,
    evaluationPeriods: 3,
    alarmDescription: "Lambda " + id + " approaching timeout"
  });
}

Object.setPrototypeOf(StandardLambda.prototype, constructs.Construct.prototype);

// Usage
var myFunc = new StandardLambda(this, "OrderProcessor", {
  handler: "orders.handler",
  code: lambda.Code.fromAsset("lambda/orders"),
  errorThreshold: 2,
  environment: { ORDER_TABLE: table.tableName }
});

table.grantReadWriteData(myFunc.fn);

Publish these constructs as private npm packages. Every team in your organization gets production-grade observability baked in without thinking about it.

CDK Aspects for Compliance

Aspects let you apply cross-cutting concerns across your entire CDK app. They visit every construct in the tree and can modify resources or flag violations.

var cdk = require("aws-cdk-lib");
var s3 = require("aws-cdk-lib/aws-s3");

function ComplianceAspect() {}

ComplianceAspect.prototype.visit = function(node) {
  // Ensure all S3 buckets have encryption
  if (node instanceof s3.CfnBucket) {
    if (!node.bucketEncryption) {
      cdk.Annotations.of(node).addError(
        "S3 bucket must have encryption enabled. Add encryption configuration."
      );
    }
  }

  // Ensure all buckets block public access
  if (node instanceof s3.CfnBucket) {
    if (!node.publicAccessBlockConfiguration) {
      cdk.Annotations.of(node).addWarning(
        "S3 bucket should have public access block configured."
      );
    }
  }

  // Tag all taggable resources
  if (cdk.TagManager.isTaggable(node)) {
    cdk.Tags.of(node).add("ManagedBy", "CDK");
    cdk.Tags.of(node).add("Team", "platform");
  }
};

// Apply to the entire app
cdk.Aspects.of(app).add(new ComplianceAspect());

Aspects run during synthesis. Using addError causes cdk synth to fail, which blocks deployment in CI/CD pipelines. Use addWarning for recommendations that should not block deployment.

Escape Hatches for Unsupported Features

When CDK's L2 constructs do not expose a property you need, escape hatches let you reach down to the L1 (CloudFormation) level.

var lambda = require("aws-cdk-lib/aws-lambda");

var fn = new lambda.Function(this, "MyFunction", {
  runtime: lambda.Runtime.NODEJS_18_X,
  handler: "index.handler",
  code: lambda.Code.fromAsset("lambda/app")
});

// Access the underlying CloudFormation resource
var cfnFunction = fn.node.defaultChild;

// Set properties not exposed by the L2 construct
cfnFunction.addPropertyOverride("SnapStart", {
  ApplyOn: "PublishedVersions"
});

// You can also add raw CloudFormation to any construct
cfnFunction.addOverride("DependsOn", ["SomeOtherResource"]);

Escape hatches are a pragmatic solution. AWS adds new features faster than CDK's L2 constructs can keep up. Rather than waiting for an L2 update, drop down to L1 for the specific property you need and keep everything else at L2.

CDK Nag for Security Scanning

cdk-nag runs rule packs against your CDK app during synthesis, catching security issues before deployment.

npm install cdk-nag
var cdk = require("aws-cdk-lib");
var cdkNag = require("cdk-nag");

var app = new cdk.App();
var stack = new MyStack(app, "MyStack");

// Apply AWS Solutions security checks
cdk.Aspects.of(app).add(new cdkNag.AwsSolutionsChecks({ verbose: true }));

// Suppress specific rules when you have a valid reason
cdkNag.NagSuppressions.addStackSuppressions(stack, [
  {
    id: "AwsSolutions-IAM4",
    reason: "Using AWS managed policies for Lambda basic execution role is acceptable"
  }
]);

// Suppress at the resource level for more granular control
cdkNag.NagSuppressions.addResourceSuppressions(
  myLambda,
  [
    {
      id: "AwsSolutions-L1",
      reason: "Node 18 is our current standard, will upgrade to 20 next quarter"
    }
  ],
  true // Apply to children
);

I run cdk-nag in CI alongside unit tests. It catches issues like overly permissive IAM policies, unencrypted storage, and missing logging configuration. The suppression mechanism ensures you can document intentional exceptions rather than silently ignoring them.

Complete Working Example: Microservice Architecture

Here is a full CDK app that deploys a microservice with API Gateway, Lambda, DynamoDB, SQS for async processing, and CloudWatch alarms. This is a production-ready pattern I have deployed variations of dozens of times.

var cdk = require("aws-cdk-lib");
var apigateway = require("aws-cdk-lib/aws-apigateway");
var lambda = require("aws-cdk-lib/aws-lambda");
var dynamodb = require("aws-cdk-lib/aws-dynamodb");
var sqs = require("aws-cdk-lib/aws-sqs");
var cloudwatch = require("aws-cdk-lib/aws-cloudwatch");
var cloudwatchActions = require("aws-cdk-lib/aws-cloudwatch-actions");
var sns = require("aws-cdk-lib/aws-sns");
var snsSubscriptions = require("aws-cdk-lib/aws-sns-subscriptions");
var lambdaEventSources = require("aws-cdk-lib/aws-lambda-event-sources");
var logs = require("aws-cdk-lib/aws-logs");
var constructs = require("constructs");

function MicroserviceStack(scope, id, props) {
  cdk.Stack.call(this, scope, id, props);

  var config = props.config || {};
  var envName = props.envName || "dev";

  // ---- Data Layer ----

  var ordersTable = new dynamodb.Table(this, "OrdersTable", {
    tableName: envName + "-orders",
    partitionKey: { name: "orderId", type: dynamodb.AttributeType.STRING },
    sortKey: { name: "createdAt", type: dynamodb.AttributeType.STRING },
    billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
    removalPolicy: envName === "prod"
      ? cdk.RemovalPolicy.RETAIN
      : cdk.RemovalPolicy.DESTROY,
    pointInTimeRecovery: envName === "prod",
    stream: dynamodb.StreamViewType.NEW_AND_OLD_IMAGES
  });

  ordersTable.addGlobalSecondaryIndex({
    indexName: "CustomerIndex",
    partitionKey: { name: "customerId", type: dynamodb.AttributeType.STRING },
    sortKey: { name: "createdAt", type: dynamodb.AttributeType.STRING },
    projectionType: dynamodb.ProjectionType.ALL
  });

  // ---- Async Processing Layer ----

  var processingDlq = new sqs.Queue(this, "ProcessingDLQ", {
    queueName: envName + "-order-processing-dlq",
    retentionPeriod: cdk.Duration.days(14)
  });

  var processingQueue = new sqs.Queue(this, "ProcessingQueue", {
    queueName: envName + "-order-processing",
    visibilityTimeout: cdk.Duration.seconds(300),
    deadLetterQueue: {
      queue: processingDlq,
      maxReceiveCount: 3
    }
  });

  // ---- Lambda Functions ----

  var commonEnv = {
    TABLE_NAME: ordersTable.tableName,
    QUEUE_URL: processingQueue.queueUrl,
    NODE_OPTIONS: "--enable-source-maps",
    ENVIRONMENT: envName
  };

  var apiHandler = new lambda.Function(this, "ApiHandler", {
    functionName: envName + "-orders-api",
    runtime: lambda.Runtime.NODEJS_18_X,
    handler: "api.handler",
    code: lambda.Code.fromAsset("lambda/orders"),
    memorySize: 256,
    timeout: cdk.Duration.seconds(30),
    tracing: lambda.Tracing.ACTIVE,
    logRetention: logs.RetentionDays.TWO_WEEKS,
    environment: commonEnv
  });

  var processorHandler = new lambda.Function(this, "ProcessorHandler", {
    functionName: envName + "-orders-processor",
    runtime: lambda.Runtime.NODEJS_18_X,
    handler: "processor.handler",
    code: lambda.Code.fromAsset("lambda/orders"),
    memorySize: 512,
    timeout: cdk.Duration.minutes(5),
    tracing: lambda.Tracing.ACTIVE,
    logRetention: logs.RetentionDays.TWO_WEEKS,
    reservedConcurrentExecutions: 10,
    environment: commonEnv
  });

  // ---- Permissions ----

  ordersTable.grantReadWriteData(apiHandler);
  ordersTable.grantReadWriteData(processorHandler);
  processingQueue.grantSendMessages(apiHandler);
  processingQueue.grantConsumeMessages(processorHandler);

  // ---- Event Sources ----

  processorHandler.addEventSource(
    new lambdaEventSources.SqsEventSource(processingQueue, {
      batchSize: 10,
      maxBatchingWindow: cdk.Duration.seconds(30),
      reportBatchItemFailures: true
    })
  );

  // ---- API Gateway ----

  var api = new apigateway.RestApi(this, "OrdersApi", {
    restApiName: envName + "-orders-api",
    deployOptions: {
      stageName: "v1",
      throttlingRateLimit: envName === "prod" ? 1000 : 100,
      throttlingBurstLimit: envName === "prod" ? 500 : 50,
      metricsEnabled: true,
      loggingLevel: apigateway.MethodLoggingLevel.INFO,
      dataTraceEnabled: envName !== "prod",
      tracingEnabled: true
    },
    defaultCorsPreflightOptions: {
      allowOrigins: apigateway.Cors.ALL_ORIGINS,
      allowMethods: apigateway.Cors.ALL_METHODS,
      allowHeaders: ["Content-Type", "Authorization", "X-Api-Key"]
    }
  });

  var ordersResource = api.root.addResource("orders");
  var lambdaIntegration = new apigateway.LambdaIntegration(apiHandler, {
    proxy: true
  });

  ordersResource.addMethod("GET", lambdaIntegration);
  ordersResource.addMethod("POST", lambdaIntegration);

  var singleOrder = ordersResource.addResource("{orderId}");
  singleOrder.addMethod("GET", lambdaIntegration);
  singleOrder.addMethod("PUT", lambdaIntegration);

  // ---- Monitoring & Alarms ----

  var alarmTopic = new sns.Topic(this, "AlarmTopic", {
    topicName: envName + "-order-alarms"
  });

  if (config.alertEmail) {
    alarmTopic.addSubscription(
      new snsSubscriptions.EmailSubscription(config.alertEmail)
    );
  }

  var alarmAction = new cloudwatchActions.SnsAction(alarmTopic);

  // API error rate alarm
  var apiErrorAlarm = new cloudwatch.Alarm(this, "ApiErrorAlarm", {
    alarmName: envName + "-orders-api-errors",
    metric: apiHandler.metricErrors({
      period: cdk.Duration.minutes(5),
      statistic: "Sum"
    }),
    threshold: envName === "prod" ? 10 : 50,
    evaluationPeriods: 2,
    treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING,
    alarmDescription: "Order API Lambda error rate is elevated"
  });
  apiErrorAlarm.addAlarmAction(alarmAction);

  // API latency alarm (p99)
  var latencyAlarm = new cloudwatch.Alarm(this, "ApiLatencyAlarm", {
    alarmName: envName + "-orders-api-latency",
    metric: apiHandler.metricDuration({
      period: cdk.Duration.minutes(5),
      statistic: "p99"
    }),
    threshold: 5000,
    evaluationPeriods: 3,
    treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING,
    alarmDescription: "Order API p99 latency exceeding 5 seconds"
  });
  latencyAlarm.addAlarmAction(alarmAction);

  // DLQ depth alarm - messages that failed processing
  var dlqAlarm = new cloudwatch.Alarm(this, "DlqDepthAlarm", {
    alarmName: envName + "-orders-dlq-depth",
    metric: processingDlq.metricApproximateNumberOfMessagesVisible({
      period: cdk.Duration.minutes(5)
    }),
    threshold: 1,
    evaluationPeriods: 1,
    treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING,
    alarmDescription: "Messages are appearing in the dead letter queue"
  });
  dlqAlarm.addAlarmAction(alarmAction);

  // Processing queue age alarm
  var queueAgeAlarm = new cloudwatch.Alarm(this, "QueueAgeAlarm", {
    alarmName: envName + "-orders-queue-age",
    metric: processingQueue.metricApproximateAgeOfOldestMessage({
      period: cdk.Duration.minutes(5)
    }),
    threshold: 300,
    evaluationPeriods: 2,
    treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING,
    alarmDescription: "Messages are sitting in the processing queue too long"
  });
  queueAgeAlarm.addAlarmAction(alarmAction);

  // ---- Dashboard ----

  var dashboard = new cloudwatch.Dashboard(this, "ServiceDashboard", {
    dashboardName: envName + "-orders-dashboard"
  });

  dashboard.addWidgets(
    new cloudwatch.GraphWidget({
      title: "API Invocations & Errors",
      left: [
        apiHandler.metricInvocations({ period: cdk.Duration.minutes(5) }),
        apiHandler.metricErrors({ period: cdk.Duration.minutes(5) })
      ],
      width: 12
    }),
    new cloudwatch.GraphWidget({
      title: "API Latency",
      left: [
        apiHandler.metricDuration({ period: cdk.Duration.minutes(5), statistic: "avg" }),
        apiHandler.metricDuration({ period: cdk.Duration.minutes(5), statistic: "p99" })
      ],
      width: 12
    }),
    new cloudwatch.GraphWidget({
      title: "Queue Metrics",
      left: [
        processingQueue.metricApproximateNumberOfMessagesVisible({ period: cdk.Duration.minutes(5) }),
        processingQueue.metricApproximateAgeOfOldestMessage({ period: cdk.Duration.minutes(5) })
      ],
      width: 12
    }),
    new cloudwatch.GraphWidget({
      title: "DLQ Messages",
      left: [
        processingDlq.metricApproximateNumberOfMessagesVisible({ period: cdk.Duration.minutes(5) })
      ],
      width: 12
    })
  );

  // ---- Outputs ----

  new cdk.CfnOutput(this, "ApiEndpoint", {
    value: api.url,
    description: "Orders API endpoint URL"
  });

  new cdk.CfnOutput(this, "TableName", {
    value: ordersTable.tableName,
    description: "DynamoDB table name"
  });

  new cdk.CfnOutput(this, "DashboardUrl", {
    value: "https://" + this.region + ".console.aws.amazon.com/cloudwatch/home#dashboards:name=" + envName + "-orders-dashboard",
    description: "CloudWatch dashboard URL"
  });
}

Object.setPrototypeOf(MicroserviceStack.prototype, cdk.Stack.prototype);

// ---- App Entry Point ----

var app = new cdk.App();
var envName = app.node.tryGetContext("env") || "dev";

var envConfigs = {
  dev: {
    account: process.env.CDK_DEFAULT_ACCOUNT,
    region: process.env.CDK_DEFAULT_REGION || "us-east-1"
  },
  prod: {
    account: process.env.CDK_DEFAULT_ACCOUNT,
    region: "us-east-1"
  }
};

new MicroserviceStack(app, "OrderService-" + envName, {
  env: envConfigs[envName] || envConfigs.dev,
  envName: envName,
  config: {
    alertEmail: app.node.tryGetContext("alertEmail")
  }
});

app.synth();

Deploy it with:

cdk deploy -c env=dev -c [email protected]

This single stack gives you a REST API, async processing pipeline, dead letter queue for poison messages, four CloudWatch alarms covering errors, latency, queue depth, and DLQ activity, plus a pre-built dashboard. Every resource is properly scoped with IAM least-privilege permissions.

Common Issues and Troubleshooting

Cross-stack reference locked resources. If you try to modify a resource that another stack imports via Fn::ImportValue, CloudFormation refuses the update. The fix is to first remove the import from the consuming stack, deploy it, then modify the exporting stack. For frequently changing resources, use SSM Parameter Store lookups with StringParameter.valueFromLookup() instead. The value is resolved at synthesis time and does not create CloudFormation exports.

Lambda bundling failures in CI. CDK's NodejsFunction uses esbuild for bundling, which requires native binaries. If your CI runs in a Docker container with a different architecture than your dev machine, bundling fails silently or produces incompatible binaries. Pin the esbuild version in your package.json and make sure the CI environment matches your target Lambda architecture (x86_64 or arm64).

cdk diff shows changes when nothing changed. This happens with constructs that use asset hashes, like Code.fromAsset(). If your build process regenerates files (even with identical content but different timestamps), the asset hash changes. Use .dockerignore or exclude patterns to filter out build artifacts, test files, and anything else that should not affect the hash.

Circular dependency between stacks. CDK detects this during synthesis when Stack A exports something Stack B imports and Stack B exports something Stack A imports. The solution is to introduce a third stack that owns the shared resources, or restructure so the dependency flows in one direction. You can also use Lazy.string() to defer value resolution and break the cycle in some cases.

Token resolution errors with string concatenation. CDK tokens (like bucket.bucketName) are not actual strings during synthesis. Code like "arn:aws:s3:::" + bucket.bucketName + "/*" works because CDK overrides toString(), but passing tokens to functions that inspect string contents (like startsWith() or split()) fails. Use cdk.Fn.join() or cdk.Arn.format() for ARN construction instead.

Deployment hangs on CloudFront distribution. CloudFront distribution creation and updates take 15-30 minutes. This is normal AWS behavior, not a CDK bug. CDK will wait for the CloudFormation stack to stabilize. If it exceeds the CloudFormation timeout, the deployment rolls back. Set appropriate timeouts and be patient.

Best Practices

  • Use RemovalPolicy.RETAIN for production data stores. DynamoDB tables, S3 buckets with data, and RDS instances should never be deleted when the stack is destroyed. Set DESTROY only in dev environments.

  • Always set Lambda timeout higher than your expected execution time, but add a duration alarm at 80% of the timeout. This gives you early warning before functions start timing out in production.

  • Put dead letter queues on every SQS queue. Without a DLQ, poison messages block your queue indefinitely. With a DLQ and an alarm on its depth, you catch processing failures immediately without losing messages.

  • Tag everything. Use cdk.Tags.of(stack).add() at the stack level to apply consistent tags. This makes cost allocation, access control, and resource cleanup dramatically easier.

  • Test your constructs with CDK assertions. The aws-cdk-lib/assertions module lets you verify that your constructs generate the expected CloudFormation. Test that IAM policies are correctly scoped, that encryption is enabled, and that alarms exist for critical metrics.

  • Pin your CDK library versions. All aws-cdk-lib and constructs versions should be pinned in package.json. CDK releases weekly, and minor version bumps can change synthesized output, causing unexpected diffs in production deployments.

  • Use cdk.context.json for cached lookups. When CDK looks up VPCs, hosted zones, or AMIs, it caches the results in context.json. Commit this file to source control so that deployments are deterministic and do not require AWS API access during synthesis.

  • Separate stateful and stateless resources into different stacks. Put databases, S3 buckets, and queues in a stateful stack with RETAIN policies. Put Lambda functions, API Gateways, and processing logic in a stateless stack that can be torn down and recreated freely.

References

Powered by Contentful