Infrastructure As Code

Terraform Fundamentals for Application Developers

Learn Terraform basics for deploying Node.js application infrastructure with VPC, compute, database, and storage resources

Terraform Fundamentals for Application Developers

Overview

Terraform is an open-source infrastructure-as-code tool by HashiCorp that lets you define cloud resources in declarative configuration files, version them in Git, and deploy them repeatably across environments. If you have ever manually clicked through the AWS console to set up a VPC, an EC2 instance, and an RDS database, you already understand the problem Terraform solves: infrastructure drift, undocumented changes, and the inability to reproduce environments reliably. Every application developer shipping code to the cloud should understand Terraform at a fundamental level, because the line between "application code" and "infrastructure code" has effectively disappeared.

Prerequisites

  • Basic understanding of cloud computing concepts (VPCs, subnets, security groups)
  • An AWS account with programmatic access configured
  • Terraform CLI installed (version 1.5+)
  • Node.js 18+ installed locally
  • AWS CLI configured with credentials (aws configure)
  • A text editor with HCL syntax support (VS Code with the HashiCorp Terraform extension is ideal)

What Terraform Is and Why Developers Should Learn It

Terraform is a declarative infrastructure provisioning tool. You describe what you want your infrastructure to look like, and Terraform figures out how to get there. This is fundamentally different from imperative scripting where you write step-by-step instructions.

Here is the core idea: you write .tf files that declare resources, run terraform apply, and Terraform creates, modifies, or destroys cloud resources to match your configuration. It maintains a state file that tracks what currently exists, so it knows the difference between what you have and what you want.

Why should application developers care? Three reasons:

  1. You own your deployments. In most modern teams, developers are responsible for deploying their own services. Terraform lets you define and manage that infrastructure alongside your application code.
  2. Reproducibility. You can spin up identical staging, QA, and production environments from the same configuration. No more "it works in staging but not in production" caused by infrastructure differences.
  3. Code review for infrastructure. Infrastructure changes go through pull requests just like application code. Your team reviews VPC changes the same way they review API changes.

Terraform is not a configuration management tool like Ansible or Chef. It does not install software on servers or manage running processes. It provisions infrastructure — the servers, networks, databases, and load balancers that your application runs on.

HCL Syntax Basics

Terraform uses HashiCorp Configuration Language (HCL), a declarative language designed specifically for infrastructure configuration. It is not JSON, YAML, or a general-purpose programming language. HCL strikes a balance between human readability and machine parseability.

Here is the basic structure:

# This is a comment

resource "aws_instance" "web_server" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"

  tags = {
    Name        = "web-server"
    Environment = "production"
  }
}

Key syntax elements:

  • Blocks are defined with a type, optional labels, and curly braces: resource "type" "name" { ... }
  • Arguments use key = value syntax inside blocks
  • Strings are double-quoted. There are no single-quoted strings in HCL
  • Numbers and booleans are unquoted: count = 3, enabled = true
  • Lists use square brackets: subnets = ["subnet-abc", "subnet-def"]
  • Maps use curly braces: tags = { Name = "example" }
  • String interpolation uses ${} syntax: "Hello, ${var.name}"
  • Multi-line strings use heredoc syntax: <<-EOF ... EOF

HCL also supports expressions and built-in functions:

locals {
  common_tags = {
    Project     = "my-nodejs-app"
    ManagedBy   = "terraform"
    Environment = var.environment
  }

  subnet_count = length(var.availability_zones)
  app_name     = lower(replace(var.project_name, " ", "-"))
}

Providers and Resources

Providers are plugins that Terraform uses to interact with cloud platforms, SaaS services, and other APIs. The AWS provider, for example, knows how to talk to every AWS service. You configure providers in a terraform block:

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = var.aws_region

  default_tags {
    tags = {
      ManagedBy = "terraform"
      Project   = var.project_name
    }
  }
}

The ~> 5.0 version constraint means "any version >= 5.0 and < 6.0". This is important. Pin your provider versions. I have seen production deployments break because a provider auto-upgraded and changed behavior.

Resources are the core building blocks. Each resource block declares a single infrastructure object:

resource "aws_s3_bucket" "app_assets" {
  bucket = "my-nodejs-app-assets-${var.environment}"
}

resource "aws_s3_bucket_versioning" "app_assets" {
  bucket = aws_s3_bucket.app_assets.id

  versioning_configuration {
    status = "Enabled"
  }
}

Notice how the second resource references the first with aws_s3_bucket.app_assets.id. This is how you create dependencies between resources. Terraform automatically determines the correct creation order from these references.

Variables and Outputs

Variables are how you parameterize your Terraform configurations. There are three types: input variables, output values, and local values.

Input variables accept values from outside the configuration:

variable "aws_region" {
  description = "AWS region for all resources"
  type        = string
  default     = "us-east-1"
}

variable "environment" {
  description = "Deployment environment"
  type        = string

  validation {
    condition     = contains(["dev", "staging", "production"], var.environment)
    error_message = "Environment must be dev, staging, or production."
  }
}

variable "app_port" {
  description = "Port the Node.js application listens on"
  type        = number
  default     = 3000
}

variable "enable_monitoring" {
  description = "Whether to enable CloudWatch detailed monitoring"
  type        = bool
  default     = false
}

variable "allowed_cidr_blocks" {
  description = "CIDR blocks allowed to access the ALB"
  type        = list(string)
  default     = ["0.0.0.0/0"]
}

variable "extra_tags" {
  description = "Additional tags to apply to all resources"
  type        = map(string)
  default     = {}
}

Output values expose information about your infrastructure after apply:

output "alb_dns_name" {
  description = "DNS name of the Application Load Balancer"
  value       = aws_lb.app.dns_name
}

output "database_endpoint" {
  description = "RDS instance endpoint"
  value       = aws_db_instance.app.endpoint
  sensitive   = true
}

output "s3_bucket_arn" {
  description = "ARN of the S3 assets bucket"
  value       = aws_s3_bucket.app_assets.arn
}

Outputs marked sensitive = true are redacted in CLI output but still stored in the state file.

Local values are computed values used within the configuration:

locals {
  name_prefix = "${var.project_name}-${var.environment}"
  common_tags = merge(var.extra_tags, {
    Environment = var.environment
    Project     = var.project_name
  })
}

Data Sources

Data sources let you fetch information about existing infrastructure that Terraform does not manage. This is extremely useful when you need to reference resources created outside your configuration:

data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
}

data "aws_availability_zones" "available" {
  state = "available"
}

data "aws_caller_identity" "current" {}

You reference data sources with data.<type>.<name>.<attribute>:

resource "aws_instance" "web" {
  ami               = data.aws_ami.amazon_linux.id
  availability_zone = data.aws_availability_zones.available.names[0]
}

Data sources are read-only. They query the cloud provider API at plan time and return current state. They do not create, modify, or destroy anything.

The Terraform Workflow: init, plan, apply, destroy

Terraform has a clear four-command workflow that you will use constantly.

terraform init

This is always the first command you run. It downloads provider plugins, initializes the backend (where state is stored), and downloads any modules you reference:

terraform init

You need to re-run init when you add a new provider, change the backend configuration, or add a module. In CI/CD pipelines, always run init before anything else.

terraform plan

Plan shows you what Terraform will do without actually doing it. This is your safety net:

terraform plan -out=tfplan

The -out flag saves the plan to a file so you can apply exactly what was planned. The output uses + for creates, - for destroys, and ~ for modifications:

# aws_instance.web will be created
+ resource "aws_instance" "web" {
    + ami           = "ami-0c55b159cbfafe1f0"
    + instance_type = "t3.micro"
    + tags          = {
        + "Name" = "web-server"
      }
  }

Plan: 1 to add, 0 to change, 0 to destroy.

Always review the plan. Always. I have seen engineers blindly apply changes that destroyed production databases because they did not read the plan output.

terraform apply

Apply executes the planned changes. If you saved a plan file, use it:

terraform apply tfplan

Without a plan file, Terraform generates a new plan and asks for confirmation:

terraform apply

In CI/CD, use -auto-approve to skip the confirmation prompt, but only after a plan has been reviewed:

terraform apply -auto-approve

terraform destroy

Destroy tears down everything managed by the configuration. Use this for development environments, never run it against production without extreme caution:

terraform destroy

Additional useful commands:

  • terraform fmt — formats your .tf files to canonical style
  • terraform validate — checks configuration syntax without accessing the provider API
  • terraform state list — lists all resources in the state file
  • terraform state show <resource> — shows details of a specific resource
  • terraform output — displays output values

terraform.tfvars and Variable Files

You set variable values through several mechanisms, in order of precedence (highest to lowest):

  1. Command-line flags: -var="environment=production"
  2. Variable definition files: *.auto.tfvars or specified with -var-file
  3. terraform.tfvars file (auto-loaded)
  4. Environment variables: TF_VAR_environment=production
  5. Default values in variable declarations

The most common approach is using .tfvars files per environment:

# environments/dev.tfvars
environment         = "dev"
aws_region          = "us-east-1"
app_port            = 3000
enable_monitoring   = false
instance_type       = "t3.micro"
db_instance_class   = "db.t3.micro"
min_capacity        = 1
max_capacity        = 2
# environments/production.tfvars
environment         = "production"
aws_region          = "us-east-1"
app_port            = 3000
enable_monitoring   = true
instance_type       = "t3.medium"
db_instance_class   = "db.r6g.large"
min_capacity        = 3
max_capacity        = 10

Apply with a specific variable file:

terraform plan -var-file=environments/production.tfvars
terraform apply -var-file=environments/production.tfvars

Never commit terraform.tfvars files that contain secrets. Add them to .gitignore and pass sensitive values through environment variables or a secrets manager.

Count and for_each

These meta-arguments let you create multiple instances of a resource from a single block.

count creates a fixed number of resources:

variable "availability_zones" {
  type    = list(string)
  default = ["us-east-1a", "us-east-1b", "us-east-1c"]
}

resource "aws_subnet" "private" {
  count             = length(var.availability_zones)
  vpc_id            = aws_vpc.main.id
  cidr_block        = cidrsubnet(var.vpc_cidr, 8, count.index)
  availability_zone = var.availability_zones[count.index]

  tags = {
    Name = "${local.name_prefix}-private-${count.index}"
  }
}

Reference count resources with an index: aws_subnet.private[0].id or use a splat expression: aws_subnet.private[*].id.

for_each creates resources from a map or set. It is more flexible and safer than count because removing an item from the middle of a list does not cause cascading recreations:

variable "s3_buckets" {
  type = map(object({
    versioning = bool
    lifecycle_days = number
  }))
  default = {
    assets = {
      versioning     = true
      lifecycle_days = 90
    }
    logs = {
      versioning     = false
      lifecycle_days = 30
    }
    backups = {
      versioning     = true
      lifecycle_days = 365
    }
  }
}

resource "aws_s3_bucket" "buckets" {
  for_each = var.s3_buckets
  bucket   = "${local.name_prefix}-${each.key}"

  tags = {
    Name    = each.key
    Purpose = each.key
  }
}

Reference for_each resources with the key: aws_s3_bucket.buckets["assets"].arn.

My rule of thumb: use for_each for anything that might change. Use count only for simple "create N copies" scenarios or conditional creation (count = 0 or 1).

Lifecycle Rules

Lifecycle blocks control how Terraform handles resource changes:

resource "aws_db_instance" "app" {
  identifier     = "${local.name_prefix}-db"
  engine         = "postgres"
  engine_version = "15.4"
  instance_class = var.db_instance_class

  lifecycle {
    prevent_destroy = true
  }
}

The three lifecycle arguments you will actually use:

  • prevent_destroy — Terraform refuses to destroy this resource. Use this on databases and anything that holds data you cannot afford to lose.
  • create_before_destroy — Creates the replacement resource before destroying the old one. Essential for zero-downtime deployments of load balancers and DNS records.
  • ignore_changes — Tells Terraform to ignore changes to specific attributes. Useful when an external process modifies a resource (like auto-scaling changing desired count).
resource "aws_ecs_service" "app" {
  name            = "${local.name_prefix}-service"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.app.arn
  desired_count   = var.min_capacity

  lifecycle {
    ignore_changes = [desired_count]
  }
}

This is critical for ECS services with auto-scaling. Without ignore_changes, every terraform apply would reset the desired count back to the configured value, overriding the auto-scaler.

Provisioners (and Why to Avoid Them)

Provisioners execute scripts on a local or remote machine as part of resource creation or destruction. Here is an example:

resource "aws_instance" "web" {
  ami           = data.aws_ami.amazon_linux.id
  instance_type = "t3.micro"

  provisioner "remote-exec" {
    inline = [
      "sudo yum update -y",
      "sudo yum install -y nodejs",
      "sudo npm install -g pm2"
    ]

    connection {
      type        = "ssh"
      user        = "ec2-user"
      private_key = file("~/.ssh/id_rsa")
      host        = self.public_ip
    }
  }
}

Do not use provisioners. I am being direct about this. Provisioners are a last resort for good reasons:

  1. They break the declarative model. Terraform cannot track what a script does, so it cannot detect drift or roll back changes.
  2. They only run on creation (by default). If the script fails, you get a tainted resource.
  3. They create tight coupling between infrastructure provisioning and configuration management.

Instead, use:

  • User data scripts for EC2 initialization
  • Custom AMIs built with Packer for pre-configured images
  • Container images (Docker) for application deployment
  • ECS/EKS for managed container orchestration
  • AWS Systems Manager for ongoing configuration management

The HashiCorp documentation itself says provisioners are "a last resort." Take that seriously.

Resource Dependencies

Terraform automatically infers dependencies from resource references. When resource B references an attribute of resource A, Terraform knows to create A first:

resource "aws_vpc" "main" {
  cidr_block = var.vpc_cidr
}

# Terraform creates the VPC first because this references aws_vpc.main.id
resource "aws_subnet" "public" {
  vpc_id     = aws_vpc.main.id
  cidr_block = cidrsubnet(var.vpc_cidr, 8, 0)
}

Occasionally you need explicit dependencies for resources that do not reference each other but still have an ordering requirement:

resource "aws_iam_role_policy" "ecs_task" {
  role   = aws_iam_role.ecs_task.id
  policy = data.aws_iam_policy_document.ecs_task.json
}

resource "aws_ecs_service" "app" {
  # ... configuration ...

  depends_on = [aws_iam_role_policy.ecs_task]
}

Use depends_on sparingly. If you find yourself adding it frequently, your resource references probably need restructuring.

terraform fmt and validate

These two commands should be part of every developer's workflow and every CI pipeline.

terraform fmt formats your code to the canonical HCL style. Run it before every commit:

terraform fmt -recursive

The -recursive flag formats all .tf files in subdirectories as well. In CI, use -check to fail the build if files are not formatted:

terraform fmt -check -recursive

terraform validate checks your configuration for syntax errors and internal consistency without contacting any cloud provider APIs:

terraform validate

This catches missing required arguments, invalid references, type mismatches, and other structural problems. It is fast and should run on every pull request.

A minimal CI check script for a Node.js project with Terraform:

// scripts/validate-terraform.js
var execSync = require("child_process").execSync;
var path = require("path");

var terraformDir = path.join(__dirname, "..", "terraform");

function runCommand(cmd) {
  try {
    var output = execSync(cmd, {
      cwd: terraformDir,
      encoding: "utf-8",
      stdio: "pipe"
    });
    console.log("PASS: " + cmd);
    return true;
  } catch (err) {
    console.error("FAIL: " + cmd);
    console.error(err.stderr || err.stdout);
    return false;
  }
}

var results = [
  runCommand("terraform fmt -check -recursive"),
  runCommand("terraform init -backend=false"),
  runCommand("terraform validate")
];

var allPassed = results.every(function(r) { return r === true; });

if (!allPassed) {
  console.error("Terraform validation failed");
  process.exit(1);
}

console.log("All Terraform checks passed");

Complete Working Example: Node.js Express on ECS with RDS and S3

This is a complete Terraform configuration that deploys infrastructure for a Node.js Express application. It creates a VPC, security groups, an Application Load Balancer, an ECS Fargate service, an RDS PostgreSQL database, and an S3 bucket for static assets.

Project Structure

terraform/
  main.tf          # Provider configuration and locals
  vpc.tf           # VPC, subnets, internet gateway, NAT gateway
  security.tf      # Security groups
  alb.tf           # Application Load Balancer
  ecs.tf           # ECS cluster, task definition, service
  rds.tf           # RDS PostgreSQL instance
  s3.tf            # S3 bucket for assets
  variables.tf     # Input variable declarations
  outputs.tf       # Output values
  environments/
    dev.tfvars
    production.tfvars

variables.tf

variable "project_name" {
  description = "Project name used for resource naming"
  type        = string
  default     = "nodejs-app"
}

variable "environment" {
  description = "Deployment environment"
  type        = string

  validation {
    condition     = contains(["dev", "staging", "production"], var.environment)
    error_message = "Environment must be dev, staging, or production."
  }
}

variable "aws_region" {
  description = "AWS region"
  type        = string
  default     = "us-east-1"
}

variable "vpc_cidr" {
  description = "CIDR block for the VPC"
  type        = string
  default     = "10.0.0.0/16"
}

variable "availability_zones" {
  description = "Availability zones"
  type        = list(string)
  default     = ["us-east-1a", "us-east-1b"]
}

variable "app_port" {
  description = "Port the Node.js application listens on"
  type        = number
  default     = 3000
}

variable "app_image" {
  description = "Docker image for the application"
  type        = string
}

variable "app_cpu" {
  description = "CPU units for the ECS task (1024 = 1 vCPU)"
  type        = number
  default     = 256
}

variable "app_memory" {
  description = "Memory in MB for the ECS task"
  type        = number
  default     = 512
}

variable "min_capacity" {
  description = "Minimum number of ECS tasks"
  type        = number
  default     = 1
}

variable "max_capacity" {
  description = "Maximum number of ECS tasks"
  type        = number
  default     = 4
}

variable "db_instance_class" {
  description = "RDS instance class"
  type        = string
  default     = "db.t3.micro"
}

variable "db_name" {
  description = "Database name"
  type        = string
  default     = "appdb"
}

variable "db_username" {
  description = "Database master username"
  type        = string
  sensitive   = true
}

variable "db_password" {
  description = "Database master password"
  type        = string
  sensitive   = true
}

main.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }

  backend "s3" {
    bucket         = "my-terraform-state-bucket"
    key            = "nodejs-app/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

provider "aws" {
  region = var.aws_region

  default_tags {
    tags = {
      Project     = var.project_name
      Environment = var.environment
      ManagedBy   = "terraform"
    }
  }
}

locals {
  name_prefix = "${var.project_name}-${var.environment}"
}

vpc.tf

resource "aws_vpc" "main" {
  cidr_block           = var.vpc_cidr
  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = {
    Name = "${local.name_prefix}-vpc"
  }
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "${local.name_prefix}-igw"
  }
}

resource "aws_subnet" "public" {
  count                   = length(var.availability_zones)
  vpc_id                  = aws_vpc.main.id
  cidr_block              = cidrsubnet(var.vpc_cidr, 8, count.index)
  availability_zone       = var.availability_zones[count.index]
  map_public_ip_on_launch = true

  tags = {
    Name = "${local.name_prefix}-public-${count.index}"
    Tier = "public"
  }
}

resource "aws_subnet" "private" {
  count             = length(var.availability_zones)
  vpc_id            = aws_vpc.main.id
  cidr_block        = cidrsubnet(var.vpc_cidr, 8, count.index + 10)
  availability_zone = var.availability_zones[count.index]

  tags = {
    Name = "${local.name_prefix}-private-${count.index}"
    Tier = "private"
  }
}

resource "aws_eip" "nat" {
  domain = "vpc"

  tags = {
    Name = "${local.name_prefix}-nat-eip"
  }
}

resource "aws_nat_gateway" "main" {
  allocation_id = aws_eip.nat.id
  subnet_id     = aws_subnet.public[0].id

  tags = {
    Name = "${local.name_prefix}-nat"
  }

  depends_on = [aws_internet_gateway.main]
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.main.id
  }

  tags = {
    Name = "${local.name_prefix}-public-rt"
  }
}

resource "aws_route_table" "private" {
  vpc_id = aws_vpc.main.id

  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.main.id
  }

  tags = {
    Name = "${local.name_prefix}-private-rt"
  }
}

resource "aws_route_table_association" "public" {
  count          = length(var.availability_zones)
  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.public.id
}

resource "aws_route_table_association" "private" {
  count          = length(var.availability_zones)
  subnet_id      = aws_subnet.private[count.index].id
  route_table_id = aws_route_table.private.id
}

security.tf

resource "aws_security_group" "alb" {
  name        = "${local.name_prefix}-alb-sg"
  description = "Security group for Application Load Balancer"
  vpc_id      = aws_vpc.main.id

  ingress {
    protocol    = "tcp"
    from_port   = 80
    to_port     = 80
    cidr_blocks = ["0.0.0.0/0"]
    description = "Allow HTTP"
  }

  ingress {
    protocol    = "tcp"
    from_port   = 443
    to_port     = 443
    cidr_blocks = ["0.0.0.0/0"]
    description = "Allow HTTPS"
  }

  egress {
    protocol    = "-1"
    from_port   = 0
    to_port     = 0
    cidr_blocks = ["0.0.0.0/0"]
    description = "Allow all outbound"
  }

  tags = {
    Name = "${local.name_prefix}-alb-sg"
  }
}

resource "aws_security_group" "ecs_tasks" {
  name        = "${local.name_prefix}-ecs-sg"
  description = "Security group for ECS tasks"
  vpc_id      = aws_vpc.main.id

  ingress {
    protocol        = "tcp"
    from_port       = var.app_port
    to_port         = var.app_port
    security_groups = [aws_security_group.alb.id]
    description     = "Allow traffic from ALB"
  }

  egress {
    protocol    = "-1"
    from_port   = 0
    to_port     = 0
    cidr_blocks = ["0.0.0.0/0"]
    description = "Allow all outbound"
  }

  tags = {
    Name = "${local.name_prefix}-ecs-sg"
  }
}

resource "aws_security_group" "rds" {
  name        = "${local.name_prefix}-rds-sg"
  description = "Security group for RDS"
  vpc_id      = aws_vpc.main.id

  ingress {
    protocol        = "tcp"
    from_port       = 5432
    to_port         = 5432
    security_groups = [aws_security_group.ecs_tasks.id]
    description     = "Allow PostgreSQL from ECS tasks"
  }

  tags = {
    Name = "${local.name_prefix}-rds-sg"
  }
}

alb.tf

resource "aws_lb" "app" {
  name               = "${local.name_prefix}-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb.id]
  subnets            = aws_subnet.public[*].id

  tags = {
    Name = "${local.name_prefix}-alb"
  }
}

resource "aws_lb_target_group" "app" {
  name        = "${local.name_prefix}-tg"
  port        = var.app_port
  protocol    = "HTTP"
  vpc_id      = aws_vpc.main.id
  target_type = "ip"

  health_check {
    enabled             = true
    healthy_threshold   = 3
    unhealthy_threshold = 3
    interval            = 30
    path                = "/health"
    port                = "traffic-port"
    protocol            = "HTTP"
    timeout             = 5
  }

  tags = {
    Name = "${local.name_prefix}-tg"
  }
}

resource "aws_lb_listener" "http" {
  load_balancer_arn = aws_lb.app.arn
  port              = 80
  protocol          = "HTTP"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.app.arn
  }
}

ecs.tf

resource "aws_ecs_cluster" "main" {
  name = "${local.name_prefix}-cluster"

  setting {
    name  = "containerInsights"
    value = "enabled"
  }

  tags = {
    Name = "${local.name_prefix}-cluster"
  }
}

resource "aws_iam_role" "ecs_task_execution" {
  name = "${local.name_prefix}-ecs-execution-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "ecs-tasks.amazonaws.com"
        }
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "ecs_task_execution" {
  role       = aws_iam_role.ecs_task_execution.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}

resource "aws_cloudwatch_log_group" "app" {
  name              = "/ecs/${local.name_prefix}"
  retention_in_days = 30

  tags = {
    Name = "${local.name_prefix}-logs"
  }
}

resource "aws_ecs_task_definition" "app" {
  family                   = "${local.name_prefix}-task"
  network_mode             = "awsvpc"
  requires_compatibilities = ["FARGATE"]
  cpu                      = var.app_cpu
  memory                   = var.app_memory
  execution_role_arn       = aws_iam_role.ecs_task_execution.arn

  container_definitions = jsonencode([
    {
      name      = "app"
      image     = var.app_image
      essential = true

      portMappings = [
        {
          containerPort = var.app_port
          hostPort      = var.app_port
          protocol      = "tcp"
        }
      ]

      environment = [
        {
          name  = "NODE_ENV"
          value = var.environment
        },
        {
          name  = "PORT"
          value = tostring(var.app_port)
        },
        {
          name  = "DATABASE_URL"
          value = "postgresql://${var.db_username}:${var.db_password}@${aws_db_instance.app.endpoint}/${var.db_name}"
        },
        {
          name  = "S3_BUCKET"
          value = aws_s3_bucket.app_assets.id
        }
      ]

      logConfiguration = {
        logDriver = "awslogs"
        options = {
          "awslogs-group"         = aws_cloudwatch_log_group.app.name
          "awslogs-region"        = var.aws_region
          "awslogs-stream-prefix" = "ecs"
        }
      }
    }
  ])
}

resource "aws_ecs_service" "app" {
  name            = "${local.name_prefix}-service"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.app.arn
  desired_count   = var.min_capacity
  launch_type     = "FARGATE"

  network_configuration {
    subnets          = aws_subnet.private[*].id
    security_groups  = [aws_security_group.ecs_tasks.id]
    assign_public_ip = false
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.app.arn
    container_name   = "app"
    container_port   = var.app_port
  }

  lifecycle {
    ignore_changes = [desired_count]
  }

  depends_on = [aws_lb_listener.http]
}

rds.tf

resource "aws_db_subnet_group" "app" {
  name       = "${local.name_prefix}-db-subnet"
  subnet_ids = aws_subnet.private[*].id

  tags = {
    Name = "${local.name_prefix}-db-subnet"
  }
}

resource "aws_db_instance" "app" {
  identifier             = "${local.name_prefix}-db"
  engine                 = "postgres"
  engine_version         = "15.4"
  instance_class         = var.db_instance_class
  allocated_storage      = 20
  max_allocated_storage  = 100
  storage_encrypted      = true
  db_name                = var.db_name
  username               = var.db_username
  password               = var.db_password
  db_subnet_group_name   = aws_db_subnet_group.app.name
  vpc_security_group_ids = [aws_security_group.rds.id]
  skip_final_snapshot    = var.environment != "production"
  final_snapshot_identifier = var.environment == "production" ? "${local.name_prefix}-final-snapshot" : null
  backup_retention_period   = var.environment == "production" ? 7 : 1
  multi_az                  = var.environment == "production"

  lifecycle {
    prevent_destroy = false
  }

  tags = {
    Name = "${local.name_prefix}-db"
  }
}

s3.tf

resource "aws_s3_bucket" "app_assets" {
  bucket = "${local.name_prefix}-assets"

  tags = {
    Name = "${local.name_prefix}-assets"
  }
}

resource "aws_s3_bucket_versioning" "app_assets" {
  bucket = aws_s3_bucket.app_assets.id

  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "app_assets" {
  bucket = aws_s3_bucket.app_assets.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

resource "aws_s3_bucket_public_access_block" "app_assets" {
  bucket = aws_s3_bucket.app_assets.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

outputs.tf

output "alb_dns_name" {
  description = "DNS name of the Application Load Balancer"
  value       = aws_lb.app.dns_name
}

output "database_endpoint" {
  description = "RDS PostgreSQL endpoint"
  value       = aws_db_instance.app.endpoint
  sensitive   = true
}

output "s3_bucket_name" {
  description = "Name of the S3 assets bucket"
  value       = aws_s3_bucket.app_assets.id
}

output "ecs_cluster_name" {
  description = "Name of the ECS cluster"
  value       = aws_ecs_cluster.main.name
}

output "ecs_service_name" {
  description = "Name of the ECS service"
  value       = aws_ecs_service.app.name
}

output "vpc_id" {
  description = "ID of the VPC"
  value       = aws_vpc.main.id
}

Deploying the Infrastructure

With the configuration in place, deploy with:

cd terraform

# Initialize providers and backend
terraform init

# Review the plan for dev environment
terraform plan -var-file=environments/dev.tfvars -out=tfplan

# Apply the plan
terraform apply tfplan

# View the outputs
terraform output

Your Node.js application needs a /health endpoint for the ALB health check:

var express = require("express");
var app = express();
var port = process.env.PORT || 3000;

app.get("/health", function(req, res) {
  res.status(200).json({ status: "healthy", timestamp: new Date().toISOString() });
});

app.get("/", function(req, res) {
  res.json({ message: "Hello from ECS", environment: process.env.NODE_ENV });
});

app.listen(port, function() {
  console.log("Server running on port " + port);
});

Common Issues and Troubleshooting

1. State Lock Timeout

Error: Error acquiring the state lock

Lock Info:
  ID:        a1b2c3d4-e5f6-7890-abcd-ef1234567890
  Path:      s3://my-terraform-state-bucket/nodejs-app/terraform.tfstate
  Operation: OperationTypeApply
  Who:       user@hostname
  Version:   1.5.7
  Created:   2024-01-15 10:30:00.000000 +0000 UTC

This happens when a previous Terraform operation crashed without releasing the lock, or when a colleague is running apply at the same time. If you are certain no other operation is running, force-unlock with the lock ID:

terraform force-unlock a1b2c3d4-e5f6-7890-abcd-ef1234567890

Never force-unlock if someone else might be running an operation. You will corrupt your state.

2. Resource Already Exists

Error: creating EC2 VPC: InvalidVpcID.Duplicate: The vpc 'vpc-0abc123def456789' already exists.

This happens when a resource exists in AWS but not in your Terraform state. Common after manual console changes or a failed apply. Import the existing resource into state:

terraform import aws_vpc.main vpc-0abc123def456789

Then run terraform plan to see if the existing resource matches your configuration. Adjust the config or the resource until the plan shows no changes.

3. Provider Authentication Failure

Error: No valid credential sources found

  with provider["registry.terraform.io/hashicorp/aws"],
  on main.tf line 14, in provider "aws":
  14:   region = var.aws_region

Terraform cannot find AWS credentials. Verify your credentials are configured:

aws sts get-caller-identity

If that fails, run aws configure or set environment variables:

export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION="us-east-1"

4. Cycle Dependency Error

Error: Cycle: aws_security_group.app, aws_security_group.db

  on security.tf line 1:
   1: resource "aws_security_group" "app" {

This happens when two security groups reference each other in their ingress/egress rules. Break the cycle by using aws_security_group_rule as a separate resource instead of inline rules:

resource "aws_security_group" "app" {
  name   = "app-sg"
  vpc_id = aws_vpc.main.id
}

resource "aws_security_group" "db" {
  name   = "db-sg"
  vpc_id = aws_vpc.main.id
}

resource "aws_security_group_rule" "app_to_db" {
  type                     = "egress"
  from_port                = 5432
  to_port                  = 5432
  protocol                 = "tcp"
  security_group_id        = aws_security_group.app.id
  source_security_group_id = aws_security_group.db.id
}

resource "aws_security_group_rule" "db_from_app" {
  type                     = "ingress"
  from_port                = 5432
  to_port                  = 5432
  protocol                 = "tcp"
  security_group_id        = aws_security_group.db.id
  source_security_group_id = aws_security_group.app.id
}

5. S3 Backend Initialization Failure

Error: Failed to get existing workspaces: S3 bucket does not exist.

  The referenced S3 bucket must have been previously created.

The S3 bucket and DynamoDB table for Terraform state must exist before you run terraform init. Create them manually or with a separate bootstrap configuration:

aws s3api create-bucket \
  --bucket my-terraform-state-bucket \
  --region us-east-1

aws dynamodb create-table \
  --table-name terraform-locks \
  --attribute-definitions AttributeName=LockID,AttributeType=S \
  --key-schema AttributeName=LockID,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST \
  --region us-east-1

Best Practices

  • Always use remote state with locking. Local state files get lost, overwritten, and cannot be shared across a team. Use S3 + DynamoDB (or Terraform Cloud) from day one, even for personal projects. The cost is negligible and the protection is invaluable.

  • Pin provider and Terraform versions. Use required_version and version constraints in your required_providers block. An unexpected provider upgrade during a production deployment is a bad time to discover breaking changes.

  • Never store secrets in Terraform files or state. Use AWS Secrets Manager, SSM Parameter Store, or HashiCorp Vault for sensitive values. Pass them to Terraform through environment variables, not .tfvars files committed to Git.

  • Use terraform plan with -out in CI/CD pipelines. The two-step plan-then-apply workflow ensures you apply exactly what was reviewed. Without -out, the infrastructure could change between plan and apply.

  • Structure your code by resource type, not by lifecycle. Separate files like vpc.tf, ecs.tf, and rds.tf are easier to navigate than a single monolithic main.tf. Terraform merges all .tf files in a directory anyway, so the file structure is purely organizational.

  • Use for_each over count for collections that may change. With count, removing item 2 from a list of 5 causes items 3, 4, and 5 to be destroyed and recreated. With for_each, only the removed item is affected.

  • Tag everything. Use default_tags in the AWS provider block for tags that apply to all resources (project, environment, managed-by). Add resource-specific tags in individual resource blocks. Tags are essential for cost allocation, security audits, and operational visibility.

  • Run terraform fmt and terraform validate in pre-commit hooks. Catch formatting inconsistencies and syntax errors before they reach your CI pipeline. The pre-commit framework has ready-made hooks for both commands.

  • Use workspaces or directory-based separation for environments. I prefer directory-based separation with shared modules because workspaces share the same backend configuration, which makes permissions management harder. Either approach beats duplicating entire configurations per environment.

  • Protect production databases with prevent_destroy. Add lifecycle { prevent_destroy = true } to any resource that holds data you cannot regenerate. This forces an explicit configuration change before Terraform will destroy it.

References

Powered by Contentful