Containerization

Docker Multi-Stage Builds for Node.js Applications

Production-ready guide to Docker multi-stage builds for Node.js covering image optimization, layer caching, native modules, health checks, non-root users, Docker Compose development workflows, and CI/CD integration.

Docker Multi-Stage Builds for Node.js Applications

Overview

Docker multi-stage builds let you use multiple FROM statements in a single Dockerfile, each starting a new build stage that can selectively copy artifacts from previous stages. For Node.js applications, this is the single most effective technique for reducing production image sizes — often by 80% or more — while keeping your build process reproducible and your dependency installation clean. If you are shipping Node.js containers to production and not using multi-stage builds, you are almost certainly deploying images that are far larger and less secure than they need to be.

Prerequisites

  • Working knowledge of Docker (building images, running containers)
  • Familiarity with Node.js and npm/yarn
  • Docker Engine 17.05+ (multi-stage build support)
  • Basic understanding of Linux package management (apt-get, apk)

Why Docker Images Get Bloated

Before we fix the problem, let's understand it. A naive Dockerfile for a Node.js application typically looks like this:

FROM node:20
WORKDIR /app
COPY . .
RUN npm install
EXPOSE 3000
CMD ["node", "server.js"]

This image will be somewhere around 1.1 GB. Here is why:

  1. The base image itself. node:20 is built on Debian Bookworm and includes build tools, Python, gcc, make, and hundreds of packages you will never use at runtime. The base image alone is roughly 950 MB.
  2. devDependencies. npm install pulls in everything — test frameworks, linters, TypeScript compiler, bundlers. These have no business in a production container.
  3. Source files you don't need. Your .git directory, test files, documentation, local config files — all copied in by COPY . ..
  4. Build artifacts from native modules. Compiling native addons (bcrypt, sharp, better-sqlite3) leaves behind object files, headers, and toolchains in intermediate layers.
  5. Layer bloat. Every RUN instruction creates a layer. Even if you delete files in a later RUN, the previous layer still contains them. The image size only grows.

The result is an image that takes minutes to push, minutes to pull, consumes excessive disk on your nodes, and exposes a massive attack surface. Let's fix all of this.


Single-Stage vs Multi-Stage Builds

Single-Stage (The Problem)

FROM node:20
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
RUN npm prune --production
EXPOSE 3000
CMD ["node", "dist/server.js"]

Image size: ~1.1 GB

Even with npm prune --production, the base image still contains the full Debian toolchain. The build tools, headers, and intermediate files are baked into earlier layers. You cannot escape them without multi-stage.

Multi-Stage (The Solution)

# Stage 1: Install dependencies
FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci

# Stage 2: Build
FROM node:20-alpine AS build
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

# Stage 3: Production
FROM node:20-alpine AS production
WORKDIR /app
COPY --from=build /app/dist ./dist
COPY --from=deps /app/node_modules ./node_modules
COPY package*.json ./
RUN npm prune --production
EXPOSE 3000
USER node
CMD ["node", "dist/server.js"]

Image size: ~180 MB

That is an 84% reduction. The production stage contains only the Alpine base image, production dependencies, and compiled output. No build tools, no devDependencies, no source code.


Anatomy of a Multi-Stage Dockerfile

Each FROM instruction begins a new stage. Stages can be named with AS and referenced by name in COPY --from= instructions. Only the final stage contributes to the output image. Everything else is discarded.

Stage 1: Dependency Installation

The first stage focuses exclusively on installing dependencies. By copying only package.json and package-lock.json before running npm ci, you maximize Docker's layer cache. Dependencies only reinstall when the lockfile changes.

FROM node:20-alpine AS deps
WORKDIR /app

# Copy only dependency manifests first for cache efficiency
COPY package.json package-lock.json ./

# ci is deterministic and faster than install
RUN npm ci

# If you have native modules that need build tools:
# RUN apk add --no-cache python3 make g++ && \
#     npm ci && \
#     apk del python3 make g++

The key insight: separate dependency installation from code copying. When you change application code but not dependencies, Docker uses the cached layer and skips npm ci entirely. On a project with 800+ dependencies, this saves 45-90 seconds per build.

Stage 2: Build and Compile

The second stage copies in the installed node_modules from stage 1 and your source code, then runs whatever build step your project needs — TypeScript compilation, Webpack/Vite bundling, asset processing.

FROM node:20-alpine AS build
WORKDIR /app

COPY --from=deps /app/node_modules ./node_modules
COPY . .

# TypeScript compilation, bundling, whatever your project needs
RUN npm run build

# At this point we have:
# - /app/dist (compiled output)
# - /app/node_modules (all deps including devDeps)
# - /app/src (source code)
# Only dist/ moves to the next stage.

Stage 3: Production Image

The final stage starts from a clean base image and copies only what the application needs at runtime.

FROM node:20-alpine AS production

# Security: don't run as root
RUN addgroup -g 1001 -S appgroup && \
    adduser -S appuser -u 1001 -G appgroup

WORKDIR /app

# Copy compiled output
COPY --from=build /app/dist ./dist

# Copy dependency manifests and install production-only deps
COPY package.json package-lock.json ./
RUN npm ci --only=production && \
    npm cache clean --force

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
    CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

# Set ownership and switch user
RUN chown -R appuser:appgroup /app
USER appuser

EXPOSE 3000

CMD ["node", "dist/server.js"]

Notice that in this version, we run npm ci --only=production in the final stage instead of copying node_modules from an earlier stage and pruning. This is intentional — it produces a cleaner result because only production dependencies are ever installed, and no devDependency artifacts leak into intermediate layers.


Choosing Base Images

This decision matters more than most engineers realize. Here are your options for Node.js, from largest to smallest:

Base Image Size Use Case
node:20 (Debian) ~950 MB Never use in production
node:20-slim ~200 MB When you need Debian but minimal
node:20-alpine ~140 MB Best general-purpose choice
gcr.io/distroless/nodejs20 ~120 MB Maximum security, no shell

My recommendation

Use node:20-alpine for most applications. It is small, well-supported, and includes a shell for debugging. The musl libc it uses instead of glibc causes occasional compatibility issues with native modules, but these are rare and well-documented.

Use node:20-slim if you depend on native modules that refuse to compile against musl (some older versions of sharp, canvas, or anything linking against system libraries that assume glibc).

Use distroless only if your security posture demands it. The lack of a shell means you cannot docker exec into the container for debugging, which is a real operational cost.

# For build stages, use full image (has build tools)
FROM node:20-alpine AS build

# For production, use the smallest viable image
FROM node:20-alpine AS production
# OR for maximum security:
# FROM gcr.io/distroless/nodejs20-debian12 AS production

.dockerignore Best Practices

A proper .dockerignore is not optional. Without one, COPY . . sends everything to the Docker daemon, including your .git directory (which can be hundreds of megabytes), node_modules (which you're installing fresh anyway), and any secrets in local config files.

# .dockerignore

# Dependencies (installed fresh in container)
node_modules

# Source control
.git
.gitignore

# IDE and editor files
.vscode
.idea
*.swp
*.swo

# Environment and secrets
.env
.env.*
!.env.example

# Test and development
coverage/
__tests__/
*.test.js
*.test.ts
*.spec.js
*.spec.ts
jest.config.*
.eslintrc*
.prettierrc*

# Docker files (prevent recursive context)
Dockerfile*
docker-compose*
.dockerignore

# Documentation
README.md
CHANGELOG.md
LICENSE
docs/

# OS files
.DS_Store
Thumbs.db

The build context size difference is substantial. On a typical project:

  • Without .dockerignore: 450 MB sent to daemon (includes .git, node_modules)
  • With .dockerignore: 2-5 MB sent to daemon

This alone can cut 10-20 seconds off every build.


Layer Caching Strategies

Docker caches each layer and reuses it when the inputs haven't changed. The order of instructions in your Dockerfile directly determines cache efficiency. The rule is simple: put things that change least frequently at the top.

FROM node:20-alpine AS production
WORKDIR /app

# 1. System dependencies (changes almost never)
RUN apk add --no-cache tini

# 2. Dependency manifests (changes weekly)
COPY package.json package-lock.json ./

# 3. Install dependencies (cached unless lockfile changes)
RUN npm ci --only=production

# 4. Application code (changes on every build)
COPY --from=build /app/dist ./dist

# Use tini as init process
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "dist/server.js"]

With this ordering, a code-only change rebuilds only step 4. The dependency installation in step 3 is cached. This is the difference between a 90-second build and a 5-second build.

BuildKit Cache Mounts

Docker BuildKit (enabled by default in modern Docker) supports cache mounts that persist across builds. This is extremely useful for npm's cache:

# syntax=docker/dockerfile:1
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./

# Mount npm cache - persists across builds
RUN --mount=type=cache,target=/root/.npm \
    npm ci

This keeps the npm download cache between builds. Even when package-lock.json changes, packages that haven't changed are read from cache instead of re-downloaded.


Handling Native Modules

Native modules like bcrypt, sharp, and better-sqlite3 require C/C++ compilation. This is where multi-stage builds really prove their value — you can install build tools in an early stage and leave them behind.

# Stage 1: Build native dependencies
FROM node:20-alpine AS native-deps
WORKDIR /app

# Install build toolchain
RUN apk add --no-cache \
    python3 \
    make \
    g++ \
    vips-dev  # Required for sharp

COPY package.json package-lock.json ./
RUN npm ci

# Stage 2: Production
FROM node:20-alpine AS production
WORKDIR /app

# Install only runtime libraries (not dev headers)
RUN apk add --no-cache vips

COPY --from=native-deps /app/node_modules ./node_modules
COPY package.json package-lock.json ./
RUN npm prune --production

COPY --from=build /app/dist ./dist
CMD ["node", "dist/server.js"]

The critical distinction: vips-dev (development headers, ~50 MB) is installed in the build stage. Only vips (runtime library, ~8 MB) is installed in the production stage. The compiled .node binary files in node_modules link against the runtime library and work without the development headers.


Production-Only Dependencies

There are two approaches to ensuring only production dependencies end up in your final image. Both work; I prefer the second.

Approach 1: Install all, then prune

COPY --from=deps /app/node_modules ./node_modules
COPY package.json package-lock.json ./
RUN npm prune --production

Approach 2: Fresh production install in final stage

COPY package.json package-lock.json ./
RUN npm ci --only=production && npm cache clean --force

Approach 2 is cleaner because devDependencies are never present in the final stage at all, not even in intermediate layers. Approach 1 can leave ghost files if a devDependency installed a binary in node_modules/.bin that prune doesn't fully clean up.


Health Checks

Always include a health check. Without one, your orchestrator (Docker Swarm, Kubernetes, ECS) has no way to know if your application is actually serving traffic.

# Option 1: wget (available in Alpine)
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
    CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

# Option 2: curl (must be installed separately on Alpine)
# RUN apk add --no-cache curl
# HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
#     CMD curl -f http://localhost:3000/health || exit 1

# Option 3: Node.js script (no extra dependencies, most reliable)
# HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
#     CMD node -e "require('http').get('http://localhost:3000/health', (r) => { process.exit(r.statusCode === 200 ? 0 : 1) })"

And in your Express application:

app.get('/health', (req, res) => {
    // Check actual application health, not just "process is running"
    const healthy = {
        uptime: process.uptime(),
        memory: process.memoryUsage(),
        timestamp: Date.now()
    };
    res.status(200).json(healthy);
});

Non-Root User Setup

Running containers as root is a security anti-pattern. If an attacker exploits a vulnerability in your Node.js application, they get root access to the container — and potentially to mounted volumes or the host via container escapes.

FROM node:20-alpine AS production

# Create a non-root user
RUN addgroup -g 1001 -S appgroup && \
    adduser -S appuser -u 1001 -G appgroup

WORKDIR /app

COPY --from=build --chown=appuser:appgroup /app/dist ./dist
COPY --chown=appuser:appgroup package.json package-lock.json ./

RUN npm ci --only=production && npm cache clean --force

# Switch to non-root user AFTER installing dependencies
USER appuser

EXPOSE 3000
CMD ["node", "dist/server.js"]

The official Node.js Alpine images include a node user (UID 1000) that you can use directly instead of creating your own:

USER node

Just ensure your WORKDIR and copied files are owned by the node user. The --chown flag on COPY handles this cleanly.


Handling Environment Variables and Secrets

Never bake secrets into images. This is Docker 101, but I still see it constantly.

# NEVER DO THIS
ENV DATABASE_URL=postgres://user:password@host:5432/db
ENV API_KEY=sk-1234567890

Instead, pass environment variables at runtime:

docker run -e DATABASE_URL=postgres://... -e API_KEY=sk-... myapp:latest

For build-time secrets (like private npm registry tokens), use BuildKit's secret mounts:

# syntax=docker/dockerfile:1
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./

# Secret is mounted only during this RUN and never stored in a layer
RUN --mount=type=secret,id=npmrc,target=/app/.npmrc \
    npm ci

Build with:

docker build --secret id=npmrc,src=$HOME/.npmrc -t myapp .

The .npmrc file is available during npm ci but is not persisted in any image layer. This is the correct way to handle private registry authentication.


Docker Compose for Local Development

Multi-stage builds shine in development when paired with Docker Compose. You can target a specific stage for local development that includes dev tools, while your CI pipeline builds the production stage.

# docker-compose.yml
version: '3.8'

services:
  app:
    build:
      context: .
      dockerfile: Dockerfile
      target: development  # Build only up to the 'development' stage
    ports:
      - "3000:3000"
      - "9229:9229"  # Node.js debugger
    environment:
      - NODE_ENV=development
      - DATABASE_URL=mongodb://mongo:27017/myapp
      - REDIS_URL=redis://redis:6379
    volumes:
      # Mount source code for hot reload
      - ./src:/app/src
      - ./package.json:/app/package.json
      # Anonymous volume to prevent overwriting container's node_modules
      - /app/node_modules
    depends_on:
      mongo:
        condition: service_healthy
      redis:
        condition: service_started
    command: npx nodemon --inspect=0.0.0.0:9229 src/server.ts

  mongo:
    image: mongo:7
    ports:
      - "27017:27017"
    volumes:
      - mongo_data:/data/db
    healthcheck:
      test: echo 'db.runCommand("ping").ok' | mongosh --quiet
      interval: 10s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

volumes:
  mongo_data:

And the corresponding Dockerfile with a development stage:

# Stage 1: Dependencies
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci

# Stage 2: Development (used by docker-compose)
FROM node:20-alpine AS development
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
# Dev tools available: nodemon, ts-node, debugger
EXPOSE 3000 9229
CMD ["npx", "nodemon", "--inspect=0.0.0.0:9229", "src/server.ts"]

# Stage 3: Build
FROM node:20-alpine AS build
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

# Stage 4: Production
FROM node:20-alpine AS production
RUN apk add --no-cache tini
WORKDIR /app

COPY package.json package-lock.json ./
RUN npm ci --only=production && npm cache clean --force
COPY --from=build /app/dist ./dist

RUN addgroup -g 1001 -S appgroup && \
    adduser -S appuser -u 1001 -G appgroup && \
    chown -R appuser:appgroup /app

USER appuser

HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
    CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

EXPOSE 3000
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "dist/server.js"]

Run locally: docker compose up targets the development stage with hot reload via volume mounts. Build for production: docker build --target production -t myapp:latest . targets the final stage.


CI/CD Integration

GitHub Actions

# .github/workflows/build.yml
name: Build and Push

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Login to Container Registry
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          context: .
          target: production
          push: true
          tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

The cache-from and cache-to directives use GitHub Actions' built-in cache storage to persist Docker layer caches between workflow runs. This reduces build times from minutes to seconds for dependency-only or code-only changes.

Azure DevOps

# azure-pipelines.yml
trigger:
  branches:
    include:
      - main

pool:
  vmImage: 'ubuntu-latest'

variables:
  dockerRegistryServiceConnection: 'my-acr-connection'
  imageRepository: 'myapp'
  containerRegistry: 'myregistry.azurecr.io'
  tag: '$(Build.BuildId)'

steps:
  - task: Docker@2
    displayName: 'Build production image'
    inputs:
      command: build
      repository: $(imageRepository)
      dockerfile: Dockerfile
      containerRegistry: $(dockerRegistryServiceConnection)
      arguments: '--target production'
      tags: |
        $(tag)
        latest

  - task: Docker@2
    displayName: 'Push to ACR'
    inputs:
      command: push
      repository: $(imageRepository)
      containerRegistry: $(dockerRegistryServiceConnection)
      tags: |
        $(tag)
        latest

Complete Working Example

Here is a realistic, production-ready setup for an Express.js application with TypeScript, native dependencies (sharp for image processing, bcrypt for password hashing), MongoDB, and Redis.

Project Structure

myapp/
├── src/
│   ├── server.ts
│   ├── routes/
│   ├── middleware/
│   └── services/
├── package.json
├── tsconfig.json
├── Dockerfile
├── docker-compose.yml
├── docker-compose.prod.yml
└── .dockerignore

Production Dockerfile

# syntax=docker/dockerfile:1

# ============================================
# Stage 1: Install ALL dependencies
# ============================================
FROM node:20-alpine AS deps

# Native module build requirements
RUN apk add --no-cache python3 make g++ vips-dev

WORKDIR /app

COPY package.json package-lock.json ./

RUN --mount=type=cache,target=/root/.npm \
    npm ci

# ============================================
# Stage 2: Build TypeScript
# ============================================
FROM node:20-alpine AS build

WORKDIR /app

COPY --from=deps /app/node_modules ./node_modules
COPY . .

RUN npm run build
# Output: /app/dist/

# ============================================
# Stage 3: Production dependencies only
# ============================================
FROM node:20-alpine AS prod-deps

RUN apk add --no-cache python3 make g++ vips-dev

WORKDIR /app

COPY package.json package-lock.json ./

RUN --mount=type=cache,target=/root/.npm \
    npm ci --only=production && \
    npm cache clean --force

# ============================================
# Stage 4: Production runtime
# ============================================
FROM node:20-alpine AS production

# Tini: proper init process for handling signals
RUN apk add --no-cache tini vips

# Non-root user
RUN addgroup -g 1001 -S appgroup && \
    adduser -S appuser -u 1001 -G appgroup

WORKDIR /app

# Copy production dependencies from stage 3
COPY --from=prod-deps --chown=appuser:appgroup /app/node_modules ./node_modules

# Copy compiled application from stage 2
COPY --from=build --chown=appuser:appgroup /app/dist ./dist

# Copy package.json for runtime metadata
COPY --chown=appuser:appgroup package.json ./

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=15s --retries=3 \
    CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

USER appuser

EXPOSE 3000

ENV NODE_ENV=production

ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "dist/server.js"]

Docker Compose for Local Development

# docker-compose.yml
version: '3.8'

services:
  app:
    build:
      context: .
      target: deps  # Only build to deps stage
    ports:
      - "3000:3000"
      - "9229:9229"
    environment:
      NODE_ENV: development
      DATABASE_URL: mongodb://mongo:27017/myapp
      REDIS_URL: redis://redis:6379
    volumes:
      - ./src:/app/src
      - ./tsconfig.json:/app/tsconfig.json
      - /app/node_modules
    command: npx nodemon --inspect=0.0.0.0:9229 --exec ts-node src/server.ts
    depends_on:
      mongo:
        condition: service_healthy
      redis:
        condition: service_started

  mongo:
    image: mongo:7
    ports:
      - "27017:27017"
    volumes:
      - mongo_data:/data/db
    healthcheck:
      test: echo 'db.runCommand("ping").ok' | mongosh --quiet
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 10s

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data

volumes:
  mongo_data:
  redis_data:

Production Docker Compose

# docker-compose.prod.yml
version: '3.8'

services:
  app:
    build:
      context: .
      target: production
    ports:
      - "3000:3000"
    environment:
      NODE_ENV: production
      DATABASE_URL: ${DATABASE_URL}
      REDIS_URL: ${REDIS_URL}
    restart: unless-stopped
    deploy:
      resources:
        limits:
          memory: 512M
          cpus: '0.5'
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"

Before/After Image Sizes

$ docker images myapp
REPOSITORY   TAG            SIZE      BUILD TIME
myapp        single-stage   1.14 GB   87s
myapp        multi-stage    174 MB    52s
myapp        ms-distroless  128 MB    55s

The multi-stage Alpine build is 85% smaller than the naive single-stage build. The distroless variant squeezes out another 26% but sacrifices shell access.


Common Issues & Troubleshooting

1. Alpine / musl Compatibility Errors

Error: Error loading shared library libstdc++.so.6:
No such file or directory (needed by /app/node_modules/bcrypt/lib/binding/napi-v3/bcrypt_lib.node)

Cause: Native module compiled against glibc (in a non-Alpine stage or prebuilt binary) but running on Alpine (musl).

Fix: Ensure native modules are compiled in the same Alpine environment they will run in. If using a separate prod-deps stage, it must also be Alpine:

# Both stages must use Alpine
FROM node:20-alpine AS prod-deps
RUN apk add --no-cache python3 make g++
# ...
FROM node:20-alpine AS production

2. Permission Denied on node_modules

Error: EACCES: permission denied, open '/app/node_modules/.package-lock.json'

Cause: Dependencies installed as root, but application runs as non-root user.

Fix: Install dependencies before switching to the non-root user, or use --chown on the COPY directive:

COPY --from=prod-deps --chown=appuser:appgroup /app/node_modules ./node_modules

3. Missing Runtime Libraries for Native Modules

Error: libvips.so.42: cannot open shared object file: No such file or directory

Cause: Build tools (dev headers) were installed in the build stage but the runtime library was not installed in the production stage.

Fix: Install the runtime package (not the -dev variant) in the production stage:

# Build stage: vips-dev (headers + library, ~50 MB)
FROM node:20-alpine AS deps
RUN apk add --no-cache vips-dev

# Production stage: vips (runtime library only, ~8 MB)
FROM node:20-alpine AS production
RUN apk add --no-cache vips

4. Docker Build Context Too Large / Slow

Sending build context to Docker daemon  458.3MB

Cause: Missing or incomplete .dockerignore. The .git directory and node_modules are being sent to the daemon.

Fix: Create a proper .dockerignore (see section above). The build context should be under 10 MB for most Node.js projects.

5. Layer Cache Invalidation on Every Build

Step 4/10 : COPY . .
 ---> No cache
Step 5/10 : RUN npm ci
 ---> Running in abc123...

Cause: COPY . . is placed before npm ci. Any source code change invalidates the COPY layer, which invalidates the npm ci layer.

Fix: Copy dependency manifests and install before copying source code:

COPY package.json package-lock.json ./
RUN npm ci
# THEN copy source
COPY . .

6. SIGTERM Not Reaching Node.js Process

# Container takes 10s to stop instead of shutting down gracefully
$ time docker stop myapp
myapp
real    0m10.03s

Cause: Node.js is running as PID 1 and does not handle SIGTERM by default when it is the init process. Docker sends SIGTERM, waits 10 seconds, then sends SIGKILL.

Fix: Use tini as the init process, and handle SIGTERM in your application:

RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "dist/server.js"]
// In your application
process.on('SIGTERM', () => {
    console.log('SIGTERM received, shutting down gracefully');
    server.close(() => {
        process.exit(0);
    });
});

Best Practices

  • Pin your base image digests in production. Use node:20-alpine@sha256:abc123... instead of node:20-alpine. Tags are mutable — someone can push a new image to the same tag. Digests are immutable.

  • Use npm ci instead of npm install. npm ci is deterministic (uses the lockfile exactly), faster (skips dependency resolution), and fails if the lockfile is out of sync with package.json. There is no reason to use npm install in a Dockerfile.

  • Always include a .dockerignore file. Without one, you are sending your .git directory, node_modules, test files, and potentially secrets to the build daemon. This slows builds and can leak sensitive data.

  • Run as a non-root user. This is a fundamental container security practice. If your application is compromised, the attacker's access is limited to the appuser account rather than root.

  • Use tini or dumb-init as PID 1. Node.js does not properly handle signals when running as PID 1. Tini forwards signals correctly and reaps zombie processes.

  • Set NODE_ENV=production in your production stage. Express.js and many npm packages change behavior based on this variable — enabling view caching, disabling verbose error pages, and optimizing performance.

  • Clean up package manager caches. Add npm cache clean --force after npm ci in your production stage. The npm cache serves no purpose inside a container and wastes 50-100 MB.

  • Use COPY --chown instead of separate RUN chown commands. Each RUN chown creates a new layer that duplicates the file data. COPY --chown sets ownership during the copy without an extra layer.

  • Set explicit memory limits. Node.js defaults to using up to 1.5 GB of heap. In a container with 512 MB of memory, this causes OOM kills. Set --max-old-space-size appropriately:

CMD ["node", "--max-old-space-size=384", "dist/server.js"]
  • Scan your images for vulnerabilities. Run docker scout cves myapp:latest or integrate Trivy/Snyk into your CI pipeline. Smaller images have fewer packages and fewer vulnerabilities.

References

Powered by Contentful