Containerization

Docker Volume Management and Data Persistence

Complete guide to Docker volume management for Node.js applications, covering named volumes, bind mounts, backup strategies, and data persistence patterns for databases.

Docker Volume Management and Data Persistence

Containers are ephemeral. When a container stops, everything inside its writable layer disappears. That is the entire point — reproducible, disposable environments. But your database data, uploaded files, and application logs need to survive container restarts. Docker volumes solve this problem, and understanding the three storage mechanisms (volumes, bind mounts, tmpfs) is essential for building reliable containerized Node.js applications.

Prerequisites

  • Docker Desktop v4.0+ or Docker Engine on Linux
  • Docker Compose v2
  • Basic understanding of Docker containers and images
  • Familiarity with PostgreSQL or MongoDB (for database examples)

Types of Docker Storage

Docker provides three mechanisms for persisting data outside a container's writable layer.

Named Volumes

Named volumes are Docker's recommended storage mechanism. Docker manages their lifecycle, location, and permissions.

# Create a named volume
docker volume create myapp-data

# Inspect it
docker volume inspect myapp-data
# [
#   {
#     "CreatedAt": "2026-02-13T10:00:00Z",
#     "Driver": "local",
#     "Labels": {},
#     "Mountpoint": "/var/lib/docker/volumes/myapp-data/_data",
#     "Name": "myapp-data",
#     "Options": {},
#     "Scope": "local"
#   }
# ]

Mount a named volume into a container:

docker run -d \
  --name postgres \
  -v pgdata:/var/lib/postgresql/data \
  postgres:16-alpine

Named volumes have several advantages:

  • Docker manages the storage location
  • They work identically across Linux, macOS, and Windows
  • They have near-native filesystem performance (even on Docker Desktop)
  • They can be shared between containers
  • They survive docker compose down (but not docker compose down -v)

Bind Mounts

Bind mounts map a specific host directory into the container.

docker run -d \
  --name api \
  -v /home/user/project/src:/app/src \
  myapp:latest

Bind mounts give you direct access to host files, making them essential for development. Edit a file on your host, and it instantly changes inside the container. But performance suffers on macOS and Windows because of the VM translation layer.

# docker-compose.yml
services:
  api:
    volumes:
      - ./src:/app/src          # Bind mount - relative path
      - /home/user/data:/data   # Bind mount - absolute path
      - pgdata:/var/lib/pg      # Named volume

In Compose, paths starting with . or / create bind mounts. A plain name creates a named volume reference.

tmpfs Mounts

tmpfs mounts exist only in memory. They never touch the disk.

docker run -d \
  --name api \
  --tmpfs /app/tmp:rw,size=100m \
  myapp:latest

Use tmpfs for temporary files, session data, or anything that should not persist and should be fast. It disappears when the container stops.

services:
  api:
    tmpfs:
      - /app/tmp:size=100m
      - /tmp:size=50m

Volume Lifecycle Management

# List all volumes
docker volume ls
# DRIVER    VOLUME NAME
# local     myapp_pgdata
# local     myapp_redisdata
# local     abc123def456  # anonymous volume (hash name)

# Inspect a volume
docker volume inspect myapp_pgdata

# Remove a specific volume
docker volume rm myapp_pgdata

# Remove all unused volumes (not mounted to any container)
docker volume prune

# Remove ALL unused volumes including named ones
docker volume prune --all

# WARNING: prune deletes data permanently. There is no undo.

Anonymous Volumes

Anonymous volumes are created without a name. Docker assigns a random hash.

docker run -v /data myapp  # Anonymous volume at /data

These are hard to manage and easy to lose. The most common use case is the node_modules trick:

volumes:
  - .:/app              # Bind mount project
  - /app/node_modules   # Anonymous volume prevents host override

This creates an anonymous volume at /app/node_modules that masks the host's node_modules directory. The container uses its own Linux-compiled modules while your host bind mount provides the source code.

The node_modules Volume Strategy

This pattern deserves special attention because it trips up every Node.js developer using Docker for the first time.

The problem: you bind mount your entire project into the container for live reloading. But node_modules on your macOS or Windows host contains native modules compiled for the wrong platform. When the bind mount overlays the container's filesystem, it replaces the container's correctly-compiled node_modules with your host's broken ones.

FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install          # Installs Linux-native modules
COPY . .                 # Would be overridden by bind mount anyway
CMD ["node", "app.js"]
services:
  api:
    build: .
    volumes:
      - .:/app                # Bind mount everything
      - /app/node_modules     # Except node_modules (anonymous volume)

The anonymous volume at /app/node_modules preserves the container's node_modules from the npm install during build. The bind mount overlays everything else.

The drawback: when you add a new dependency, you need to rebuild:

# After adding a dependency to package.json
docker compose up --build

# Or exec into the container and install
docker compose exec api npm install new-package

Sharing Data Between Containers

Named volumes can be mounted by multiple containers simultaneously.

services:
  api:
    volumes:
      - uploads:/app/uploads  # Read/write

  image-processor:
    volumes:
      - uploads:/data/input:ro  # Read-only

volumes:
  uploads:

The :ro flag makes the mount read-only inside that container. The API writes uploads, and the image processor reads them.

For one-time data sharing between containers:

# Copy files from a container to a volume
docker run --rm -v mydata:/backup -v /host/path:/source alpine \
  cp -r /source/. /backup/

# Copy files from a volume to the host
docker run --rm -v mydata:/data -v $(pwd):/backup alpine \
  tar czf /backup/data-backup.tar.gz -C /data .

Volume Permissions and Ownership

Permission issues are the most common Docker volume headache. By default, files created by a container are owned by the user running the process inside the container.

# Check who owns files in a volume
docker run --rm -v pgdata:/data alpine ls -la /data
# drwx------    2 70       70            4096 Feb 13 10:00 .
# PostgreSQL runs as UID 70 inside the container

When your Node.js app writes files that need to be readable by other containers or the host:

FROM node:20-alpine

# Create app user with specific UID/GID
RUN addgroup -g 1000 -S appgroup && \
    adduser -u 1000 -S appuser -G appgroup

WORKDIR /app
RUN chown appuser:appgroup /app

COPY --chown=appuser:appgroup package*.json ./
RUN npm install
COPY --chown=appuser:appgroup . .

USER appuser
CMD ["node", "app.js"]

If your host user has UID 1000 (common on Linux), files created by the container will be owned by your host user. On macOS and Windows with Docker Desktop, ownership mapping is handled automatically.

For volumes that need specific permissions at startup:

// scripts/init-dirs.js
var fs = require('fs');
var path = require('path');

var dirs = [
  '/app/uploads',
  '/app/uploads/images',
  '/app/uploads/documents',
  '/app/logs'
];

dirs.forEach(function(dir) {
  if (!fs.existsSync(dir)) {
    fs.mkdirSync(dir, { recursive: true });
    console.log('Created directory:', dir);
  }
});

Database Data Persistence

PostgreSQL

services:
  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: appuser
      POSTGRES_PASSWORD: secret
      POSTGRES_DB: myapp
    volumes:
      - pgdata:/var/lib/postgresql/data
      - ./db/init.sql:/docker-entrypoint-initdb.d/01-init.sql

volumes:
  pgdata:

The pgdata volume persists all database data. The init script in docker-entrypoint-initdb.d/ runs only when the volume is empty (first start).

To reset the database completely:

docker compose down -v  # Remove volumes
docker compose up -d    # Fresh start, init scripts run again

MongoDB

services:
  mongo:
    image: mongo:7
    environment:
      MONGO_INITDB_ROOT_USERNAME: admin
      MONGO_INITDB_ROOT_PASSWORD: secret
      MONGO_INITDB_DATABASE: myapp
    volumes:
      - mongodata:/data/db
      - ./db/mongo-init.js:/docker-entrypoint-initdb.d/01-init.js

volumes:
  mongodata:

MongoDB stores data in /data/db. The init script pattern is the same — JavaScript files in docker-entrypoint-initdb.d/ run on first start.

Backup and Restore Strategies

Backup a Named Volume

# Backup PostgreSQL data volume to a tar file
docker run --rm \
  -v pgdata:/source:ro \
  -v $(pwd)/backups:/backup \
  alpine tar czf /backup/pgdata-$(date +%Y%m%d).tar.gz -C /source .

Database-Level Backup (Recommended)

Volume-level backups copy raw files, which can be inconsistent if the database is running. Use database-native tools instead.

# PostgreSQL dump
docker compose exec postgres pg_dump -U appuser myapp > backups/myapp-$(date +%Y%m%d).sql

# PostgreSQL dump (compressed)
docker compose exec postgres pg_dump -U appuser -Fc myapp > backups/myapp-$(date +%Y%m%d).dump

# MongoDB dump
docker compose exec mongo mongodump --uri="mongodb://admin:secret@localhost:27017/myapp" --archive=/tmp/backup.gz --gzip
docker compose cp mongo:/tmp/backup.gz backups/mongo-$(date +%Y%m%d).gz

Restore

# PostgreSQL restore from SQL
docker compose exec -T postgres psql -U appuser myapp < backups/myapp-20260213.sql

# PostgreSQL restore from custom format
docker compose exec -T postgres pg_restore -U appuser -d myapp < backups/myapp-20260213.dump

# MongoDB restore
docker compose cp backups/mongo-20260213.gz mongo:/tmp/backup.gz
docker compose exec mongo mongorestore --uri="mongodb://admin:secret@localhost:27017" --archive=/tmp/backup.gz --gzip

Automated Backup Script

// scripts/backup.js
var execSync = require('child_process').execSync;
var path = require('path');

var timestamp = new Date().toISOString().replace(/[:.]/g, '-').slice(0, 19);
var backupDir = path.join(__dirname, '..', 'backups');
var filename = 'myapp-' + timestamp + '.sql';

try {
  execSync('mkdir -p ' + backupDir);

  var cmd = 'docker compose exec -T postgres pg_dump -U appuser myapp > ' +
    path.join(backupDir, filename);

  execSync(cmd, { stdio: 'inherit' });
  console.log('Backup created: ' + filename);

  // Keep only last 7 backups
  var files = require('fs').readdirSync(backupDir)
    .filter(function(f) { return f.endsWith('.sql'); })
    .sort()
    .reverse();

  files.slice(7).forEach(function(old) {
    require('fs').unlinkSync(path.join(backupDir, old));
    console.log('Removed old backup: ' + old);
  });
} catch (err) {
  console.error('Backup failed:', err.message);
  process.exit(1);
}

Docker Compose Volume Configuration

volumes:
  # Simple named volume
  pgdata:

  # Named volume with driver options
  app-logs:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: /var/log/myapp

  # External volume (created outside Compose)
  shared-data:
    external: true

  # Volume with labels
  uploads:
    labels:
      com.myapp.description: "User uploaded files"
      com.myapp.backup: "daily"

External volumes are not created or destroyed by docker compose up/down. You manage them separately:

docker volume create shared-data
docker compose up -d
# ... later
docker compose down  # shared-data is NOT removed

Performance Characteristics

Performance varies dramatically by platform and storage type.

Storage Type Linux macOS (VirtioFS) macOS (gRPC-FUSE) Windows (WSL2)
Named Volume Native Near-native Near-native Near-native
Bind Mount Native 2-3x slower 5-10x slower 2-4x slower
tmpfs RAM speed RAM speed RAM speed RAM speed

For Node.js applications on macOS/Windows:

volumes:
  # Bind mount only source code (for live reload)
  - ./src:/app/src
  - ./routes:/app/routes
  - ./models:/app/models
  - ./views:/app/views

  # Named volumes for heavy I/O paths
  - node_modules:/app/node_modules
  - logs:/app/logs
  - uploads:/app/uploads

Using named volumes for node_modules instead of anonymous volumes gives you better visibility and control:

volumes:
  node_modules:  # Named, shows up in docker volume ls

Complete Working Example

# docker-compose.yml
version: "3.8"

services:
  api:
    build:
      context: .
      target: development
    volumes:
      - ./src:/app/src
      - ./routes:/app/routes
      - ./models:/app/models
      - ./views:/app/views
      - ./app.js:/app/app.js
      - ./package.json:/app/package.json
      - node_modules:/app/node_modules
      - uploads:/app/uploads
      - logs:/app/logs
    ports:
      - "3000:3000"
    environment:
      - DATABASE_URL=postgresql://appuser:secret@postgres:5432/myapp
    depends_on:
      postgres:
        condition: service_healthy

  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: appuser
      POSTGRES_PASSWORD: secret
      POSTGRES_DB: myapp
    volumes:
      - pgdata:/var/lib/postgresql/data
      - ./db/schema.sql:/docker-entrypoint-initdb.d/01-schema.sql
      - ./db/seed.sql:/docker-entrypoint-initdb.d/02-seed.sql
    ports:
      - "127.0.0.1:5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U appuser -d myapp"]
      interval: 5s
      timeout: 5s
      retries: 5

volumes:
  pgdata:
    labels:
      backup: "daily"
  node_modules:
  uploads:
  logs:
// scripts/backup-volumes.js
var execSync = require('child_process').execSync;
var fs = require('fs');
var path = require('path');

var backupDir = path.join(__dirname, '..', 'backups');
var timestamp = new Date().toISOString().slice(0, 10);

if (!fs.existsSync(backupDir)) {
  fs.mkdirSync(backupDir, { recursive: true });
}

// Database backup (consistent, application-level)
console.log('Backing up database...');
var dbFile = path.join(backupDir, 'db-' + timestamp + '.sql');
execSync(
  'docker compose exec -T postgres pg_dump -U appuser myapp > ' + dbFile,
  { stdio: 'inherit' }
);
console.log('Database backup: ' + dbFile);

// Uploads volume backup (file-level)
console.log('Backing up uploads...');
var uploadsFile = path.join(backupDir, 'uploads-' + timestamp + '.tar.gz');
execSync(
  'docker run --rm -v myapp_uploads:/source:ro -v ' + backupDir + ':/backup alpine ' +
  'tar czf /backup/uploads-' + timestamp + '.tar.gz -C /source .',
  { stdio: 'inherit' }
);
console.log('Uploads backup: ' + uploadsFile);

// Cleanup old backups (keep 7 days)
var cutoff = Date.now() - (7 * 24 * 60 * 60 * 1000);
fs.readdirSync(backupDir).forEach(function(file) {
  var filePath = path.join(backupDir, file);
  var stat = fs.statSync(filePath);
  if (stat.mtimeMs < cutoff) {
    fs.unlinkSync(filePath);
    console.log('Removed old backup: ' + file);
  }
});

console.log('Backup complete.');
// scripts/restore-db.js
var execSync = require('child_process').execSync;
var fs = require('fs');
var path = require('path');

var backupFile = process.argv[2];
if (!backupFile) {
  console.error('Usage: node scripts/restore-db.js <backup-file>');
  process.exit(1);
}

if (!fs.existsSync(backupFile)) {
  console.error('File not found: ' + backupFile);
  process.exit(1);
}

console.log('Restoring from: ' + backupFile);
console.log('WARNING: This will overwrite the current database.');

try {
  // Drop and recreate
  execSync(
    'docker compose exec -T postgres psql -U appuser -d postgres -c "DROP DATABASE IF EXISTS myapp;"',
    { stdio: 'inherit' }
  );
  execSync(
    'docker compose exec -T postgres psql -U appuser -d postgres -c "CREATE DATABASE myapp OWNER appuser;"',
    { stdio: 'inherit' }
  );

  // Restore
  execSync(
    'docker compose exec -T postgres psql -U appuser -d myapp < ' + backupFile,
    { stdio: 'inherit' }
  );

  console.log('Restore complete.');
} catch (err) {
  console.error('Restore failed:', err.message);
  process.exit(1);
}

Common Issues and Troubleshooting

1. Permission Denied on Volume Mount

Error: EACCES: permission denied, open '/app/uploads/image.jpg'

The container process does not have write permission to the volume. Check the UID:

docker compose exec api id
# uid=1000(node) gid=1000(node)

docker compose exec api ls -la /app/uploads
# drwxr-xr-x 2 root root 4096 Feb 13 10:00 .

Fix by setting ownership in the Dockerfile or using an init script:

RUN mkdir -p /app/uploads && chown node:node /app/uploads
USER node

2. Volume Data Not Persisting

docker compose down
docker compose up -d
# Database is empty!

You probably used docker compose down -v which removes volumes. Without -v, named volumes persist. Check your scripts and aliases for the -v flag.

3. Init Scripts Not Running After Schema Change

# Changed init.sql but tables are unchanged

PostgreSQL init scripts only run when the data directory is empty. You must destroy the volume:

docker compose down -v
docker compose up -d

4. Disk Space Exhaustion from Volumes

Error: No space left on device

Docker volumes accumulate silently:

docker system df
# TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
# Volumes         47        3         12.8GB    11.2GB (87%)

# Remove unused volumes
docker volume prune

# Nuclear option: remove everything
docker system prune --volumes

5. Bind Mount Shows Empty Directory

# Volume is empty inside container
docker compose exec api ls /app/src
# (nothing)

The host path does not exist or is misspelled. Docker silently creates empty directories for missing bind mount sources. Double-check the path in docker-compose.yml and ensure it exists on the host.

Best Practices

  • Use named volumes for any data that must survive container restarts. Anonymous volumes are easy to lose during cleanup operations.
  • Use database-native backup tools, not volume-level copies. pg_dump ensures a consistent snapshot. Copying raw volume files while the database is running risks corruption.
  • Keep bind mounts targeted in development. Mount ./src:/app/src instead of .:/app to minimize filesystem overhead on macOS and Windows.
  • Label your volumes for management. Labels like backup: daily make it easy to script automated operations.
  • Never use -v with docker compose down unless you intend to destroy data. Create an alias like docker compose down (without -v) as your default.
  • Use .dockerignore to keep bind mount context small. Exclude node_modules, .git, and build artifacts.
  • Set explicit UID/GID in Dockerfiles. Match your host user's UID (typically 1000 on Linux) to avoid permission headaches.
  • Document the volume strategy for your project. Include volume names, backup procedures, and reset instructions in your README.

References

Powered by Contentful