Graceful Shutdown in Node.js Applications

Shane

2/14/2026

21 min read

A practical guide to implementing graceful shutdown in Node.js applications covering signal handling, connection draining, database cleanup, and container orchestration.

nodejs production process-management graceful-shutdown signals

Graceful Shutdown in Node.js Applications

Overview

Every Node.js application that runs in production will eventually need to stop. Deployments, scaling events, host maintenance, container orchestration — these all trigger shutdowns. The question is whether your application handles that shutdown cleanly or whether it drops requests on the floor, leaves database transactions half-committed, and abandons messages in your queue.

Graceful shutdown is the practice of intercepting termination signals, finishing in-flight work, releasing resources in the correct order, and then exiting with a clean status code. It sounds simple, but getting it right requires understanding Unix signals, the Node.js event loop, HTTP keep-alive behavior, and the shutdown semantics of every external system your application touches.

I have seen production incidents caused by applications that did not handle shutdown properly — orphaned database locks that blocked subsequent deployments, lost queue messages that required manual replay, and health check failures that cascaded across an entire cluster. This article covers how to avoid all of that.

Prerequisites

Node.js v16 or later (v20+ recommended for production)
Familiarity with Express.js and basic HTTP server concepts
Understanding of the Node.js event loop
Basic knowledge of Unix process signals
Optional: experience with Docker, Kubernetes, or PM2

Why Graceful Shutdown Matters

When a Node.js process receives a termination signal and exits immediately, several things can go wrong:

In-flight HTTP requests get dropped. A client that submitted a form or initiated a payment receives a connection reset error. If the request partially completed a multi-step operation, your data is now in an inconsistent state.

Database connections are severed without proper cleanup. Connection pools do not return connections to the server cleanly. Prepared statements and temporary tables may linger. Transactions that were in progress are rolled back by the database server, but only after the connection timeout expires — which can take minutes and hold locks the entire time.

Message queue consumers lose messages. If your application pulled a message from RabbitMQ or SQS but had not yet acknowledged it, the message returns to the queue after the visibility timeout. That might be acceptable for idempotent operations, but for anything else, you risk duplicate processing or lost work.

Background workers and child processes are orphaned. A child process spawned by your application may continue running after the parent exits, consuming resources and potentially conflicting with the next instance.

File handles and network sockets are leaked. The operating system cleans these up eventually, but "eventually" is not a production-grade strategy.

Unix Signals

Before writing any shutdown code, you need to understand the signals that trigger shutdown.

SIGTERM (Signal 15)

This is the standard termination signal. It is what Kubernetes sends to your pod during a rolling update, what Docker sends during docker stop, and what most process managers use to request a clean shutdown. Your application should treat SIGTERM as a polite request to wrap up and exit.

SIGINT (Signal 2)

This is what you get when you press Ctrl+C in a terminal. During development, this is the primary shutdown signal. In production, it rarely appears unless someone is attached to the process interactively.

SIGKILL (Signal 9)

This signal cannot be caught, blocked, or ignored. The operating system terminates the process immediately. Kubernetes sends SIGKILL after the terminationGracePeriodSeconds expires (default 30 seconds). You cannot handle this signal — your only defense is to finish your cleanup before it arrives.

SIGHUP (Signal 1)

Historically used to signal that a terminal connection was lost. Some applications use it to trigger configuration reload. Node.js does not handle SIGHUP by default on non-Windows platforms. If your application uses SIGHUP for anything, be explicit about it.

// Basic signal handler registration
var process = require('process');

process.on('SIGTERM', function () {
  console.log('Received SIGTERM — starting graceful shutdown');
  shutdown('SIGTERM');
});

process.on('SIGINT', function () {
  console.log('Received SIGINT — starting graceful shutdown');
  shutdown('SIGINT');
});

Important: On Windows, SIGTERM and SIGHUP are not real signals. Node.js emulates SIGINT from Ctrl+C, but SIGTERM behavior varies. If you are developing on Windows and deploying to Linux containers, test your signal handling in the target environment.

server.close() Behavior

The http.Server.close() method is the foundation of graceful HTTP shutdown. When you call it, two things happen:

The server stops accepting new connections.
The callback fires when all existing connections have been closed.

var http = require('http');

var server = http.createServer(function (req, res) {
  res.writeHead(200);
  res.end('OK');
});

server.listen(3000, function () {
  console.log('Server listening on port 3000');
});

// Later, during shutdown:
server.close(function () {
  console.log('All connections closed — server shut down');
});

There is a catch. HTTP keep-alive connections remain open even after server.close() is called. If a client holds a keep-alive connection and does not send another request, server.close() will wait indefinitely for that connection to close. This is the single most common reason graceful shutdowns hang.

Draining HTTP Connections

To handle keep-alive connections properly, you need to track active connections and destroy idle ones when shutdown begins.

var http = require('http');

var connections = new Set();
var isShuttingDown = false;

var server = http.createServer(function (req, res) {
  if (isShuttingDown) {
    res.writeHead(503, { 'Connection': 'close' });
    res.end('Service is shutting down');
    return;
  }

  // Set Connection: close header to prevent keep-alive during shutdown
  if (isShuttingDown) {
    res.setHeader('Connection', 'close');
  }

  // Normal request handling
  res.writeHead(200);
  res.end('OK');
});

server.on('connection', function (socket) {
  connections.add(socket);

  socket.on('close', function () {
    connections.delete(socket);
  });
});

function drainConnections() {
  isShuttingDown = true;

  // Close idle connections immediately
  connections.forEach(function (socket) {
    // If the socket has no pending requests, destroy it
    if (!socket._httpMessage) {
      socket.destroy();
    }
  });
}

The key insight here is the socket._httpMessage property. When a socket is actively processing a request, _httpMessage references the ServerResponse object. When the socket is idle (keep-alive, waiting for the next request), _httpMessage is null. We destroy idle sockets immediately and let active ones finish.

Database Connection Cleanup

Database connections must be closed after the HTTP server stops accepting requests but before the process exits. The ordering matters — you do not want to close the database pool while requests are still being processed.

var pg = require('pg');

var pool = new pg.Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20
});

function closeDatabasePool() {
  return new Promise(function (resolve, reject) {
    console.log('Closing database connection pool...');
    pool.end(function (err) {
      if (err) {
        console.error('Error closing database pool:', err.message);
        reject(err);
      } else {
        console.log('Database pool closed');
        resolve();
      }
    });
  });
}

For MongoDB with Mongoose:

var mongoose = require('mongoose');

function closeMongoConnection() {
  return new Promise(function (resolve, reject) {
    console.log('Closing MongoDB connection...');
    mongoose.connection.close(false, function (err) {
      if (err) {
        console.error('Error closing MongoDB:', err.message);
        reject(err);
      } else {
        console.log('MongoDB connection closed');
        resolve();
      }
    });
  });
}

For Redis:

var redis = require('redis');

var redisClient = redis.createClient({ url: process.env.REDIS_URL });

function closeRedisConnection() {
  return new Promise(function (resolve, reject) {
    console.log('Closing Redis connection...');
    redisClient.quit(function (err) {
      if (err) {
        console.error('Error closing Redis:', err.message);
        reject(err);
      } else {
        console.log('Redis connection closed');
        resolve();
      }
    });
  });
}

Worker Thread and Child Process Cleanup

If your application spawns worker threads or child processes, you need to signal them to stop and wait for them to exit.

var childProcess = require('child_process');

var workers = [];

function spawnWorker(script) {
  var worker = childProcess.fork(script);
  workers.push(worker);

  worker.on('exit', function (code) {
    var index = workers.indexOf(worker);
    if (index > -1) {
      workers.splice(index, 1);
    }
    console.log('Worker ' + worker.pid + ' exited with code ' + code);
  });

  return worker;
}

function shutdownWorkers() {
  return new Promise(function (resolve) {
    if (workers.length === 0) {
      resolve();
      return;
    }

    var remaining = workers.length;

    workers.forEach(function (worker) {
      worker.on('exit', function () {
        remaining--;
        if (remaining === 0) {
          resolve();
        }
      });

      // Send shutdown message to worker
      worker.send({ type: 'shutdown' });
    });

    // Force kill after timeout
    setTimeout(function () {
      workers.forEach(function (worker) {
        if (!worker.killed) {
          console.warn('Force killing worker ' + worker.pid);
          worker.kill('SIGKILL');
        }
      });
    }, 5000);
  });
}

In the worker process, handle the shutdown message:

process.on('message', function (msg) {
  if (msg && msg.type === 'shutdown') {
    console.log('Worker received shutdown signal — finishing current task');
    // Complete current work, then exit
    finishCurrentTask(function () {
      process.exit(0);
    });
  }
});

Message Queue Acknowledgment Handling

Message queue consumers need special attention during shutdown. The goal is to stop consuming new messages, finish processing messages already in flight, and acknowledge them before disconnecting.

var amqp = require('amqplib/callback_api');

var channel = null;
var consumerTag = null;

function startConsumer(ch, queue, handler) {
  channel = ch;

  ch.consume(queue, function (msg) {
    handler(msg, function (err) {
      if (err) {
        ch.nack(msg, false, true); // Requeue on failure
      } else {
        ch.ack(msg);
      }
    });
  }, {}, function (err, ok) {
    consumerTag = ok.consumerTag;
  });
}

function stopConsumer() {
  return new Promise(function (resolve) {
    if (!channel || !consumerTag) {
      resolve();
      return;
    }

    console.log('Cancelling queue consumer...');
    channel.cancel(consumerTag, function (err) {
      if (err) {
        console.error('Error cancelling consumer:', err.message);
      }
      console.log('Queue consumer cancelled — no new messages will be received');
      resolve();
    });
  });
}

The critical point: cancel the consumer first to stop the flow of new messages, then let in-flight message handlers finish and acknowledge. Only after all acknowledgments are sent should you close the connection.

Health Check Integration During Shutdown

When your application starts shutting down, it should immediately begin failing health checks. This tells load balancers and service meshes to stop routing traffic to this instance.

var isShuttingDown = false;

app.get('/health', function (req, res) {
  if (isShuttingDown) {
    res.status(503).json({
      status: 'shutting_down',
      message: 'Service is draining connections'
    });
    return;
  }

  res.status(200).json({
    status: 'healthy',
    uptime: process.uptime()
  });
});

app.get('/ready', function (req, res) {
  if (isShuttingDown) {
    res.status(503).json({ ready: false });
    return;
  }

  // Check dependencies
  checkDatabaseConnection(function (dbOk) {
    if (dbOk) {
      res.status(200).json({ ready: true });
    } else {
      res.status(503).json({ ready: false });
    }
  });
});

In Kubernetes, the readiness probe will detect the 503 and remove the pod from the service endpoints. This is often faster than waiting for the pod to fully terminate.

Timeout-Based Forced Exit

Never rely on all cleanup completing. Network partitions, hung database connections, and stuck I/O can prevent your graceful shutdown from finishing. Always set a hard timeout.

var SHUTDOWN_TIMEOUT = 30000; // 30 seconds

function shutdown(signal) {
  console.log('Shutdown initiated by ' + signal);

  // Set a hard deadline
  var forceExitTimer = setTimeout(function () {
    console.error('Graceful shutdown timed out — forcing exit');
    process.exit(1);
  }, SHUTDOWN_TIMEOUT);

  // Ensure the timer does not prevent the process from exiting
  forceExitTimer.unref();

  performGracefulShutdown()
    .then(function () {
      console.log('Graceful shutdown completed');
      process.exit(0);
    })
    .catch(function (err) {
      console.error('Error during shutdown:', err.message);
      process.exit(1);
    });
}

The forceExitTimer.unref() call is critical. Without it, the timer itself keeps the event loop alive, which means the process cannot exit naturally even if all other cleanup finishes. With unref(), the timer does not prevent the process from exiting if everything else has resolved.

Kubernetes and Container Shutdown

Kubernetes follows a specific termination lifecycle:

The pod is marked as Terminating.
The pod is removed from service endpoints (no new traffic).
The preStop hook runs (if configured).
SIGTERM is sent to the container process.
Kubernetes waits for terminationGracePeriodSeconds (default 30).
SIGKILL is sent if the process has not exited.

There is a race condition between steps 2 and 4. The service endpoint update is asynchronous — traffic may still arrive after SIGTERM is sent. A preStop hook with a short sleep can help:

# Kubernetes pod spec
spec:
  terminationGracePeriodSeconds: 45
  containers:
    - name: app
      image: myapp:latest
      lifecycle:
        preStop:
          exec:
            command: ["sh", "-c", "sleep 5"]
      readinessProbe:
        httpGet:
          path: /ready
          port: 3000
        periodSeconds: 5
        failureThreshold: 1

The 5-second sleep in preStop gives the endpoint controller time to propagate the removal. After the sleep, SIGTERM fires, and your application begins its graceful shutdown. Make sure terminationGracePeriodSeconds is larger than preStop sleep + your application shutdown timeout.

For Docker specifically:

# Use exec form so the Node process receives signals directly
CMD ["node", "app.js"]

# Do NOT use shell form — signals go to the shell, not your app
# BAD: CMD node app.js

The STOPSIGNAL directive defaults to SIGTERM, which is correct for most applications. If you use docker stop, Docker sends SIGTERM, waits 10 seconds, then sends SIGKILL. Override the timeout with docker stop -t 30.

PM2 Graceful Shutdown Integration

PM2 has built-in support for graceful shutdown. When you run pm2 reload or pm2 stop, PM2 sends a shutdown message to the process before sending SIGINT.

// PM2 graceful shutdown
process.on('message', function (msg) {
  if (msg === 'shutdown') {
    console.log('PM2 shutdown message received');
    shutdown('PM2');
  }
});

// Also handle direct signals for non-PM2 environments
process.on('SIGTERM', function () {
  shutdown('SIGTERM');
});

process.on('SIGINT', function () {
  shutdown('SIGINT');
});

In your PM2 ecosystem file:

// ecosystem.config.js
module.exports = {
  apps: [{
    name: 'myapp',
    script: 'app.js',
    kill_timeout: 30000,
    listen_timeout: 10000,
    shutdown_with_message: true,
    wait_ready: true
  }]
};

Set kill_timeout to match your shutdown timeout. Set wait_ready to true and call process.send('ready') from your application after startup completes. This ensures PM2 does not route traffic to an instance that has not finished initializing.

Testing Shutdown Behavior

Testing graceful shutdown is often overlooked, but it is straightforward to automate.

var assert = require('assert');
var http = require('http');
var childProcess = require('child_process');

function testGracefulShutdown(callback) {
  // Start the application as a child process
  var app = childProcess.spawn('node', ['app.js'], {
    env: Object.assign({}, process.env, { PORT: '9999' }),
    stdio: 'pipe'
  });

  var output = '';
  app.stdout.on('data', function (data) {
    output += data.toString();
  });
  app.stderr.on('data', function (data) {
    output += data.toString();
  });

  // Wait for the server to start
  setTimeout(function () {
    // Send an in-flight request
    var req = http.get('http://localhost:9999/slow-endpoint', function (res) {
      var body = '';
      res.on('data', function (chunk) { body += chunk; });
      res.on('end', function () {
        assert.strictEqual(res.statusCode, 200);
        console.log('In-flight request completed successfully');
      });
    });

    // Send SIGTERM while the request is in flight
    setTimeout(function () {
      app.kill('SIGTERM');
    }, 100);

    app.on('exit', function (code) {
      assert.strictEqual(code, 0, 'Process should exit with code 0');
      assert.ok(output.indexOf('Graceful shutdown completed') > -1,
        'Should log completion message');
      console.log('Graceful shutdown test passed');
      callback(null);
    });
  }, 2000);
}

Test the following scenarios:

SIGTERM during idle (no active requests)
SIGTERM with in-flight requests
SIGTERM with a long-running request that exceeds the timeout
SIGTERM with active database transactions
Double SIGTERM (ensure the second signal does not restart shutdown)

Shutdown Ordering: Reverse Dependency Order

The order in which you shut down components matters. The principle is simple: shut down in reverse dependency order. Components that depend on other components should stop first.

A typical shutdown sequence:

Stop accepting new work — fail health checks, stop queue consumers.
Close the HTTP server — stop accepting new connections, drain existing ones.
Wait for in-flight requests — let active request handlers finish.
Stop background jobs — cancel timers, stop cron jobs.
Shut down worker threads and child processes — signal and wait.
Flush buffers — log aggregators, metrics collectors, write-ahead buffers.
Close message queue connections — after all messages are acknowledged.
Close cache connections — Redis, Memcached.
Close database connections — the last external resource to release.
Exit the process — process.exit(0).

If you close the database before the HTTP server finishes draining, any in-flight request that touches the database will fail. If you close the queue connection before messages are acknowledged, those messages will be redelivered.

Complete Working Example

Here is a production-ready Express.js application with PostgreSQL, Redis, and a background worker that handles graceful shutdown properly.

var express = require('express');
var http = require('http');
var pg = require('pg');
var redis = require('redis');
var childProcess = require('child_process');

// ─── Configuration ───────────────────────────────────────────
var PORT = process.env.PORT || 3000;
var SHUTDOWN_TIMEOUT = parseInt(process.env.SHUTDOWN_TIMEOUT, 10) || 30000;

// ─── State ───────────────────────────────────────────────────
var isShuttingDown = false;
var connections = new Set();
var backgroundWorker = null;

// ─── Database Setup ──────────────────────────────────────────
var pgPool = new pg.Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,
  idleTimeoutMillis: 30000
});

pgPool.on('error', function (err) {
  console.error('Unexpected database pool error:', err.message);
});

// ─── Redis Setup ─────────────────────────────────────────────
var redisClient = redis.createClient({ url: process.env.REDIS_URL });

redisClient.on('error', function (err) {
  console.error('Redis error:', err.message);
});

redisClient.connect().catch(function (err) {
  console.error('Redis connection failed:', err.message);
});

// ─── Express App ─────────────────────────────────────────────
var app = express();
app.use(express.json());

// Health and readiness checks
app.get('/health', function (req, res) {
  if (isShuttingDown) {
    return res.status(503).json({ status: 'shutting_down' });
  }
  res.json({ status: 'healthy', uptime: process.uptime() });
});

app.get('/ready', function (req, res) {
  if (isShuttingDown) {
    return res.status(503).json({ ready: false });
  }

  pgPool.query('SELECT 1', function (err) {
    if (err) {
      return res.status(503).json({ ready: false, reason: 'database' });
    }
    res.json({ ready: true });
  });
});

// Middleware: reject requests during shutdown
app.use(function (req, res, next) {
  if (isShuttingDown) {
    res.set('Connection', 'close');
    return res.status(503).json({ error: 'Service is shutting down' });
  }
  next();
});

// Sample routes
app.get('/users/:id', function (req, res) {
  var userId = req.params.id;

  // Try cache first
  redisClient.get('user:' + userId)
    .then(function (cached) {
      if (cached) {
        return res.json(JSON.parse(cached));
      }

      pgPool.query('SELECT * FROM users WHERE id = $1', [userId], function (err, result) {
        if (err) {
          console.error('Database query error:', err.message);
          return res.status(500).json({ error: 'Internal server error' });
        }

        if (result.rows.length === 0) {
          return res.status(404).json({ error: 'User not found' });
        }

        var user = result.rows[0];

        // Cache for 5 minutes
        redisClient.setEx('user:' + userId, 300, JSON.stringify(user))
          .catch(function (cacheErr) {
            console.warn('Cache write failed:', cacheErr.message);
          });

        res.json(user);
      });
    })
    .catch(function (err) {
      console.warn('Cache read failed:', err.message);
      // Fall through to database
      pgPool.query('SELECT * FROM users WHERE id = $1', [userId], function (err, result) {
        if (err) {
          return res.status(500).json({ error: 'Internal server error' });
        }
        res.json(result.rows[0] || {});
      });
    });
});

// ─── HTTP Server ─────────────────────────────────────────────
var server = http.createServer(app);

server.on('connection', function (socket) {
  connections.add(socket);
  socket.on('close', function () {
    connections.delete(socket);
  });
});

// ─── Background Worker ──────────────────────────────────────
function startBackgroundWorker() {
  backgroundWorker = childProcess.fork('./worker.js');

  backgroundWorker.on('exit', function (code) {
    console.log('Background worker exited with code ' + code);
    backgroundWorker = null;

    // Restart worker if we are not shutting down
    if (!isShuttingDown) {
      console.log('Restarting background worker...');
      setTimeout(startBackgroundWorker, 1000);
    }
  });

  backgroundWorker.on('error', function (err) {
    console.error('Background worker error:', err.message);
  });

  console.log('Background worker started (PID: ' + backgroundWorker.pid + ')');
}

// ─── Shutdown Logic ──────────────────────────────────────────
function shutdown(signal) {
  if (isShuttingDown) {
    console.log('Shutdown already in progress — ignoring ' + signal);
    return;
  }

  isShuttingDown = true;
  console.log('[shutdown] Signal: ' + signal + ' — beginning graceful shutdown');

  // Hard deadline: force exit if cleanup takes too long
  var forceExitTimer = setTimeout(function () {
    console.error('[shutdown] Timed out after ' + SHUTDOWN_TIMEOUT + 'ms — forcing exit');
    process.exit(1);
  }, SHUTDOWN_TIMEOUT);
  forceExitTimer.unref();

  // Step 1: Stop accepting new connections and drain existing ones
  console.log('[shutdown] Step 1: Closing HTTP server');
  server.close(function () {
    console.log('[shutdown] HTTP server closed');
  });

  // Destroy idle keep-alive connections
  connections.forEach(function (socket) {
    if (!socket._httpMessage) {
      socket.destroy();
    } else {
      // For active connections, close after response finishes
      socket._httpMessage.on('finish', function () {
        socket.destroy();
      });
    }
  });

  // Step 2: Stop background worker
  var workerPromise;
  if (backgroundWorker) {
    console.log('[shutdown] Step 2: Stopping background worker');
    workerPromise = new Promise(function (resolve) {
      backgroundWorker.on('exit', function () {
        console.log('[shutdown] Background worker stopped');
        resolve();
      });
      backgroundWorker.send({ type: 'shutdown' });

      // Force kill after 10 seconds
      setTimeout(function () {
        if (backgroundWorker && !backgroundWorker.killed) {
          console.warn('[shutdown] Force killing background worker');
          backgroundWorker.kill('SIGKILL');
        }
      }, 10000);
    });
  } else {
    workerPromise = Promise.resolve();
  }

  // Step 3: Wait for worker, then close external connections
  workerPromise
    .then(function () {
      console.log('[shutdown] Step 3: Closing Redis connection');
      return redisClient.quit().catch(function (err) {
        console.warn('[shutdown] Redis close error:', err.message);
      });
    })
    .then(function () {
      console.log('[shutdown] Step 4: Closing database pool');
      return pgPool.end().catch(function (err) {
        console.warn('[shutdown] Database pool close error:', err.message);
      });
    })
    .then(function () {
      console.log('[shutdown] Graceful shutdown completed');
      process.exit(0);
    })
    .catch(function (err) {
      console.error('[shutdown] Error during shutdown:', err.message);
      process.exit(1);
    });
}

// ─── Signal Handlers ─────────────────────────────────────────
process.on('SIGTERM', function () { shutdown('SIGTERM'); });
process.on('SIGINT', function () { shutdown('SIGINT'); });

// PM2 support
process.on('message', function (msg) {
  if (msg === 'shutdown') {
    shutdown('PM2');
  }
});

// Catch unhandled errors — shut down on fatal errors
process.on('uncaughtException', function (err) {
  console.error('Uncaught exception:', err);
  shutdown('uncaughtException');
});

process.on('unhandledRejection', function (reason) {
  console.error('Unhandled rejection:', reason);
  // Log but do not shut down for unhandled rejections
  // unless they indicate a critical failure
});

// ─── Start Server ────────────────────────────────────────────
server.listen(PORT, function () {
  console.log('Server listening on port ' + PORT);
  startBackgroundWorker();

  // Tell PM2 we are ready
  if (process.send) {
    process.send('ready');
  }
});

This example demonstrates the complete pattern: connection tracking, health check integration, reverse-order cleanup, worker process management, double-signal prevention, forced timeout, and PM2 compatibility.

Common Issues and Troubleshooting

1. Shutdown hangs indefinitely

Cause: Keep-alive connections are held open by clients (browsers, load balancers, or monitoring tools). server.close() waits for all connections to close.

Fix: Track connections and destroy idle sockets when shutdown begins, as shown in the connection draining section. Always set a forced exit timeout.

2. Exit code is non-zero after successful cleanup

Cause: An unhandled promise rejection or a lingering event listener throws after process.exit(0) is called, but before the process fully terminates. Or server.close() fires its callback with an error.

Fix: Always pass an error handler to server.close(). Ensure your unhandledRejection handler does not call process.exit(1) during normal shutdown.

3. SIGTERM not received in Docker container

Cause: The Dockerfile uses shell form for CMD (e.g., CMD node app.js), which wraps the process in /bin/sh. The shell receives SIGTERM but does not forward it to the Node.js process.

Fix: Use exec form: CMD ["node", "app.js"]. Alternatively, use tini or dumb-init as a PID 1 init process that properly forwards signals.

4. Database connections pool error after shutdown

Cause: A database query initiated just before shutdown fires its callback after the pool has been closed.

Fix: Check the isShuttingDown flag in middleware to reject new requests early. Ensure the HTTP server is fully drained before closing the database pool. Add error handling for pool queries that accounts for the pool being shut down.

5. PM2 kills the process before shutdown completes

Cause: The kill_timeout in PM2 ecosystem config is shorter than your shutdown takes.

Fix: Set kill_timeout to a value larger than your maximum expected shutdown time. Monitor your shutdown duration in logs and adjust accordingly.

Best Practices

Always set a forced exit timeout. No matter how well you design your shutdown sequence, something can hang. A hard timeout with process.exit(1) is your safety net. Set it to a few seconds less than your container orchestrator's terminationGracePeriodSeconds.
Use a shutdown flag to reject new work immediately. The moment SIGTERM arrives, stop accepting new requests, stop consuming queue messages, and fail health checks. This minimizes the amount of in-flight work you need to drain.
Shut down in reverse dependency order. HTTP server first (stop new requests), then background workers, then caches, then databases. If component A depends on component B, shut down A before B.
Log every step of the shutdown process. When a shutdown hangs or exits uncleanly, you need to know exactly which step failed. Include timestamps and the initiating signal in your log messages.
Prevent double shutdown. Guard your shutdown function with a flag. A second SIGTERM (or SIGINT after SIGTERM) should not restart the shutdown process. Log that it was ignored.
Call timer.unref() on forced exit timers. Without unref(), the timer keeps the event loop alive, which prevents the process from exiting naturally even after all cleanup completes.
Test shutdown under load. Do not assume your shutdown works because it works when idle. Test it with active requests, open database transactions, and queued messages. Automate these tests in your CI pipeline.
Keep shutdown fast. Aim for under 10 seconds in normal conditions. Long shutdowns slow down deployments and increase the window where fewer instances are serving traffic. If your shutdown consistently takes more than a few seconds, investigate what is blocking it.

References

Node.js Process Signal Events — Official documentation on signal handling.
Node.js HTTP Server.close() — Server shutdown behavior.
Kubernetes Pod Termination — Termination lifecycle and grace periods.
PM2 Graceful Shutdown — PM2-specific shutdown patterns.
Docker STOPSIGNAL — Signal configuration in Docker containers.
Stoppable — A library that adds graceful shutdown to Node.js HTTP servers, handling connection tracking automatically.

Graceful Shutdown in Node.js Applications

Graceful Shutdown in Node.js Applications

Overview

Prerequisites

Why Graceful Shutdown Matters

Unix Signals

SIGTERM (Signal 15)

SIGINT (Signal 2)

SIGKILL (Signal 9)

SIGHUP (Signal 1)

server.close() Behavior

Draining HTTP Connections

Database Connection Cleanup

Worker Thread and Child Process Cleanup

Message Queue Acknowledgment Handling

Health Check Integration During Shutdown

Timeout-Based Forced Exit

Kubernetes and Container Shutdown

PM2 Graceful Shutdown Integration

Testing Shutdown Behavior

Shutdown Ordering: Reverse Dependency Order

Complete Working Example

Common Issues and Troubleshooting

1. Shutdown hangs indefinitely

2. Exit code is non-zero after successful cleanup

3. SIGTERM not received in Docker container

4. Database connections pool error after shutdown

5. PM2 kills the process before shutdown completes

Best Practices

References

Quick Links

Need Expert Help?