Nodejs

Process Management with PM2

A practical guide to managing Node.js processes in production with PM2 covering cluster mode, log management, zero-downtime deployments, and monitoring.

Process Management with PM2

Running node app.js in a terminal is fine for development. It is not fine for production. The moment your SSH session drops, your terminal closes, or your process throws an unhandled exception, your application dies and stays dead. PM2 solves this problem and a dozen others. It is the de facto process manager for Node.js applications in production, handling process supervision, clustering, log management, zero-downtime deployments, and monitoring in a single tool.

Prerequisites

  • Node.js v14 or later installed
  • Basic understanding of Node.js and Express
  • A Linux or macOS server (PM2 works on Windows but production deployments are typically Linux)
  • Familiarity with the command line

Installing and Configuring PM2

Install PM2 globally on your server:

npm install -g pm2

Verify the installation:

pm2 --version
# 5.3.1

Start an application:

pm2 start app.js --name "my-api"

That single command daemonizes your process, restarts it if it crashes, and starts collecting logs. You can check what is running:

pm2 list

Output:

┌─────┬──────────┬─────────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┬──────────┬──────────┐
│ id  │ name     │ namespace   │ version │ mode    │ pid      │ uptime │ ↺    │ status    │ cpu      │ mem      │ user     │ watching │
├─────┼──────────┼─────────────┼─────────┼─────────┼──────────┼────────┼──────┼───────────┼──────────┼──────────┼──────────┼──────────┤
│ 0   │ my-api   │ default     │ 1.0.0   │ fork    │ 12345    │ 5s     │ 0    │ online    │ 0.1%     │ 45.2mb   │ deploy   │ disabled │
└─────┴──────────┴─────────────┴─────────┴─────────┴──────────┴────────┴──────┴───────────┴──────────┴──────────┴──────────┴──────────┘

Basic commands you will use constantly:

pm2 stop my-api        # Stop the process
pm2 restart my-api     # Restart the process
pm2 delete my-api      # Remove from PM2's process list
pm2 logs my-api        # Stream logs in real-time
pm2 show my-api        # Detailed process information

The ecosystem.config.js File

Running PM2 with command-line flags gets tedious fast. The ecosystem file is PM2's configuration file, and every production deployment should have one. It defines your applications, their settings, and environment-specific variables in a single, version-controlled file.

// ecosystem.config.js
module.exports = {
  apps: [
    {
      name: "my-api",
      script: "./app.js",
      instances: "max",
      exec_mode: "cluster",
      watch: false,
      max_memory_restart: "500M",
      env: {
        NODE_ENV: "development",
        PORT: 3000
      },
      env_production: {
        NODE_ENV: "production",
        PORT: 8080
      }
    }
  ]
};

Start with a specific environment:

pm2 start ecosystem.config.js --env production

The env block defines defaults. The env_production block merges into env when you pass --env production. You can define as many environments as you need: env_staging, env_test, whatever your pipeline requires.

Cluster Mode vs Fork Mode

Fork mode is the default. PM2 spawns your application as a child process. One process, one core. This is appropriate for worker scripts, cron jobs, and applications that are not CPU-bound.

Cluster mode uses Node's built-in cluster module to spawn multiple instances of your application across CPU cores. Every instance shares the same port. PM2 handles the load balancing.

// Fork mode (default)
{
  name: "worker",
  script: "./worker.js",
  exec_mode: "fork",
  instances: 1
}

// Cluster mode
{
  name: "api",
  script: "./app.js",
  exec_mode: "cluster",
  instances: "max"  // One instance per CPU core
}

Setting instances to "max" uses all available cores. You can also set a specific number or use negative values. Setting instances to -1 uses all cores minus one, which is useful if you want to leave a core free for the operating system or other processes.

Check how many instances are running:

pm2 list
┌─────┬──────────┬─────────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┐
│ id  │ name     │ namespace   │ version │ mode    │ pid      │ uptime │ ↺    │ status    │ cpu      │ mem      │
├─────┼──────────┼─────────────┼─────────┼─────────┼──────────┼────────┼──────┼───────────┼──────────┼──────────┤
│ 0   │ api      │ default     │ 1.0.0   │ cluster │ 14201    │ 2m     │ 0    │ online    │ 0.1%     │ 52.1mb   │
│ 1   │ api      │ default     │ 1.0.0   │ cluster │ 14202    │ 2m     │ 0    │ online    │ 0.1%     │ 51.8mb   │
│ 2   │ api      │ default     │ 1.0.0   │ cluster │ 14203    │ 2m     │ 0    │ online    │ 0.2%     │ 53.4mb   │
│ 3   │ api      │ default     │ 1.0.0   │ cluster │ 14204    │ 2m     │ 0    │ online    │ 0.1%     │ 50.9mb   │
└─────┴──────────┴─────────────┴─────────┴─────────┴──────────┴────────┴──────┴───────────┴──────────┴──────────┘

There is a critical difference to understand: cluster mode requires your application to be stateless. If you store sessions in memory, cache data in a module-level variable, or maintain any in-process state, cluster mode will break things because each instance has its own memory space. Use Redis or another external store for shared state.

Log Management and Rotation

PM2 captures stdout and stderr from each process and writes them to log files. By default, logs go to ~/.pm2/logs/.

pm2 logs                    # Stream all logs
pm2 logs my-api             # Stream logs for one app
pm2 logs --lines 200        # Show last 200 lines
pm2 flush                   # Clear all log files

In production, log files grow indefinitely unless you handle rotation. Install the log rotation module:

pm2 install pm2-logrotate

Configure it:

pm2 set pm2-logrotate:max_size 50M       # Rotate when file reaches 50MB
pm2 set pm2-logrotate:retain 30          # Keep 30 rotated files
pm2 set pm2-logrotate:compress true      # Gzip old logs
pm2 set pm2-logrotate:dateFormat YYYY-MM-DD_HH-mm-ss
pm2 set pm2-logrotate:rotateInterval '0 0 * * *'  # Rotate daily at midnight

You can also configure log paths per application in the ecosystem file:

{
  name: "my-api",
  script: "./app.js",
  error_file: "/var/log/my-api/error.log",
  out_file: "/var/log/my-api/output.log",
  log_date_format: "YYYY-MM-DD HH:mm:ss Z",
  merge_logs: true  // Merge cluster mode logs into single files
}

The merge_logs option is important for cluster mode. Without it, each instance writes to a separate log file, which makes log analysis painful.

Monitoring with pm2 monit

For real-time monitoring in the terminal:

pm2 monit

This opens a dashboard showing CPU usage, memory consumption, loop delay, and logs for each process. It is useful for quick debugging sessions on the server but not a replacement for proper monitoring infrastructure.

For a quick snapshot without the interactive dashboard:

pm2 status        # Same as pm2 list
pm2 show my-api   # Detailed info including metadata, restart count, uptime

The pm2 show output includes restart count, uptime, unstable restarts, created at timestamp, script path, and environment variables. This is often the first thing you check when diagnosing production issues.

Zero-Downtime Deployments with pm2 reload

The difference between restart and reload is critical for production:

  • pm2 restart my-api — kills all processes, then starts new ones. There is downtime.
  • pm2 reload my-api — restarts processes one at a time, waiting for each new instance to be ready before killing the old one. Zero downtime.

Reload only works in cluster mode. If you are running in fork mode, reload behaves like restart.

pm2 reload my-api
# Or reload everything in the ecosystem file
pm2 reload ecosystem.config.js --env production

For reload to work correctly, your application needs to signal when it is ready. Without this, PM2 waits a default timeout and assumes the process is ready. You can make this explicit:

// app.js
var express = require("express");
var app = express();

app.get("/", function(req, res) {
  res.json({ status: "ok" });
});

var server = app.listen(process.env.PORT || 3000, function() {
  console.log("Server listening on port " + server.address().port);

  // Tell PM2 this instance is ready to receive traffic
  if (process.send) {
    process.send("ready");
  }
});

Then in your ecosystem config:

{
  name: "my-api",
  script: "./app.js",
  exec_mode: "cluster",
  instances: 4,
  wait_ready: true,        // Wait for process.send('ready')
  listen_timeout: 10000    // Timeout after 10 seconds if ready signal not received
}

Startup Scripts for System Reboot

Your server will reboot. Power failures, kernel updates, hardware maintenance — it happens. PM2 can generate a startup script that automatically restarts your processes when the system boots:

pm2 startup

PM2 detects your init system (systemd, upstart, launchd) and outputs a command you need to run with sudo:

[PM2] Init System found: systemd
[PM2] To setup the Startup Script, copy/paste the following command:
sudo env PATH=$PATH:/usr/bin /usr/lib/node_modules/pm2/bin/pm2 startup systemd -u deploy --hp /home/deploy

Run that command. Then save the current process list:

pm2 save

Now if the server reboots, PM2 will automatically restore every process that was running when you last ran pm2 save. Get in the habit of running pm2 save after every deployment.

To remove the startup script:

pm2 unstartup systemd

Environment Variable Management

The ecosystem file supports per-environment configuration. This is one of its most useful features:

module.exports = {
  apps: [
    {
      name: "my-api",
      script: "./app.js",
      env: {
        NODE_ENV: "development",
        PORT: 3000,
        DB_HOST: "localhost",
        DB_NAME: "myapp_dev",
        LOG_LEVEL: "debug"
      },
      env_staging: {
        NODE_ENV: "staging",
        PORT: 8080,
        DB_HOST: "staging-db.internal",
        DB_NAME: "myapp_staging",
        LOG_LEVEL: "info"
      },
      env_production: {
        NODE_ENV: "production",
        PORT: 8080,
        DB_HOST: "prod-db.internal",
        DB_NAME: "myapp_prod",
        LOG_LEVEL: "warn"
      }
    }
  ]
};

Start with the desired environment:

pm2 start ecosystem.config.js --env staging
pm2 start ecosystem.config.js --env production

An important gotcha: environment variables are cached when the process first starts. If you change env_production in your ecosystem file and run pm2 restart, the old environment variables persist. You need to delete and re-start the process, or use --update-env:

pm2 restart my-api --update-env

For secrets like API keys and database passwords, do not put them in the ecosystem file. That file should be committed to version control. Use a .env file loaded at runtime, or inject secrets through your deployment pipeline, or use your platform's secrets management.

Watch Mode for Development

During development, you want processes to restart when files change:

{
  name: "my-api",
  script: "./app.js",
  watch: true,
  watch_delay: 1000,
  ignore_watch: [
    "node_modules",
    "logs",
    "tmp",
    ".git",
    "*.log"
  ]
}

The watch_delay prevents rapid-fire restarts when multiple files change at once (common with editors that do atomic saves). The ignore_watch array is essential — without it, PM2 will restart on every node_modules change, log write, or temp file creation.

Do not use watch mode in production. It adds overhead and can cause unexpected restarts. Use pm2 reload for production deployments.

PM2 Plus and Keymetrics for Remote Monitoring

PM2 Plus (formerly Keymetrics) is a paid monitoring service that integrates directly with PM2. It provides a web dashboard for remote monitoring, alerting, and diagnostics.

Link your PM2 instance:

pm2 plus

This walks you through authentication and links your server. Once connected, you get:

  • Real-time CPU, memory, and event loop metrics per process
  • Exception tracking with stack traces
  • HTTP transaction tracing and slow route detection
  • Custom metric publishing from your application
  • Alerting via email, Slack, or webhooks
  • Remote actions (restart, reload) from the dashboard

You can publish custom metrics from your application:

var pmx = require("@pm2/io");

var requestCounter = pmx.counter({
  name: "Active Requests"
});

var responseTime = pmx.histogram({
  name: "Response Time",
  measurement: "mean"
});

app.use(function(req, res, next) {
  requestCounter.inc();
  var start = Date.now();

  res.on("finish", function() {
    requestCounter.dec();
    responseTime.update(Date.now() - start);
  });

  next();
});

PM2 Plus is not required. Many teams use PM2 with Prometheus, Grafana, Datadog, or similar tools instead. But if you want a quick, PM2-native monitoring solution, it works well.

Managing Multiple Applications

A real production environment rarely has just one process. You might have an API server, a background worker, and a scheduled task runner. The ecosystem file handles this cleanly:

module.exports = {
  apps: [
    {
      name: "api",
      script: "./api/server.js",
      instances: "max",
      exec_mode: "cluster"
    },
    {
      name: "worker",
      script: "./workers/queue-processor.js",
      instances: 2,
      exec_mode: "fork"
    },
    {
      name: "scheduler",
      script: "./jobs/scheduler.js",
      instances: 1,
      exec_mode: "fork",
      cron_restart: "0 */6 * * *"
    }
  ]
};

Manage them individually or together:

pm2 start ecosystem.config.js          # Start all
pm2 stop all                            # Stop all
pm2 restart api                         # Restart just the API
pm2 scale api 8                         # Scale API to 8 instances
pm2 delete worker                       # Remove worker from process list

The pm2 scale command is particularly useful. You can scale cluster mode applications up or down without restarting:

pm2 scale api +2    # Add 2 more instances
pm2 scale api 4     # Set to exactly 4 instances

Graceful Start/Stop Lifecycle Hooks

When PM2 stops or restarts a process, it sends a SIGINT signal. Your application should listen for this and shut down cleanly — close database connections, finish in-flight requests, flush buffers:

var express = require("express");
var mongoose = require("mongoose");
var app = express();

var server = app.listen(process.env.PORT || 3000, function() {
  console.log("Server started");
  if (process.send) {
    process.send("ready");
  }
});

process.on("SIGINT", function() {
  console.log("SIGINT received. Shutting down gracefully...");

  server.close(function() {
    console.log("HTTP server closed");

    mongoose.connection.close(false, function() {
      console.log("MongoDB connection closed");
      process.exit(0);
    });
  });

  // Force close after 10 seconds
  setTimeout(function() {
    console.error("Forced shutdown after timeout");
    process.exit(1);
  }, 10000);
});

Configure the shutdown timeout in the ecosystem file:

{
  name: "my-api",
  script: "./app.js",
  kill_timeout: 10000,        // Wait 10 seconds before SIGKILL
  shutdown_with_message: true  // Send 'shutdown' message instead of signal
}

If you set shutdown_with_message: true, PM2 sends a message instead of a signal:

process.on("message", function(msg) {
  if (msg === "shutdown") {
    // Clean shutdown logic here
    gracefulShutdown();
  }
});

Memory Limit Auto-Restart

Memory leaks happen. Even small ones compound over days and weeks. PM2 can automatically restart a process when it exceeds a memory threshold:

{
  name: "my-api",
  script: "./app.js",
  max_memory_restart: "500M"
}

When the process exceeds 500MB of memory, PM2 restarts it. In cluster mode, this happens one instance at a time, so you maintain availability. This is not a fix for memory leaks — you still need to find and fix them — but it is a safety net that keeps your application running while you investigate.

You can also set this from the command line:

pm2 start app.js --max-memory-restart 500M

Accepted units are K (kilobytes), M (megabytes), and G (gigabytes).

Complete Working Example

Here is a full ecosystem configuration managing multiple Node.js services with environment-specific settings, log rotation, and a deployment workflow.

Project Structure

my-project/
  api/
    server.js
  workers/
    queue-processor.js
  jobs/
    cleanup.js
  ecosystem.config.js
  package.json

ecosystem.config.js

module.exports = {
  apps: [
    // API Server - clustered for load balancing
    {
      name: "api",
      script: "./api/server.js",
      instances: "max",
      exec_mode: "cluster",
      wait_ready: true,
      listen_timeout: 10000,
      kill_timeout: 5000,
      max_memory_restart: "512M",
      error_file: "/var/log/my-project/api-error.log",
      out_file: "/var/log/my-project/api-output.log",
      log_date_format: "YYYY-MM-DD HH:mm:ss Z",
      merge_logs: true,
      env: {
        NODE_ENV: "development",
        PORT: 3000,
        DB_HOST: "localhost",
        DB_PORT: 5432,
        DB_NAME: "myapp_dev",
        REDIS_URL: "redis://localhost:6379",
        LOG_LEVEL: "debug"
      },
      env_staging: {
        NODE_ENV: "staging",
        PORT: 8080,
        DB_HOST: "staging-db.internal",
        DB_PORT: 5432,
        DB_NAME: "myapp_staging",
        REDIS_URL: "redis://staging-redis.internal:6379",
        LOG_LEVEL: "info"
      },
      env_production: {
        NODE_ENV: "production",
        PORT: 8080,
        DB_HOST: "prod-db.internal",
        DB_PORT: 5432,
        DB_NAME: "myapp_prod",
        REDIS_URL: "redis://prod-redis.internal:6379",
        LOG_LEVEL: "warn"
      }
    },

    // Background Worker - processes jobs from Redis queue
    {
      name: "worker",
      script: "./workers/queue-processor.js",
      instances: 2,
      exec_mode: "fork",
      kill_timeout: 30000,
      max_memory_restart: "256M",
      error_file: "/var/log/my-project/worker-error.log",
      out_file: "/var/log/my-project/worker-output.log",
      log_date_format: "YYYY-MM-DD HH:mm:ss Z",
      env: {
        NODE_ENV: "development",
        REDIS_URL: "redis://localhost:6379",
        CONCURRENCY: 5,
        LOG_LEVEL: "debug"
      },
      env_production: {
        NODE_ENV: "production",
        REDIS_URL: "redis://prod-redis.internal:6379",
        CONCURRENCY: 20,
        LOG_LEVEL: "warn"
      }
    },

    // Scheduled Cleanup Job - runs every 6 hours
    {
      name: "cleanup",
      script: "./jobs/cleanup.js",
      instances: 1,
      exec_mode: "fork",
      cron_restart: "0 */6 * * *",
      autorestart: false,
      max_memory_restart: "128M",
      error_file: "/var/log/my-project/cleanup-error.log",
      out_file: "/var/log/my-project/cleanup-output.log",
      log_date_format: "YYYY-MM-DD HH:mm:ss Z",
      env: {
        NODE_ENV: "development",
        DB_HOST: "localhost",
        DB_NAME: "myapp_dev",
        RETENTION_DAYS: 90
      },
      env_production: {
        NODE_ENV: "production",
        DB_HOST: "prod-db.internal",
        DB_NAME: "myapp_prod",
        RETENTION_DAYS: 365
      }
    }
  ],

  // Deployment configuration
  deploy: {
    production: {
      user: "deploy",
      host: ["prod-1.example.com", "prod-2.example.com"],
      ref: "origin/main",
      repo: "[email protected]:myorg/my-project.git",
      path: "/var/www/my-project",
      "pre-deploy-local": "echo 'Deploying to production...'",
      "post-deploy": "npm ci --production && pm2 reload ecosystem.config.js --env production && pm2 save",
      "pre-setup": "mkdir -p /var/log/my-project"
    },
    staging: {
      user: "deploy",
      host: "staging.example.com",
      ref: "origin/develop",
      repo: "[email protected]:myorg/my-project.git",
      path: "/var/www/my-project",
      "post-deploy": "npm ci && pm2 reload ecosystem.config.js --env staging && pm2 save"
    }
  }
};

API Server (api/server.js)

var express = require("express");
var app = express();

app.use(express.json());

app.get("/health", function(req, res) {
  res.json({ status: "ok", uptime: process.uptime(), pid: process.pid });
});

app.get("/api/users", function(req, res) {
  // Your route logic here
  res.json({ users: [] });
});

var port = process.env.PORT || 3000;
var server = app.listen(port, function() {
  console.log("[API] Process " + process.pid + " listening on port " + port);

  // Signal PM2 that this instance is ready for traffic
  if (process.send) {
    process.send("ready");
  }
});

// Graceful shutdown
process.on("SIGINT", function() {
  console.log("[API] Process " + process.pid + " received SIGINT. Shutting down...");
  server.close(function() {
    console.log("[API] Process " + process.pid + " closed all connections");
    process.exit(0);
  });

  setTimeout(function() {
    console.error("[API] Process " + process.pid + " forced shutdown");
    process.exit(1);
  }, 5000);
});

Background Worker (workers/queue-processor.js)

var Queue = require("bull");

var concurrency = parseInt(process.env.CONCURRENCY, 10) || 5;
var emailQueue = new Queue("email", process.env.REDIS_URL);

emailQueue.process(concurrency, function(job) {
  console.log("[Worker] Processing job " + job.id + " on PID " + process.pid);
  return sendEmail(job.data);
});

function sendEmail(data) {
  // Email sending logic
  return new Promise(function(resolve) {
    setTimeout(function() {
      console.log("[Worker] Email sent to " + data.to);
      resolve({ sent: true });
    }, 1000);
  });
}

emailQueue.on("completed", function(job) {
  console.log("[Worker] Job " + job.id + " completed");
});

emailQueue.on("failed", function(job, err) {
  console.error("[Worker] Job " + job.id + " failed: " + err.message);
});

// Graceful shutdown - finish current jobs before exiting
process.on("SIGINT", function() {
  console.log("[Worker] PID " + process.pid + " shutting down. Waiting for active jobs...");
  emailQueue.close().then(function() {
    console.log("[Worker] Queue closed. Exiting.");
    process.exit(0);
  });
});

console.log("[Worker] PID " + process.pid + " started with concurrency " + concurrency);

Deployment Workflow

# First-time setup on the server
pm2 deploy ecosystem.config.js production setup

# Deploy to production
pm2 deploy ecosystem.config.js production

# Deploy to staging
pm2 deploy ecosystem.config.js staging

# Rollback one version
pm2 deploy ecosystem.config.js production revert 1

# Execute remote command
pm2 deploy ecosystem.config.js production exec "pm2 list"

After deploying, verify everything is running:

ssh [email protected] "pm2 list && pm2 logs --lines 20"

Common Issues and Troubleshooting

1. Process Keeps Restarting in a Loop

PM2        | App [my-api:0] starting in -fork mode-
PM2        | App [my-api:0] online
PM2        | App [my-api:0] exited with code [1] via signal [SIGINT]
PM2        | App [my-api:0] starting in -fork mode-

This happens when your application crashes on startup. PM2 restarts it, it crashes again, PM2 restarts it, and so on. Check the error log:

pm2 logs my-api --err --lines 50

Common causes: missing environment variables, database connection failures, port already in use. Fix the underlying error. If you need to stop the restart loop while debugging, set autorestart: false temporarily or use pm2 stop my-api.

PM2 has built-in protection against restart loops. After 15 unstable restarts (restarts within 1 second of starting), it will stop restarting the process. You can configure this:

{
  name: "my-api",
  script: "./app.js",
  min_uptime: "5s",         // Process must run at least 5s to be considered stable
  max_restarts: 10,          // Stop trying after 10 unstable restarts
  restart_delay: 4000        // Wait 4 seconds between restart attempts
}

2. Environment Variables Not Updating After Deploy

pm2 restart my-api
pm2 show my-api
# Still shows old DB_HOST value

PM2 caches environment variables. Use the --update-env flag:

pm2 restart my-api --update-env

Or delete and re-start:

pm2 delete my-api
pm2 start ecosystem.config.js --env production
pm2 save

3. Port Already in Use After Restart

Error: listen EADDRINUSE: address already in use :::8080
    at Server.setupListenHandle [as _setupListenHandle] (net.js:1331:16)

This happens when PM2 starts a new process before the old one finishes releasing the port. It is most common in fork mode without proper graceful shutdown handling. Solutions:

  • Switch to cluster mode (PM2 manages port sharing)
  • Implement the SIGINT handler shown above so the old process closes the port before exiting
  • Increase kill_timeout to give the old process more time to clean up

4. Cluster Mode Not Load Balancing Evenly

You check pm2 monit and see one instance handling 80% of requests while others sit idle. This is usually caused by long-running requests or WebSocket connections that pin to a single worker. PM2 uses Node's default round-robin load balancing.

Check if your load balancer or reverse proxy is the issue:

pm2 show api

Look at the "restart count" and "uptime" for each instance. If one instance keeps restarting, it will not receive traffic during startup, causing an imbalance. Also verify your Nginx upstream configuration uses proper load balancing:

upstream api {
    least_conn;
    server 127.0.0.1:8080;
}

5. Startup Script Not Working After Node Version Update

pm2 startup
# Generates script pointing to old Node path

After updating Node.js (via nvm, n, or package manager), the startup script points to the old binary path. Regenerate it:

pm2 unstartup systemd
pm2 startup systemd
# Run the generated sudo command
pm2 save

Best Practices

  • Always use an ecosystem.config.js file. Command-line flags are for quick tests. The ecosystem file is version-controlled, documented, and reproducible. Every production application should have one.

  • Use cluster mode for HTTP servers and fork mode for everything else. Background workers, cron jobs, and queue processors should run in fork mode. Only cluster stateless HTTP servers.

  • Implement graceful shutdown handlers in every application. Listen for SIGINT, close connections, finish in-flight work, then exit. Without this, restarts and deployments will drop requests and corrupt data.

  • Set memory limits on every process. Even if you do not have memory leaks today, set max_memory_restart as a safety net. A process consuming all available RAM will take down everything else on the server.

  • Run pm2 save after every change to the process list. Scaling, adding processes, changing configuration — always follow up with pm2 save so the startup script restores the correct state.

  • Keep secrets out of the ecosystem file. Use a .env file, your platform's secrets management, or environment variables injected by your CI/CD pipeline. The ecosystem file should be committed to version control, and secrets should not be.

  • Configure log rotation from day one. Do not wait until your disk fills up. Install pm2-logrotate immediately after setting up PM2. Unrotated logs are the number one cause of disk space issues on long-running servers.

  • Use pm2 reload instead of pm2 restart in production. The difference is zero downtime vs. guaranteed downtime. There is no reason to use restart for cluster mode applications in production.

  • Monitor restart counts. A process that restarts frequently has an underlying issue. Use pm2 show <name> to check restart counts and investigate any process with a high count relative to its uptime.

  • Pin your PM2 version. PM2 updates can change behavior. Install a specific version in production (npm install -g [email protected]) and test updates in staging before rolling them out.

References

Powered by Contentful