Container Resource Limits and Requests

Shane

2/13/2026

15 min read

Master container resource management for Node.js applications in Docker and Kubernetes, covering memory limits, CPU allocation, OOMKilled debugging, and right-sizing strategies.

docker kubernetes resource-limits memory-management cpu-limits

Container Resource Limits and Requests

Running Node.js in containers without resource limits is like driving without a speedometer — everything works until it does not. An unconstrained Node.js process can consume all available memory on a node, starving other containers and potentially crashing the host. Resource limits and requests are how you tell the container runtime exactly how much CPU and memory your application needs and how much it is allowed to use. Getting these numbers right is the difference between stable production workloads and 3 AM pages.

Prerequisites

Docker Desktop v4.0+ or Docker Engine
kubectl and a Kubernetes cluster
Node.js 18+ understanding of V8 memory management
Basic familiarity with Docker and Kubernetes resource concepts

Docker Memory Limits

Docker uses Linux cgroups to enforce memory limits on containers.

# Run with a 256MB memory limit
docker run --memory=256m --memory-swap=256m node:20-alpine node -e "
  console.log('Memory limit:', process.constrainedMemory());
  console.log('Heap stats:', JSON.stringify(require('v8').getHeapStatistics(), null, 2));
"

The flags:

--memory=256m: Maximum memory the container can use (includes heap, stack, buffers)
--memory-swap=256m: Total memory + swap limit. Setting it equal to --memory disables swap

# Output:
# Memory limit: 268435456   (256MB in bytes)
# Heap stats: {
#   "total_heap_size": 5242880,
#   "heap_size_limit": 135266304,  // ~129MB - V8 auto-adjusts
#   ...
# }

Node.js 18+ reads the cgroup memory limit and adjusts V8's heap limit automatically. Before Node 18, V8 would use its default (1.5GB on 64-bit systems) regardless of container limits, leading to OOMKill.

Memory Limit Flags

# Soft limit (memory reservation) - used for scheduling, not enforcement
docker run --memory-reservation=128m myapp

# Hard limit - container is killed if exceeded
docker run --memory=256m myapp

# Disable swap
docker run --memory=256m --memory-swap=256m myapp

# Allow 128MB swap in addition to 256MB memory
docker run --memory=256m --memory-swap=384m myapp

In production, always set --memory-swap equal to --memory to disable swap. Swap inside containers causes unpredictable latency spikes.

Docker CPU Limits

CPU limits are more nuanced than memory because CPU is a compressible resource — the container is throttled, not killed.

# Limit to 0.5 CPU cores
docker run --cpus=0.5 myapp

# Limit to specific CPU cores
docker run --cpuset-cpus="0,1" myapp

# Relative CPU weight (default: 1024)
docker run --cpu-shares=512 myapp

The difference matters:

--cpus=0.5: Hard limit. The container gets at most 50% of one core's time. Even if the host has idle CPUs, the container is throttled.
--cpu-shares=512: Relative weight. With no contention, the container can use all available CPU. When other containers compete, shares determine proportional allocation.

For Node.js applications, --cpus is more predictable. A single-threaded Node.js app rarely benefits from more than 1 CPU core (the event loop runs on one core).

# Good for a typical Express.js API
docker run --cpus=1 --memory=256m myapp

# Good for a worker processing background jobs
docker run --cpus=0.5 --memory=512m myworker

Node.js Memory Management in Containers

V8's garbage collector needs headroom. If you set a 256MB container limit, V8 will not use all 256MB for heap — the process also needs memory for the stack, native code, buffers, and libuv.

Aligning --max-old-space-size with Container Limits

# Container has 512MB limit
docker run --memory=512m myapp node --max-old-space-size=384 app.js

The rule of thumb: set --max-old-space-size to 75% of the container memory limit. This leaves room for:

V8 new space (semi-space) and code space
Native (C++) memory allocations
Buffer allocations (outside V8 heap)
Stack space

// Check effective memory limits at runtime
var v8 = require('v8');
var os = require('os');

function getMemoryInfo() {
  var heapStats = v8.getHeapStatistics();
  var memUsage = process.memoryUsage();

  return {
    heapSizeLimit: Math.round(heapStats.heap_size_limit / 1024 / 1024) + 'MB',
    heapUsed: Math.round(memUsage.heapUsed / 1024 / 1024) + 'MB',
    rss: Math.round(memUsage.rss / 1024 / 1024) + 'MB',
    external: Math.round(memUsage.external / 1024 / 1024) + 'MB',
    arrayBuffers: Math.round(memUsage.arrayBuffers / 1024 / 1024) + 'MB',
    containerLimit: process.constrainedMemory()
      ? Math.round(process.constrainedMemory() / 1024 / 1024) + 'MB'
      : 'unlimited'
  };
}

console.log(getMemoryInfo());
// {
//   heapSizeLimit: '256MB',
//   heapUsed: '12MB',
//   rss: '45MB',
//   external: '1MB',
//   arrayBuffers: '0MB',
//   containerLimit: '512MB'
// }

Garbage Collection in Constrained Environments

Node.js runs GC more aggressively in memory-constrained containers. This can increase latency. Monitor GC with the --trace-gc flag:

docker run --memory=256m myapp node --trace-gc app.js

# Output:
# [12345:0x1234567]  100 ms: Scavenge 4.2 (5.0) -> 3.1 (6.0) MB, 1.2 / 0.0 ms
# [12345:0x1234567]  250 ms: Mark-sweep 8.1 (10.0) -> 5.2 (10.0) MB, 3.5 / 0.0 ms

For production, expose GC metrics via a health endpoint:

var gcStats = { count: 0, totalDuration: 0, lastDuration: 0 };

try {
  var perf_hooks = require('perf_hooks');
  var observer = new perf_hooks.PerformanceObserver(function(list) {
    var entries = list.getEntries();
    entries.forEach(function(entry) {
      gcStats.count++;
      gcStats.totalDuration += entry.duration;
      gcStats.lastDuration = entry.duration;
    });
  });
  observer.observe({ entryTypes: ['gc'] });
} catch (e) {
  // Performance observer not available
}

app.get('/metrics/gc', function(req, res) {
  res.json({
    gcCount: gcStats.count,
    totalGcDuration: Math.round(gcStats.totalDuration) + 'ms',
    lastGcDuration: Math.round(gcStats.lastDuration) + 'ms',
    avgGcDuration: gcStats.count > 0
      ? Math.round(gcStats.totalDuration / gcStats.count) + 'ms'
      : '0ms'
  });
});

Kubernetes Resource Requests and Limits

Kubernetes separates resource configuration into requests and limits.

resources:
  requests:
    memory: "128Mi"
    cpu: "100m"
  limits:
    memory: "256Mi"
    cpu: "500m"

Requests are what the scheduler uses to place pods. A node must have at least 128Mi free memory and 100m CPU to schedule this pod. Requests guarantee minimum resources.

Limits are the maximum the container can use. Exceeding memory limits causes OOMKill. Exceeding CPU limits causes throttling.

CPU units: 1000m = 1 core. 100m = 10% of one core. 0.1 and 100m are equivalent.

Memory units: Mi (mebibytes), Gi (gibibytes). Use Mi not MB — they are slightly different.

QoS Classes

Kubernetes assigns QoS classes based on how you configure requests and limits.

Guaranteed — requests equal limits for all resources:

resources:
  requests:
    memory: "256Mi"
    cpu: "500m"
  limits:
    memory: "256Mi"
    cpu: "500m"

Guaranteed pods are the last to be evicted under memory pressure. Use this for critical workloads.

Burstable — requests are set but lower than limits:

resources:
  requests:
    memory: "128Mi"
    cpu: "100m"
  limits:
    memory: "256Mi"
    cpu: "500m"

Burstable pods can use more than requested when capacity is available. Most workloads should be Burstable.

BestEffort — no requests or limits set:

resources: {}

BestEffort pods are the first to be evicted. Never use this in production.

OOMKilled: Debugging and Prevention

OOMKilled means the container exceeded its memory limit and was terminated by the kernel.

kubectl describe pod api-abc123

# Last State:   Terminated
#   Reason:     OOMKilled
#   Exit Code:  137

Exit code 137 = 128 + 9 (SIGKILL).

Finding the Root Cause

# Check actual memory usage before crash
kubectl top pod api-abc123
# NAME         CPU(cores)   MEMORY(bytes)
# api-abc123   50m          245Mi           # Close to 256Mi limit

# Check container events
kubectl get events --field-selector involvedObject.name=api-abc123
# REASON      MESSAGE
# OOMKilling  Memory cgroup out of memory: Killed process 1 (node)

Common Causes and Fixes

V8 heap exceeds container limit. Set --max-old-space-size:

containers:
  - name: api
    command: ["node", "--max-old-space-size=192", "app.js"]
    resources:
      limits:
        memory: "256Mi"

Buffer accumulation. Large file uploads or response buffering consumes memory outside V8 heap:

// Bad: buffering entire file in memory
app.post('/upload', function(req, res) {
  var chunks = [];
  req.on('data', function(chunk) { chunks.push(chunk); });
  req.on('end', function() {
    var buffer = Buffer.concat(chunks); // Could be hundreds of MB
  });
});

// Good: stream to disk
var fs = require('fs');
app.post('/upload', function(req, res) {
  var writeStream = fs.createWriteStream('/tmp/upload-' + Date.now());
  req.pipe(writeStream);
  writeStream.on('finish', function() {
    res.json({ status: 'uploaded' });
  });
});

Memory leak. Global caches, event listeners, or closures that never get cleaned up:

// Memory leak: unbounded cache
var cache = {};
app.get('/api/data/:id', function(req, res) {
  cache[req.params.id] = fetchData(req.params.id); // Never evicted
});

// Fixed: bounded cache with LRU
var LRU = require('lru-cache');
var cache = new LRU({ max: 1000, ttl: 1000 * 60 * 5 });

CPU Throttling Detection

CPU throttling is harder to detect than OOMKill because the container is not killed — it just runs slower.

# Check throttling via cgroup stats
kubectl exec api-abc123 -- cat /sys/fs/cgroup/cpu.stat
# usage_usec 12345678
# user_usec 10000000
# system_usec 2345678
# nr_periods 5000
# nr_throttled 1200      # 24% of periods were throttled
# throttled_usec 3600000 # 3.6 seconds total throttled time

A high nr_throttled / nr_periods ratio means your CPU limit is too low.

// Monitor event loop lag as a proxy for CPU throttling
var lastCheck = Date.now();

setInterval(function() {
  var now = Date.now();
  var lag = now - lastCheck - 1000; // Expected interval is 1000ms
  lastCheck = now;

  if (lag > 100) {
    console.warn('Event loop lag: ' + lag + 'ms (possible CPU throttling)');
  }
}, 1000);

Event loop lag above 50-100ms in a container with CPU limits usually indicates throttling.

Monitoring Container Resource Usage

Docker Stats

docker stats
# CONTAINER ID  NAME    CPU %   MEM USAGE / LIMIT   MEM %   NET I/O
# abc123        api     2.5%    145MiB / 256MiB      56%     12kB / 8kB
# def456        worker  15.3%   380MiB / 512MiB      74%     1.2MB / 500kB

kubectl top

kubectl top pods
# NAME         CPU(cores)   MEMORY(bytes)
# api-1        45m          142Mi
# api-2        52m          148Mi
# worker-1     120m         380Mi

kubectl top nodes
# NAME     CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
# node-1   450m         22%    2048Mi          52%

Application-Level Monitoring

// metrics/resources.js
var os = require('os');
var v8 = require('v8');

function getResourceMetrics() {
  var memUsage = process.memoryUsage();
  var heapStats = v8.getHeapStatistics();
  var cpus = os.cpus();

  return {
    memory: {
      rss: Math.round(memUsage.rss / 1024 / 1024),
      heapTotal: Math.round(memUsage.heapTotal / 1024 / 1024),
      heapUsed: Math.round(memUsage.heapUsed / 1024 / 1024),
      external: Math.round(memUsage.external / 1024 / 1024),
      heapLimit: Math.round(heapStats.heap_size_limit / 1024 / 1024),
      containerLimit: process.constrainedMemory()
        ? Math.round(process.constrainedMemory() / 1024 / 1024)
        : null
    },
    cpu: {
      cores: cpus.length,
      loadAvg: os.loadavg(),
      uptime: Math.round(process.uptime())
    }
  };
}

// Expose via endpoint
app.get('/metrics/resources', function(req, res) {
  res.json(getResourceMetrics());
});

// Log every 60 seconds
setInterval(function() {
  var metrics = getResourceMetrics();
  var heapPercent = Math.round(
    (metrics.memory.heapUsed / metrics.memory.heapLimit) * 100
  );

  if (heapPercent > 80) {
    console.warn('High memory usage: ' + heapPercent + '% of heap limit');
  }

  console.log('Resources: heap=' + metrics.memory.heapUsed + 'MB/' +
    metrics.memory.heapLimit + 'MB, rss=' + metrics.memory.rss + 'MB');
}, 60000);

Right-Sizing Containers for Node.js

Express.js API Server

A typical Express.js API handling JSON requests:

resources:
  requests:
    memory: "128Mi"
    cpu: "100m"
  limits:
    memory: "256Mi"
    cpu: "500m"

Node.js is single-threaded, so one CPU core is the upper bound for a single process. The API server spends most time waiting on I/O (database, Redis, HTTP), so CPU usage is typically low. Memory depends on payload sizes and connection pools.

Background Worker

A worker processing jobs from a queue:

resources:
  requests:
    memory: "256Mi"
    cpu: "200m"
  limits:
    memory: "512Mi"
    cpu: "1000m"

Workers often do more CPU-intensive work (parsing, transforming, generating). Give them more CPU and memory headroom. If using Worker Threads, increase CPU limits proportionally.

Memory-Intensive Processing

A service processing large files or datasets:

containers:
  - name: processor
    command: ["node", "--max-old-space-size=768", "processor.js"]
    resources:
      requests:
        memory: "512Mi"
        cpu: "250m"
      limits:
        memory: "1Gi"
        cpu: "1000m"

Set --max-old-space-size to 75% of the memory limit.

Horizontal Pod Autoscaler

Scale based on resource usage:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Pods
          value: 1
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
        - type: Pods
          value: 2
          periodSeconds: 60

The HPA scales up when average CPU exceeds 70% or memory exceeds 80%. Scale-down has a 5-minute stabilization window to prevent flapping.

Complete Working Example

# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      containers:
        - name: api
          image: myapp:1.0.0
          command: ["node", "--max-old-space-size=192", "app.js"]
          ports:
            - containerPort: 3000
          env:
            - name: NODE_ENV
              value: "production"
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: api-secrets
                  key: database-url
          resources:
            requests:
              memory: "128Mi"
              cpu: "100m"
            limits:
              memory: "256Mi"
              cpu: "500m"
          startupProbe:
            httpGet:
              path: /health
              port: 3000
            failureThreshold: 30
            periodSeconds: 5
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            periodSeconds: 20
            timeoutSeconds: 5
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 3000
            periodSeconds: 10
            timeoutSeconds: 5
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: api

// monitoring/resource-monitor.js
var v8 = require('v8');
var os = require('os');

function ResourceMonitor(options) {
  this.interval = (options && options.interval) || 30000;
  this.warnThreshold = (options && options.warnThreshold) || 80;
  this.critThreshold = (options && options.critThreshold) || 90;
  this.timer = null;
}

ResourceMonitor.prototype.start = function() {
  var self = this;
  this.timer = setInterval(function() {
    self.check();
  }, this.interval);
};

ResourceMonitor.prototype.stop = function() {
  if (this.timer) clearInterval(this.timer);
};

ResourceMonitor.prototype.check = function() {
  var memUsage = process.memoryUsage();
  var heapStats = v8.getHeapStatistics();
  var containerLimit = process.constrainedMemory();

  var heapPercent = Math.round(
    (memUsage.heapUsed / heapStats.heap_size_limit) * 100
  );
  var rssPercent = containerLimit
    ? Math.round((memUsage.rss / containerLimit) * 100)
    : null;

  if (heapPercent >= this.critThreshold) {
    console.error('CRITICAL: Heap usage at ' + heapPercent + '%' +
      ' (' + Math.round(memUsage.heapUsed / 1024 / 1024) + 'MB' +
      ' / ' + Math.round(heapStats.heap_size_limit / 1024 / 1024) + 'MB)');
  } else if (heapPercent >= this.warnThreshold) {
    console.warn('WARNING: Heap usage at ' + heapPercent + '%' +
      ' (' + Math.round(memUsage.heapUsed / 1024 / 1024) + 'MB' +
      ' / ' + Math.round(heapStats.heap_size_limit / 1024 / 1024) + 'MB)');
  }

  if (rssPercent && rssPercent >= this.critThreshold) {
    console.error('CRITICAL: Container memory at ' + rssPercent + '%' +
      ' (' + Math.round(memUsage.rss / 1024 / 1024) + 'MB' +
      ' / ' + Math.round(containerLimit / 1024 / 1024) + 'MB)');
  }

  return {
    heapPercent: heapPercent,
    rssPercent: rssPercent,
    heapUsedMB: Math.round(memUsage.heapUsed / 1024 / 1024),
    rssMB: Math.round(memUsage.rss / 1024 / 1024)
  };
};

module.exports = ResourceMonitor;

// Usage in app.js
var ResourceMonitor = require('./monitoring/resource-monitor');

var monitor = new ResourceMonitor({
  interval: 30000,
  warnThreshold: 75,
  critThreshold: 90
});

monitor.start();

process.on('SIGTERM', function() {
  monitor.stop();
  // ... rest of shutdown
});

Common Issues and Troubleshooting

1. OOMKilled Despite Low Heap Usage

kubectl logs api-abc123 --previous
# Last log: heap=85MB/192MB rss=248MB
# But container limit is 256Mi

RSS exceeds the container limit even though heap is fine. The difference is native memory: Buffers, C++ allocations, shared libraries. Increase the memory limit or reduce Buffer usage.

# Check what is using memory
kubectl exec api-abc123 -- node -e "
  var m = process.memoryUsage();
  console.log('Heap:', Math.round(m.heapUsed/1024/1024), 'MB');
  console.log('External:', Math.round(m.external/1024/1024), 'MB');
  console.log('ArrayBuffers:', Math.round(m.arrayBuffers/1024/1024), 'MB');
  console.log('RSS:', Math.round(m.rss/1024/1024), 'MB');
  console.log('Gap (native):', Math.round((m.rss - m.heapTotal - m.external)/1024/1024), 'MB');
"

2. CPU Throttling Causing Latency

# Response times jumped from 50ms to 500ms
# kubectl top shows CPU at limit

The container's CPU limit is too low. Increase it or scale horizontally:

resources:
  limits:
    cpu: "1000m"  # Was 500m

Or check for CPU-intensive operations that should be offloaded to worker threads:

var Worker = require('worker_threads').Worker;

app.post('/api/process', function(req, res) {
  var worker = new Worker('./heavy-computation.js', {
    workerData: req.body
  });
  worker.on('message', function(result) {
    res.json(result);
  });
  worker.on('error', function(err) {
    res.status(500).json({ error: err.message });
  });
});

3. Pods Pending Due to Insufficient Resources

kubectl get pods
# NAME     READY   STATUS    RESTARTS   AGE
# api-4    0/1     Pending   0          5m

kubectl describe pod api-4
# Events:
#   Warning  FailedScheduling  0/3 nodes are available:
#     3 Insufficient memory.

Your requests exceed available node capacity. Options:

Reduce memory requests (if pods use less than requested)
Add more nodes
Use Cluster Autoscaler

4. Node.js Ignoring Container Memory Limit

# On older Node.js versions
node -e "console.log(require('v8').getHeapStatistics().heap_size_limit)"
# 1518338048  (1.4GB - V8 default, ignoring 256MB container limit)

Node.js versions before 18 do not read cgroup limits by default. Either upgrade Node.js or set --max-old-space-size explicitly:

command: ["node", "--max-old-space-size=192", "app.js"]

Best Practices

Always set both requests and limits. Pods without resource specs get BestEffort QoS and are first to be evicted under pressure.
Set --max-old-space-size to 75% of the memory limit. This leaves headroom for native memory, buffers, and overhead.
Use Burstable QoS for most workloads. Set requests lower than limits to allow bursting when capacity is available.
Monitor actual usage before right-sizing. Run kubectl top and application metrics for a week before adjusting limits. Do not guess.
Disable swap in containers. Set --memory-swap equal to --memory in Docker. Swap causes unpredictable latency.
Use HPA for CPU-bound workloads. Horizontal scaling is usually better than vertical scaling for Node.js because of the single-threaded event loop.
Set PodDisruptionBudgets. Ensure at least one pod remains available during node maintenance or scaling events.
Profile GC behavior under load. Use --trace-gc in staging to understand GC patterns before setting production memory limits.
Track event loop lag as a CPU throttling indicator. Lag above 100ms in a resource-limited container usually means CPU throttling.

Container Resource Limits and Requests

Container Resource Limits and Requests

Prerequisites

Docker Memory Limits

Memory Limit Flags

Docker CPU Limits

Node.js Memory Management in Containers

Aligning --max-old-space-size with Container Limits

Garbage Collection in Constrained Environments

Kubernetes Resource Requests and Limits

QoS Classes

OOMKilled: Debugging and Prevention

Finding the Root Cause

Common Causes and Fixes

CPU Throttling Detection

Monitoring Container Resource Usage

Docker Stats

kubectl top

Application-Level Monitoring

Right-Sizing Containers for Node.js

Express.js API Server

Background Worker

Memory-Intensive Processing

Horizontal Pod Autoscaler

Complete Working Example

Common Issues and Troubleshooting

1. OOMKilled Despite Low Heap Usage

2. CPU Throttling Causing Latency

3. Pods Pending Due to Insufficient Resources

4. Node.js Ignoring Container Memory Limit

Best Practices

References

Quick Links

Need Expert Help?