Container Resource Limits and Requests
Master container resource management for Node.js applications in Docker and Kubernetes, covering memory limits, CPU allocation, OOMKilled debugging, and right-sizing strategies.
Container Resource Limits and Requests
Running Node.js in containers without resource limits is like driving without a speedometer — everything works until it does not. An unconstrained Node.js process can consume all available memory on a node, starving other containers and potentially crashing the host. Resource limits and requests are how you tell the container runtime exactly how much CPU and memory your application needs and how much it is allowed to use. Getting these numbers right is the difference between stable production workloads and 3 AM pages.
Prerequisites
- Docker Desktop v4.0+ or Docker Engine
- kubectl and a Kubernetes cluster
- Node.js 18+ understanding of V8 memory management
- Basic familiarity with Docker and Kubernetes resource concepts
Docker Memory Limits
Docker uses Linux cgroups to enforce memory limits on containers.
# Run with a 256MB memory limit
docker run --memory=256m --memory-swap=256m node:20-alpine node -e "
console.log('Memory limit:', process.constrainedMemory());
console.log('Heap stats:', JSON.stringify(require('v8').getHeapStatistics(), null, 2));
"
The flags:
--memory=256m: Maximum memory the container can use (includes heap, stack, buffers)--memory-swap=256m: Total memory + swap limit. Setting it equal to--memorydisables swap
# Output:
# Memory limit: 268435456 (256MB in bytes)
# Heap stats: {
# "total_heap_size": 5242880,
# "heap_size_limit": 135266304, // ~129MB - V8 auto-adjusts
# ...
# }
Node.js 18+ reads the cgroup memory limit and adjusts V8's heap limit automatically. Before Node 18, V8 would use its default (1.5GB on 64-bit systems) regardless of container limits, leading to OOMKill.
Memory Limit Flags
# Soft limit (memory reservation) - used for scheduling, not enforcement
docker run --memory-reservation=128m myapp
# Hard limit - container is killed if exceeded
docker run --memory=256m myapp
# Disable swap
docker run --memory=256m --memory-swap=256m myapp
# Allow 128MB swap in addition to 256MB memory
docker run --memory=256m --memory-swap=384m myapp
In production, always set --memory-swap equal to --memory to disable swap. Swap inside containers causes unpredictable latency spikes.
Docker CPU Limits
CPU limits are more nuanced than memory because CPU is a compressible resource — the container is throttled, not killed.
# Limit to 0.5 CPU cores
docker run --cpus=0.5 myapp
# Limit to specific CPU cores
docker run --cpuset-cpus="0,1" myapp
# Relative CPU weight (default: 1024)
docker run --cpu-shares=512 myapp
The difference matters:
--cpus=0.5: Hard limit. The container gets at most 50% of one core's time. Even if the host has idle CPUs, the container is throttled.--cpu-shares=512: Relative weight. With no contention, the container can use all available CPU. When other containers compete, shares determine proportional allocation.
For Node.js applications, --cpus is more predictable. A single-threaded Node.js app rarely benefits from more than 1 CPU core (the event loop runs on one core).
# Good for a typical Express.js API
docker run --cpus=1 --memory=256m myapp
# Good for a worker processing background jobs
docker run --cpus=0.5 --memory=512m myworker
Node.js Memory Management in Containers
V8's garbage collector needs headroom. If you set a 256MB container limit, V8 will not use all 256MB for heap — the process also needs memory for the stack, native code, buffers, and libuv.
Aligning --max-old-space-size with Container Limits
# Container has 512MB limit
docker run --memory=512m myapp node --max-old-space-size=384 app.js
The rule of thumb: set --max-old-space-size to 75% of the container memory limit. This leaves room for:
- V8 new space (semi-space) and code space
- Native (C++) memory allocations
- Buffer allocations (outside V8 heap)
- Stack space
// Check effective memory limits at runtime
var v8 = require('v8');
var os = require('os');
function getMemoryInfo() {
var heapStats = v8.getHeapStatistics();
var memUsage = process.memoryUsage();
return {
heapSizeLimit: Math.round(heapStats.heap_size_limit / 1024 / 1024) + 'MB',
heapUsed: Math.round(memUsage.heapUsed / 1024 / 1024) + 'MB',
rss: Math.round(memUsage.rss / 1024 / 1024) + 'MB',
external: Math.round(memUsage.external / 1024 / 1024) + 'MB',
arrayBuffers: Math.round(memUsage.arrayBuffers / 1024 / 1024) + 'MB',
containerLimit: process.constrainedMemory()
? Math.round(process.constrainedMemory() / 1024 / 1024) + 'MB'
: 'unlimited'
};
}
console.log(getMemoryInfo());
// {
// heapSizeLimit: '256MB',
// heapUsed: '12MB',
// rss: '45MB',
// external: '1MB',
// arrayBuffers: '0MB',
// containerLimit: '512MB'
// }
Garbage Collection in Constrained Environments
Node.js runs GC more aggressively in memory-constrained containers. This can increase latency. Monitor GC with the --trace-gc flag:
docker run --memory=256m myapp node --trace-gc app.js
# Output:
# [12345:0x1234567] 100 ms: Scavenge 4.2 (5.0) -> 3.1 (6.0) MB, 1.2 / 0.0 ms
# [12345:0x1234567] 250 ms: Mark-sweep 8.1 (10.0) -> 5.2 (10.0) MB, 3.5 / 0.0 ms
For production, expose GC metrics via a health endpoint:
var gcStats = { count: 0, totalDuration: 0, lastDuration: 0 };
try {
var perf_hooks = require('perf_hooks');
var observer = new perf_hooks.PerformanceObserver(function(list) {
var entries = list.getEntries();
entries.forEach(function(entry) {
gcStats.count++;
gcStats.totalDuration += entry.duration;
gcStats.lastDuration = entry.duration;
});
});
observer.observe({ entryTypes: ['gc'] });
} catch (e) {
// Performance observer not available
}
app.get('/metrics/gc', function(req, res) {
res.json({
gcCount: gcStats.count,
totalGcDuration: Math.round(gcStats.totalDuration) + 'ms',
lastGcDuration: Math.round(gcStats.lastDuration) + 'ms',
avgGcDuration: gcStats.count > 0
? Math.round(gcStats.totalDuration / gcStats.count) + 'ms'
: '0ms'
});
});
Kubernetes Resource Requests and Limits
Kubernetes separates resource configuration into requests and limits.
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
Requests are what the scheduler uses to place pods. A node must have at least 128Mi free memory and 100m CPU to schedule this pod. Requests guarantee minimum resources.
Limits are the maximum the container can use. Exceeding memory limits causes OOMKill. Exceeding CPU limits causes throttling.
CPU units: 1000m = 1 core. 100m = 10% of one core. 0.1 and 100m are equivalent.
Memory units: Mi (mebibytes), Gi (gibibytes). Use Mi not MB — they are slightly different.
QoS Classes
Kubernetes assigns QoS classes based on how you configure requests and limits.
Guaranteed — requests equal limits for all resources:
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "256Mi"
cpu: "500m"
Guaranteed pods are the last to be evicted under memory pressure. Use this for critical workloads.
Burstable — requests are set but lower than limits:
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
Burstable pods can use more than requested when capacity is available. Most workloads should be Burstable.
BestEffort — no requests or limits set:
resources: {}
BestEffort pods are the first to be evicted. Never use this in production.
OOMKilled: Debugging and Prevention
OOMKilled means the container exceeded its memory limit and was terminated by the kernel.
kubectl describe pod api-abc123
# Last State: Terminated
# Reason: OOMKilled
# Exit Code: 137
Exit code 137 = 128 + 9 (SIGKILL).
Finding the Root Cause
# Check actual memory usage before crash
kubectl top pod api-abc123
# NAME CPU(cores) MEMORY(bytes)
# api-abc123 50m 245Mi # Close to 256Mi limit
# Check container events
kubectl get events --field-selector involvedObject.name=api-abc123
# REASON MESSAGE
# OOMKilling Memory cgroup out of memory: Killed process 1 (node)
Common Causes and Fixes
- V8 heap exceeds container limit. Set
--max-old-space-size:
containers:
- name: api
command: ["node", "--max-old-space-size=192", "app.js"]
resources:
limits:
memory: "256Mi"
- Buffer accumulation. Large file uploads or response buffering consumes memory outside V8 heap:
// Bad: buffering entire file in memory
app.post('/upload', function(req, res) {
var chunks = [];
req.on('data', function(chunk) { chunks.push(chunk); });
req.on('end', function() {
var buffer = Buffer.concat(chunks); // Could be hundreds of MB
});
});
// Good: stream to disk
var fs = require('fs');
app.post('/upload', function(req, res) {
var writeStream = fs.createWriteStream('/tmp/upload-' + Date.now());
req.pipe(writeStream);
writeStream.on('finish', function() {
res.json({ status: 'uploaded' });
});
});
- Memory leak. Global caches, event listeners, or closures that never get cleaned up:
// Memory leak: unbounded cache
var cache = {};
app.get('/api/data/:id', function(req, res) {
cache[req.params.id] = fetchData(req.params.id); // Never evicted
});
// Fixed: bounded cache with LRU
var LRU = require('lru-cache');
var cache = new LRU({ max: 1000, ttl: 1000 * 60 * 5 });
CPU Throttling Detection
CPU throttling is harder to detect than OOMKill because the container is not killed — it just runs slower.
# Check throttling via cgroup stats
kubectl exec api-abc123 -- cat /sys/fs/cgroup/cpu.stat
# usage_usec 12345678
# user_usec 10000000
# system_usec 2345678
# nr_periods 5000
# nr_throttled 1200 # 24% of periods were throttled
# throttled_usec 3600000 # 3.6 seconds total throttled time
A high nr_throttled / nr_periods ratio means your CPU limit is too low.
// Monitor event loop lag as a proxy for CPU throttling
var lastCheck = Date.now();
setInterval(function() {
var now = Date.now();
var lag = now - lastCheck - 1000; // Expected interval is 1000ms
lastCheck = now;
if (lag > 100) {
console.warn('Event loop lag: ' + lag + 'ms (possible CPU throttling)');
}
}, 1000);
Event loop lag above 50-100ms in a container with CPU limits usually indicates throttling.
Monitoring Container Resource Usage
Docker Stats
docker stats
# CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O
# abc123 api 2.5% 145MiB / 256MiB 56% 12kB / 8kB
# def456 worker 15.3% 380MiB / 512MiB 74% 1.2MB / 500kB
kubectl top
kubectl top pods
# NAME CPU(cores) MEMORY(bytes)
# api-1 45m 142Mi
# api-2 52m 148Mi
# worker-1 120m 380Mi
kubectl top nodes
# NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
# node-1 450m 22% 2048Mi 52%
Application-Level Monitoring
// metrics/resources.js
var os = require('os');
var v8 = require('v8');
function getResourceMetrics() {
var memUsage = process.memoryUsage();
var heapStats = v8.getHeapStatistics();
var cpus = os.cpus();
return {
memory: {
rss: Math.round(memUsage.rss / 1024 / 1024),
heapTotal: Math.round(memUsage.heapTotal / 1024 / 1024),
heapUsed: Math.round(memUsage.heapUsed / 1024 / 1024),
external: Math.round(memUsage.external / 1024 / 1024),
heapLimit: Math.round(heapStats.heap_size_limit / 1024 / 1024),
containerLimit: process.constrainedMemory()
? Math.round(process.constrainedMemory() / 1024 / 1024)
: null
},
cpu: {
cores: cpus.length,
loadAvg: os.loadavg(),
uptime: Math.round(process.uptime())
}
};
}
// Expose via endpoint
app.get('/metrics/resources', function(req, res) {
res.json(getResourceMetrics());
});
// Log every 60 seconds
setInterval(function() {
var metrics = getResourceMetrics();
var heapPercent = Math.round(
(metrics.memory.heapUsed / metrics.memory.heapLimit) * 100
);
if (heapPercent > 80) {
console.warn('High memory usage: ' + heapPercent + '% of heap limit');
}
console.log('Resources: heap=' + metrics.memory.heapUsed + 'MB/' +
metrics.memory.heapLimit + 'MB, rss=' + metrics.memory.rss + 'MB');
}, 60000);
Right-Sizing Containers for Node.js
Express.js API Server
A typical Express.js API handling JSON requests:
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
Node.js is single-threaded, so one CPU core is the upper bound for a single process. The API server spends most time waiting on I/O (database, Redis, HTTP), so CPU usage is typically low. Memory depends on payload sizes and connection pools.
Background Worker
A worker processing jobs from a queue:
resources:
requests:
memory: "256Mi"
cpu: "200m"
limits:
memory: "512Mi"
cpu: "1000m"
Workers often do more CPU-intensive work (parsing, transforming, generating). Give them more CPU and memory headroom. If using Worker Threads, increase CPU limits proportionally.
Memory-Intensive Processing
A service processing large files or datasets:
containers:
- name: processor
command: ["node", "--max-old-space-size=768", "processor.js"]
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
Set --max-old-space-size to 75% of the memory limit.
Horizontal Pod Autoscaler
Scale based on resource usage:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Pods
value: 1
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 30
policies:
- type: Pods
value: 2
periodSeconds: 60
The HPA scales up when average CPU exceeds 70% or memory exceeds 80%. Scale-down has a 5-minute stabilization window to prevent flapping.
Complete Working Example
# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
spec:
replicas: 3
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: myapp:1.0.0
command: ["node", "--max-old-space-size=192", "app.js"]
ports:
- containerPort: 3000
env:
- name: NODE_ENV
value: "production"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: api-secrets
key: database-url
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
startupProbe:
httpGet:
path: /health
port: 3000
failureThreshold: 30
periodSeconds: 5
livenessProbe:
httpGet:
path: /health
port: 3000
periodSeconds: 20
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /health/ready
port: 3000
periodSeconds: 10
timeoutSeconds: 5
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-pdb
spec:
minAvailable: 1
selector:
matchLabels:
app: api
// monitoring/resource-monitor.js
var v8 = require('v8');
var os = require('os');
function ResourceMonitor(options) {
this.interval = (options && options.interval) || 30000;
this.warnThreshold = (options && options.warnThreshold) || 80;
this.critThreshold = (options && options.critThreshold) || 90;
this.timer = null;
}
ResourceMonitor.prototype.start = function() {
var self = this;
this.timer = setInterval(function() {
self.check();
}, this.interval);
};
ResourceMonitor.prototype.stop = function() {
if (this.timer) clearInterval(this.timer);
};
ResourceMonitor.prototype.check = function() {
var memUsage = process.memoryUsage();
var heapStats = v8.getHeapStatistics();
var containerLimit = process.constrainedMemory();
var heapPercent = Math.round(
(memUsage.heapUsed / heapStats.heap_size_limit) * 100
);
var rssPercent = containerLimit
? Math.round((memUsage.rss / containerLimit) * 100)
: null;
if (heapPercent >= this.critThreshold) {
console.error('CRITICAL: Heap usage at ' + heapPercent + '%' +
' (' + Math.round(memUsage.heapUsed / 1024 / 1024) + 'MB' +
' / ' + Math.round(heapStats.heap_size_limit / 1024 / 1024) + 'MB)');
} else if (heapPercent >= this.warnThreshold) {
console.warn('WARNING: Heap usage at ' + heapPercent + '%' +
' (' + Math.round(memUsage.heapUsed / 1024 / 1024) + 'MB' +
' / ' + Math.round(heapStats.heap_size_limit / 1024 / 1024) + 'MB)');
}
if (rssPercent && rssPercent >= this.critThreshold) {
console.error('CRITICAL: Container memory at ' + rssPercent + '%' +
' (' + Math.round(memUsage.rss / 1024 / 1024) + 'MB' +
' / ' + Math.round(containerLimit / 1024 / 1024) + 'MB)');
}
return {
heapPercent: heapPercent,
rssPercent: rssPercent,
heapUsedMB: Math.round(memUsage.heapUsed / 1024 / 1024),
rssMB: Math.round(memUsage.rss / 1024 / 1024)
};
};
module.exports = ResourceMonitor;
// Usage in app.js
var ResourceMonitor = require('./monitoring/resource-monitor');
var monitor = new ResourceMonitor({
interval: 30000,
warnThreshold: 75,
critThreshold: 90
});
monitor.start();
process.on('SIGTERM', function() {
monitor.stop();
// ... rest of shutdown
});
Common Issues and Troubleshooting
1. OOMKilled Despite Low Heap Usage
kubectl logs api-abc123 --previous
# Last log: heap=85MB/192MB rss=248MB
# But container limit is 256Mi
RSS exceeds the container limit even though heap is fine. The difference is native memory: Buffers, C++ allocations, shared libraries. Increase the memory limit or reduce Buffer usage.
# Check what is using memory
kubectl exec api-abc123 -- node -e "
var m = process.memoryUsage();
console.log('Heap:', Math.round(m.heapUsed/1024/1024), 'MB');
console.log('External:', Math.round(m.external/1024/1024), 'MB');
console.log('ArrayBuffers:', Math.round(m.arrayBuffers/1024/1024), 'MB');
console.log('RSS:', Math.round(m.rss/1024/1024), 'MB');
console.log('Gap (native):', Math.round((m.rss - m.heapTotal - m.external)/1024/1024), 'MB');
"
2. CPU Throttling Causing Latency
# Response times jumped from 50ms to 500ms
# kubectl top shows CPU at limit
The container's CPU limit is too low. Increase it or scale horizontally:
resources:
limits:
cpu: "1000m" # Was 500m
Or check for CPU-intensive operations that should be offloaded to worker threads:
var Worker = require('worker_threads').Worker;
app.post('/api/process', function(req, res) {
var worker = new Worker('./heavy-computation.js', {
workerData: req.body
});
worker.on('message', function(result) {
res.json(result);
});
worker.on('error', function(err) {
res.status(500).json({ error: err.message });
});
});
3. Pods Pending Due to Insufficient Resources
kubectl get pods
# NAME READY STATUS RESTARTS AGE
# api-4 0/1 Pending 0 5m
kubectl describe pod api-4
# Events:
# Warning FailedScheduling 0/3 nodes are available:
# 3 Insufficient memory.
Your requests exceed available node capacity. Options:
- Reduce memory requests (if pods use less than requested)
- Add more nodes
- Use Cluster Autoscaler
4. Node.js Ignoring Container Memory Limit
# On older Node.js versions
node -e "console.log(require('v8').getHeapStatistics().heap_size_limit)"
# 1518338048 (1.4GB - V8 default, ignoring 256MB container limit)
Node.js versions before 18 do not read cgroup limits by default. Either upgrade Node.js or set --max-old-space-size explicitly:
command: ["node", "--max-old-space-size=192", "app.js"]
Best Practices
- Always set both requests and limits. Pods without resource specs get BestEffort QoS and are first to be evicted under pressure.
- Set
--max-old-space-sizeto 75% of the memory limit. This leaves headroom for native memory, buffers, and overhead. - Use Burstable QoS for most workloads. Set requests lower than limits to allow bursting when capacity is available.
- Monitor actual usage before right-sizing. Run
kubectl topand application metrics for a week before adjusting limits. Do not guess. - Disable swap in containers. Set
--memory-swapequal to--memoryin Docker. Swap causes unpredictable latency. - Use HPA for CPU-bound workloads. Horizontal scaling is usually better than vertical scaling for Node.js because of the single-threaded event loop.
- Set PodDisruptionBudgets. Ensure at least one pod remains available during node maintenance or scaling events.
- Profile GC behavior under load. Use
--trace-gcin staging to understand GC patterns before setting production memory limits. - Track event loop lag as a CPU throttling indicator. Lag above 100ms in a resource-limited container usually means CPU throttling.