Serverless Cost Modeling and Optimization
Model and optimize serverless costs with Lambda pricing analysis, memory tuning, and cost comparison tools built in Node.js
Serverless Cost Modeling and Optimization
Serverless computing promises you only pay for what you use, but without deliberate cost modeling, monthly bills can spiral past what a dedicated server would cost. This article breaks down every component of serverless pricing on AWS, builds a practical cost calculator in Node.js, and walks through optimization strategies that can cut your bill by 40-60%. If you are running production serverless workloads and have not modeled your costs in a spreadsheet or script, you are almost certainly overpaying.
Prerequisites
- Working knowledge of AWS Lambda, API Gateway, DynamoDB, and S3
- Node.js v16+ installed locally
- An AWS account with billing access
- Familiarity with CloudWatch metrics
- Basic understanding of request/response patterns in web applications
Lambda Pricing Model
Lambda pricing has three dimensions that compound in ways most teams underestimate: requests, duration, and memory. Understanding each one individually is the starting point for any cost model.
Requests
AWS charges $0.20 per 1 million requests after a free tier of 1 million requests per month. This sounds cheap until you realize a moderately trafficked API handling 100 requests per second generates 259 million requests per month — that is $51.60 just in request charges before you even consider execution time.
Duration
Duration is billed in 1ms increments at a rate determined by the memory you allocate. The base rate for 128MB is $0.0000000021 per millisecond. A function allocated 1024MB costs $0.0000000167 per millisecond. The critical insight here is that doubling memory does not double cost linearly — it also often halves execution time because AWS allocates proportional CPU power with memory.
Memory
Memory allocation ranges from 128MB to 10,240MB. The relationship between memory and CPU is not documented precisely, but empirical testing shows that at 1,769MB you get one full vCPU. Below that, you get a fractional CPU. This is why a 256MB function often takes more than twice as long as a 512MB function for CPU-bound work.
// lambda-pricing.js — Core pricing constants for US-East-1
var LAMBDA_PRICING = {
requestCostPerMillion: 0.20,
freeRequestsPerMonth: 1000000,
// Price per GB-second
gbSecondPrice: 0.0000166667,
freeGbSecondsPerMonth: 400000,
// Provisioned concurrency
provisionedConcurrencyPrice: 0.0000041667, // per GB-second
provisionedRequestPrice: 0.0000015 // per request (provisioned)
};
function calculateLambdaCost(config) {
var monthlyRequests = config.requestsPerSecond * 86400 * 30;
var billableRequests = Math.max(0, monthlyRequests - LAMBDA_PRICING.freeRequestsPerMonth);
var requestCost = (billableRequests / 1000000) * LAMBDA_PRICING.requestCostPerMillion;
var memoryGB = config.memoryMB / 1024;
var durationSeconds = config.avgDurationMs / 1000;
var totalGbSeconds = monthlyRequests * memoryGB * durationSeconds;
var billableGbSeconds = Math.max(0, totalGbSeconds - LAMBDA_PRICING.freeGbSecondsPerMonth);
var computeCost = billableGbSeconds * LAMBDA_PRICING.gbSecondPrice;
return {
monthlyRequests: monthlyRequests,
requestCost: requestCost,
computeCost: computeCost,
totalCost: requestCost + computeCost
};
}
// Example: 50 req/s, 512MB, 200ms average duration
var result = calculateLambdaCost({
requestsPerSecond: 50,
memoryMB: 512,
avgDurationMs: 200
});
console.log('Monthly Lambda cost:', '$' + result.totalCost.toFixed(2));
// Monthly Lambda cost: $267.84
API Gateway Pricing
API Gateway is the hidden cost multiplier in serverless architectures. REST APIs cost $3.50 per million requests. HTTP APIs cost $1.00 per million requests. WebSocket APIs charge $1.00 per million connection minutes plus $1.00 per million messages.
If you are using REST API when HTTP API would suffice, you are paying 3.5x more than necessary for that layer. HTTP APIs support JWT authorizers, Lambda integrations, and most common patterns. The main features you lose are request validation, WAF integration, and usage plans. For most internal APIs, HTTP API is the correct choice.
// api-gateway-cost.js
var API_GATEWAY_PRICING = {
restApi: {
perMillionRequests: 3.50,
cachingPerHour: { // per GB
'0.5': 0.020,
'1.6': 0.038,
'6.1': 0.200,
'13.5': 0.250,
'28.4': 0.500,
'58.2': 1.000,
'118': 1.900,
'237': 3.800
}
},
httpApi: {
perMillionRequests: 1.00,
freeRequestsPerMonth: 1000000 // first 12 months
}
};
function calculateApiGatewayCost(type, monthlyRequests, cacheGB) {
var pricing = API_GATEWAY_PRICING[type];
var requestCost = (monthlyRequests / 1000000) * pricing.perMillionRequests;
var cacheCost = 0;
if (type === 'restApi' && cacheGB) {
var hourlyRate = pricing.cachingPerHour[cacheGB] || 0;
cacheCost = hourlyRate * 730; // hours per month
}
return {
requestCost: requestCost,
cacheCost: cacheCost,
totalCost: requestCost + cacheCost
};
}
DynamoDB Capacity Modes and Cost
DynamoDB offers two capacity modes and your choice here can swing costs by an order of magnitude. On-demand mode charges $1.25 per million write request units (WRU) and $0.25 per million read request units (RRU). Provisioned mode charges $0.00065 per WCU per hour and $0.00013 per RCU per hour. Storage costs $0.25 per GB per month in either mode.
The crossover point where provisioned becomes cheaper than on-demand is roughly 14.4% utilization of provisioned capacity. If your table handles steady traffic that uses at least 14.4% of its provisioned throughput, provisioned mode saves money. If your traffic is bursty with long idle periods, on-demand wins.
// dynamodb-cost.js
var DYNAMODB_PRICING = {
onDemand: {
writePerMillion: 1.25,
readPerMillion: 0.25
},
provisioned: {
wcuPerHour: 0.00065,
rcuPerHour: 0.00013,
hoursPerMonth: 730
},
storagePerGB: 0.25,
// Reserved capacity (1-year commitment)
reserved1Year: {
writeUpfront: 150.00, // per 100 WCU
writeHourly: 0.000128, // per WCU
readUpfront: 30.00, // per 100 RCU
readHourly: 0.0000256 // per RCU
}
};
function calculateDynamoDBCost(config) {
var storageCost = config.storageGB * DYNAMODB_PRICING.storagePerGB;
if (config.mode === 'on-demand') {
var writeCost = (config.monthlyWrites / 1000000) * DYNAMODB_PRICING.onDemand.writePerMillion;
var readCost = (config.monthlyReads / 1000000) * DYNAMODB_PRICING.onDemand.readPerMillion;
return {
mode: 'on-demand',
writeCost: writeCost,
readCost: readCost,
storageCost: storageCost,
totalCost: writeCost + readCost + storageCost
};
}
// Provisioned mode
var writeHourlyCost = config.provisionedWCU * DYNAMODB_PRICING.provisioned.wcuPerHour;
var readHourlyCost = config.provisionedRCU * DYNAMODB_PRICING.provisioned.rcuPerHour;
var monthlyCost = (writeHourlyCost + readHourlyCost) * DYNAMODB_PRICING.provisioned.hoursPerMonth;
return {
mode: 'provisioned',
capacityCost: monthlyCost,
storageCost: storageCost,
totalCost: monthlyCost + storageCost
};
}
S3 Request Pricing
S3 pricing trips people up because GET and PUT requests have wildly different costs. PUT, COPY, POST, and LIST requests cost $0.005 per 1,000 requests. GET and SELECT requests cost $0.0004 per 1,000 requests. Storage ranges from $0.023 per GB for Standard to $0.004 per GB for Glacier Instant Retrieval.
For serverless applications that serve static assets through S3 + CloudFront, the CloudFront request pricing ($0.0100 per 10,000 HTTPS requests) is often more significant than the S3 origin costs because CloudFront caching dramatically reduces origin fetches.
// s3-cost.js
var S3_PRICING = {
storage: {
standard: 0.023, // per GB/month
intelligentTiering: 0.023,
standardIA: 0.0125,
glacierInstant: 0.004
},
requests: {
putCopyPostList: 0.005, // per 1,000
getSelect: 0.0004 // per 1,000
},
dataTransferOut: {
first10TB: 0.09, // per GB
next40TB: 0.085,
next100TB: 0.07,
over150TB: 0.05
}
};
function calculateS3Cost(config) {
var storageCost = config.storageGB * S3_PRICING.storage[config.storageClass || 'standard'];
var putCost = (config.monthlyPuts / 1000) * S3_PRICING.requests.putCopyPostList;
var getCost = (config.monthlyGets / 1000) * S3_PRICING.requests.getSelect;
var transferCost = calculateDataTransferCost(config.monthlyTransferGB);
return {
storageCost: storageCost,
putCost: putCost,
getCost: getCost,
transferCost: transferCost,
totalCost: storageCost + putCost + getCost + transferCost
};
}
function calculateDataTransferCost(gb) {
if (gb <= 1) return 0; // First 1 GB free
var cost = 0;
var remaining = gb - 1;
if (remaining > 0) {
var tier1 = Math.min(remaining, 10239);
cost += tier1 * S3_PRICING.dataTransferOut.first10TB;
remaining -= tier1;
}
if (remaining > 0) {
var tier2 = Math.min(remaining, 40960);
cost += tier2 * S3_PRICING.dataTransferOut.next40TB;
remaining -= tier2;
}
if (remaining > 0) {
var tier3 = Math.min(remaining, 102400);
cost += tier3 * S3_PRICING.dataTransferOut.next100TB;
remaining -= tier3;
}
if (remaining > 0) {
cost += remaining * S3_PRICING.dataTransferOut.over150TB;
}
return cost;
}
Data Transfer Costs
Data transfer is the stealth tax on every cloud architecture. Inbound data transfer is free. Outbound to the internet follows tiered pricing starting at $0.09/GB. But the killer is cross-region data transfer at $0.02/GB and cross-AZ data transfer at $0.01/GB. If your Lambda in us-east-1 calls a DynamoDB table in us-west-2, every request incurs data transfer charges both ways.
Within the same region, traffic between Lambda and most AWS services (DynamoDB, S3, SQS, SNS) uses VPC endpoints or AWS internal networking with no data transfer charge. But the moment you add a NAT Gateway to your VPC-connected Lambda, you pay $0.045/GB for data processed through it — plus the $0.045/hour for the NAT Gateway itself. A VPC-connected Lambda making external API calls through a NAT Gateway can cost more in NAT Gateway fees than in Lambda execution.
Building a Cost Calculator in Node.js
Now we combine all these pricing components into a unified cost modeling tool. This is the kind of tool every team running serverless in production should build and maintain.
// cost-calculator.js
var PRICING = {
lambda: {
requestCostPerMillion: 0.20,
gbSecondPrice: 0.0000166667,
provisionedGbSecondPrice: 0.0000041667,
provisionedRequestPrice: 0.0000015
},
apiGateway: {
httpApiPerMillion: 1.00,
restApiPerMillion: 3.50
},
dynamodb: {
onDemandWritePerMillion: 1.25,
onDemandReadPerMillion: 0.25,
storagePerGB: 0.25
},
s3: {
storagePerGB: 0.023,
putPer1000: 0.005,
getPer1000: 0.0004,
transferPerGB: 0.09
},
cloudwatch: {
logsIngestionPerGB: 0.50,
logsStoragePerGB: 0.03,
metricsPerMonth: 0.30,
alarmsPerMonth: 0.10
}
};
function ServerlessCostModel(config) {
this.config = config;
this.breakdown = {};
}
ServerlessCostModel.prototype.calculate = function() {
var c = this.config;
var monthlyRequests = c.requestsPerSecond * 86400 * 30;
// Lambda costs
var memGB = c.lambda.memoryMB / 1024;
var durationSec = c.lambda.avgDurationMs / 1000;
var gbSeconds = monthlyRequests * memGB * durationSec;
var lambdaRequestCost = (monthlyRequests / 1000000) * PRICING.lambda.requestCostPerMillion;
var lambdaComputeCost = gbSeconds * PRICING.lambda.gbSecondPrice;
this.breakdown.lambda = {
requests: lambdaRequestCost,
compute: lambdaComputeCost,
total: lambdaRequestCost + lambdaComputeCost
};
// API Gateway costs
var apiType = c.apiGateway.type || 'httpApi';
var apiRate = PRICING.apiGateway[apiType + 'PerMillion'];
this.breakdown.apiGateway = {
total: (monthlyRequests / 1000000) * apiRate
};
// DynamoDB costs (assuming each request = 1 read + 0.3 writes)
var reads = monthlyRequests * (c.dynamodb.readRatio || 1.0);
var writes = monthlyRequests * (c.dynamodb.writeRatio || 0.3);
var dbReadCost = (reads / 1000000) * PRICING.dynamodb.onDemandReadPerMillion;
var dbWriteCost = (writes / 1000000) * PRICING.dynamodb.onDemandWritePerMillion;
var dbStorageCost = (c.dynamodb.storageGB || 1) * PRICING.dynamodb.storagePerGB;
this.breakdown.dynamodb = {
reads: dbReadCost,
writes: dbWriteCost,
storage: dbStorageCost,
total: dbReadCost + dbWriteCost + dbStorageCost
};
// CloudWatch costs
var avgLogSizeKB = c.cloudwatch.avgLogSizeKB || 1;
var logGB = (monthlyRequests * avgLogSizeKB) / 1048576;
var logIngestionCost = logGB * PRICING.cloudwatch.logsIngestionPerGB;
var logStorageCost = logGB * PRICING.cloudwatch.logsStoragePerGB;
var metricsCost = (c.cloudwatch.customMetrics || 5) * PRICING.cloudwatch.metricsPerMonth;
var alarmsCost = (c.cloudwatch.alarms || 3) * PRICING.cloudwatch.alarmsPerMonth;
this.breakdown.cloudwatch = {
logIngestion: logIngestionCost,
logStorage: logStorageCost,
metrics: metricsCost,
alarms: alarmsCost,
total: logIngestionCost + logStorageCost + metricsCost + alarmsCost
};
// Total
var totalMonthlyCost = 0;
var self = this;
Object.keys(this.breakdown).forEach(function(service) {
totalMonthlyCost += self.breakdown[service].total;
});
this.breakdown.total = {
monthly: totalMonthlyCost,
annual: totalMonthlyCost * 12,
perRequest: totalMonthlyCost / monthlyRequests
};
return this.breakdown;
};
ServerlessCostModel.prototype.printReport = function() {
var b = this.breakdown;
console.log('\n=== Serverless Cost Report ===\n');
console.log('Lambda: $' + b.lambda.total.toFixed(2));
console.log('API Gateway: $' + b.apiGateway.total.toFixed(2));
console.log('DynamoDB: $' + b.dynamodb.total.toFixed(2));
console.log('CloudWatch: $' + b.cloudwatch.total.toFixed(2));
console.log('─────────────────────────────');
console.log('Monthly Total: $' + b.total.monthly.toFixed(2));
console.log('Annual Total: $' + b.total.annual.toFixed(2));
console.log('Cost/Request: $' + b.total.perRequest.toFixed(8));
};
// Usage
var model = new ServerlessCostModel({
requestsPerSecond: 100,
lambda: { memoryMB: 512, avgDurationMs: 150 },
apiGateway: { type: 'httpApi' },
dynamodb: { readRatio: 1.0, writeRatio: 0.3, storageGB: 10 },
cloudwatch: { avgLogSizeKB: 1, customMetrics: 5, alarms: 3 }
});
model.calculate();
model.printReport();
Running this produces output like:
=== Serverless Cost Report ===
Lambda: $422.93
API Gateway: $259.20
DynamoDB: $167.05
CloudWatch: $134.78
─────────────────────────────
Monthly Total: $983.96
Annual Total: $11807.52
Cost/Request: $0.00000381
Memory-to-Cost Optimization
The most impactful optimization for Lambda is right-sizing memory. More memory means more CPU, which often means shorter execution time. The sweet spot minimizes the product of memory and duration. Here is a tool to find it empirically.
// memory-optimizer.js
var AWS = require('aws-sdk');
var lambda = new AWS.Lambda();
function MemoryOptimizer(functionName) {
this.functionName = functionName;
this.results = [];
}
MemoryOptimizer.prototype.runBenchmark = function(memoryValues, payload, iterations, callback) {
var self = this;
var index = 0;
function testNextMemory() {
if (index >= memoryValues.length) {
return callback(null, self.results);
}
var memoryMB = memoryValues[index];
index++;
// Update function memory
lambda.updateFunctionConfiguration({
FunctionName: self.functionName,
MemorySize: memoryMB
}, function(err) {
if (err) return callback(err);
// Wait for update to propagate
setTimeout(function() {
self._runIterations(memoryMB, payload, iterations, function(err, stats) {
if (err) return callback(err);
self.results.push(stats);
testNextMemory();
});
}, 5000);
});
}
testNextMemory();
};
MemoryOptimizer.prototype._runIterations = function(memoryMB, payload, iterations, callback) {
var durations = [];
var completed = 0;
for (var i = 0; i < iterations; i++) {
(function(iter) {
lambda.invoke({
FunctionName: this.functionName,
Payload: JSON.stringify(payload),
LogType: 'Tail'
}, function(err, data) {
if (err) return callback(err);
// Parse duration from log
var logResult = Buffer.from(data.LogResult, 'base64').toString();
var match = logResult.match(/Billed Duration: (\d+) ms/);
if (match) {
durations.push(parseInt(match[1], 10));
}
completed++;
if (completed === iterations) {
var avgDuration = durations.reduce(function(a, b) { return a + b; }, 0) / durations.length;
var memGB = memoryMB / 1024;
var gbSeconds = memGB * (avgDuration / 1000);
var costPerInvocation = gbSeconds * 0.0000166667 + 0.0000002;
callback(null, {
memoryMB: memoryMB,
avgDurationMs: Math.round(avgDuration),
minDurationMs: Math.min.apply(null, durations),
maxDurationMs: Math.max.apply(null, durations),
gbSeconds: gbSeconds,
costPerInvocation: costPerInvocation,
costPer1MInvocations: costPerInvocation * 1000000
});
}
});
}.bind(this))(i);
}
};
MemoryOptimizer.prototype.findOptimal = function() {
if (this.results.length === 0) return null;
var sorted = this.results.slice().sort(function(a, b) {
return a.costPerInvocation - b.costPerInvocation;
});
return {
cheapest: sorted[0],
fastest: this.results.slice().sort(function(a, b) {
return a.avgDurationMs - b.avgDurationMs;
})[0],
all: sorted
};
};
// Usage
var optimizer = new MemoryOptimizer('my-api-function');
var memoryValues = [128, 256, 512, 768, 1024, 1536, 2048, 3008];
optimizer.runBenchmark(memoryValues, { test: true }, 10, function(err, results) {
if (err) {
console.error('Benchmark failed:', err);
return;
}
var optimal = optimizer.findOptimal();
console.log('Cheapest config:', optimal.cheapest.memoryMB + 'MB at $' +
optimal.cheapest.costPer1MInvocations.toFixed(2) + ' per 1M invocations');
console.log('Fastest config:', optimal.fastest.memoryMB + 'MB at ' +
optimal.fastest.avgDurationMs + 'ms avg');
});
In my experience, most Node.js Lambda functions find their cost sweet spot between 512MB and 1024MB. Below 512MB, the CPU throttling extends duration disproportionately. Above 1024MB, you are paying for memory you rarely use unless the function processes large payloads.
Provisioned Concurrency Cost Tradeoffs
Provisioned concurrency eliminates cold starts by keeping function instances warm. The cost is $0.0000041667 per GB-second of provisioned concurrency plus a reduced per-request charge of $0.0000015 (versus the standard $0.0000002). This means provisioned concurrency only makes financial sense when two conditions are met: you have consistent baseline traffic and cold starts materially impact user experience.
// provisioned-concurrency-analysis.js
function analyzeProvisionedConcurrency(config) {
var hoursPerMonth = 730;
var memGB = config.memoryMB / 1024;
// Cost of keeping N instances warm
var provisionedSeconds = config.concurrentInstances * memGB * hoursPerMonth * 3600;
var warmCost = provisionedSeconds * 0.0000041667;
// Cost of provisioned requests
var provisionedRequestCost = config.monthlyRequests * 0.0000015;
// Standard (on-demand) cost for comparison
var durationSec = config.avgDurationMs / 1000;
var standardGbSeconds = config.monthlyRequests * memGB * durationSec;
var standardComputeCost = standardGbSeconds * 0.0000166667;
var standardRequestCost = (config.monthlyRequests / 1000000) * 0.20;
var provisionedTotal = warmCost + provisionedRequestCost;
var standardTotal = standardComputeCost + standardRequestCost;
return {
provisionedCost: provisionedTotal,
standardCost: standardTotal,
difference: provisionedTotal - standardTotal,
provisionedCheaper: provisionedTotal < standardTotal,
breakEvenRequests: Math.ceil(warmCost / (0.0000002 - 0.0000015 + (memGB * durationSec * 0.0000166667)))
};
}
var analysis = analyzeProvisionedConcurrency({
memoryMB: 1024,
concurrentInstances: 10,
monthlyRequests: 50000000,
avgDurationMs: 100
});
console.log('Provisioned cost: $' + analysis.provisionedCost.toFixed(2));
console.log('Standard cost: $' + analysis.standardCost.toFixed(2));
console.log('Provisioned is ' + (analysis.provisionedCheaper ? 'cheaper' : 'more expensive'));
The general rule: provisioned concurrency is worth it when you need sub-100ms P99 latency and have sustained traffic above 10 requests per second per provisioned instance. Below that utilization, you are paying for idle compute, which defeats the purpose of serverless.
Comparing Serverless vs Container vs VM Costs at Scale
This is the question every team eventually faces. Serverless is cheapest at low scale and bursty workloads. Containers (ECS/Fargate) win at steady medium scale. EC2 reserved instances win at high steady scale. Here is a comparison model.
// infrastructure-comparison.js
function compareInfrastructureCosts(config) {
var monthlyRequests = config.requestsPerSecond * 86400 * 30;
var requestsPerInstance = config.requestsPerInstancePerSecond * 86400 * 30;
// Serverless (Lambda + API Gateway)
var memGB = config.lambda.memoryMB / 1024;
var durationSec = config.lambda.avgDurationMs / 1000;
var lambdaCompute = monthlyRequests * memGB * durationSec * 0.0000166667;
var lambdaRequests = (monthlyRequests / 1000000) * 0.20;
var apiGateway = (monthlyRequests / 1000000) * 1.00;
var serverlessCost = lambdaCompute + lambdaRequests + apiGateway;
// Fargate (ECS)
var instancesNeeded = Math.ceil(config.requestsPerSecond / config.requestsPerInstancePerSecond);
var fargateVCPUCost = instancesNeeded * config.fargate.vcpu * 0.04048 * 730;
var fargateMemCost = instancesNeeded * config.fargate.memoryGB * 0.004445 * 730;
var albCost = 16.20 + (monthlyRequests / 1000000) * 0.80; // ALB fixed + LCU
var fargateCost = fargateVCPUCost + fargateMemCost + albCost;
// EC2 Reserved (1-year, no upfront)
var ec2InstancesNeeded = Math.ceil(config.requestsPerSecond / config.requestsPerInstancePerSecond);
var ec2MonthlyCost = ec2InstancesNeeded * config.ec2.reservedMonthlyPrice;
var ec2AlbCost = 16.20 + (monthlyRequests / 1000000) * 0.80;
var ec2Cost = ec2MonthlyCost + ec2AlbCost;
return {
serverless: { monthly: serverlessCost, annual: serverlessCost * 12 },
fargate: { monthly: fargateCost, annual: fargateCost * 12, instances: instancesNeeded },
ec2Reserved: { monthly: ec2Cost, annual: ec2Cost * 12, instances: ec2InstancesNeeded },
cheapest: serverlessCost <= fargateCost && serverlessCost <= ec2Cost ? 'serverless' :
fargateCost <= ec2Cost ? 'fargate' : 'ec2Reserved'
};
}
// Low traffic: 10 req/s
console.log('--- 10 req/s ---');
var low = compareInfrastructureCosts({
requestsPerSecond: 10,
requestsPerInstancePerSecond: 500,
lambda: { memoryMB: 512, avgDurationMs: 100 },
fargate: { vcpu: 0.25, memoryGB: 0.5 },
ec2: { reservedMonthlyPrice: 25.55 } // t3.small 1yr reserved
});
console.log('Serverless: $' + low.serverless.monthly.toFixed(2) + '/mo');
console.log('Fargate: $' + low.fargate.monthly.toFixed(2) + '/mo');
console.log('EC2: $' + low.ec2Reserved.monthly.toFixed(2) + '/mo');
console.log('Winner: ' + low.cheapest);
// High traffic: 1000 req/s
console.log('\n--- 1000 req/s ---');
var high = compareInfrastructureCosts({
requestsPerSecond: 1000,
requestsPerInstancePerSecond: 500,
lambda: { memoryMB: 512, avgDurationMs: 100 },
fargate: { vcpu: 1, memoryGB: 2 },
ec2: { reservedMonthlyPrice: 25.55 }
});
console.log('Serverless: $' + high.serverless.monthly.toFixed(2) + '/mo');
console.log('Fargate: $' + high.fargate.monthly.toFixed(2) + '/mo');
console.log('EC2: $' + high.ec2Reserved.monthly.toFixed(2) + '/mo');
console.log('Winner: ' + high.cheapest);
The crossover point varies by workload, but in my experience, serverless stops being cost-competitive around 200-300 sustained requests per second for typical API workloads. Below that, serverless wins on both cost and operational overhead. Above that, you should model the costs carefully — the operational simplicity of serverless may still justify the premium.
Cost Monitoring with CloudWatch and Budgets
Setting up cost monitoring is non-negotiable. Without alerts, a misconfigured Lambda retry loop can burn through hundreds of dollars in hours.
// cost-monitoring-setup.js
var AWS = require('aws-sdk');
var cloudwatch = new AWS.CloudWatch();
var budgets = new AWS.Budgets();
function setupCostAlarms(config) {
// Create billing alarm
var alarmParams = {
AlarmName: config.appName + '-monthly-cost-alarm',
AlarmDescription: 'Alert when estimated charges exceed $' + config.monthlyThreshold,
ActionsEnabled: true,
AlarmActions: [config.snsTopicArn],
MetricName: 'EstimatedCharges',
Namespace: 'AWS/Billing',
Statistic: 'Maximum',
Dimensions: [{ Name: 'Currency', Value: 'USD' }],
Period: 21600, // 6 hours
EvaluationPeriods: 1,
Threshold: config.monthlyThreshold,
ComparisonOperator: 'GreaterThanThreshold',
TreatMissingData: 'missing'
};
cloudwatch.putMetricAlarm(alarmParams, function(err) {
if (err) console.error('Failed to create billing alarm:', err.message);
else console.log('Billing alarm created: $' + config.monthlyThreshold + ' threshold');
});
// Create Lambda-specific invocation alarm (spike detection)
var invocationAlarmParams = {
AlarmName: config.appName + '-lambda-spike-alarm',
AlarmDescription: 'Alert on Lambda invocation spike',
ActionsEnabled: true,
AlarmActions: [config.snsTopicArn],
MetricName: 'Invocations',
Namespace: 'AWS/Lambda',
Statistic: 'Sum',
Dimensions: [{ Name: 'FunctionName', Value: config.functionName }],
Period: 300, // 5 minutes
EvaluationPeriods: 2,
Threshold: config.maxInvocationsPer5Min,
ComparisonOperator: 'GreaterThanThreshold'
};
cloudwatch.putMetricAlarm(invocationAlarmParams, function(err) {
if (err) console.error('Failed to create invocation alarm:', err.message);
else console.log('Invocation spike alarm created');
});
}
function setupBudget(config) {
var params = {
AccountId: config.accountId,
Budget: {
BudgetName: config.appName + '-monthly-budget',
BudgetLimit: { Amount: String(config.monthlyBudget), Unit: 'USD' },
BudgetType: 'COST',
TimeUnit: 'MONTHLY',
CostFilters: {
'TagKeyValue': ['user:Application$' + config.appName]
}
},
NotificationsWithSubscribers: [
{
Notification: {
NotificationType: 'ACTUAL',
ComparisonOperator: 'GREATER_THAN',
Threshold: 80,
ThresholdType: 'PERCENTAGE'
},
Subscribers: [{
SubscriptionType: 'EMAIL',
Address: config.alertEmail
}]
},
{
Notification: {
NotificationType: 'FORECASTED',
ComparisonOperator: 'GREATER_THAN',
Threshold: 100,
ThresholdType: 'PERCENTAGE'
},
Subscribers: [{
SubscriptionType: 'EMAIL',
Address: config.alertEmail
}]
}
]
};
budgets.createBudget(params, function(err) {
if (err) console.error('Failed to create budget:', err.message);
else console.log('Budget created: $' + config.monthlyBudget + '/month');
});
}
// Setup
setupCostAlarms({
appName: 'my-api',
monthlyThreshold: 500,
snsTopicArn: 'arn:aws:sns:us-east-1:123456789:billing-alerts',
functionName: 'my-api-handler',
maxInvocationsPer5Min: 50000
});
setupBudget({
appName: 'my-api',
accountId: '123456789012',
monthlyBudget: 600,
alertEmail: '[email protected]'
});
Right-Sizing Lambda Memory
AWS Lambda Power Tuning is an open-source tool, but you can build a simpler version specific to your needs. The key insight is to test with production-like payloads, not synthetic benchmarks.
// right-size.js — Analyze CloudWatch logs to recommend memory
var AWS = require('aws-sdk');
var cloudwatchlogs = new AWS.CloudWatchLogs();
function analyzeMemoryUsage(functionName, hoursBack, callback) {
var endTime = Date.now();
var startTime = endTime - (hoursBack * 3600 * 1000);
var logGroupName = '/aws/lambda/' + functionName;
var params = {
logGroupName: logGroupName,
startTime: startTime,
endTime: endTime,
filterPattern: 'REPORT',
limit: 1000
};
cloudwatchlogs.filterLogEvents(params, function(err, data) {
if (err) return callback(err);
var memoryUsed = [];
var durations = [];
var allocatedMemory = 0;
data.events.forEach(function(event) {
var memMatch = event.message.match(/Max Memory Used: (\d+) MB/);
var durMatch = event.message.match(/Billed Duration: (\d+) ms/);
var allocMatch = event.message.match(/Memory Size: (\d+) MB/);
if (memMatch) memoryUsed.push(parseInt(memMatch[1], 10));
if (durMatch) durations.push(parseInt(durMatch[1], 10));
if (allocMatch) allocatedMemory = parseInt(allocMatch[1], 10);
});
if (memoryUsed.length === 0) {
return callback(new Error('No REPORT logs found'));
}
memoryUsed.sort(function(a, b) { return a - b; });
durations.sort(function(a, b) { return a - b; });
var p50Memory = memoryUsed[Math.floor(memoryUsed.length * 0.5)];
var p95Memory = memoryUsed[Math.floor(memoryUsed.length * 0.95)];
var p99Memory = memoryUsed[Math.floor(memoryUsed.length * 0.99)];
var maxMemory = memoryUsed[memoryUsed.length - 1];
// Recommend memory with 20% headroom above P99
var recommendedMemory = Math.ceil(p99Memory * 1.2);
// Round up to nearest 64MB (Lambda memory increments)
recommendedMemory = Math.ceil(recommendedMemory / 64) * 64;
var result = {
functionName: functionName,
allocatedMemory: allocatedMemory,
sampleCount: memoryUsed.length,
memoryUsage: {
p50: p50Memory,
p95: p95Memory,
p99: p99Memory,
max: maxMemory
},
durationStats: {
p50: durations[Math.floor(durations.length * 0.5)],
p95: durations[Math.floor(durations.length * 0.95)],
p99: durations[Math.floor(durations.length * 0.99)]
},
recommendedMemory: recommendedMemory,
currentUtilization: ((maxMemory / allocatedMemory) * 100).toFixed(1) + '%',
potentialSavings: null
};
if (recommendedMemory < allocatedMemory) {
var currentGbSecond = (allocatedMemory / 1024) * (result.durationStats.p50 / 1000);
var newGbSecond = (recommendedMemory / 1024) * (result.durationStats.p50 / 1000);
result.potentialSavings = (((currentGbSecond - newGbSecond) / currentGbSecond) * 100).toFixed(1) + '%';
}
callback(null, result);
});
}
// Usage
analyzeMemoryUsage('my-api-handler', 24, function(err, result) {
if (err) {
console.error('Analysis failed:', err.message);
return;
}
console.log('\nMemory Analysis for:', result.functionName);
console.log('Allocated:', result.allocatedMemory + 'MB');
console.log('P95 Used:', result.memoryUsage.p95 + 'MB');
console.log('P99 Used:', result.memoryUsage.p99 + 'MB');
console.log('Max Used:', result.memoryUsage.max + 'MB');
console.log('Utilization:', result.currentUtilization);
console.log('Recommended:', result.recommendedMemory + 'MB');
if (result.potentialSavings) {
console.log('Potential Savings:', result.potentialSavings);
}
});
Cost Allocation Tags
Tags are the foundation of cost attribution. Without them, you cannot answer the question "which feature costs us the most?" Tag every resource consistently.
// tagging-strategy.js
var AWS = require('aws-sdk');
var lambda = new AWS.Lambda();
var resourcegroupstagging = new AWS.ResourceGroupsTaggingAPI();
var REQUIRED_TAGS = {
Application: 'my-api',
Environment: 'production',
Team: 'backend',
CostCenter: 'engineering-001'
};
function tagLambdaFunction(functionName, additionalTags, callback) {
lambda.getFunction({ FunctionName: functionName }, function(err, data) {
if (err) return callback(err);
var tags = Object.assign({}, REQUIRED_TAGS, additionalTags || {});
lambda.tagResource({
Resource: data.Configuration.FunctionArn,
Tags: tags
}, function(err) {
if (err) return callback(err);
console.log('Tagged ' + functionName + ' with', Object.keys(tags).length, 'tags');
callback(null);
});
});
}
function auditMissingTags(callback) {
var params = {
TagFilters: [{ Key: 'Application' }],
ResourceTypeFilters: [
'lambda:function',
'dynamodb:table',
's3:bucket',
'apigateway:restapi'
]
};
resourcegroupstagging.getResources(params, function(err, data) {
if (err) return callback(err);
var untagged = [];
data.ResourceTagMappingList.forEach(function(resource) {
var tagKeys = resource.Tags.map(function(t) { return t.Key; });
var missingTags = Object.keys(REQUIRED_TAGS).filter(function(key) {
return tagKeys.indexOf(key) === -1;
});
if (missingTags.length > 0) {
untagged.push({
arn: resource.ResourceARN,
missingTags: missingTags
});
}
});
callback(null, untagged);
});
}
Reserved Capacity Planning
For DynamoDB and other services with reserved pricing, planning requires analyzing historical usage patterns. Reserve capacity only for your baseline — let on-demand or auto-scaling handle peaks.
// reserved-capacity-planner.js
function planReservedCapacity(usageHistory) {
// usageHistory: array of { timestamp, readUnits, writeUnits }
// Sort by usage to find percentiles
var readValues = usageHistory.map(function(h) { return h.readUnits; }).sort(function(a, b) { return a - b; });
var writeValues = usageHistory.map(function(h) { return h.writeUnits; }).sort(function(a, b) { return a - b; });
var p10Read = readValues[Math.floor(readValues.length * 0.10)];
var p50Read = readValues[Math.floor(readValues.length * 0.50)];
var p10Write = writeValues[Math.floor(writeValues.length * 0.10)];
var p50Write = writeValues[Math.floor(writeValues.length * 0.50)];
// Reserve at P10 (baseline) — covers minimum sustained usage
// Auto-scale from P10 to P95 — handles normal variation
// On-demand burst above P95 — handles spikes
var reservedReadRCU = Math.floor(p10Read / 100) * 100; // round to 100 RCU blocks
var reservedWriteWCU = Math.floor(p10Write / 100) * 100;
// Cost comparison
var hoursPerMonth = 730;
var onDemandReadCost = p50Read * hoursPerMonth * 0.00013;
var reservedReadCost = (reservedReadRCU / 100) * (30 / 12) + // upfront amortized
reservedReadRCU * hoursPerMonth * 0.0000256 + // hourly
Math.max(0, p50Read - reservedReadRCU) * hoursPerMonth * 0.00013; // overflow
return {
recommendation: {
reserveReadRCU: reservedReadRCU,
reserveWriteWCU: reservedWriteWCU,
autoScaleReadMax: Math.ceil(readValues[Math.floor(readValues.length * 0.95)]),
autoScaleWriteMax: Math.ceil(writeValues[Math.floor(writeValues.length * 0.95)])
},
savings: {
onDemandMonthlyCost: onDemandReadCost,
reservedMonthlyCost: reservedReadCost,
monthlySavings: onDemandReadCost - reservedReadCost,
savingsPercent: (((onDemandReadCost - reservedReadCost) / onDemandReadCost) * 100).toFixed(1)
}
};
}
Complete Working Example
Here is the full cost modeling tool that ties everything together. It accepts a configuration file, runs the cost model, and produces optimization recommendations.
// serverless-cost-tool.js
var fs = require('fs');
var PRICING = {
lambda: {
requestCostPerMillion: 0.20,
gbSecondPrice: 0.0000166667,
freeRequests: 1000000,
freeGbSeconds: 400000
},
apiGateway: { httpApi: 1.00, restApi: 3.50 },
dynamodb: {
onDemandRead: 0.25,
onDemandWrite: 1.25,
storage: 0.25
},
s3: { storage: 0.023, getPer1k: 0.0004, putPer1k: 0.005, transferPerGB: 0.09 },
cloudwatch: { logIngestion: 0.50, logStorage: 0.03 },
natGateway: { perHour: 0.045, perGB: 0.045 }
};
function CostModelingTool(configPath) {
var raw = fs.readFileSync(configPath, 'utf8');
this.config = JSON.parse(raw);
this.results = {};
this.recommendations = [];
}
CostModelingTool.prototype.run = function() {
this._calculateLambda();
this._calculateApiGateway();
this._calculateDynamoDB();
this._calculateS3();
this._calculateCloudWatch();
this._calculateNetworking();
this._generateRecommendations();
return this;
};
CostModelingTool.prototype._calculateLambda = function() {
var c = this.config.lambda;
var monthly = c.requestsPerSecond * 86400 * 30;
var memGB = c.memoryMB / 1024;
var durSec = c.avgDurationMs / 1000;
var billableReqs = Math.max(0, monthly - PRICING.lambda.freeRequests);
var reqCost = (billableReqs / 1000000) * PRICING.lambda.requestCostPerMillion;
var gbSec = monthly * memGB * durSec;
var billableGbSec = Math.max(0, gbSec - PRICING.lambda.freeGbSeconds);
var computeCost = billableGbSec * PRICING.lambda.gbSecondPrice;
this.results.lambda = {
monthlyRequests: monthly,
gbSeconds: gbSec,
requestCost: reqCost,
computeCost: computeCost,
total: reqCost + computeCost
};
};
CostModelingTool.prototype._calculateApiGateway = function() {
var type = this.config.apiGateway.type;
var rate = PRICING.apiGateway[type];
var monthly = this.results.lambda.monthlyRequests;
this.results.apiGateway = {
type: type,
total: (monthly / 1000000) * rate
};
};
CostModelingTool.prototype._calculateDynamoDB = function() {
var c = this.config.dynamodb;
var monthly = this.results.lambda.monthlyRequests;
var reads = monthly * c.readRatio;
var writes = monthly * c.writeRatio;
this.results.dynamodb = {
readCost: (reads / 1000000) * PRICING.dynamodb.onDemandRead,
writeCost: (writes / 1000000) * PRICING.dynamodb.onDemandWrite,
storageCost: c.storageGB * PRICING.dynamodb.storage,
total: 0
};
this.results.dynamodb.total = this.results.dynamodb.readCost +
this.results.dynamodb.writeCost + this.results.dynamodb.storageCost;
};
CostModelingTool.prototype._calculateS3 = function() {
var c = this.config.s3;
if (!c) { this.results.s3 = { total: 0 }; return; }
var storageCost = c.storageGB * PRICING.s3.storage;
var getCost = (c.monthlyGets / 1000) * PRICING.s3.getPer1k;
var putCost = (c.monthlyPuts / 1000) * PRICING.s3.putPer1k;
var transferCost = c.transferGB * PRICING.s3.transferPerGB;
this.results.s3 = {
storageCost: storageCost,
requestCost: getCost + putCost,
transferCost: transferCost,
total: storageCost + getCost + putCost + transferCost
};
};
CostModelingTool.prototype._calculateCloudWatch = function() {
var monthly = this.results.lambda.monthlyRequests;
var logKB = this.config.cloudwatch.avgLogSizeKB || 1;
var logGB = (monthly * logKB) / 1048576;
this.results.cloudwatch = {
ingestionCost: logGB * PRICING.cloudwatch.logIngestion,
storageCost: logGB * PRICING.cloudwatch.logStorage,
total: logGB * (PRICING.cloudwatch.logIngestion + PRICING.cloudwatch.logStorage)
};
};
CostModelingTool.prototype._calculateNetworking = function() {
var c = this.config.networking;
if (!c || !c.natGateway) { this.results.networking = { total: 0 }; return; }
var natHourlyCost = PRICING.natGateway.perHour * 730;
var natDataCost = c.monthlyDataGB * PRICING.natGateway.perGB;
this.results.networking = {
natGatewayCost: natHourlyCost,
dataProcessingCost: natDataCost,
total: natHourlyCost + natDataCost
};
};
CostModelingTool.prototype._generateRecommendations = function() {
var r = this.results;
// Check API Gateway type
if (this.config.apiGateway.type === 'restApi') {
var savings = r.apiGateway.total * (1 - 1.00 / 3.50);
this.recommendations.push({
service: 'API Gateway',
action: 'Switch from REST API to HTTP API',
monthlySavings: savings,
effort: 'Medium'
});
}
// Check Lambda memory utilization
if (this.config.lambda.memoryMB > 512 && this.config.lambda.avgDurationMs < 100) {
this.recommendations.push({
service: 'Lambda',
action: 'Reduce memory allocation — low duration suggests over-provisioning',
monthlySavings: r.lambda.computeCost * 0.3,
effort: 'Low'
});
}
// Check CloudWatch log costs
if (r.cloudwatch.total > 50) {
this.recommendations.push({
service: 'CloudWatch',
action: 'Reduce log verbosity or set log retention policy',
monthlySavings: r.cloudwatch.total * 0.5,
effort: 'Low'
});
}
// Check NAT Gateway costs
if (r.networking.total > 100) {
this.recommendations.push({
service: 'Networking',
action: 'Use VPC endpoints instead of NAT Gateway for AWS service access',
monthlySavings: r.networking.total * 0.8,
effort: 'Medium'
});
}
// Check DynamoDB mode
if (r.dynamodb.total > 200) {
this.recommendations.push({
service: 'DynamoDB',
action: 'Evaluate provisioned capacity with auto-scaling for steady workloads',
monthlySavings: r.dynamodb.total * 0.4,
effort: 'Medium'
});
}
};
CostModelingTool.prototype.printReport = function() {
var r = this.results;
var total = 0;
console.log('\n╔══════════════════════════════════════════╗');
console.log('║ Serverless Cost Model Report ║');
console.log('╠══════════════════════════════════════════╣');
var services = ['lambda', 'apiGateway', 'dynamodb', 's3', 'cloudwatch', 'networking'];
var labels = {
lambda: 'Lambda',
apiGateway: 'API Gateway',
dynamodb: 'DynamoDB',
s3: 'S3',
cloudwatch: 'CloudWatch',
networking: 'Networking'
};
var self = this;
services.forEach(function(svc) {
if (self.results[svc]) {
var cost = self.results[svc].total;
total += cost;
var label = (labels[svc] + ':').padEnd(16);
console.log('║ ' + label + '$' + cost.toFixed(2).padStart(10) + ' ║');
}
});
console.log('╠══════════════════════════════════════════╣');
console.log('║ MONTHLY TOTAL: $' + total.toFixed(2).padStart(10) + ' ║');
console.log('║ ANNUAL TOTAL: $' + (total * 12).toFixed(2).padStart(10) + ' ║');
console.log('╚══════════════════════════════════════════╝');
if (this.recommendations.length > 0) {
console.log('\n=== Optimization Recommendations ===\n');
var totalSavings = 0;
this.recommendations.forEach(function(rec, i) {
console.log((i + 1) + '. [' + rec.service + '] ' + rec.action);
console.log(' Estimated savings: $' + rec.monthlySavings.toFixed(2) + '/month');
console.log(' Effort: ' + rec.effort);
totalSavings += rec.monthlySavings;
});
console.log('\nTotal potential savings: $' + totalSavings.toFixed(2) + '/month');
}
};
// Create a sample config and run
var sampleConfig = {
lambda: {
requestsPerSecond: 100,
memoryMB: 1024,
avgDurationMs: 200
},
apiGateway: { type: 'restApi' },
dynamodb: {
readRatio: 1.0,
writeRatio: 0.3,
storageGB: 50
},
s3: {
storageGB: 100,
monthlyGets: 10000000,
monthlyPuts: 500000,
transferGB: 200
},
cloudwatch: { avgLogSizeKB: 2 },
networking: {
natGateway: true,
monthlyDataGB: 100
}
};
// Save sample config
fs.writeFileSync('cost-config.json', JSON.stringify(sampleConfig, null, 2));
// Run the tool
var tool = new CostModelingTool('cost-config.json');
tool.run();
tool.printReport();
This tool reads a JSON config, calculates costs across all serverless components, and produces actionable recommendations. In production, you would extend it to pull real usage data from CloudWatch and Cost Explorer APIs rather than using static estimates.
Common Issues and Troubleshooting
1. Unexpected Lambda Timeout Costs
REPORT RequestId: abc-123 Duration: 30000.00 ms Billed Duration: 30000 ms
Memory Size: 1024 MB Max Memory Used: 45 MB
Task timed out after 30.00 seconds
Lambda charges for the full duration even when a function times out. A 1024MB function timing out at 30 seconds costs $0.0005 per timeout. If a downstream service is down and causing cascading timeouts at 100 requests per second, that is $0.05 per second or $180 per hour in wasted compute. Set aggressive timeouts (3-5 seconds for API functions) and implement circuit breakers.
2. DynamoDB On-Demand Throttling During Spikes
ProvisionedThroughputExceededException: The level of configured provisioned
throughput for the table was exceeded. Consider increasing your provisioning
level with the UpdateTable API.
Despite the name, on-demand mode can throttle if traffic doubles within 30 minutes. On-demand tables pre-provision capacity equal to the previous peak, and can scale to double that instantly. Beyond 2x previous peak, scaling takes time. If you know about upcoming traffic spikes (marketing campaigns, launches), pre-warm by gradually increasing traffic over 30 minutes.
3. CloudWatch Logs Ingestion Cost Surprise
AWS Cost Explorer showing $400+ for CloudWatch Logs
This is almost always caused by verbose logging in high-traffic Lambda functions. A console.log(JSON.stringify(event)) in a function handling 100 requests per second with 2KB events generates 518 GB of logs per month at a cost of $259 for ingestion alone. Fix by setting the log level to WARN in production, using structured logging with sampling, and setting log retention to 7 or 14 days instead of the default "Never expire."
4. NAT Gateway Data Processing Charges
AWS Bill Line Item: $328.50 - NatGateway-Hours and NatGateway-Bytes
VPC-connected Lambda functions routing traffic through a NAT Gateway pay $0.045/GB processed. If your Lambda calls external APIs or S3 (without a VPC endpoint), every byte transits the NAT Gateway. The fix is to add VPC endpoints for S3, DynamoDB, and other AWS services you call (free for gateway endpoints, $0.01/hour for interface endpoints). For external API calls, evaluate whether the Lambda truly needs VPC access — removing it from the VPC eliminates NAT Gateway costs entirely.
5. API Gateway 429 Rate Limit with Cost Implications
{
"message": "Too Many Requests",
"statusCode": 429
}
API Gateway has a default account-level throttle of 10,000 requests per second. When you hit this, requests fail but you are still charged for the 4XX responses. More importantly, retried requests double your costs. Configure client-side exponential backoff and set realistic throttle limits at the stage or method level to fail fast rather than accumulating charges from retry storms.
Best Practices
Right-size Lambda memory using data, not guesses. Run the power-tuning tool monthly. A function allocated 3008MB that only uses 200MB is burning money on every invocation. The optimal memory configuration minimizes cost per request, not cost per GB-second.
Use HTTP API instead of REST API unless you specifically need REST API features. The 3.5x price difference adds up fast. HTTP APIs support Lambda proxy integration, JWT authorizers, and CORS — which covers 80% of use cases.
Set CloudWatch log retention policies on every log group. The default is "Never expire," meaning logs accumulate and storage costs grow every month forever. Set production logs to 14-30 days, staging to 7 days, and dev to 3 days. Archive important logs to S3 for long-term storage at a fraction of the cost.
Tag every resource with Application, Environment, Team, and CostCenter tags. Enable cost allocation tags in the Billing console. Without tags, Cost Explorer shows you total spend but cannot tell you which feature or team is responsible. This makes optimization impossible at scale.
Set up billing alarms before you need them. Create alarms at 50%, 80%, and 100% of your expected monthly budget. Add a separate alarm for Lambda invocation spikes. A runaway retry loop or DDoS can burn through your budget in hours if nobody is watching.
Prefer VPC endpoints over NAT Gateways for AWS service access. S3 and DynamoDB gateway endpoints are free. Interface endpoints cost $0.01/hour but eliminate the $0.045/GB NAT Gateway data processing charge. For a function processing 1TB/month through a NAT Gateway, switching to VPC endpoints saves over $40/month.
Batch DynamoDB operations to reduce request costs. A BatchWriteItem of 25 items costs the same as 25 individual PutItem calls in WRU terms, but reduces Lambda execution time by eliminating 24 round trips. Lower duration means lower Lambda compute cost.
Review your architecture at cost breakpoints. Run the serverless vs container comparison quarterly. When your monthly serverless bill consistently exceeds what 2-3 Fargate tasks would cost, it is time to consider migrating high-traffic, steady-state endpoints to containers while keeping bursty and low-traffic functions serverless.
Cache aggressively at every layer. API Gateway response caching ($0.02/hour for 0.5GB) can reduce Lambda invocations by 80% for read-heavy APIs. DynamoDB DAX reduces read costs. CloudFront reduces S3 origin requests. Every cache hit is a Lambda invocation you did not pay for.
References
- AWS Lambda Pricing — Official pricing page with current rates and free tier details
- API Gateway Pricing — REST API, HTTP API, and WebSocket API pricing
- DynamoDB Pricing — On-demand vs provisioned capacity pricing
- AWS Pricing Calculator — Official multi-service cost estimation tool
- Lambda Power Tuning — Open-source memory optimization tool
- AWS Cost Explorer API — Programmatic access to cost data
- Well-Architected Framework: Cost Optimization — AWS best practices for cost management
- AWS Compute Optimizer — ML-based recommendations for right-sizing