API Authentication Patterns for LLM Services
Secure API authentication patterns for LLM services including key management, rotation, vault integration, and multi-tenant strategies in Node.js.
API Authentication Patterns for LLM Services
Overview
Every production application that calls an LLM API has the same critical vulnerability: API keys. A leaked OpenAI key can rack up thousands of dollars in charges in minutes. A compromised Anthropic key can expose your entire prompt history. This article covers the authentication patterns I use in production to manage, rotate, audit, and secure API keys for LLM services across single-tenant and multi-tenant Node.js applications.
Prerequisites
- Node.js v18 or later installed
- Basic understanding of Express.js middleware
- Familiarity with environment variables and
.envfiles - An account with at least one LLM provider (OpenAI, Anthropic, or similar)
- Working knowledge of HTTP headers and API request patterns
API Key Management Fundamentals
The first rule is absolute: never hardcode API keys. I have seen this in production codebases at companies that should know better. It gets committed, pushed to GitHub, scraped by bots within minutes, and the bill starts climbing.
// NEVER do this. Ever.
var openaiKey = "sk-proj-abc123def456...";
// This is the minimum acceptable pattern
var openaiKey = process.env.OPENAI_API_KEY;
But process.env alone is not a strategy. It is a starting point. You need a layered approach:
- Development:
.envfiles loaded viadotenv, excluded from version control - CI/CD: Pipeline-level secrets injected at build or deploy time
- Production: Cloud-native secret management with encryption at rest
Here is the baseline setup every project should have:
var dotenv = require("dotenv");
var path = require("path");
// Load environment-specific .env file
var envFile = process.env.NODE_ENV === "production"
? ".env.production"
: ".env.development";
dotenv.config({ path: path.join(__dirname, envFile) });
// Validate that required keys exist at startup
var requiredKeys = [
"OPENAI_API_KEY",
"ANTHROPIC_API_KEY"
];
requiredKeys.forEach(function (key) {
if (!process.env[key]) {
console.error("FATAL: Missing required environment variable: " + key);
process.exit(1);
}
});
Fail fast. If a key is missing, do not let the process start. You do not want to discover the problem when a user hits the endpoint at 2 AM.
Environment Variable Patterns for Different Deployment Targets
Every deployment target handles environment variables differently, and you need to understand the mechanics.
Local development uses .env files:
# .env.development
OPENAI_API_KEY=sk-proj-dev-key-here
ANTHROPIC_API_KEY=sk-ant-dev-key-here
LLM_PROVIDER=openai
LLM_MODEL=gpt-4o-mini
DigitalOcean App Platform uses app-level environment variables defined in the app spec:
# .do/app.yaml
envs:
- key: OPENAI_API_KEY
scope: RUN_TIME
type: SECRET
- key: ANTHROPIC_API_KEY
scope: RUN_TIME
type: SECRET
- key: LLM_PROVIDER
scope: RUN_AND_BUILD_TIME
value: openai
The type: SECRET designation is critical. It tells App Platform to encrypt the value and mask it in logs. Without it, your key shows up in plain text in the dashboard and deploy logs.
Docker containers receive variables at runtime:
docker run -d \
-e OPENAI_API_KEY="$(cat /run/secrets/openai_key)" \
-e ANTHROPIC_API_KEY="$(cat /run/secrets/anthropic_key)" \
my-llm-app:latest
Kubernetes uses secrets objects:
apiVersion: v1
kind: Secret
metadata:
name: llm-api-keys
type: Opaque
data:
OPENAI_API_KEY: c2stcHJvai1hYmMxMjM= # base64 encoded
ANTHROPIC_API_KEY: c2stYW50LWFiYzEyMw==
Note that Kubernetes base64 encoding is not encryption. It is encoding. You still need to restrict access to the Secret objects via RBAC.
Secret Management with Cloud Vaults
For production systems, environment variables are the minimum. Cloud vaults add encryption at rest, access logging, automatic rotation, and fine-grained IAM policies.
AWS Secrets Manager
var AWS = require("aws-sdk");
var secretsManager = new AWS.SecretsManager({
region: process.env.AWS_REGION || "us-east-1"
});
function getSecret(secretName, callback) {
secretsManager.getSecretValue({ SecretId: secretName }, function (err, data) {
if (err) {
console.error("Failed to retrieve secret:", secretName, err.code);
return callback(err, null);
}
if (data.SecretString) {
try {
var parsed = JSON.parse(data.SecretString);
return callback(null, parsed);
} catch (e) {
return callback(null, data.SecretString);
}
}
callback(new Error("Secret binary not supported"), null);
});
}
// Usage at startup
getSecret("prod/llm-api-keys", function (err, secrets) {
if (err) {
console.error("FATAL: Cannot load secrets from vault");
process.exit(1);
}
process.env.OPENAI_API_KEY = secrets.OPENAI_API_KEY;
process.env.ANTHROPIC_API_KEY = secrets.ANTHROPIC_API_KEY;
// Now start the server
require("./server");
});
Azure Key Vault
var identity = require("@azure/identity");
var secretClient = require("@azure/keyvault-secrets");
var credential = new identity.DefaultAzureCredential();
var vaultUrl = "https://my-llm-vault.vault.azure.net";
var client = new secretClient.SecretClient(vaultUrl, credential);
function loadAzureSecrets(callback) {
var keys = ["openai-api-key", "anthropic-api-key"];
var loaded = 0;
var secrets = {};
keys.forEach(function (keyName) {
client.getSecret(keyName).then(function (secret) {
secrets[keyName] = secret.value;
loaded++;
if (loaded === keys.length) {
callback(null, secrets);
}
}).catch(function (err) {
callback(err, null);
});
});
}
The point of vault integration is not just security theater. It gives you an audit trail. Every time a secret is read, rotated, or accessed, you get a log entry. When something goes wrong, you can trace exactly when and where a key was used.
Rotating API Keys Without Downtime
Key rotation is where most teams fail. They know they should rotate keys but they do not have a mechanism that avoids downtime. The pattern I use is a dual-key window:
var LLMKeyRotator = function (options) {
this.provider = options.provider;
this.primaryKey = options.primaryKey;
this.secondaryKey = options.secondaryKey || null;
this.activeKey = "primary";
this.failureCount = 0;
this.failureThreshold = options.failureThreshold || 3;
};
LLMKeyRotator.prototype.getKey = function () {
if (this.activeKey === "primary") {
return this.primaryKey;
}
return this.secondaryKey;
};
LLMKeyRotator.prototype.reportFailure = function (statusCode) {
// Only count auth failures, not rate limits or server errors
if (statusCode === 401 || statusCode === 403) {
this.failureCount++;
console.warn(
"[KeyRotator] Auth failure #" + this.failureCount +
" for " + this.provider +
" using " + this.activeKey + " key"
);
if (this.failureCount >= this.failureThreshold && this.secondaryKey) {
this.activeKey = this.activeKey === "primary" ? "secondary" : "primary";
this.failureCount = 0;
console.warn(
"[KeyRotator] Switched " + this.provider +
" to " + this.activeKey + " key"
);
}
}
};
LLMKeyRotator.prototype.reportSuccess = function () {
this.failureCount = 0;
};
LLMKeyRotator.prototype.updateKeys = function (primary, secondary) {
this.primaryKey = primary;
this.secondaryKey = secondary || this.secondaryKey;
this.activeKey = "primary";
this.failureCount = 0;
};
The rotation workflow:
- Generate a new key in the provider dashboard
- Set it as the secondary key via vault or environment update
- Revoke the old primary key
- The rotator automatically falls back to the secondary
- Promote the secondary to primary, generate a new secondary
This gives you zero-downtime rotation every time.
Implementing a Key Manager Class
In production, you are calling multiple LLM providers. You need a centralized manager that handles all of them.
var EventEmitter = require("events");
var util = require("util");
function LLMKeyManager(options) {
EventEmitter.call(this);
this.providers = {};
this.usageLog = [];
this.usageLogMax = options.usageLogMax || 10000;
}
util.inherits(LLMKeyManager, EventEmitter);
LLMKeyManager.prototype.registerProvider = function (name, config) {
this.providers[name] = {
name: name,
rotator: new LLMKeyRotator({
provider: name,
primaryKey: config.primaryKey,
secondaryKey: config.secondaryKey,
failureThreshold: config.failureThreshold || 3
}),
rateLimit: {
requestsPerMinute: config.requestsPerMinute || 60,
tokensPerMinute: config.tokensPerMinute || 100000,
currentRequests: 0,
currentTokens: 0,
windowStart: Date.now()
},
totalRequests: 0,
totalTokens: 0,
totalCost: 0
};
console.log("[KeyManager] Registered provider: " + name);
};
LLMKeyManager.prototype.getKey = function (providerName) {
var provider = this.providers[providerName];
if (!provider) {
throw new Error("Unknown LLM provider: " + providerName);
}
return provider.rotator.getKey();
};
LLMKeyManager.prototype.checkRateLimit = function (providerName) {
var provider = this.providers[providerName];
if (!provider) return false;
var now = Date.now();
var elapsed = now - provider.rateLimit.windowStart;
// Reset window every minute
if (elapsed > 60000) {
provider.rateLimit.currentRequests = 0;
provider.rateLimit.currentTokens = 0;
provider.rateLimit.windowStart = now;
}
return provider.rateLimit.currentRequests < provider.rateLimit.requestsPerMinute;
};
LLMKeyManager.prototype.recordUsage = function (providerName, details) {
var provider = this.providers[providerName];
if (!provider) return;
provider.rateLimit.currentRequests++;
provider.rateLimit.currentTokens += (details.tokens || 0);
provider.totalRequests++;
provider.totalTokens += (details.tokens || 0);
provider.totalCost += (details.cost || 0);
var logEntry = {
provider: providerName,
timestamp: new Date().toISOString(),
tokens: details.tokens || 0,
cost: details.cost || 0,
model: details.model || "unknown",
userId: details.userId || "system",
endpoint: details.endpoint || "unknown",
statusCode: details.statusCode || 200
};
this.usageLog.push(logEntry);
// Trim log if it gets too large
if (this.usageLog.length > this.usageLogMax) {
this.usageLog = this.usageLog.slice(-Math.floor(this.usageLogMax / 2));
}
if (details.statusCode === 401 || details.statusCode === 403) {
provider.rotator.reportFailure(details.statusCode);
this.emit("authFailure", logEntry);
} else {
provider.rotator.reportSuccess();
}
this.emit("usage", logEntry);
};
LLMKeyManager.prototype.getStats = function (providerName) {
var provider = this.providers[providerName];
if (!provider) return null;
return {
name: provider.name,
totalRequests: provider.totalRequests,
totalTokens: provider.totalTokens,
totalCost: provider.totalCost.toFixed(4),
currentRateUsage: {
requests: provider.rateLimit.currentRequests,
limit: provider.rateLimit.requestsPerMinute,
windowResetMs: 60000 - (Date.now() - provider.rateLimit.windowStart)
}
};
};
module.exports = LLMKeyManager;
OAuth Flows for LLM APIs
Most LLM providers use simple API key auth, but some enterprise scenarios require OAuth 2.0. Azure OpenAI Service, for instance, supports both API key and Azure Active Directory (Entra ID) token authentication.
var https = require("https");
var querystring = require("querystring");
function AzureOAuthProvider(config) {
this.tenantId = config.tenantId;
this.clientId = config.clientId;
this.clientSecret = config.clientSecret;
this.scope = config.scope || "https://cognitiveservices.azure.com/.default";
this.cachedToken = null;
this.tokenExpiry = 0;
}
AzureOAuthProvider.prototype.getToken = function (callback) {
var self = this;
// Return cached token if still valid (with 5 minute buffer)
if (self.cachedToken && Date.now() < self.tokenExpiry - 300000) {
return callback(null, self.cachedToken);
}
var postData = querystring.stringify({
grant_type: "client_credentials",
client_id: self.clientId,
client_secret: self.clientSecret,
scope: self.scope
});
var options = {
hostname: "login.microsoftonline.com",
path: "/" + self.tenantId + "/oauth2/v2.0/token",
method: "POST",
headers: {
"Content-Type": "application/x-www-form-urlencoded",
"Content-Length": Buffer.byteLength(postData)
}
};
var req = https.request(options, function (res) {
var body = "";
res.on("data", function (chunk) { body += chunk; });
res.on("end", function () {
try {
var data = JSON.parse(body);
if (data.access_token) {
self.cachedToken = data.access_token;
self.tokenExpiry = Date.now() + (data.expires_in * 1000);
callback(null, data.access_token);
} else {
callback(new Error("OAuth token error: " + (data.error_description || "Unknown")), null);
}
} catch (e) {
callback(e, null);
}
});
});
req.on("error", function (err) { callback(err, null); });
req.write(postData);
req.end();
};
The advantage of OAuth over static API keys is that tokens expire automatically. Even if a token leaks, it is useless after the expiry window (typically one hour). The tradeoff is added complexity and a dependency on the identity provider.
Proxy Patterns to Hide API Keys from Client-Side Code
Never send LLM API keys to the browser. Period. The proxy pattern is the standard approach: your server holds the keys and proxies requests to the LLM provider.
var express = require("express");
var https = require("https");
var router = express.Router();
// Middleware: authenticate the user, not the LLM
router.use(function (req, res, next) {
var userToken = req.headers["authorization"];
if (!userToken || !isValidUserToken(userToken)) {
return res.status(401).json({ error: "Unauthorized" });
}
req.userId = extractUserId(userToken);
next();
});
// Proxy endpoint
router.post("/api/llm/completions", function (req, res) {
var provider = req.body.provider || "openai";
var apiKey = keyManager.getKey(provider);
if (!keyManager.checkRateLimit(provider)) {
return res.status(429).json({ error: "Rate limit exceeded" });
}
var payload = JSON.stringify({
model: req.body.model || "gpt-4o-mini",
messages: sanitizeMessages(req.body.messages),
max_tokens: Math.min(req.body.max_tokens || 1000, 4000),
temperature: req.body.temperature || 0.7
});
var options = {
hostname: "api.openai.com",
path: "/v1/chat/completions",
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": "Bearer " + apiKey,
"Content-Length": Buffer.byteLength(payload)
}
};
var proxyReq = https.request(options, function (proxyRes) {
var body = "";
proxyRes.on("data", function (chunk) { body += chunk; });
proxyRes.on("end", function () {
keyManager.recordUsage(provider, {
tokens: estimateTokens(body),
userId: req.userId,
model: req.body.model,
statusCode: proxyRes.statusCode,
endpoint: "/v1/chat/completions"
});
res.status(proxyRes.statusCode).json(JSON.parse(body));
});
});
proxyReq.on("error", function (err) {
console.error("[Proxy] Request failed:", err.message);
res.status(502).json({ error: "LLM service unavailable" });
});
proxyReq.write(payload);
proxyReq.end();
});
function sanitizeMessages(messages) {
if (!Array.isArray(messages)) return [];
return messages.map(function (msg) {
return {
role: msg.role === "system" || msg.role === "assistant" ? msg.role : "user",
content: String(msg.content).substring(0, 32000)
};
});
}
function estimateTokens(responseBody) {
try {
var data = JSON.parse(responseBody);
return (data.usage && data.usage.total_tokens) || 0;
} catch (e) {
return 0;
}
}
The proxy does several important things beyond hiding the key: it sanitizes input, enforces token limits, tracks usage per user, and gives you a single point to swap providers without changing any client code.
Auditing and Logging API Key Usage
Every LLM API call costs money. Without auditing, you are flying blind. I log every request with enough detail to answer three questions: who made the call, how much did it cost, and is anything anomalous.
var fs = require("fs");
var path = require("path");
function UsageAuditor(options) {
this.logDir = options.logDir || path.join(__dirname, "logs");
this.alertThreshold = options.alertThreshold || 100; // dollars
this.alertCallback = options.alertCallback || null;
this.dailyTotals = {};
if (!fs.existsSync(this.logDir)) {
fs.mkdirSync(this.logDir, { recursive: true });
}
}
UsageAuditor.prototype.log = function (entry) {
var date = new Date().toISOString().split("T")[0];
var logFile = path.join(this.logDir, "llm-usage-" + date + ".jsonl");
var line = JSON.stringify(entry) + "\n";
fs.appendFile(logFile, line, function (err) {
if (err) console.error("[Auditor] Write failed:", err.message);
});
// Track daily cost
if (!this.dailyTotals[date]) {
this.dailyTotals[date] = { cost: 0, requests: 0 };
}
this.dailyTotals[date].cost += (entry.cost || 0);
this.dailyTotals[date].requests++;
// Alert on anomalous spending
if (this.dailyTotals[date].cost > this.alertThreshold && this.alertCallback) {
this.alertCallback({
date: date,
totalCost: this.dailyTotals[date].cost,
totalRequests: this.dailyTotals[date].requests,
latestEntry: entry
});
}
};
UsageAuditor.prototype.getDailySummary = function (date) {
return this.dailyTotals[date] || { cost: 0, requests: 0 };
};
Wire the auditor into the key manager's event system:
var auditor = new UsageAuditor({
logDir: "/var/log/llm-usage",
alertThreshold: 50,
alertCallback: function (alert) {
console.error("[ALERT] Daily LLM spend exceeded $" + alert.alertThreshold +
" - Current: $" + alert.totalCost.toFixed(2));
// Send to Slack, PagerDuty, email, etc.
}
});
keyManager.on("usage", function (entry) {
auditor.log(entry);
});
keyManager.on("authFailure", function (entry) {
console.error("[SECURITY] Auth failure for " + entry.provider +
" from user " + entry.userId);
auditor.log(Object.assign({}, entry, { type: "AUTH_FAILURE" }));
});
Per-User API Key Management in Multi-Tenant Apps
In multi-tenant applications, you might let users bring their own API keys. This keeps your costs at zero but creates new security responsibilities.
var crypto = require("crypto");
var ENCRYPTION_KEY = process.env.USER_KEY_ENCRYPTION_SECRET; // 32 bytes hex
var IV_LENGTH = 16;
function encryptUserKey(apiKey) {
var iv = crypto.randomBytes(IV_LENGTH);
var cipher = crypto.createCipheriv(
"aes-256-cbc",
Buffer.from(ENCRYPTION_KEY, "hex"),
iv
);
var encrypted = cipher.update(apiKey, "utf8", "hex");
encrypted += cipher.final("hex");
return iv.toString("hex") + ":" + encrypted;
}
function decryptUserKey(encryptedKey) {
var parts = encryptedKey.split(":");
var iv = Buffer.from(parts[0], "hex");
var decipher = crypto.createDecipheriv(
"aes-256-cbc",
Buffer.from(ENCRYPTION_KEY, "hex"),
iv
);
var decrypted = decipher.update(parts[1], "hex", "utf8");
decrypted += decipher.final("utf8");
return decrypted;
}
// Express route to save a user's API key
router.post("/api/settings/llm-key", function (req, res) {
var apiKey = req.body.apiKey;
// Validate key format before storing
if (!apiKey || !apiKey.startsWith("sk-")) {
return res.status(400).json({ error: "Invalid API key format" });
}
var encrypted = encryptUserKey(apiKey);
// Store encrypted key in database
db.collection("user_settings").updateOne(
{ userId: req.userId },
{ $set: { encryptedLlmKey: encrypted, llmKeyUpdatedAt: new Date() } },
{ upsert: true },
function (err) {
if (err) return res.status(500).json({ error: "Failed to save key" });
res.json({ message: "API key saved securely" });
}
);
});
Never store user API keys in plain text. Always encrypt at rest. And never, ever log the decrypted value.
Rate Limiting Tied to Authentication
Rate limiting for LLM endpoints needs to work at two levels: per-user limits to prevent abuse, and per-provider limits to stay within your API quota.
function RateLimiter(options) {
this.windows = {};
this.maxRequests = options.maxRequests || 20;
this.windowMs = options.windowMs || 60000;
this.cleanupInterval = setInterval(this.cleanup.bind(this), 300000);
}
RateLimiter.prototype.check = function (key) {
var now = Date.now();
if (!this.windows[key]) {
this.windows[key] = { count: 1, start: now };
return { allowed: true, remaining: this.maxRequests - 1 };
}
var window = this.windows[key];
if (now - window.start > this.windowMs) {
window.count = 1;
window.start = now;
return { allowed: true, remaining: this.maxRequests - 1 };
}
if (window.count >= this.maxRequests) {
var retryAfter = Math.ceil((window.start + this.windowMs - now) / 1000);
return { allowed: false, remaining: 0, retryAfterSeconds: retryAfter };
}
window.count++;
return { allowed: true, remaining: this.maxRequests - window.count };
};
RateLimiter.prototype.cleanup = function () {
var now = Date.now();
var self = this;
Object.keys(this.windows).forEach(function (key) {
if (now - self.windows[key].start > self.windowMs * 2) {
delete self.windows[key];
}
});
};
// Middleware usage
var userLimiter = new RateLimiter({ maxRequests: 20, windowMs: 60000 });
router.use("/api/llm", function (req, res, next) {
var result = userLimiter.check(req.userId);
res.setHeader("X-RateLimit-Remaining", result.remaining);
if (!result.allowed) {
res.setHeader("Retry-After", result.retryAfterSeconds);
return res.status(429).json({
error: "Rate limit exceeded",
retryAfterSeconds: result.retryAfterSeconds
});
}
next();
});
Securing API Keys in CI/CD Pipelines
CI/CD is a common leak vector. Build logs, test output, and artifact storage can all expose keys.
GitHub Actions:
name: Deploy
on:
push:
branches: [master]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run tests
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: npm test
- name: Deploy to production
env:
DO_ACCESS_TOKEN: ${{ secrets.DIGITALOCEAN_TOKEN }}
run: doctl apps create-deployment ${{ secrets.APP_ID }}
Critical rules for CI/CD:
- Use the platform's secret storage, never environment variable files checked into the repo
- Mask secrets in logs. GitHub Actions does this automatically for repository secrets
- Use separate keys for CI/CD. If a CI key leaks, revoke it without affecting production
- Scope CI keys to minimal permissions. A test key does not need production-tier rate limits
Add a pre-commit hook that scans for key patterns:
#!/bin/bash
# .git/hooks/pre-commit
if git diff --cached --diff-filter=ACM | grep -qE '(sk-[a-zA-Z0-9]{20,}|sk-proj-[a-zA-Z0-9]{20,}|sk-ant-[a-zA-Z0-9]{20,})'; then
echo "ERROR: Possible API key detected in staged changes."
echo "Remove the key and use environment variables instead."
exit 1
fi
Detecting and Responding to Compromised Keys
You need a plan for when (not if) a key gets compromised. Here is the response checklist I follow:
- Revoke immediately. Do not assess impact first. Revoke, then investigate.
- Rotate to secondary key via the key manager (this is why dual-key patterns exist).
- Audit usage logs for the compromised key to find unauthorized requests.
- Check provider dashboards for unexpected charges or usage spikes.
- Generate new keys and deploy via vault or environment update.
- Post-mortem: how did it leak, and what process change prevents recurrence.
Automated detection in your key manager:
LLMKeyManager.prototype.detectAnomaly = function (providerName) {
var provider = this.providers[providerName];
if (!provider) return null;
var recentUsage = this.usageLog.filter(function (entry) {
return entry.provider === providerName &&
(Date.now() - new Date(entry.timestamp).getTime()) < 3600000;
});
var uniqueUsers = {};
var totalCost = 0;
recentUsage.forEach(function (entry) {
uniqueUsers[entry.userId] = true;
totalCost += entry.cost;
});
var anomalies = [];
if (recentUsage.length > 500) {
anomalies.push("High request volume: " + recentUsage.length + " in last hour");
}
if (totalCost > 25) {
anomalies.push("High cost: $" + totalCost.toFixed(2) + " in last hour");
}
if (Object.keys(uniqueUsers).length > 100) {
anomalies.push("Unusual user count: " + Object.keys(uniqueUsers).length);
}
return anomalies.length > 0 ? anomalies : null;
};
Complete Working Example
Here is a full, production-ready key manager module that ties everything together. Save this as llm-key-manager.js:
var EventEmitter = require("events");
var util = require("util");
var crypto = require("crypto");
var fs = require("fs");
var path = require("path");
// --- Key Rotator ---
function KeyRotator(options) {
this.provider = options.provider;
this.primaryKey = options.primaryKey;
this.secondaryKey = options.secondaryKey || null;
this.activeKey = "primary";
this.failureCount = 0;
this.failureThreshold = options.failureThreshold || 3;
}
KeyRotator.prototype.getKey = function () {
return this.activeKey === "primary" ? this.primaryKey : this.secondaryKey;
};
KeyRotator.prototype.reportFailure = function (statusCode) {
if (statusCode === 401 || statusCode === 403) {
this.failureCount++;
if (this.failureCount >= this.failureThreshold && this.secondaryKey) {
this.activeKey = this.activeKey === "primary" ? "secondary" : "primary";
this.failureCount = 0;
return true; // switched
}
}
return false;
};
KeyRotator.prototype.reportSuccess = function () {
this.failureCount = 0;
};
// --- Usage Auditor ---
function UsageAuditor(options) {
this.logDir = options.logDir || path.join(process.cwd(), "logs");
this.alertThreshold = options.alertThreshold || 100;
this.dailyTotals = {};
try {
if (!fs.existsSync(this.logDir)) {
fs.mkdirSync(this.logDir, { recursive: true });
}
} catch (e) {
console.warn("[Auditor] Could not create log dir:", e.message);
}
}
UsageAuditor.prototype.log = function (entry) {
var date = new Date().toISOString().split("T")[0];
var logFile = path.join(this.logDir, "llm-usage-" + date + ".jsonl");
fs.appendFile(logFile, JSON.stringify(entry) + "\n", function (err) {
if (err) console.error("[Auditor] Write failed:", err.message);
});
if (!this.dailyTotals[date]) {
this.dailyTotals[date] = { cost: 0, requests: 0, tokens: 0 };
}
this.dailyTotals[date].cost += (entry.cost || 0);
this.dailyTotals[date].requests++;
this.dailyTotals[date].tokens += (entry.tokens || 0);
return this.dailyTotals[date];
};
// --- Rate Limiter ---
function RateLimiter(options) {
this.limits = {};
this.defaultMax = options.defaultMax || 20;
this.windowMs = options.windowMs || 60000;
}
RateLimiter.prototype.check = function (key, customMax) {
var max = customMax || this.defaultMax;
var now = Date.now();
if (!this.limits[key] || now - this.limits[key].start > this.windowMs) {
this.limits[key] = { count: 1, start: now };
return { allowed: true, remaining: max - 1 };
}
if (this.limits[key].count >= max) {
var retryAfter = Math.ceil(
(this.limits[key].start + this.windowMs - now) / 1000
);
return { allowed: false, remaining: 0, retryAfterSeconds: retryAfter };
}
this.limits[key].count++;
return { allowed: true, remaining: max - this.limits[key].count };
};
// --- Main Key Manager ---
function LLMKeyManager(options) {
EventEmitter.call(this);
options = options || {};
this.providers = {};
this.auditor = new UsageAuditor({
logDir: options.logDir,
alertThreshold: options.alertThreshold
});
this.rateLimiter = new RateLimiter({
defaultMax: options.defaultRateLimit || 60,
windowMs: options.rateLimitWindowMs || 60000
});
this.usageLog = [];
this.usageLogMax = options.usageLogMax || 5000;
}
util.inherits(LLMKeyManager, EventEmitter);
LLMKeyManager.prototype.registerProvider = function (name, config) {
this.providers[name] = {
name: name,
rotator: new KeyRotator({
provider: name,
primaryKey: config.primaryKey,
secondaryKey: config.secondaryKey,
failureThreshold: config.failureThreshold || 3
}),
rateLimitPerMinute: config.rateLimitPerMinute || 60,
totalRequests: 0,
totalTokens: 0,
totalCost: 0,
errors: 0
};
console.log("[LLMKeyManager] Registered: " + name);
return this;
};
LLMKeyManager.prototype.getKey = function (providerName) {
var provider = this.providers[providerName];
if (!provider) {
throw new Error("[LLMKeyManager] Unknown provider: " + providerName);
}
return provider.rotator.getKey();
};
LLMKeyManager.prototype.checkRateLimit = function (providerName, userId) {
var provider = this.providers[providerName];
if (!provider) return { allowed: false, remaining: 0 };
// Check provider-level limit
var providerCheck = this.rateLimiter.check(
"provider:" + providerName,
provider.rateLimitPerMinute
);
if (!providerCheck.allowed) return providerCheck;
// Check user-level limit if userId provided
if (userId) {
return this.rateLimiter.check("user:" + userId);
}
return providerCheck;
};
LLMKeyManager.prototype.recordUsage = function (providerName, details) {
var provider = this.providers[providerName];
if (!provider) return;
provider.totalRequests++;
provider.totalTokens += (details.tokens || 0);
provider.totalCost += (details.cost || 0);
var entry = {
provider: providerName,
timestamp: new Date().toISOString(),
tokens: details.tokens || 0,
cost: details.cost || 0,
model: details.model || "unknown",
userId: details.userId || "system",
statusCode: details.statusCode || 200,
latencyMs: details.latencyMs || 0
};
// Handle auth failures
if (details.statusCode === 401 || details.statusCode === 403) {
provider.errors++;
var switched = provider.rotator.reportFailure(details.statusCode);
entry.authFailure = true;
entry.keySwitched = switched;
this.emit("authFailure", entry);
} else {
provider.rotator.reportSuccess();
}
// Log to auditor
var daily = this.auditor.log(entry);
// Track in memory
this.usageLog.push(entry);
if (this.usageLog.length > this.usageLogMax) {
this.usageLog = this.usageLog.slice(-Math.floor(this.usageLogMax / 2));
}
// Emit events
this.emit("usage", entry);
if (daily.cost > this.auditor.alertThreshold) {
this.emit("costAlert", {
provider: providerName,
dailyCost: daily.cost,
dailyRequests: daily.requests,
threshold: this.auditor.alertThreshold
});
}
};
LLMKeyManager.prototype.getStats = function (providerName) {
if (providerName) {
var provider = this.providers[providerName];
if (!provider) return null;
return {
name: provider.name,
totalRequests: provider.totalRequests,
totalTokens: provider.totalTokens,
totalCost: "$" + provider.totalCost.toFixed(4),
errors: provider.errors
};
}
var self = this;
var allStats = {};
Object.keys(this.providers).forEach(function (name) {
allStats[name] = self.getStats(name);
});
return allStats;
};
LLMKeyManager.prototype.detectAnomalies = function () {
var self = this;
var oneHourAgo = Date.now() - 3600000;
var anomalies = [];
var recentEntries = this.usageLog.filter(function (entry) {
return new Date(entry.timestamp).getTime() > oneHourAgo;
});
Object.keys(this.providers).forEach(function (name) {
var providerEntries = recentEntries.filter(function (e) {
return e.provider === name;
});
var totalCost = 0;
var authFailures = 0;
providerEntries.forEach(function (e) {
totalCost += e.cost;
if (e.authFailure) authFailures++;
});
if (providerEntries.length > 500) {
anomalies.push({
provider: name,
type: "HIGH_VOLUME",
message: providerEntries.length + " requests in last hour"
});
}
if (totalCost > 25) {
anomalies.push({
provider: name,
type: "HIGH_COST",
message: "$" + totalCost.toFixed(2) + " in last hour"
});
}
if (authFailures > 5) {
anomalies.push({
provider: name,
type: "AUTH_FAILURES",
message: authFailures + " auth failures in last hour"
});
}
});
return anomalies;
};
// --- Initialization helper ---
LLMKeyManager.createFromEnv = function (options) {
var manager = new LLMKeyManager(options);
if (process.env.OPENAI_API_KEY) {
manager.registerProvider("openai", {
primaryKey: process.env.OPENAI_API_KEY,
secondaryKey: process.env.OPENAI_API_KEY_SECONDARY,
rateLimitPerMinute: parseInt(process.env.OPENAI_RPM, 10) || 60
});
}
if (process.env.ANTHROPIC_API_KEY) {
manager.registerProvider("anthropic", {
primaryKey: process.env.ANTHROPIC_API_KEY,
secondaryKey: process.env.ANTHROPIC_API_KEY_SECONDARY,
rateLimitPerMinute: parseInt(process.env.ANTHROPIC_RPM, 10) || 60
});
}
return manager;
};
module.exports = LLMKeyManager;
Usage in an Express app:
var express = require("express");
var LLMKeyManager = require("./llm-key-manager");
var app = express();
app.use(express.json());
var keyManager = LLMKeyManager.createFromEnv({
logDir: "./logs/llm",
alertThreshold: 50,
defaultRateLimit: 20
});
keyManager.on("authFailure", function (entry) {
console.error("[SECURITY] Auth failure:", entry.provider, entry.userId);
});
keyManager.on("costAlert", function (alert) {
console.error("[COST ALERT] " + alert.provider + ": $" +
alert.dailyCost.toFixed(2) + " today (threshold: $" +
alert.threshold + ")");
});
app.post("/api/chat", function (req, res) {
var provider = req.body.provider || "openai";
var userId = req.headers["x-user-id"] || "anonymous";
var rateCheck = keyManager.checkRateLimit(provider, userId);
if (!rateCheck.allowed) {
return res.status(429).json({
error: "Rate limit exceeded",
retryAfterSeconds: rateCheck.retryAfterSeconds
});
}
var apiKey = keyManager.getKey(provider);
var startTime = Date.now();
// ... make the actual LLM API call with apiKey ...
// On response:
keyManager.recordUsage(provider, {
tokens: 1500,
cost: 0.003,
model: "gpt-4o-mini",
userId: userId,
statusCode: 200,
latencyMs: Date.now() - startTime
});
res.json({ message: "Response from LLM" });
});
app.get("/api/llm/stats", function (req, res) {
res.json({
stats: keyManager.getStats(),
anomalies: keyManager.detectAnomalies()
});
});
app.listen(process.env.PORT || 8080, function () {
console.log("Server running with LLM key management enabled");
});
Running it:
OPENAI_API_KEY=sk-proj-abc123 ANTHROPIC_API_KEY=sk-ant-def456 node server.js
Expected output:
[LLMKeyManager] Registered: openai
[LLMKeyManager] Registered: anthropic
Server running with LLM key management enabled
Common Issues and Troubleshooting
1. "Error: Unknown LLM provider: openAI"
Error: [LLMKeyManager] Unknown provider: openAI
at LLMKeyManager.getKey (llm-key-manager.js:142)
Provider names are case-sensitive. If you registered "openai" but your request body sends "openAI", it fails. Normalize provider names to lowercase:
var provider = (req.body.provider || "openai").toLowerCase().trim();
2. "Error: error:06065064:digital envelope routines:EVP_DecryptFinal_ex:bad decrypt"
Error: error:06065064:digital envelope routines:EVP_DecryptFinal_ex:bad decrypt
at Decipheriv.final (node:internal/crypto/cipher:196:29)
This happens when the USER_KEY_ENCRYPTION_SECRET environment variable changed between encryption and decryption. If you rotate your encryption key, you need to re-encrypt all stored user keys with the new key first. Never rotate the encryption key independently.
3. "429 Too Many Requests" from the LLM Provider Despite Low Traffic
{
"error": {
"message": "Rate limit reached for gpt-4o-mini in organization org-abc123 on tokens per min (TPM): Limit 200000, Used 198542, Requested 4500.",
"type": "tokens",
"code": "rate_limit_exceeded"
}
}
Your rate limiter counts requests, but the provider counts tokens. A single request with a large context window can consume most of your token budget. Add token-based rate limiting alongside request-based limits:
if (provider.rateLimit.currentTokens + estimatedTokens > provider.rateLimit.tokensPerMinute) {
return res.status(429).json({ error: "Token rate limit exceeded" });
}
4. "FATAL: Missing required environment variable: OPENAI_API_KEY" in Docker
FATAL: Missing required environment variable: OPENAI_API_KEY
npm ERR! code ELIFECYCLE
npm ERR! errno 1
The .env file exists on your machine but was not copied into the Docker image (and should not be). Pass environment variables at runtime with -e flags or Docker Compose environment block. If using Docker Compose:
services:
app:
build: .
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
Then run with: OPENAI_API_KEY=sk-proj-xxx docker compose up
5. OAuth Token Expired Mid-Request
{
"error": {
"code": "ExpiredAuthenticationToken",
"message": "The access token has expired or is not yet valid."
}
}
This occurs when a long-running request exceeds the token TTL. Cache tokens with a 5-minute buffer before expiry, and implement retry logic that refreshes the token on 401:
function callWithRetry(provider, payload, callback, retried) {
provider.getToken(function (err, token) {
if (err) return callback(err);
makeRequest(token, payload, function (err, response) {
if (response && response.statusCode === 401 && !retried) {
provider.cachedToken = null; // Force refresh
return callWithRetry(provider, payload, callback, true);
}
callback(err, response);
});
});
}
Best Practices
Fail at startup, not at request time. Validate all required API keys exist before the server starts accepting traffic. A missing key at 3 AM is worse than a failed deployment.
Use separate keys per environment. Development, staging, and production should each have their own API keys. This limits blast radius when a dev key leaks and makes it easy to track usage by environment.
Implement dual-key rotation. Always have a secondary key ready. Rotation should be a routine operation, not an emergency procedure. Aim to rotate every 90 days at minimum.
Log everything, expose nothing. Log provider name, token count, cost, user ID, and timestamp for every LLM call. Never log the actual API key, prompt content, or response content in production.
Rate limit at multiple levels. Enforce per-user, per-provider, and global rate limits. The LLM provider's rate limit is your last line of defense, not your first.
Encrypt user-supplied keys at rest. If your application stores API keys for users, use AES-256-CBC or better with a separate encryption key stored in a vault. Never store raw keys in your database.
Scan commits for leaked keys. Use pre-commit hooks, GitHub secret scanning, or tools like truffleHog to catch keys before they hit your repository. Treat any committed key as compromised immediately.
Set spending alerts. Every LLM provider dashboard has billing alerts. Set them aggressively low. Also implement application-level cost tracking that alerts independently. Belt and suspenders.
Scope API keys to minimum permissions. If your provider supports key-level permissions (Azure OpenAI does, OpenAI is adding it), restrict keys to only the models and operations you actually need.
Plan for compromise. Have a runbook for key revocation, rotation, and impact assessment. Practice it before you need it. The first time you respond to a compromised key should not be the first time you think about the process.
References
- OpenAI API Authentication - Official API key documentation
- Anthropic API Authentication - Claude API key setup and usage
- Azure Key Vault Node.js SDK - Secret management for Azure
- AWS Secrets Manager Developer Guide - AWS vault documentation
- OWASP API Security Top 10 - Industry standard API security guidance
- Node.js Crypto Module - Built-in encryption capabilities for key storage
- GitHub Secret Scanning - Automated detection of committed secrets