AI-Powered Customer Support Automation
Build AI-powered customer support with chatbots, RAG knowledge retrieval, intent classification, and human escalation in Node.js.
AI-Powered Customer Support Automation
Overview
Customer support is one of the highest-impact applications of large language models in production today. By combining intent classification, retrieval-augmented generation, sentiment analysis, and intelligent escalation, you can build a support system that handles the majority of inbound requests while routing the difficult cases to human agents with full context. This article walks through the complete architecture and implementation of an AI-powered support platform in Node.js, from message intake to resolution tracking.
Prerequisites
- Node.js v18 or later installed
- Basic understanding of Express.js and WebSocket
- An OpenAI API key (or equivalent LLM provider)
- A PostgreSQL database for ticket and knowledge storage
- Familiarity with vector embeddings and similarity search concepts
- Working knowledge of REST APIs and middleware patterns
The Case for AI in Customer Support
24/7 Availability and Consistency
The most immediate win from AI-powered support is coverage. Human agents work in shifts. They get tired. They have bad days. An AI support layer operates around the clock with consistent quality. When a customer contacts you at 3 AM about a billing issue, the AI can resolve it immediately rather than queuing it for the morning shift.
This is not about replacing your support team. It is about making them dramatically more effective. The AI handles password resets, order status checks, FAQ-style questions, and basic troubleshooting. Your human agents focus on complex escalations, relationship management, and the cases that actually require judgment.
Cost Reduction Without Quality Sacrifice
A well-implemented AI support system typically resolves 40-60% of incoming requests without human intervention. At scale, that translates to significant cost savings. But the cost argument is secondary. The real value is speed. Average first-response time drops from minutes or hours to under two seconds. Resolution time for common issues drops to seconds instead of minutes.
The key word here is "well-implemented." A poorly built chatbot that frustrates customers and loops them through unhelpful responses is worse than no automation at all. Everything in this article is designed to avoid that outcome.
Designing a Support Chatbot Architecture
The architecture has five core components that work together:
- Message Intake Layer - Receives messages from multiple channels (web chat, email, Slack, API)
- Intent Classifier - Determines what the customer is asking for
- Knowledge Retrieval Engine - Finds relevant information using RAG
- Response Generator - Produces contextual, helpful responses
- Escalation Manager - Routes to human agents when confidence is low
Customer Message
|
v
[Message Intake] --> [PII Detection] --> [Intent Classifier]
|
+-----------+-----------+
| |
[High Confidence] [Low Confidence]
| |
[RAG Retrieval] [Human Escalation]
| |
[Response Gen] [Ticket Creation]
| |
[Sentiment Check] [Agent Assignment]
|
[Send Response]
|
[Feedback Collection]
This flow ensures that every message is classified, sensitive data is handled appropriately, and the system knows when to defer to a human.
Implementing Intent Classification
Intent classification is the first decision point after a message arrives. You need to know what the customer wants before you can help them. Here is a robust implementation that uses an LLM for flexible intent detection:
var axios = require("axios");
var INTENT_CATEGORIES = [
"billing_inquiry",
"order_status",
"technical_support",
"account_management",
"product_information",
"complaint",
"refund_request",
"feature_request",
"general_question",
"greeting",
"unknown"
];
function classifyIntent(message, conversationHistory) {
var systemPrompt = "You are a customer support intent classifier. " +
"Analyze the customer message and return a JSON object with: " +
"intent (one of: " + INTENT_CATEGORIES.join(", ") + "), " +
"confidence (0.0 to 1.0), " +
"entities (extracted key details like order numbers, product names, account IDs), " +
"urgency (low, medium, high, critical). " +
"Return ONLY valid JSON, no explanation.";
var messages = [{ role: "system", content: systemPrompt }];
if (conversationHistory && conversationHistory.length > 0) {
conversationHistory.forEach(function (msg) {
messages.push({ role: msg.role, content: msg.content });
});
}
messages.push({ role: "user", content: message });
return axios.post("https://api.openai.com/v1/chat/completions", {
model: "gpt-4o-mini",
messages: messages,
temperature: 0.1,
max_tokens: 200,
response_format: { type: "json_object" }
}, {
headers: {
"Authorization": "Bearer " + process.env.OPENAI_API_KEY,
"Content-Type": "application/json"
}
}).then(function (response) {
var result = JSON.parse(response.data.choices[0].message.content);
return {
intent: result.intent || "unknown",
confidence: result.confidence || 0.0,
entities: result.entities || {},
urgency: result.urgency || "medium"
};
}).catch(function (err) {
console.error("Intent classification failed:", err.message);
return {
intent: "unknown",
confidence: 0.0,
entities: {},
urgency: "medium"
};
});
}
module.exports = { classifyIntent: classifyIntent, INTENT_CATEGORIES: INTENT_CATEGORIES };
Using gpt-4o-mini with a low temperature gives you fast, deterministic classification. The structured JSON output format guarantees parseable results. Notice the fallback in the catch block: if classification fails for any reason, the system defaults to "unknown" intent with zero confidence, which triggers human escalation downstream.
Knowledge Base Integration with RAG
Retrieval-Augmented Generation is the core pattern that makes your AI support system actually useful. Instead of relying solely on the LLM's training data, you feed it your actual documentation, FAQs, and past resolutions. Here is how to implement it with vector embeddings:
var axios = require("axios");
var { Pool } = require("pg");
var pool = new Pool({
connectionString: process.env.POSTGRES_CONNECTION_STRING
});
function generateEmbedding(text) {
return axios.post("https://api.openai.com/v1/embeddings", {
model: "text-embedding-3-small",
input: text
}, {
headers: {
"Authorization": "Bearer " + process.env.OPENAI_API_KEY,
"Content-Type": "application/json"
}
}).then(function (response) {
return response.data.data[0].embedding;
});
}
function indexKnowledgeArticle(article) {
return generateEmbedding(article.title + " " + article.content)
.then(function (embedding) {
var embeddingStr = "[" + embedding.join(",") + "]";
return pool.query(
"INSERT INTO knowledge_base (title, content, category, embedding, metadata) " +
"VALUES ($1, $2, $3, $4::vector, $5) " +
"ON CONFLICT (title) DO UPDATE SET content = $2, embedding = $4::vector",
[article.title, article.content, article.category, embeddingStr, JSON.stringify(article.metadata || {})]
);
});
}
function searchKnowledge(query, limit) {
limit = limit || 5;
return generateEmbedding(query)
.then(function (queryEmbedding) {
var embeddingStr = "[" + queryEmbedding.join(",") + "]";
return pool.query(
"SELECT title, content, category, " +
"1 - (embedding <=> $1::vector) AS similarity " +
"FROM knowledge_base " +
"WHERE 1 - (embedding <=> $1::vector) > 0.3 " +
"ORDER BY embedding <=> $1::vector " +
"LIMIT $2",
[embeddingStr, limit]
);
})
.then(function (result) {
return result.rows;
});
}
module.exports = {
indexKnowledgeArticle: indexKnowledgeArticle,
searchKnowledge: searchKnowledge,
generateEmbedding: generateEmbedding
};
The similarity threshold of 0.3 is important. Set it too high and you miss relevant articles. Set it too low and you retrieve garbage. Start at 0.3 and tune based on your data. The pgvector extension handles the cosine distance calculation (<=> operator) efficiently with proper indexing.
Populating the Knowledge Base
Before the system can retrieve answers, you need to populate it with your existing documentation:
var fs = require("fs");
var path = require("path");
var knowledge = require("./knowledge");
function importFAQs(faqDirectory) {
var files = fs.readdirSync(faqDirectory);
var promises = [];
files.forEach(function (file) {
if (path.extname(file) !== ".md") return;
var content = fs.readFileSync(path.join(faqDirectory, file), "utf8");
var lines = content.split("\n");
var title = lines[0].replace(/^#+\s*/, "");
var body = lines.slice(1).join("\n").trim();
promises.push(knowledge.indexKnowledgeArticle({
title: title,
content: body,
category: "faq",
metadata: { source: file, importedAt: new Date().toISOString() }
}));
});
return Promise.all(promises).then(function (results) {
console.log("Imported " + results.length + " FAQ articles");
return results;
});
}
Generating Contextual Responses
With intent classification and knowledge retrieval in place, the response generator combines them to produce helpful answers:
var axios = require("axios");
var knowledge = require("./knowledge");
function generateResponse(message, intent, conversationHistory, customerContext) {
return knowledge.searchKnowledge(message, 3)
.then(function (articles) {
var knowledgeContext = "";
if (articles.length > 0) {
knowledgeContext = "\n\nRelevant knowledge base articles:\n";
articles.forEach(function (article, index) {
knowledgeContext += "\n--- Article " + (index + 1) +
" (similarity: " + article.similarity.toFixed(2) + ") ---\n" +
"Title: " + article.title + "\n" +
article.content + "\n";
});
}
var systemPrompt = "You are a helpful customer support agent for " +
(process.env.COMPANY_NAME || "our company") + ". " +
"Answer the customer's question using the provided knowledge base articles. " +
"If the knowledge base does not contain the answer, say so honestly and offer to " +
"connect them with a human agent. Be concise, friendly, and professional. " +
"Never make up information about products, pricing, or policies. " +
"Customer intent: " + intent.intent + "\n" +
"Customer urgency: " + intent.urgency;
if (customerContext) {
systemPrompt += "\nCustomer context: " +
"Account type: " + (customerContext.accountType || "unknown") + ", " +
"Customer since: " + (customerContext.customerSince || "unknown") + ", " +
"Open tickets: " + (customerContext.openTickets || 0);
}
systemPrompt += knowledgeContext;
var messages = [{ role: "system", content: systemPrompt }];
if (conversationHistory && conversationHistory.length > 0) {
var recent = conversationHistory.slice(-10);
recent.forEach(function (msg) {
messages.push({ role: msg.role, content: msg.content });
});
}
messages.push({ role: "user", content: message });
return axios.post("https://api.openai.com/v1/chat/completions", {
model: "gpt-4o",
messages: messages,
temperature: 0.3,
max_tokens: 500
}, {
headers: {
"Authorization": "Bearer " + process.env.OPENAI_API_KEY,
"Content-Type": "application/json"
}
});
})
.then(function (response) {
return {
content: response.data.choices[0].message.content,
usage: response.data.usage
};
});
}
module.exports = { generateResponse: generateResponse };
Notice the conversation history is capped at the last 10 messages. This keeps token usage predictable and avoids the context window filling up in long conversations. The temperature of 0.3 gives the model enough creativity to sound natural while keeping it grounded in the retrieved knowledge.
Implementing Escalation to Human Agents
Smart escalation is what separates a good AI support system from a frustrating one. The system needs to know when it cannot help and hand off gracefully:
var ESCALATION_THRESHOLDS = {
confidenceMinimum: 0.6,
maxAITurns: 5,
sentimentEscalation: -0.5,
criticalIntents: ["complaint", "refund_request"],
escalationKeywords: ["speak to a human", "real person", "manager", "supervisor", "escalate"]
};
function shouldEscalate(intent, conversation, sentiment) {
// Low confidence in intent classification
if (intent.confidence < ESCALATION_THRESHOLDS.confidenceMinimum) {
return { escalate: true, reason: "low_confidence", detail: "Intent confidence: " + intent.confidence };
}
// Customer explicitly requests human agent
var lastMessage = conversation[conversation.length - 1];
if (lastMessage && lastMessage.role === "user") {
var lower = lastMessage.content.toLowerCase();
var keywordMatch = ESCALATION_THRESHOLDS.escalationKeywords.some(function (keyword) {
return lower.indexOf(keyword) !== -1;
});
if (keywordMatch) {
return { escalate: true, reason: "customer_request", detail: "Customer requested human agent" };
}
}
// Too many AI turns without resolution
var aiTurns = conversation.filter(function (msg) {
return msg.role === "assistant";
}).length;
if (aiTurns >= ESCALATION_THRESHOLDS.maxAITurns) {
return { escalate: true, reason: "max_turns", detail: "AI turns: " + aiTurns };
}
// Negative sentiment detected
if (sentiment && sentiment.score < ESCALATION_THRESHOLDS.sentimentEscalation) {
return { escalate: true, reason: "negative_sentiment", detail: "Sentiment: " + sentiment.score };
}
// Critical intent types always get human review
if (ESCALATION_THRESHOLDS.criticalIntents.indexOf(intent.intent) !== -1 &&
intent.urgency === "critical") {
return { escalate: true, reason: "critical_intent", detail: intent.intent + " with critical urgency" };
}
return { escalate: false };
}
function escalateToHuman(conversationId, reason, conversation, intent) {
var summary = summarizeConversation(conversation);
return {
conversationId: conversationId,
escalationReason: reason,
intent: intent,
summary: summary,
conversationHistory: conversation,
escalatedAt: new Date().toISOString(),
priority: calculateEscalationPriority(reason, intent)
};
}
function summarizeConversation(conversation) {
var customerMessages = conversation.filter(function (msg) {
return msg.role === "user";
}).map(function (msg) {
return msg.content;
});
return customerMessages.join(" | ").substring(0, 500);
}
function calculateEscalationPriority(reason, intent) {
if (reason.reason === "negative_sentiment" || intent.urgency === "critical") return "high";
if (reason.reason === "customer_request") return "medium";
return "normal";
}
module.exports = {
shouldEscalate: shouldEscalate,
escalateToHuman: escalateToHuman
};
The multi-factor escalation logic is critical. A single threshold is never enough. Customers get escalated when any of these conditions trigger: low confidence, explicit request, too many back-and-forth turns, negative sentiment, or critical issue types. The conversation summary gives the human agent immediate context so the customer does not have to repeat themselves.
Ticket Creation and Categorization
Every customer interaction should produce a structured ticket, whether resolved by AI or escalated to a human:
var { Pool } = require("pg");
var axios = require("axios");
var pool = new Pool({ connectionString: process.env.POSTGRES_CONNECTION_STRING });
function createTicket(conversation, intent, resolution) {
var ticket = {
conversationId: conversation.id,
customerId: conversation.customerId,
subject: generateSubject(intent, conversation),
category: intent.intent,
priority: mapUrgencyToPriority(intent.urgency),
status: resolution.resolvedByAI ? "resolved" : "open",
resolvedByAI: resolution.resolvedByAI,
aiConfidence: intent.confidence,
tags: extractTags(intent),
createdAt: new Date().toISOString()
};
return pool.query(
"INSERT INTO support_tickets " +
"(conversation_id, customer_id, subject, category, priority, status, " +
"resolved_by_ai, ai_confidence, tags, created_at) " +
"VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10) RETURNING id",
[ticket.conversationId, ticket.customerId, ticket.subject, ticket.category,
ticket.priority, ticket.status, ticket.resolvedByAI, ticket.aiConfidence,
JSON.stringify(ticket.tags), ticket.createdAt]
).then(function (result) {
ticket.id = result.rows[0].id;
return ticket;
});
}
function generateSubject(intent, conversation) {
var entitySummary = Object.keys(intent.entities).map(function (key) {
return key + ": " + intent.entities[key];
}).join(", ");
var subjects = {
billing_inquiry: "Billing question" + (entitySummary ? " - " + entitySummary : ""),
order_status: "Order status inquiry" + (entitySummary ? " - " + entitySummary : ""),
technical_support: "Technical support request",
refund_request: "Refund request" + (entitySummary ? " - " + entitySummary : ""),
complaint: "Customer complaint",
account_management: "Account management request"
};
return subjects[intent.intent] || "Support request - " + intent.intent;
}
function mapUrgencyToPriority(urgency) {
var map = { critical: 1, high: 2, medium: 3, low: 4 };
return map[urgency] || 3;
}
function extractTags(intent) {
var tags = [intent.intent];
if (intent.urgency === "critical" || intent.urgency === "high") {
tags.push("priority");
}
Object.keys(intent.entities).forEach(function (key) {
tags.push(key);
});
return tags;
}
module.exports = { createTicket: createTicket };
Sentiment Analysis for Prioritization
Sentiment analysis runs on every incoming message to detect frustrated or upset customers before they explicitly complain:
var axios = require("axios");
function analyzeSentiment(message) {
return axios.post("https://api.openai.com/v1/chat/completions", {
model: "gpt-4o-mini",
messages: [
{
role: "system",
content: "Analyze the sentiment of this customer support message. " +
"Return JSON with: score (-1.0 to 1.0 where -1 is very negative, 0 is neutral, " +
"1 is very positive), emotion (frustrated, angry, confused, neutral, satisfied, happy), " +
"escalation_risk (low, medium, high). Return ONLY valid JSON."
},
{ role: "user", content: message }
],
temperature: 0.1,
max_tokens: 100,
response_format: { type: "json_object" }
}, {
headers: {
"Authorization": "Bearer " + process.env.OPENAI_API_KEY,
"Content-Type": "application/json"
}
}).then(function (response) {
return JSON.parse(response.data.choices[0].message.content);
}).catch(function () {
return { score: 0, emotion: "neutral", escalation_risk: "low" };
});
}
module.exports = { analyzeSentiment: analyzeSentiment };
Multi-Channel Support
A production support system needs to handle messages from multiple channels. Here is a channel adapter pattern that normalizes messages into a common format:
function createChannelAdapter(channelType) {
var adapters = {
webchat: {
normalize: function (raw) {
return {
channel: "webchat",
customerId: raw.sessionId,
content: raw.message,
timestamp: raw.timestamp || new Date().toISOString(),
metadata: { userAgent: raw.userAgent, page: raw.currentPage }
};
},
send: function (response, context) {
// WebSocket send handled by the chat server
return { type: "websocket", payload: response };
}
},
email: {
normalize: function (raw) {
return {
channel: "email",
customerId: raw.from,
content: raw.subject + "\n\n" + raw.body,
timestamp: raw.receivedAt || new Date().toISOString(),
metadata: { subject: raw.subject, threadId: raw.threadId }
};
},
send: function (response, context) {
return {
type: "email",
to: context.customerId,
subject: "Re: " + (context.metadata.subject || "Support Request"),
body: response.content,
threadId: context.metadata.threadId
};
}
},
slack: {
normalize: function (raw) {
return {
channel: "slack",
customerId: raw.user,
content: raw.text,
timestamp: raw.ts,
metadata: { channelId: raw.channel, threadTs: raw.thread_ts }
};
},
send: function (response, context) {
return {
type: "slack",
channel: context.metadata.channelId,
text: response.content,
thread_ts: context.metadata.threadTs
};
}
}
};
return adapters[channelType] || adapters.webchat;
}
module.exports = { createChannelAdapter: createChannelAdapter };
This pattern lets you add new channels without touching the core support logic. Each adapter knows how to normalize incoming messages into the standard format and how to format outgoing responses for its specific channel.
Response Template Management
For consistency, you often want AI to customize pre-approved templates rather than generate entirely freeform text:
var templates = {
order_status: {
found: "Hi {customerName}! Your order #{orderId} is currently {status}. " +
"{additionalDetail} Is there anything else I can help with?",
not_found: "I was not able to locate order #{orderId} in our system. " +
"Could you double-check the order number? It should be in your confirmation email."
},
refund_request: {
eligible: "I have initiated a refund of {amount} for order #{orderId}. " +
"You should see it back on your {paymentMethod} within 5-7 business days.",
review_needed: "I understand you would like a refund for order #{orderId}. " +
"Given the circumstances, I am connecting you with a specialist who can review this. " +
"They will have full context of our conversation."
},
greeting: {
default: "Hello! Welcome to {companyName} support. How can I help you today?"
}
};
function renderTemplate(templateKey, variant, variables) {
var template = templates[templateKey];
if (!template || !template[variant]) return null;
var rendered = template[variant];
Object.keys(variables).forEach(function (key) {
rendered = rendered.replace(new RegExp("\\{" + key + "\\}", "g"), variables[key]);
});
return rendered;
}
function customizeTemplate(template, customerContext, tone) {
// Use AI to adjust tone while keeping core message intact
var axios = require("axios");
return axios.post("https://api.openai.com/v1/chat/completions", {
model: "gpt-4o-mini",
messages: [
{
role: "system",
content: "Adjust the following support response to match a " + tone +
" tone. Keep all factual details identical. Only adjust phrasing and warmth. " +
"Return only the adjusted message, no explanation."
},
{ role: "user", content: template }
],
temperature: 0.4,
max_tokens: 300
}, {
headers: {
"Authorization": "Bearer " + process.env.OPENAI_API_KEY,
"Content-Type": "application/json"
}
}).then(function (response) {
return response.data.choices[0].message.content;
});
}
module.exports = { renderTemplate: renderTemplate, customizeTemplate: customizeTemplate };
Handling Sensitive Information
PII detection and secure handling is non-negotiable in customer support. You need to strip sensitive data before it hits your logs or the LLM:
var PII_PATTERNS = [
{ name: "credit_card", pattern: /\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/g, mask: "****-****-****-####" },
{ name: "ssn", pattern: /\b\d{3}[-]?\d{2}[-]?\d{4}\b/g, mask: "***-**-****" },
{ name: "email", pattern: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g, mask: "[EMAIL]" },
{ name: "phone", pattern: /\b(?:\+?1[-.]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b/g, mask: "[PHONE]" }
];
function detectAndMaskPII(text) {
var detected = [];
var masked = text;
PII_PATTERNS.forEach(function (pii) {
var matches = text.match(pii.pattern);
if (matches) {
matches.forEach(function (match) {
detected.push({ type: pii.name, value: match });
if (pii.name === "credit_card") {
var last4 = match.replace(/[\s-]/g, "").slice(-4);
masked = masked.replace(match, pii.mask.replace("####", last4));
} else {
masked = masked.replace(match, pii.mask);
}
});
}
});
return {
original: text,
masked: masked,
piiDetected: detected,
hasPII: detected.length > 0
};
}
function sanitizeForLLM(message) {
var result = detectAndMaskPII(message);
return result.masked;
}
function sanitizeConversationHistory(history) {
return history.map(function (msg) {
return {
role: msg.role,
content: sanitizeForLLM(msg.content)
};
});
}
module.exports = {
detectAndMaskPII: detectAndMaskPII,
sanitizeForLLM: sanitizeForLLM,
sanitizeConversationHistory: sanitizeConversationHistory
};
Always run PII detection before sending customer messages to the LLM provider. The original data stays in your secure database; only masked content reaches external APIs. This is especially important for compliance with GDPR, CCPA, and PCI-DSS.
Feedback Collection
You cannot improve what you do not measure. Collect feedback on every AI response:
var { Pool } = require("pg");
var pool = new Pool({ connectionString: process.env.POSTGRES_CONNECTION_STRING });
function collectFeedback(ticketId, messageId, rating, comment) {
return pool.query(
"INSERT INTO support_feedback (ticket_id, message_id, rating, comment, created_at) " +
"VALUES ($1, $2, $3, $4, $5)",
[ticketId, messageId, rating, comment || null, new Date().toISOString()]
).then(function () {
// If negative feedback, flag for review
if (rating <= 2) {
return pool.query(
"UPDATE support_tickets SET needs_review = true, " +
"review_reason = 'negative_feedback' WHERE id = $1",
[ticketId]
);
}
});
}
function getFeedbackMetrics(startDate, endDate) {
return pool.query(
"SELECT " +
" COUNT(*) as total_responses, " +
" AVG(rating) as avg_rating, " +
" COUNT(CASE WHEN rating >= 4 THEN 1 END) as positive_count, " +
" COUNT(CASE WHEN rating <= 2 THEN 1 END) as negative_count, " +
" ROUND(COUNT(CASE WHEN rating >= 4 THEN 1 END)::numeric / COUNT(*)::numeric * 100, 1) as satisfaction_rate " +
"FROM support_feedback " +
"WHERE created_at BETWEEN $1 AND $2",
[startDate, endDate]
).then(function (result) {
return result.rows[0];
});
}
module.exports = { collectFeedback: collectFeedback, getFeedbackMetrics: getFeedbackMetrics };
Measuring Support Quality
Track three key metrics that tell you whether the system is working: resolution rate, customer satisfaction, and first response time.
function getSupportMetrics(startDate, endDate) {
var queries = [
// Resolution rate
pool.query(
"SELECT " +
" COUNT(*) as total_tickets, " +
" COUNT(CASE WHEN resolved_by_ai = true THEN 1 END) as ai_resolved, " +
" ROUND(COUNT(CASE WHEN resolved_by_ai = true THEN 1 END)::numeric / " +
" NULLIF(COUNT(*), 0)::numeric * 100, 1) as ai_resolution_rate " +
"FROM support_tickets WHERE created_at BETWEEN $1 AND $2",
[startDate, endDate]
),
// Average first response time
pool.query(
"SELECT AVG(EXTRACT(EPOCH FROM (first_response_at - created_at))) as avg_first_response_seconds " +
"FROM support_tickets WHERE created_at BETWEEN $1 AND $2 AND first_response_at IS NOT NULL",
[startDate, endDate]
),
// Customer satisfaction from feedback
getFeedbackMetrics(startDate, endDate)
];
return Promise.all(queries).then(function (results) {
return {
resolution: results[0].rows[0],
responseTime: results[1].rows[0],
satisfaction: results[2]
};
});
}
Training with Past Support Tickets
Historical tickets are gold for improving your system. Use them to build a better knowledge base and fine-tune classification:
function importHistoricalTickets(tickets) {
var knowledge = require("./knowledge");
var processed = 0;
var errors = 0;
function processTicket(index) {
if (index >= tickets.length) {
return Promise.resolve({ processed: processed, errors: errors });
}
var ticket = tickets[index];
// Only import resolved tickets with positive outcomes
if (ticket.status !== "resolved" || !ticket.resolution) {
return processTicket(index + 1);
}
var article = {
title: ticket.subject,
content: "**Customer Issue:** " + ticket.description + "\n\n" +
"**Resolution:** " + ticket.resolution + "\n\n" +
"**Category:** " + ticket.category,
category: "historical_resolution",
metadata: {
originalTicketId: ticket.id,
resolvedAt: ticket.resolvedAt,
satisfactionRating: ticket.rating
}
};
return knowledge.indexKnowledgeArticle(article)
.then(function () {
processed++;
return processTicket(index + 1);
})
.catch(function (err) {
console.error("Failed to import ticket " + ticket.id + ":", err.message);
errors++;
return processTicket(index + 1);
});
}
return processTicket(0);
}
Processing tickets sequentially avoids overwhelming the embedding API with concurrent requests. For large imports, batch them in groups of 50-100 with delays between batches.
Complete Working Example
Here is the full AI support chatbot that ties everything together. It uses Express.js for the API, WebSocket for real-time chat, and all the components described above:
var express = require("express");
var http = require("http");
var WebSocket = require("ws");
var { Pool } = require("pg");
var { v4: uuidv4 } = require("uuid");
// Import support modules
var intentClassifier = require("./intent-classifier");
var knowledge = require("./knowledge");
var responseGen = require("./response-generator");
var escalation = require("./escalation");
var sentiment = require("./sentiment");
var tickets = require("./tickets");
var pii = require("./pii");
var feedback = require("./feedback");
var channels = require("./channels");
var app = express();
var server = http.createServer(app);
var wss = new WebSocket.Server({ server: server, path: "/support/ws" });
var pool = new Pool({ connectionString: process.env.POSTGRES_CONNECTION_STRING });
app.use(express.json());
// In-memory conversation store (use Redis in production)
var conversations = {};
// WebSocket chat handler
wss.on("connection", function (ws, req) {
var conversationId = uuidv4();
var adapter = channels.createChannelAdapter("webchat");
conversations[conversationId] = {
id: conversationId,
history: [],
customerId: null,
startedAt: new Date().toISOString()
};
ws.send(JSON.stringify({
type: "connected",
conversationId: conversationId,
message: "Hello! Welcome to support. How can I help you today?"
}));
ws.on("message", function (data) {
var parsed;
try {
parsed = JSON.parse(data);
} catch (e) {
ws.send(JSON.stringify({ type: "error", message: "Invalid message format" }));
return;
}
var conversation = conversations[conversationId];
if (!conversation) {
ws.send(JSON.stringify({ type: "error", message: "Conversation not found" }));
return;
}
// Normalize the message
var normalized = adapter.normalize({
sessionId: conversationId,
message: parsed.message,
timestamp: new Date().toISOString()
});
// Detect and mask PII
var sanitized = pii.detectAndMaskPII(normalized.content);
if (sanitized.hasPII) {
console.log("PII detected in conversation " + conversationId +
": " + sanitized.piiDetected.map(function (p) { return p.type; }).join(", "));
}
// Add to conversation history
conversation.history.push({
role: "user",
content: normalized.content,
sanitizedContent: sanitized.masked,
timestamp: normalized.timestamp
});
// Send typing indicator
ws.send(JSON.stringify({ type: "typing", status: true }));
// Process the message through the pipeline
var sanitizedHistory = pii.sanitizeConversationHistory(conversation.history);
Promise.all([
intentClassifier.classifyIntent(sanitized.masked, sanitizedHistory),
sentiment.analyzeSentiment(sanitized.masked)
]).then(function (results) {
var intent = results[0];
var sentimentResult = results[1];
// Check for escalation
var escalationCheck = escalation.shouldEscalate(
intent, conversation.history, sentimentResult
);
if (escalationCheck.escalate) {
var escalationData = escalation.escalateToHuman(
conversationId, escalationCheck, conversation.history, intent
);
// Create ticket for human agent
return tickets.createTicket(conversation, intent, { resolvedByAI: false })
.then(function (ticket) {
ws.send(JSON.stringify({
type: "escalation",
message: "I am connecting you with a support specialist who can better assist you. " +
"They will have the full context of our conversation. Your ticket number is #" + ticket.id + ".",
ticketId: ticket.id
}));
// Stop typing indicator
ws.send(JSON.stringify({ type: "typing", status: false }));
});
}
// Generate AI response
return responseGen.generateResponse(
sanitized.masked, intent, sanitizedHistory, null
).then(function (response) {
// Add response to history
conversation.history.push({
role: "assistant",
content: response.content,
timestamp: new Date().toISOString(),
intent: intent,
sentiment: sentimentResult
});
// Send the response
ws.send(JSON.stringify({
type: "message",
content: response.content,
messageId: uuidv4(),
intent: intent.intent,
confidence: intent.confidence
}));
// Stop typing indicator
ws.send(JSON.stringify({ type: "typing", status: false }));
});
}).catch(function (err) {
console.error("Error processing message:", err);
ws.send(JSON.stringify({
type: "error",
message: "I apologize, but I encountered an error. Let me connect you with a human agent."
}));
ws.send(JSON.stringify({ type: "typing", status: false }));
});
});
ws.on("close", function () {
var conversation = conversations[conversationId];
if (conversation && conversation.history.length > 1) {
// Create a ticket for tracking even if resolved
intentClassifier.classifyIntent(
conversation.history[conversation.history.length - 1].content,
conversation.history
).then(function (intent) {
return tickets.createTicket(conversation, intent, { resolvedByAI: true });
}).catch(function (err) {
console.error("Failed to create closing ticket:", err.message);
});
}
delete conversations[conversationId];
});
});
// REST API endpoints for email and other async channels
app.post("/support/api/message", function (req, res) {
var channel = req.body.channel || "email";
var adapter = channels.createChannelAdapter(channel);
var normalized = adapter.normalize(req.body);
var sanitized = pii.detectAndMaskPII(normalized.content);
var conversationId = req.body.conversationId || uuidv4();
if (!conversations[conversationId]) {
conversations[conversationId] = {
id: conversationId,
history: [],
customerId: normalized.customerId,
startedAt: new Date().toISOString()
};
}
var conversation = conversations[conversationId];
conversation.history.push({
role: "user",
content: normalized.content,
sanitizedContent: sanitized.masked,
timestamp: normalized.timestamp
});
var sanitizedHistory = pii.sanitizeConversationHistory(conversation.history);
Promise.all([
intentClassifier.classifyIntent(sanitized.masked, sanitizedHistory),
sentiment.analyzeSentiment(sanitized.masked)
]).then(function (results) {
var intent = results[0];
var sentimentResult = results[1];
var escalationCheck = escalation.shouldEscalate(intent, conversation.history, sentimentResult);
if (escalationCheck.escalate) {
return tickets.createTicket(conversation, intent, { resolvedByAI: false })
.then(function (ticket) {
res.json({
conversationId: conversationId,
escalated: true,
ticketId: ticket.id,
message: "Your request has been escalated to a support specialist. Ticket #" + ticket.id
});
});
}
return responseGen.generateResponse(sanitized.masked, intent, sanitizedHistory, null)
.then(function (response) {
conversation.history.push({
role: "assistant",
content: response.content,
timestamp: new Date().toISOString()
});
res.json({
conversationId: conversationId,
message: response.content,
intent: intent.intent,
confidence: intent.confidence
});
});
}).catch(function (err) {
console.error("API message error:", err);
res.status(500).json({ error: "Failed to process message" });
});
});
// Feedback endpoint
app.post("/support/api/feedback", function (req, res) {
feedback.collectFeedback(req.body.ticketId, req.body.messageId, req.body.rating, req.body.comment)
.then(function () {
res.json({ success: true });
})
.catch(function (err) {
console.error("Feedback error:", err);
res.status(500).json({ error: "Failed to save feedback" });
});
});
// Metrics endpoint (admin only)
app.get("/support/api/metrics", function (req, res) {
var startDate = req.query.start || new Date(Date.now() - 30 * 86400000).toISOString();
var endDate = req.query.end || new Date().toISOString();
feedback.getFeedbackMetrics(startDate, endDate)
.then(function (metrics) {
res.json(metrics);
})
.catch(function (err) {
console.error("Metrics error:", err);
res.status(500).json({ error: "Failed to load metrics" });
});
});
var PORT = process.env.PORT || 3000;
server.listen(PORT, function () {
console.log("AI Support Server running on port " + PORT);
});
Common Issues and Troubleshooting
1. Vector Dimension Mismatch
ERROR: expected 1536 dimensions, not 3072
This happens when you switch embedding models mid-deployment. The text-embedding-3-small model produces 1536-dimension vectors while text-embedding-3-large produces 3072. Your pgvector column dimensions must match the model. If you need to change models, you must recreate the column and re-embed all existing content:
ALTER TABLE knowledge_base DROP COLUMN embedding;
ALTER TABLE knowledge_base ADD COLUMN embedding vector(1536);
2. Rate Limiting on Classification
Error: 429 Too Many Requests - Rate limit reached for gpt-4o-mini
When traffic spikes, you will hit the OpenAI rate limit. Implement a queue with exponential backoff:
var requestQueue = [];
var processing = false;
function queueClassification(message, history) {
return new Promise(function (resolve, reject) {
requestQueue.push({ message: message, history: history, resolve: resolve, reject: reject });
if (!processing) processQueue();
});
}
function processQueue() {
if (requestQueue.length === 0) {
processing = false;
return;
}
processing = true;
var item = requestQueue.shift();
intentClassifier.classifyIntent(item.message, item.history)
.then(item.resolve)
.catch(function (err) {
if (err.response && err.response.status === 429) {
var retryAfter = parseInt(err.response.headers["retry-after"] || "2", 10);
requestQueue.unshift(item);
setTimeout(processQueue, retryAfter * 1000);
return;
}
item.reject(err);
})
.then(function () {
setTimeout(processQueue, 100);
});
}
3. WebSocket Connection Drops
WebSocket connection to 'wss://example.com/support/ws' failed: Connection closed before receiving a handshake response
This is common behind load balancers or proxies that do not support WebSocket upgrades. Configure your proxy to pass the Upgrade header:
# Nginx configuration
location /support/ws {
proxy_pass http://localhost:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 86400;
}
Also implement reconnection logic on the client side:
var reconnectDelay = 1000;
var maxReconnectDelay = 30000;
function connectWebSocket() {
var ws = new WebSocket("wss://example.com/support/ws");
ws.onopen = function () {
reconnectDelay = 1000;
};
ws.onclose = function () {
setTimeout(function () {
reconnectDelay = Math.min(reconnectDelay * 2, maxReconnectDelay);
connectWebSocket();
}, reconnectDelay);
};
}
4. PII Regex False Positives
PII detected: credit_card in message "My order number is 1234-5678-9012-3456"
Order numbers, tracking numbers, and other identifiers can match credit card regex patterns. Refine your detection by adding context awareness:
function isLikelyPII(match, type, fullMessage) {
if (type === "credit_card") {
var contextWords = ["order", "tracking", "reference", "confirmation", "invoice"];
var lower = fullMessage.toLowerCase();
var hasNonPIIContext = contextWords.some(function (word) {
var wordIndex = lower.indexOf(word);
var matchIndex = lower.indexOf(match.toLowerCase());
return wordIndex !== -1 && Math.abs(wordIndex - matchIndex) < 50;
});
if (hasNonPIIContext) return false;
}
return true;
}
5. LLM Hallucinating Product Information
Customer: "What is the price of your Pro plan?"
AI: "Our Pro plan is $49/month and includes unlimited API calls."
(Actual price: $79/month with 10,000 API calls)
This is the most dangerous failure mode. The LLM fills in gaps with plausible but incorrect information. The fix is twofold: always require RAG retrieval before answering factual questions, and add a verification step:
function verifyFactualClaim(response, knowledgeArticles) {
if (knowledgeArticles.length === 0) {
return {
verified: false,
reason: "No knowledge base articles found to verify this response"
};
}
var highestSimilarity = Math.max.apply(null, knowledgeArticles.map(function (a) {
return a.similarity;
}));
if (highestSimilarity < 0.5) {
return {
verified: false,
reason: "Low relevance match (similarity: " + highestSimilarity.toFixed(2) + ")"
};
}
return { verified: true };
}
Best Practices
Always provide an escape hatch. Every AI response should include an implicit or explicit way to reach a human agent. Customers who feel trapped by a bot become hostile customers.
Log everything, but mask PII first. Full conversation logs are essential for debugging and improvement, but they must go through PII detection before storage. Audit your logging pipeline regularly.
Set response length limits. A 2000-word AI response to a simple question is a poor customer experience. Cap the
max_tokensparameter based on the intent type. Billing inquiries rarely need more than 150 tokens. Technical support might need 400.Implement circuit breakers for external APIs. If the LLM provider goes down, your support system should not hang indefinitely. Use timeouts, fallback responses, and queue incoming messages for processing when the service recovers.
Version your knowledge base. Track when articles are added, modified, or removed. When a customer reports an incorrect answer, you need to know which version of the knowledge base generated it.
Test with real customer messages, not synthetic data. Sanitize a sample of actual support conversations and use them as your test suite. Synthetic test cases miss the creative ways customers phrase their problems.
Monitor confidence score distributions. If your average classification confidence drops over time, it means your intent categories no longer match how customers are actually reaching out. Add new categories or adjust existing ones.
Rate-limit per conversation, not just globally. A single frustrated customer rapidly sending messages should not consume all your LLM quota. Cap individual conversations at a reasonable message rate.
Never auto-close tickets without confirmation. When the AI believes an issue is resolved, ask the customer explicitly. Silently closing a ticket when the customer is still confused creates a terrible experience.
Keep the AI personality consistent but configurable. Store tone and personality parameters separately from the core logic. Different products or customer segments may need different communication styles.