AI-Powered Content Writing Tools
Build AI content writing tools with auto-completion, style checking, tone adjustment, and SEO optimization in Node.js.
AI-Powered Content Writing Tools
Overview
AI-powered writing tools have moved from novelty to necessity for anyone producing content at scale. Whether you are building an internal documentation system, a blog platform, or a client-facing content tool, integrating LLMs into the writing workflow gives writers real-time feedback, auto-completion, tone control, and SEO guidance without leaving their editor. This article walks through building a production-grade writing assistant API in Node.js, covering architecture decisions, feature implementation, and the pitfalls I have hit shipping these systems in production.
Prerequisites
- Node.js v18 or later installed
- An OpenAI API key (or any OpenAI-compatible API provider)
- Basic understanding of Express.js and REST APIs
- Familiarity with how LLMs work at a high level (prompts, tokens, completions)
- A text editor or IDE for testing (we will build a backend API, not a full frontend)
Building vs. Buying: Why Custom Tools Win
Off-the-shelf writing assistants like Grammarly, Jasper, and Copy.ai cover the basics. But the moment you need custom terminology enforcement, brand voice compliance, domain-specific knowledge, or integration into an existing CMS, you hit their limits fast.
Building your own gives you three critical advantages:
Customization — You control the prompts, the models, and the behavior. Want your tool to enforce a specific style guide? Write a system prompt. Want it to reject passive voice in API documentation but allow it in marketing copy? Add a configuration layer.
Integration — Your writing tool lives inside your existing stack. It reads from your CMS, checks against your glossary database, and writes suggestions back into your editorial workflow. No copy-pasting between tabs.
Cost control — Commercial tools charge per seat. When you own the API layer, you pay per token. For a team of 50 writers producing moderate volume, a custom solution often costs a third of a per-seat SaaS product. You also avoid vendor lock-in.
The trade-off is engineering time. But the core building blocks are surprisingly straightforward, as you will see below.
Designing a Writing Assistant Architecture
A well-designed writing assistant separates concerns into three layers:
┌─────────────────────────────────────────┐
│ Frontend Editor │
│ (ContentEditable / CodeMirror / Quill) │
└──────────────┬──────────────────────────┘
│ HTTP / WebSocket
┌──────────────▼──────────────────────────┐
│ API Gateway (Express.js) │
│ - Rate limiting │
│ - Authentication │
│ - Request routing │
│ - Response streaming │
└──────────────┬──────────────────────────┘
│
┌──────────────▼──────────────────────────┐
│ AI Service Layer │
│ - Prompt construction │
│ - Model selection │
│ - Token management │
│ - Response parsing │
│ - Caching │
└─────────────────────────────────────────┘
The editor sends text fragments to the API. The API constructs prompts, calls the LLM, parses the response, and returns structured suggestions. This separation means you can swap models, add caching, or change the frontend without touching the other layers.
For real-time features like auto-completion, use Server-Sent Events (SSE) or WebSockets to stream tokens back to the editor as they arrive. For batch operations like SEO analysis, a standard request-response pattern works fine.
Implementing Auto-Completion and Sentence Continuation
Auto-completion is the feature users interact with most. The key is low latency — if it takes more than 300ms, writers will outpace the suggestions and the feature becomes annoying rather than helpful.
var OpenAI = require("openai");
var openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
function buildCompletionPrompt(textBefore, textAfter) {
return [
{
role: "system",
content: "You are a writing assistant. Complete the user's sentence naturally. " +
"Return ONLY the completion text, nothing else. Keep completions under 40 words. " +
"Match the existing tone and style."
},
{
role: "user",
content: "Text before cursor:\n" + textBefore.slice(-500) +
"\n\nText after cursor:\n" + (textAfter || "").slice(0, 200) +
"\n\nContinue writing from where the text before the cursor ends:"
}
];
}
function getAutoCompletion(textBefore, textAfter, callback) {
var messages = buildCompletionPrompt(textBefore, textAfter);
openai.chat.completions.create({
model: "gpt-4o-mini",
messages: messages,
max_tokens: 60,
temperature: 0.7,
stream: false
}).then(function(response) {
var completion = response.choices[0].message.content.trim();
callback(null, { completion: completion });
}).catch(function(err) {
callback(err);
});
}
I use gpt-4o-mini here deliberately. For auto-completion, speed matters more than deep reasoning. The smaller model responds in 100-200ms versus 500ms+ for the full model. The quality difference for sentence continuation is negligible.
Notice I only send the last 500 characters before the cursor and 200 characters after. Sending the entire document for every keystroke is wasteful and slow. Context windows are large, but token costs add up fast at the frequency auto-completion runs.
Grammar and Style Checking with LLMs
Traditional grammar checkers use rule-based systems. LLMs bring contextual understanding — they catch errors that LanguageTool misses because they understand what the sentence is trying to say, not just its grammatical structure.
function checkGrammarAndStyle(text, styleGuide, callback) {
var systemPrompt = "You are a professional editor. Analyze the text for grammar errors, " +
"style issues, and clarity problems. Return a JSON array of issues.\n\n" +
"Each issue should have: " +
"\"start\" (character offset), \"end\" (character offset), " +
"\"type\" (grammar|style|clarity), \"message\" (explanation), " +
"\"suggestion\" (corrected text).\n\n";
if (styleGuide) {
systemPrompt += "Apply these style rules:\n" + styleGuide + "\n\n";
}
systemPrompt += "Return ONLY valid JSON. No markdown fences.";
openai.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: systemPrompt },
{ role: "user", content: text }
],
temperature: 0.2,
response_format: { type: "json_object" }
}).then(function(response) {
var raw = response.choices[0].message.content;
try {
var issues = JSON.parse(raw);
callback(null, issues);
} catch (e) {
callback(new Error("Failed to parse style check response: " + e.message));
}
}).catch(function(err) {
callback(err);
});
}
The styleGuide parameter is where customization shines. Feed in your company's style guide — "Use active voice," "Never say 'utilize' when 'use' works," "Capitalize product names as defined in the glossary" — and the LLM enforces it. I have seen teams reduce editorial review time by 40% with a well-tuned style guide prompt.
Use lower temperature (0.2) for grammar checking. You want consistent, deterministic feedback, not creative suggestions.
Tone Adjustment Features
Tone adjustment is one of the most requested features in content tools. A writer drafts in their natural voice, then the tool reshapes it for the target audience.
var TONE_PROMPTS = {
formal: "Rewrite in a formal, professional tone. Use complete sentences, " +
"avoid contractions, and maintain an authoritative voice.",
casual: "Rewrite in a conversational, casual tone. Use contractions, " +
"shorter sentences, and a friendly voice. Avoid jargon.",
persuasive: "Rewrite to be persuasive and compelling. Use power words, " +
"create urgency, and appeal to the reader's interests.",
technical: "Rewrite for a technical audience. Use precise terminology, " +
"include specific details, and assume domain knowledge.",
simplified: "Rewrite at a 6th-grade reading level. Use simple words, " +
"short sentences, and explain any technical concepts."
};
function adjustTone(text, targetTone, callback) {
var toneInstruction = TONE_PROMPTS[targetTone];
if (!toneInstruction) {
return callback(new Error("Unknown tone: " + targetTone +
". Available: " + Object.keys(TONE_PROMPTS).join(", ")));
}
openai.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "system",
content: toneInstruction + " Preserve the original meaning and key information. " +
"Return only the rewritten text."
},
{ role: "user", content: text }
],
temperature: 0.5
}).then(function(response) {
var rewritten = response.choices[0].message.content.trim();
callback(null, {
original: text,
rewritten: rewritten,
tone: targetTone
});
}).catch(function(err) {
callback(err);
});
}
Temperature 0.5 hits the sweet spot for tone adjustment — creative enough to reshape the prose, controlled enough to preserve meaning. I have tested this extensively: 0.3 produces wooden rewrites, 0.7 starts hallucinating new information.
Content Expansion and Condensation
Writers frequently need to hit a word count target. Expansion and condensation tools handle this without the writer manually padding or cutting.
function expandContent(text, targetWordCount, callback) {
var currentWords = text.split(/\s+/).length;
openai.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "system",
content: "Expand the following text to approximately " + targetWordCount +
" words (currently " + currentWords + " words). Add relevant details, " +
"examples, and explanations. Do not add filler. Every added sentence " +
"should provide genuine value. Return only the expanded text."
},
{ role: "user", content: text }
],
temperature: 0.6
}).then(function(response) {
var expanded = response.choices[0].message.content.trim();
var newCount = expanded.split(/\s+/).length;
callback(null, {
text: expanded,
originalWordCount: currentWords,
newWordCount: newCount
});
}).catch(function(err) {
callback(err);
});
}
function condenseContent(text, targetWordCount, callback) {
var currentWords = text.split(/\s+/).length;
openai.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "system",
content: "Condense the following text to approximately " + targetWordCount +
" words (currently " + currentWords + " words). Preserve key points " +
"and essential information. Remove redundancy and filler. " +
"Return only the condensed text."
},
{ role: "user", content: text }
],
temperature: 0.3
}).then(function(response) {
var condensed = response.choices[0].message.content.trim();
var newCount = condensed.split(/\s+/).length;
callback(null, {
text: condensed,
originalWordCount: currentWords,
newWordCount: newCount
});
}).catch(function(err) {
callback(err);
});
}
Implementing Outline Generation from Topics
Good outlines save hours of writing time. Given a topic and target audience, the tool generates a structured outline with headings, subheadings, and key points for each section.
function generateOutline(topic, audience, articleType, callback) {
var prompt = "Generate a detailed article outline for the topic: \"" + topic + "\"\n\n" +
"Target audience: " + (audience || "general technical readers") + "\n" +
"Article type: " + (articleType || "tutorial") + "\n\n" +
"Return a JSON object with this structure:\n" +
"{ \"title\": \"Suggested title\", " +
"\"sections\": [{ \"heading\": \"H2 heading\", " +
"\"subheadings\": [\"H3 subheading\"], " +
"\"keyPoints\": [\"point to cover\"], " +
"\"estimatedWords\": 300 }], " +
"\"estimatedTotalWords\": 2500, " +
"\"suggestedKeywords\": [\"keyword1\", \"keyword2\"] }";
openai.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "system",
content: "You are a content strategist. Generate comprehensive, well-structured outlines. " +
"Return only valid JSON."
},
{ role: "user", content: prompt }
],
temperature: 0.7,
response_format: { type: "json_object" }
}).then(function(response) {
var outline = JSON.parse(response.choices[0].message.content);
callback(null, outline);
}).catch(function(err) {
callback(err);
});
}
SEO Optimization Suggestions
SEO analysis does not need an LLM for everything. Combine deterministic analysis (keyword density, heading structure, meta description length) with LLM-powered suggestions for readability and semantic relevance.
function analyzeSEO(content, targetKeyword, callback) {
// Deterministic analysis first
var words = content.split(/\s+/);
var wordCount = words.length;
var keywordRegex = new RegExp(targetKeyword, "gi");
var keywordMatches = content.match(keywordRegex) || [];
var keywordDensity = ((keywordMatches.length / wordCount) * 100).toFixed(2);
var headings = content.match(/^#{1,6}\s.+$/gm) || [];
var h1Count = headings.filter(function(h) { return /^#\s/.test(h); }).length;
var h2Count = headings.filter(function(h) { return /^##\s/.test(h); }).length;
var paragraphs = content.split(/\n\n+/);
var avgParagraphLength = Math.round(
paragraphs.reduce(function(sum, p) {
return sum + p.split(/\s+/).length;
}, 0) / paragraphs.length
);
var deterministicResults = {
wordCount: wordCount,
keywordDensity: parseFloat(keywordDensity),
keywordOccurrences: keywordMatches.length,
headingCount: headings.length,
h1Count: h1Count,
h2Count: h2Count,
averageParagraphLength: avgParagraphLength,
issues: []
};
// Flag deterministic issues
if (parseFloat(keywordDensity) < 0.5) {
deterministicResults.issues.push({
type: "warning",
message: "Keyword density is low (" + keywordDensity + "%). Aim for 1-2%."
});
}
if (parseFloat(keywordDensity) > 3.0) {
deterministicResults.issues.push({
type: "error",
message: "Keyword density is too high (" + keywordDensity + "%). This may trigger keyword stuffing penalties."
});
}
if (h1Count !== 1) {
deterministicResults.issues.push({
type: "error",
message: "Page should have exactly one H1 tag. Found " + h1Count + "."
});
}
if (h2Count < 2) {
deterministicResults.issues.push({
type: "warning",
message: "Add more H2 subheadings to improve content structure."
});
}
if (avgParagraphLength > 100) {
deterministicResults.issues.push({
type: "warning",
message: "Average paragraph is " + avgParagraphLength + " words. Break up long paragraphs for readability."
});
}
// LLM-powered analysis for semantic suggestions
openai.chat.completions.create({
model: "gpt-4o-mini",
messages: [
{
role: "system",
content: "Analyze this content for SEO. Suggest a meta description (under 160 chars), " +
"3-5 related keywords to add, and identify any missing subtopics that competitors " +
"would cover. Return JSON with keys: metaDescription, relatedKeywords, missingTopics, readabilityScore (1-10)."
},
{
role: "user",
content: "Target keyword: " + targetKeyword + "\n\nContent:\n" + content.slice(0, 3000)
}
],
temperature: 0.3,
response_format: { type: "json_object" }
}).then(function(response) {
var aiSuggestions = JSON.parse(response.choices[0].message.content);
deterministicResults.suggestions = aiSuggestions;
callback(null, deterministicResults);
}).catch(function(err) {
// Return deterministic results even if LLM fails
deterministicResults.suggestions = null;
deterministicResults.llmError = err.message;
callback(null, deterministicResults);
});
}
This hybrid approach is important. Deterministic checks are instant and free. The LLM call adds semantic analysis that rule-based tools cannot provide, but if it fails, you still return useful data.
Plagiarism-Aware Generation
When generating or expanding content, you need to ensure the output is original. The approach is not to run a plagiarism checker after the fact — it is to instruct the model during generation and then verify.
function generateOriginalContent(topic, context, callback) {
openai.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "system",
content: "Generate original content on the given topic. Do NOT reproduce phrases " +
"from known sources verbatim. Use your own phrasing and structure. " +
"If referencing well-known concepts, explain them in original language. " +
"Include a 'sources' array listing any concepts or frameworks you reference " +
"so the writer can add proper citations. Return JSON with keys: content, sources."
},
{
role: "user",
content: "Topic: " + topic + "\n\nContext from the article so far:\n" +
(context || "No existing context.")
}
],
temperature: 0.8,
response_format: { type: "json_object" }
}).then(function(response) {
var result = JSON.parse(response.choices[0].message.content);
callback(null, result);
}).catch(function(err) {
callback(err);
});
}
Higher temperature (0.8) encourages more original phrasing. The sources array gives writers a trail to follow for citations, which is especially important for technical or research-heavy content.
Building a Rewriting Tool
Rewriting is more nuanced than tone adjustment. Writers need specific operations: paraphrase (same meaning, different words), simplify (reduce complexity), and elaborate (add depth).
var REWRITE_MODES = {
paraphrase: "Rewrite the text using completely different words and sentence structures " +
"while preserving the exact same meaning. Do not add or remove information.",
simplify: "Rewrite the text to be simpler and easier to understand. Break complex " +
"sentences into shorter ones. Replace jargon with plain language. Explain technical " +
"terms inline.",
elaborate: "Expand on the text by adding relevant details, examples, and explanations. " +
"Make implicit information explicit. Add context that helps the reader understand why " +
"each point matters."
};
function rewriteText(text, mode, callback) {
var instruction = REWRITE_MODES[mode];
if (!instruction) {
return callback(new Error("Invalid rewrite mode. Use: " +
Object.keys(REWRITE_MODES).join(", ")));
}
openai.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: instruction + " Return only the rewritten text." },
{ role: "user", content: text }
],
temperature: mode === "paraphrase" ? 0.7 : 0.5
}).then(function(response) {
callback(null, {
original: text,
rewritten: response.choices[0].message.content.trim(),
mode: mode
});
}).catch(function(err) {
callback(err);
});
}
Multi-Language Support and Translation
For teams publishing in multiple languages, translation with tone preservation is critical. Generic translation APIs lose nuance. An LLM-powered translation preserves the writer's intent.
function translateContent(text, targetLanguage, preserveTone, callback) {
var systemPrompt = "Translate the following text to " + targetLanguage + ". ";
if (preserveTone) {
systemPrompt += "Preserve the original tone, style, and formatting. " +
"Do not formalize casual text or casualize formal text. " +
"Keep markdown formatting intact. ";
}
systemPrompt += "Return only the translated text.";
openai.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: systemPrompt },
{ role: "user", content: text }
],
temperature: 0.3
}).then(function(response) {
callback(null, {
original: text,
translated: response.choices[0].message.content.trim(),
language: targetLanguage
});
}).catch(function(err) {
callback(err);
});
}
Implementing Collaborative Writing with AI
In collaborative environments, the AI acts as a mediator — merging contributions, resolving style conflicts, and maintaining consistency across sections written by different people.
function harmonizeContent(sections, styleGuide, callback) {
var combinedContent = sections.map(function(section, index) {
return "=== Section " + (index + 1) + " (by " + section.author + ") ===\n" +
section.content;
}).join("\n\n");
openai.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "system",
content: "You are reviewing a document written by multiple authors. " +
"Harmonize the tone and style across all sections to read as if written " +
"by a single author. Fix inconsistencies in terminology, formatting, and voice. " +
"Apply the style guide if provided. Do not change the substance or meaning. " +
"Return the harmonized full text.\n\n" +
(styleGuide ? "Style guide:\n" + styleGuide : "")
},
{ role: "user", content: combinedContent }
],
temperature: 0.4
}).then(function(response) {
callback(null, {
harmonized: response.choices[0].message.content.trim(),
sectionCount: sections.length,
authors: sections.map(function(s) { return s.author; })
});
}).catch(function(err) {
callback(err);
});
}
Version History and Diff Tracking for AI Edits
Every AI edit should be tracked. Writers need to see what changed and revert if the AI mangled their prose. Store versions with diffs, not just snapshots.
var crypto = require("crypto");
var Diff = require("diff");
function VersionTracker() {
this.versions = {};
}
VersionTracker.prototype.addVersion = function(documentId, content, source) {
if (!this.versions[documentId]) {
this.versions[documentId] = [];
}
var history = this.versions[documentId];
var previousContent = history.length > 0 ? history[history.length - 1].content : "";
var diff = Diff.createPatch("document", previousContent, content);
var version = {
id: crypto.randomUUID(),
timestamp: new Date().toISOString(),
content: content,
source: source, // "user", "ai-tone", "ai-grammar", "ai-expand", etc.
diff: diff,
wordCount: content.split(/\s+/).length,
versionNumber: history.length + 1
};
history.push(version);
return version;
};
VersionTracker.prototype.getHistory = function(documentId) {
return (this.versions[documentId] || []).map(function(v) {
return {
id: v.id,
timestamp: v.timestamp,
source: v.source,
wordCount: v.wordCount,
versionNumber: v.versionNumber
};
});
};
VersionTracker.prototype.getVersion = function(documentId, versionId) {
var history = this.versions[documentId] || [];
return history.find(function(v) { return v.id === versionId; }) || null;
};
VersionTracker.prototype.getDiff = function(documentId, fromVersion, toVersion) {
var history = this.versions[documentId] || [];
var from = history.find(function(v) { return v.id === fromVersion; });
var to = history.find(function(v) { return v.id === toVersion; });
if (!from || !to) return null;
return {
diff: Diff.createPatch("document", from.content, to.content),
from: { id: from.id, timestamp: from.timestamp, source: from.source },
to: { id: to.id, timestamp: to.timestamp, source: to.source }
};
};
In production, store versions in a database — PostgreSQL with JSONB works well. The in-memory implementation above is for illustration. The important pattern is tagging each version with its source so writers can see "AI changed this" versus "Sarah edited this."
Implementing a Suggestion Sidebar
A suggestion sidebar provides real-time feedback while the writer types. The trick is debouncing — you do not want to fire an API call on every keystroke.
function createSuggestionEngine(options) {
var debounceMs = options.debounceMs || 2000;
var timers = {};
return {
queueAnalysis: function(documentId, content, callback) {
if (timers[documentId]) {
clearTimeout(timers[documentId]);
}
timers[documentId] = setTimeout(function() {
var suggestions = [];
// Run parallel checks
var completed = 0;
var totalChecks = 3;
function checkDone() {
completed++;
if (completed === totalChecks) {
callback(null, suggestions);
}
}
// Readability check
var sentences = content.split(/[.!?]+/).filter(function(s) {
return s.trim().length > 0;
});
var avgSentenceLength = Math.round(
content.split(/\s+/).length / sentences.length
);
if (avgSentenceLength > 25) {
suggestions.push({
type: "readability",
severity: "warning",
message: "Average sentence length is " + avgSentenceLength +
" words. Aim for under 20 for better readability."
});
}
checkDone();
// Passive voice detection (heuristic)
var passivePattern = /\b(is|are|was|were|be|been|being)\s+\w+ed\b/gi;
var passiveMatches = content.match(passivePattern) || [];
if (passiveMatches.length > 3) {
suggestions.push({
type: "style",
severity: "info",
message: "Found " + passiveMatches.length + " potential passive voice constructions. " +
"Consider rewriting for clarity."
});
}
checkDone();
// Repeated word detection
var words = content.toLowerCase().split(/\s+/);
var wordFreq = {};
words.forEach(function(word) {
if (word.length > 4) {
wordFreq[word] = (wordFreq[word] || 0) + 1;
}
});
Object.keys(wordFreq).forEach(function(word) {
if (wordFreq[word] > 5 && wordFreq[word] / words.length > 0.02) {
suggestions.push({
type: "vocabulary",
severity: "info",
message: "\"" + word + "\" appears " + wordFreq[word] +
" times. Consider using synonyms."
});
}
});
checkDone();
}, debounceMs);
},
cancelAnalysis: function(documentId) {
if (timers[documentId]) {
clearTimeout(timers[documentId]);
delete timers[documentId];
}
}
};
}
This keeps the sidebar responsive without hammering the API. The heuristic checks (readability, passive voice, repetition) run locally with zero latency. Reserve LLM calls for deeper analysis triggered by explicit user action.
Complete Working Example: Content Writing API
Here is a complete Express.js API that ties all the features together into a deployable service.
var express = require("express");
var cors = require("cors");
var bodyParser = require("body-parser");
var OpenAI = require("openai");
var Diff = require("diff");
var crypto = require("crypto");
var app = express();
var PORT = process.env.PORT || 3000;
app.use(cors());
app.use(bodyParser.json({ limit: "2mb" }));
var openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// In-memory version store (use a database in production)
var versionStore = {};
// ==================== Helper Functions ====================
function callLLM(messages, options, callback) {
var config = {
model: options.model || "gpt-4o-mini",
messages: messages,
temperature: options.temperature || 0.5,
max_tokens: options.maxTokens || 2000
};
if (options.jsonMode) {
config.response_format = { type: "json_object" };
}
openai.chat.completions.create(config).then(function(response) {
var content = response.choices[0].message.content.trim();
var usage = response.usage;
callback(null, content, usage);
}).catch(function(err) {
callback(err);
});
}
function saveVersion(docId, content, source) {
if (!versionStore[docId]) {
versionStore[docId] = [];
}
var history = versionStore[docId];
var prev = history.length > 0 ? history[history.length - 1].content : "";
var version = {
id: crypto.randomUUID(),
timestamp: new Date().toISOString(),
content: content,
source: source,
diff: Diff.createPatch("doc", prev, content),
wordCount: content.split(/\s+/).length,
version: history.length + 1
};
history.push(version);
return version;
}
// ==================== API Routes ====================
// Auto-completion endpoint
app.post("/api/complete", function(req, res) {
var textBefore = req.body.textBefore || "";
var textAfter = req.body.textAfter || "";
if (!textBefore) {
return res.status(400).json({ error: "textBefore is required" });
}
callLLM([
{
role: "system",
content: "Complete the user's text naturally. Return ONLY the continuation, " +
"under 40 words. Match existing tone and style."
},
{
role: "user",
content: "Text before cursor:\n" + textBefore.slice(-500) +
(textAfter ? "\n\nText after cursor:\n" + textAfter.slice(0, 200) : "")
}
], { model: "gpt-4o-mini", temperature: 0.7, maxTokens: 80 },
function(err, completion, usage) {
if (err) {
return res.status(500).json({ error: err.message });
}
res.json({ completion: completion, tokens: usage.total_tokens });
});
});
// Style and grammar check endpoint
app.post("/api/check-style", function(req, res) {
var text = req.body.text;
var styleGuide = req.body.styleGuide || "";
if (!text) {
return res.status(400).json({ error: "text is required" });
}
var systemPrompt = "Analyze this text for grammar, style, and clarity issues. " +
"Return a JSON object with an \"issues\" array. Each issue: " +
"{\"type\": \"grammar|style|clarity\", \"message\": \"...\", \"suggestion\": \"...\", " +
"\"excerpt\": \"the problematic text\"}. ";
if (styleGuide) {
systemPrompt += "Apply these style rules: " + styleGuide;
}
callLLM([
{ role: "system", content: systemPrompt },
{ role: "user", content: text }
], { model: "gpt-4o", temperature: 0.2, jsonMode: true },
function(err, result) {
if (err) {
return res.status(500).json({ error: err.message });
}
try {
res.json(JSON.parse(result));
} catch (e) {
res.status(500).json({ error: "Failed to parse LLM response" });
}
});
});
// Tone adjustment endpoint
app.post("/api/adjust-tone", function(req, res) {
var text = req.body.text;
var tone = req.body.tone;
var docId = req.body.documentId;
var tones = {
formal: "Rewrite in a formal, professional tone. Avoid contractions.",
casual: "Rewrite in a conversational, casual tone. Use contractions and shorter sentences.",
persuasive: "Rewrite to be persuasive and compelling. Create urgency.",
technical: "Rewrite for a technical audience. Use precise terminology.",
simplified: "Rewrite at a 6th-grade reading level with simple words."
};
if (!text || !tone) {
return res.status(400).json({ error: "text and tone are required" });
}
if (!tones[tone]) {
return res.status(400).json({
error: "Invalid tone. Options: " + Object.keys(tones).join(", ")
});
}
callLLM([
{
role: "system",
content: tones[tone] + " Preserve original meaning. Return only rewritten text."
},
{ role: "user", content: text }
], { model: "gpt-4o", temperature: 0.5 },
function(err, rewritten) {
if (err) {
return res.status(500).json({ error: err.message });
}
var response = { original: text, rewritten: rewritten, tone: tone };
if (docId) {
var version = saveVersion(docId, rewritten, "ai-tone-" + tone);
response.version = version.version;
}
res.json(response);
});
});
// Outline generation endpoint
app.post("/api/generate-outline", function(req, res) {
var topic = req.body.topic;
var audience = req.body.audience || "general readers";
var articleType = req.body.articleType || "article";
if (!topic) {
return res.status(400).json({ error: "topic is required" });
}
callLLM([
{
role: "system",
content: "Generate a comprehensive article outline. Return JSON with: " +
"title (string), sections (array of {heading, subheadings[], keyPoints[], estimatedWords}), " +
"estimatedTotalWords (number), suggestedKeywords (string array)."
},
{
role: "user",
content: "Topic: " + topic + "\nAudience: " + audience + "\nType: " + articleType
}
], { model: "gpt-4o", temperature: 0.7, jsonMode: true },
function(err, result) {
if (err) {
return res.status(500).json({ error: err.message });
}
try {
res.json(JSON.parse(result));
} catch (e) {
res.status(500).json({ error: "Failed to parse outline response" });
}
});
});
// SEO analysis endpoint
app.post("/api/analyze-seo", function(req, res) {
var content = req.body.content;
var keyword = req.body.keyword;
if (!content || !keyword) {
return res.status(400).json({ error: "content and keyword are required" });
}
var words = content.split(/\s+/);
var keywordRegex = new RegExp(keyword, "gi");
var matches = content.match(keywordRegex) || [];
var density = ((matches.length / words.length) * 100).toFixed(2);
var headings = content.match(/^#{1,6}\s.+$/gm) || [];
var analysis = {
wordCount: words.length,
keywordDensity: parseFloat(density),
keywordOccurrences: matches.length,
headingCount: headings.length,
issues: []
};
if (parseFloat(density) < 0.5) {
analysis.issues.push("Keyword density too low (" + density + "%). Aim for 1-2%.");
}
if (parseFloat(density) > 3.0) {
analysis.issues.push("Keyword density too high (" + density + "%). Risk of keyword stuffing.");
}
if (headings.length < 3) {
analysis.issues.push("Add more headings to improve content structure and scanability.");
}
// LLM suggestions
callLLM([
{
role: "system",
content: "Analyze for SEO. Return JSON: {metaDescription (under 160 chars), " +
"relatedKeywords (array), missingTopics (array), readabilityScore (1-10)}."
},
{
role: "user",
content: "Keyword: " + keyword + "\n\n" + content.slice(0, 3000)
}
], { model: "gpt-4o-mini", temperature: 0.3, jsonMode: true },
function(err, result) {
if (err) {
analysis.aiSuggestions = null;
analysis.aiError = err.message;
} else {
try {
analysis.aiSuggestions = JSON.parse(result);
} catch (e) {
analysis.aiSuggestions = null;
}
}
res.json(analysis);
});
});
// Rewrite endpoint
app.post("/api/rewrite", function(req, res) {
var text = req.body.text;
var mode = req.body.mode; // paraphrase, simplify, elaborate
var docId = req.body.documentId;
var modes = {
paraphrase: "Rewrite using different words and sentence structures. Same meaning, different phrasing.",
simplify: "Simplify the text. Shorter sentences, simpler words, explain jargon.",
elaborate: "Expand with relevant details, examples, and explanations."
};
if (!text || !mode) {
return res.status(400).json({ error: "text and mode are required" });
}
if (!modes[mode]) {
return res.status(400).json({
error: "Invalid mode. Options: " + Object.keys(modes).join(", ")
});
}
callLLM([
{ role: "system", content: modes[mode] + " Return only rewritten text." },
{ role: "user", content: text }
], { model: "gpt-4o", temperature: mode === "paraphrase" ? 0.7 : 0.5 },
function(err, rewritten) {
if (err) {
return res.status(500).json({ error: err.message });
}
var response = { original: text, rewritten: rewritten, mode: mode };
if (docId) {
var version = saveVersion(docId, rewritten, "ai-rewrite-" + mode);
response.version = version.version;
}
res.json(response);
});
});
// Translation endpoint
app.post("/api/translate", function(req, res) {
var text = req.body.text;
var language = req.body.language;
if (!text || !language) {
return res.status(400).json({ error: "text and language are required" });
}
callLLM([
{
role: "system",
content: "Translate to " + language + ". Preserve tone and formatting. Return only the translation."
},
{ role: "user", content: text }
], { model: "gpt-4o", temperature: 0.3 },
function(err, translated) {
if (err) {
return res.status(500).json({ error: err.message });
}
res.json({ original: text, translated: translated, language: language });
});
});
// Version history endpoint
app.get("/api/versions/:documentId", function(req, res) {
var history = versionStore[req.params.documentId] || [];
res.json({
documentId: req.params.documentId,
versions: history.map(function(v) {
return {
id: v.id,
timestamp: v.timestamp,
source: v.source,
wordCount: v.wordCount,
version: v.version
};
})
});
});
// Health check
app.get("/api/health", function(req, res) {
res.json({
status: "ok",
hasApiKey: !!process.env.OPENAI_API_KEY,
uptime: process.uptime()
});
});
app.listen(PORT, function() {
console.log("Content Writing API running on port " + PORT);
});
Test it with curl:
# Auto-completion
curl -X POST http://localhost:3000/api/complete \
-H "Content-Type: application/json" \
-d '{"textBefore": "Building scalable APIs requires careful attention to"}'
# Tone adjustment
curl -X POST http://localhost:3000/api/adjust-tone \
-H "Content-Type: application/json" \
-d '{"text": "The system is broken and nobody cares.", "tone": "formal", "documentId": "doc-1"}'
# SEO analysis
curl -X POST http://localhost:3000/api/analyze-seo \
-H "Content-Type: application/json" \
-d '{"content": "Your article content here...", "keyword": "API development"}'
# Outline generation
curl -X POST http://localhost:3000/api/generate-outline \
-H "Content-Type: application/json" \
-d '{"topic": "Microservices Authentication Patterns", "audience": "backend engineers"}'
Install dependencies:
npm init -y
npm install express cors body-parser openai diff
Common Issues and Troubleshooting
1. Rate Limiting Errors from OpenAI
Error: 429 Too Many Requests - Rate limit reached for gpt-4o-mini
Auto-completion fires frequently. Implement client-side debouncing (at least 500ms) and server-side rate limiting per user. Use a token bucket algorithm or the express-rate-limit middleware. Also maintain a request queue that drops stale requests — if a newer completion request arrives before the old one finishes, cancel the old one.
2. JSON Parsing Failures on Structured Responses
SyntaxError: Unexpected token 'H' at position 0
Failed to parse style check response: Unexpected token 'H'
Even with response_format: { type: "json_object" }, models occasionally return markdown-fenced JSON or explanatory text. Always wrap JSON.parse in try-catch. As a fallback, strip markdown fences before parsing:
function safeParseJSON(text) {
var cleaned = text.replace(/^```json?\n?/i, "").replace(/\n?```$/i, "").trim();
try {
return JSON.parse(cleaned);
} catch (e) {
return null;
}
}
3. Token Limit Exceeded for Long Documents
Error: 400 - This model's maximum context length is 128000 tokens.
You requested 135420 tokens.
Never send entire documents for every request. Slice content to the relevant section. For grammar checking, break the document into chunks of 2000-3000 words and process them sequentially. Reassemble the results with adjusted character offsets.
4. Inconsistent Character Offsets in Grammar Suggestions
Issue: AI returns character offsets that don't match the original text positions
LLMs are unreliable at counting characters. Instead of trusting character offsets from the model, ask for the problematic text excerpt and use String.indexOf() to find the actual position:
function findActualOffset(fullText, excerpt) {
var index = fullText.indexOf(excerpt);
if (index === -1) {
// Try case-insensitive
index = fullText.toLowerCase().indexOf(excerpt.toLowerCase());
}
return index;
}
5. Version Store Memory Leak
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
The in-memory version store grows unbounded. In production, move to PostgreSQL or MongoDB. If you must use in-memory storage for prototyping, cap the history length per document and evict old versions:
var MAX_VERSIONS = 50;
if (history.length > MAX_VERSIONS) {
history.splice(0, history.length - MAX_VERSIONS);
}
Best Practices
Debounce aggressively on real-time features. Auto-completion should wait at least 500ms after the user stops typing. Sidebar suggestions should wait 2 seconds. Every unnecessary API call costs money and adds latency.
Use the cheapest model that works for each feature. Auto-completion and SEO keyword analysis work fine with gpt-4o-mini. Reserve gpt-4o for tone adjustment, rewriting, and grammar analysis where quality differences are noticeable.
Always return partial results on failure. If the LLM call fails during SEO analysis, still return the deterministic results (word count, keyword density, heading count). Users get value even when the AI layer is down.
Tag every AI edit with its source. Version history should distinguish between "user edited" and "AI tone adjustment" and "AI grammar fix." Writers need to audit what the AI changed and revert specific operations without losing their own edits.
Enforce content length limits on API inputs. A malicious or careless client sending a 500,000-word document to your grammar endpoint will burn through your token budget in one request. Validate input length and return a 413 error for oversized payloads.
Cache repeated requests. Writers often check the same paragraph multiple times. A simple in-memory cache keyed on a hash of the input text and operation type can reduce API costs by 20-30%. Set a TTL of 5 minutes — content changes frequently enough that stale caches are worse than cache misses.
Stream responses for user-facing features. Auto-completion and rewriting feel dramatically faster when you stream tokens to the client via SSE. The first token arrives in 100ms instead of waiting 2 seconds for the full response. Use
stream: truein the OpenAI SDK and pipe chunks to the response.Log token usage per feature and per user. You need visibility into which features consume the most tokens and which users are heavy consumers. This data drives pricing decisions, model selection, and rate limiting policies.
References
- OpenAI API Documentation — Chat completions, streaming, structured outputs
- Express.js Documentation — Route handling, middleware, request/response
- diff npm package — Text diffing for version history
- Server-Sent Events (MDN) — Streaming AI responses to the browser
- express-rate-limit — Rate limiting middleware for Express
- OpenAI Cookbook: Text Generation Best Practices — Prompt engineering patterns for writing tools