Use Cases

Content Generation Systems: Architecture and Implementation

Build content generation systems with LLM pipelines, quality control, SEO optimization, and CMS publishing in Node.js.

Content Generation Systems: Architecture and Implementation

Content generation systems use LLMs to produce articles, product descriptions, marketing copy, and documentation at scale through automated pipelines. These systems are not about replacing writers — they are about building infrastructure that turns structured briefs into publish-ready content with consistent quality, proper SEO, and human oversight at the right checkpoints.

I have built and operated content generation pipelines that produce hundreds of articles per week across multiple domains. This article covers the architecture, implementation details, and hard-won lessons from running these systems in production.

Prerequisites

  • Node.js 18+ installed
  • Working knowledge of Express.js and REST APIs
  • An OpenAI or Anthropic API key
  • Basic understanding of CMS concepts (headless CMS, content models)
  • Familiarity with message queues (Redis or BullMQ)
  • A PostgreSQL or MongoDB instance for content storage

What Content Generation Systems Do

Content generation systems automate the creation of text-based content through LLM pipelines. The scope varies widely:

  • Articles and tutorials — Long-form educational content with code examples, structured sections, and SEO metadata
  • Product descriptions — E-commerce catalog entries generated from product attributes and specifications
  • Marketing copy — Landing pages, email sequences, ad variations, social media posts
  • Documentation — API references, user guides, changelogs generated from code or specs
  • Localization — Translating and adapting existing content for different markets

The common thread is a structured input (a brief, a template, a data source) that gets transformed into polished output through a pipeline of generation, validation, and review steps.

What separates a toy script from a production system is everything around the LLM call: queue management, retry logic, quality gates, cost tracking, content versioning, and publishing workflows.

System Architecture

A production content generation system follows this flow:

Input Sources → Content Queue → Generation Pipeline → Quality Control → Publishing
     ↓              ↓                  ↓                    ↓              ↓
  Briefs         Priority          Templates +          Automated       CMS API
  Data feeds     Scheduling        LLM Calls            Checks         Webhooks
  Templates      Rate limiting     Multi-pass           LLM Review     CDN Cache
  APIs           Deduplication     Formatting            Human Review   Notifications

Each stage is independent and communicable through a message queue. This matters because LLM calls are slow, expensive, and occasionally fail. You need the ability to retry individual stages without reprocessing the entire pipeline.

// architecture.js — Core pipeline stages
var PIPELINE_STAGES = {
  QUEUED: 'queued',
  GENERATING: 'generating',
  REVIEWING: 'reviewing',
  FORMATTING: 'formatting',
  PUBLISHING: 'publishing',
  PUBLISHED: 'published',
  FAILED: 'failed',
  REJECTED: 'rejected'
};

Designing Content Templates with Variable Slots

Templates are the backbone of consistent content. A template defines the structure, tone, and required sections, while variable slots get filled per-item.

// templates/article-template.js
var articleTemplate = {
  name: 'technical-article',
  version: '2.1',
  systemPrompt: [
    'You are a senior software engineer writing for a technical audience.',
    'Write in a practical, concise, authoritative voice.',
    'Include working code examples that readers can copy and run.',
    'Use markdown formatting with proper heading hierarchy.',
    'Every claim should be backed by a code example or concrete data.'
  ].join(' '),
  sections: [
    { name: 'overview', required: true, maxWords: 150 },
    { name: 'prerequisites', required: true, maxWords: 100 },
    { name: 'mainContent', required: true, minWords: 1500 },
    { name: 'workingExample', required: true, minWords: 500 },
    { name: 'troubleshooting', required: true, minItems: 4 },
    { name: 'bestPractices', required: true, minItems: 6 },
    { name: 'references', required: false }
  ],
  variables: {
    title: { type: 'string', required: true },
    category: { type: 'string', required: true },
    difficulty: { type: 'enum', values: ['beginner', 'intermediate', 'advanced'] },
    tags: { type: 'array', required: true },
    topicSections: { type: 'array', required: true },
    targetAudience: { type: 'string', default: 'software engineers' },
    codeLanguage: { type: 'string', default: 'javascript' }
  },
  buildPrompt: function(variables) {
    var sectionList = variables.topicSections.map(function(s) {
      return '- ' + s;
    }).join('\n');

    return [
      'Write a comprehensive technical article titled "' + variables.title + '".',
      '',
      'Target audience: ' + (variables.targetAudience || 'software engineers'),
      'Difficulty level: ' + variables.difficulty,
      'Primary code language: ' + (variables.codeLanguage || 'javascript'),
      '',
      'Cover these topics in depth:',
      sectionList,
      '',
      'Include a complete working example that readers can run.',
      'Add a troubleshooting section with at least 4 real issues and error messages.',
      'End with at least 6 best practices as bullet points.'
    ].join('\n');
  }
};

module.exports = articleTemplate;

The key insight is that templates are versioned. When you change a template, you need to know which content was generated with which version for debugging and regeneration.

Implementing a Content Generation Pipeline in Node.js

Here is the core generation service that handles LLM calls with retries, token tracking, and structured output:

// services/generator.js
var axios = require('axios');

var OPENAI_URL = 'https://api.openai.com/v1/chat/completions';

function ContentGenerator(config) {
  this.apiKey = config.apiKey;
  this.model = config.model || 'gpt-4o';
  this.maxRetries = config.maxRetries || 3;
  this.tokenBudget = config.tokenBudget || 4096;
  this.totalTokensUsed = 0;
  this.totalCost = 0;
}

ContentGenerator.prototype.generate = function(template, variables, callback) {
  var self = this;
  var systemPrompt = template.systemPrompt;
  var userPrompt = template.buildPrompt(variables);
  var attempt = 0;

  function tryGenerate() {
    attempt++;
    console.log('[Generator] Attempt %d for "%s"', attempt, variables.title);

    var requestBody = {
      model: self.model,
      messages: [
        { role: 'system', content: systemPrompt },
        { role: 'user', content: userPrompt }
      ],
      max_tokens: self.tokenBudget,
      temperature: 0.7
    };

    axios.post(OPENAI_URL, requestBody, {
      headers: {
        'Authorization': 'Bearer ' + self.apiKey,
        'Content-Type': 'application/json'
      },
      timeout: 120000
    }).then(function(response) {
      var result = response.data;
      var content = result.choices[0].message.content;
      var usage = result.usage;

      self.totalTokensUsed += usage.total_tokens;
      self.totalCost += self.calculateCost(usage);

      console.log('[Generator] Generated %d tokens for "%s" (cost: $%s)',
        usage.total_tokens, variables.title, self.calculateCost(usage).toFixed(4));

      callback(null, {
        content: content,
        tokens: usage,
        model: self.model,
        attempt: attempt,
        generatedAt: new Date().toISOString()
      });
    }).catch(function(err) {
      var status = err.response ? err.response.status : 0;
      var message = err.response ? err.response.data.error.message : err.message;

      console.error('[Generator] Error on attempt %d: %s (status: %d)', attempt, message, status);

      if (status === 429 || status === 500 || status === 503) {
        if (attempt < self.maxRetries) {
          var delay = Math.pow(2, attempt) * 1000 + Math.random() * 1000;
          console.log('[Generator] Retrying in %dms...', Math.round(delay));
          setTimeout(tryGenerate, delay);
          return;
        }
      }

      callback(new Error('Generation failed after ' + attempt + ' attempts: ' + message));
    });
  }

  tryGenerate();
};

ContentGenerator.prototype.calculateCost = function(usage) {
  // GPT-4o pricing as of early 2026
  var inputCost = (usage.prompt_tokens / 1000000) * 2.50;
  var outputCost = (usage.completion_tokens / 1000000) * 10.00;
  return inputCost + outputCost;
};

module.exports = ContentGenerator;

Batch Content Generation with Queue-Based Processing

For high-volume generation, you need a job queue. Individual HTTP requests will not scale. I use BullMQ with Redis because it handles retries, priorities, concurrency limits, and dead-letter queues out of the box.

// queue/contentQueue.js
var Queue = require('bullmq').Queue;
var Worker = require('bullmq').Worker;
var Redis = require('ioredis');

var connection = new Redis(process.env.REDIS_URL || 'redis://localhost:6379');

var contentQueue = new Queue('content-generation', { connection: connection });

function enqueueBatch(briefs, priority) {
  var jobs = briefs.map(function(brief, index) {
    return {
      name: 'generate-' + brief.slug,
      data: {
        brief: brief,
        batchId: Date.now().toString(36),
        index: index,
        totalInBatch: briefs.length
      },
      opts: {
        priority: priority || 5,
        attempts: 3,
        backoff: { type: 'exponential', delay: 5000 },
        removeOnComplete: { count: 1000 },
        removeOnFail: { count: 5000 }
      }
    };
  });

  return contentQueue.addBulk(jobs);
}

function createWorker(generator, template, concurrency) {
  var worker = new Worker('content-generation', function(job) {
    return new Promise(function(resolve, reject) {
      var brief = job.data.brief;
      console.log('[Worker] Processing job %s (%d/%d in batch %s)',
        brief.slug, job.data.index + 1, job.data.totalInBatch, job.data.batchId);

      job.updateProgress(10);

      generator.generate(template, brief, function(err, result) {
        if (err) {
          reject(err);
          return;
        }
        job.updateProgress(100);
        resolve(result);
      });
    });
  }, {
    connection: connection,
    concurrency: concurrency || 3,
    limiter: {
      max: 10,
      duration: 60000  // Max 10 jobs per minute to respect rate limits
    }
  });

  worker.on('completed', function(job, result) {
    console.log('[Worker] Completed: %s (%d tokens)', job.data.brief.slug, result.tokens.total_tokens);
  });

  worker.on('failed', function(job, err) {
    console.error('[Worker] Failed: %s — %s', job.data.brief.slug, err.message);
  });

  return worker;
}

module.exports = {
  contentQueue: contentQueue,
  enqueueBatch: enqueueBatch,
  createWorker: createWorker
};

The concurrency setting is critical. Set it too high and you will hit API rate limits. Set it too low and batch processing takes forever. I run 3 concurrent workers for GPT-4o and 10 for smaller models.

Quality Control Layers

Quality control is where most content generation systems fail. Generating text is easy. Generating consistently good text requires multiple validation layers.

Layer 1: Automated Structural Checks

// quality/structuralCheck.js
function StructuralChecker(template) {
  this.template = template;
}

StructuralChecker.prototype.check = function(content) {
  var issues = [];
  var headings = content.match(/^#{1,6}\s.+$/gm) || [];
  var codeBlocks = content.match(/```[\s\S]*?```/g) || [];
  var wordCount = content.split(/\s+/).length;

  // Check minimum word count
  if (wordCount < 2000) {
    issues.push({
      severity: 'error',
      rule: 'min-word-count',
      message: 'Content is ' + wordCount + ' words, minimum is 2000'
    });
  }

  // Check heading hierarchy
  var h1Count = headings.filter(function(h) { return h.startsWith('# '); }).length;
  if (h1Count !== 1) {
    issues.push({
      severity: 'error',
      rule: 'single-h1',
      message: 'Expected 1 H1 heading, found ' + h1Count
    });
  }

  // Check for required sections
  var requiredSections = this.template.sections.filter(function(s) { return s.required; });
  requiredSections.forEach(function(section) {
    var sectionName = section.name.toLowerCase();
    var found = headings.some(function(h) {
      return h.toLowerCase().indexOf(sectionName) !== -1;
    });
    if (!found && sectionName !== 'maincontent') {
      issues.push({
        severity: 'warning',
        rule: 'missing-section',
        message: 'Required section "' + section.name + '" not found in headings'
      });
    }
  });

  // Check for code examples
  if (codeBlocks.length < 3) {
    issues.push({
      severity: 'warning',
      rule: 'min-code-blocks',
      message: 'Only ' + codeBlocks.length + ' code blocks found, expected at least 3'
    });
  }

  // Check for broken markdown
  var unmatchedBackticks = (content.match(/```/g) || []).length;
  if (unmatchedBackticks % 2 !== 0) {
    issues.push({
      severity: 'error',
      rule: 'unmatched-code-fence',
      message: 'Odd number of code fences detected — likely an unclosed block'
    });
  }

  return {
    passed: issues.filter(function(i) { return i.severity === 'error'; }).length === 0,
    wordCount: wordCount,
    headingCount: headings.length,
    codeBlockCount: codeBlocks.length,
    issues: issues
  };
};

module.exports = StructuralChecker;

Layer 2: LLM-Based Quality Review

Use a second LLM call to review the generated content. This catches issues that structural checks miss: factual errors, awkward phrasing, off-topic sections.

// quality/llmReview.js
var axios = require('axios');

function LLMReviewer(apiKey) {
  this.apiKey = apiKey;
  this.model = 'gpt-4o-mini';  // Use a cheaper model for review
}

LLMReviewer.prototype.review = function(content, brief, callback) {
  var reviewPrompt = [
    'Review this technical article for quality. The original brief was:',
    'Title: ' + brief.title,
    'Category: ' + brief.category,
    'Target audience: software engineers',
    '',
    'Check for:',
    '1. Technical accuracy — are code examples correct and runnable?',
    '2. Completeness — does it cover the required topics?',
    '3. Tone — is it practical and authoritative, not generic?',
    '4. Code quality — are examples realistic and not oversimplified?',
    '5. SEO — does it have a clear structure with descriptive headings?',
    '',
    'Respond with JSON: { "score": 1-10, "passed": true/false, "issues": ["issue1", "issue2"], "suggestions": ["suggestion1"] }',
    'Set passed=true if score >= 7.',
    '',
    'Article to review:',
    content.substring(0, 8000)  // Truncate to save tokens
  ].join('\n');

  axios.post('https://api.openai.com/v1/chat/completions', {
    model: this.model,
    messages: [{ role: 'user', content: reviewPrompt }],
    max_tokens: 1024,
    temperature: 0.3,
    response_format: { type: 'json_object' }
  }, {
    headers: {
      'Authorization': 'Bearer ' + this.apiKey,
      'Content-Type': 'application/json'
    }
  }).then(function(response) {
    var review = JSON.parse(response.data.choices[0].message.content);
    callback(null, review);
  }).catch(function(err) {
    callback(err);
  });
};

module.exports = LLMReviewer;

Layer 3: Human Review Queue

Automated checks and LLM review catch 80% of issues. The remaining 20% require human eyes. Route flagged content to a review dashboard.

// quality/reviewQueue.js
function ReviewQueue(db) {
  this.db = db;
  this.collection = db.collection('content_reviews');
}

ReviewQueue.prototype.addToReview = function(contentId, reason, callback) {
  this.collection.insertOne({
    contentId: contentId,
    reason: reason,
    status: 'pending',
    assignedTo: null,
    createdAt: new Date(),
    reviewedAt: null,
    decision: null,
    notes: null
  }, callback);
};

ReviewQueue.prototype.getPendingReviews = function(limit, callback) {
  this.collection.find({ status: 'pending' })
    .sort({ createdAt: 1 })
    .limit(limit || 20)
    .toArray(callback);
};

ReviewQueue.prototype.submitReview = function(reviewId, decision, notes, reviewer, callback) {
  this.collection.updateOne(
    { _id: reviewId },
    {
      $set: {
        status: 'reviewed',
        decision: decision,  // 'approve', 'reject', 'revise'
        notes: notes,
        assignedTo: reviewer,
        reviewedAt: new Date()
      }
    },
    callback
  );
};

module.exports = ReviewQueue;

Content Style Guides as System Prompts

Your system prompt is your style guide. It determines 90% of the output quality. Here is my approach:

// styles/technicalArticle.js
var styleGuide = {
  voice: [
    'Write as a practitioner who has shipped this in production.',
    'Be opinionated — recommend specific approaches and explain why.',
    'Use "I" when sharing personal experience. Use "you" when addressing the reader.',
    'Avoid filler phrases like "In today\'s world" or "As we all know".',
    'Never start a sentence with "It is important to note that".',
    'Show real error messages, real output, real file sizes.',
    'Every code block must be runnable — no pseudocode unless explicitly labeled.'
  ],
  formatting: [
    'Use ATX-style headings (# not underline).',
    'Code blocks must have language tags (```javascript not ```).',
    'Use inline code for file names, function names, and CLI commands.',
    'Keep paragraphs to 3-4 sentences maximum.',
    'Use bullet lists for 3+ related items.'
  ],
  codeStyle: [
    'Use var instead of const/let.',
    'Use function() syntax instead of arrow functions.',
    'Use require() not import.',
    'Include error handling in all examples.',
    'Add comments only where the code is non-obvious.'
  ],
  buildSystemPrompt: function() {
    var sections = [
      '## Voice\n' + this.voice.join('\n'),
      '## Formatting\n' + this.formatting.join('\n'),
      '## Code Style\n' + this.codeStyle.join('\n')
    ];
    return sections.join('\n\n');
  }
};

module.exports = styleGuide;

Store style guides as versioned configuration. When you tweak the voice, you want to regenerate only the content that was produced with the old version.

Generating SEO-Optimized Content

SEO metadata should be generated alongside the content, not bolted on afterward. The LLM has full context during generation — use that.

// seo/metadataGenerator.js
function generateSEOMetadata(content, brief, callback) {
  var prompt = [
    'Given this article content and brief, generate SEO metadata.',
    '',
    'Title: ' + brief.title,
    'Category: ' + brief.category,
    '',
    'Generate JSON with:',
    '- metaTitle: under 60 characters, includes primary keyword',
    '- metaDescription: under 155 characters, compelling, includes keyword',
    '- slug: lowercase, hyphens, under 60 characters',
    '- primaryKeyword: the main search term',
    '- secondaryKeywords: array of 3-5 related terms',
    '- headings: optimized H2 headings (array of strings)',
    '- ogTitle: Open Graph title for social sharing',
    '- ogDescription: Open Graph description',
    '',
    'Article excerpt (first 2000 chars):',
    content.substring(0, 2000)
  ].join('\n');

  // ... LLM call similar to generator pattern above
  // Returns structured SEO metadata
  callback(null, {
    metaTitle: brief.title.substring(0, 60),
    metaDescription: brief.description.substring(0, 155),
    slug: brief.title.toLowerCase().replace(/[^a-z0-9]+/g, '-').replace(/(^-|-$)/g, ''),
    primaryKeyword: brief.tags[0],
    secondaryKeywords: brief.tags.slice(1, 6)
  });
}

module.exports = { generateSEOMetadata: generateSEOMetadata };

For headings, I enforce a keyword density rule: the primary keyword must appear in the H1 and at least one H2. The LLM usually gets this right if you specify it in the template.

Multi-Format Output

Generated content needs to be stored in a neutral format and rendered to multiple outputs. Markdown is the obvious choice for the canonical format.

// formatters/multiFormat.js
var showdown = require('showdown');
var converter = new showdown.Converter({
  tables: true,
  ghCodeBlocks: true,
  tasklists: true,
  strikethrough: true
});

function ContentFormatter() {}

ContentFormatter.prototype.toHTML = function(markdown) {
  return converter.makeHtml(markdown);
};

ContentFormatter.prototype.toJSON = function(markdown, metadata) {
  var sections = [];
  var currentSection = null;
  var lines = markdown.split('\n');

  lines.forEach(function(line) {
    var headingMatch = line.match(/^(#{1,6})\s+(.+)$/);
    if (headingMatch) {
      if (currentSection) sections.push(currentSection);
      currentSection = {
        level: headingMatch[1].length,
        title: headingMatch[2],
        content: ''
      };
    } else if (currentSection) {
      currentSection.content += line + '\n';
    }
  });
  if (currentSection) sections.push(currentSection);

  return {
    metadata: metadata,
    sections: sections,
    wordCount: markdown.split(/\s+/).length,
    generatedAt: new Date().toISOString()
  };
};

ContentFormatter.prototype.toPlainText = function(markdown) {
  return markdown
    .replace(/#{1,6}\s/g, '')
    .replace(/\*\*(.+?)\*\*/g, '$1')
    .replace(/\*(.+?)\*/g, '$1')
    .replace(/`(.+?)`/g, '$1')
    .replace(/```[\s\S]*?```/g, '[code block]')
    .replace(/\[(.+?)\]\(.+?\)/g, '$1')
    .replace(/!\[.+?\]\(.+?\)/g, '[image]');
};

module.exports = ContentFormatter;

Content Scheduling and Publishing Workflows

A publishing workflow ties generation to your CMS. Content moves through states: draft, review, scheduled, published, archived.

// publishing/scheduler.js
var cron = require('node-cron');

function PublishingScheduler(cmsClient, db) {
  this.cms = cmsClient;
  this.db = db;
  this.collection = db.collection('scheduled_content');
}

PublishingScheduler.prototype.schedule = function(contentId, publishAt, callback) {
  this.collection.updateOne(
    { contentId: contentId },
    {
      $set: {
        contentId: contentId,
        publishAt: new Date(publishAt),
        status: 'scheduled',
        scheduledAt: new Date()
      }
    },
    { upsert: true },
    callback
  );
};

PublishingScheduler.prototype.startCron = function() {
  var self = this;

  // Check every 5 minutes for content due to publish
  cron.schedule('*/5 * * * *', function() {
    var now = new Date();
    self.collection.find({
      status: 'scheduled',
      publishAt: { $lte: now }
    }).toArray(function(err, items) {
      if (err) {
        console.error('[Scheduler] Error fetching scheduled items:', err.message);
        return;
      }

      items.forEach(function(item) {
        self.publish(item.contentId, function(pubErr) {
          if (pubErr) {
            console.error('[Scheduler] Failed to publish %s: %s', item.contentId, pubErr.message);
            self.collection.updateOne(
              { contentId: item.contentId },
              { $set: { status: 'failed', error: pubErr.message } }
            );
          } else {
            console.log('[Scheduler] Published: %s', item.contentId);
            self.collection.updateOne(
              { contentId: item.contentId },
              { $set: { status: 'published', publishedAt: now } }
            );
          }
        });
      });
    });
  });

  console.log('[Scheduler] Publishing cron started');
};

PublishingScheduler.prototype.publish = function(contentId, callback) {
  var self = this;
  self.db.collection('generated_content').findOne({ _id: contentId }, function(err, content) {
    if (err || !content) {
      callback(err || new Error('Content not found: ' + contentId));
      return;
    }

    self.cms.createEntry('article', {
      title: content.metadata.title,
      body: content.html,
      slug: content.metadata.slug,
      metaDescription: content.metadata.metaDescription,
      category: content.metadata.category,
      tags: content.metadata.tags,
      publishDate: new Date().toISOString()
    }, callback);
  });
};

module.exports = PublishingScheduler;

A/B Testing Generated Content Performance

When you generate content at scale, you can produce multiple variants and let analytics pick the winner.

// ab/variantManager.js
function VariantManager(db) {
  this.db = db;
  this.collection = db.collection('content_variants');
}

VariantManager.prototype.createVariants = function(contentId, variants, callback) {
  var docs = variants.map(function(variant, index) {
    return {
      contentId: contentId,
      variantId: contentId + '-v' + index,
      title: variant.title,
      metaDescription: variant.metaDescription,
      introduction: variant.introduction,
      trafficShare: 1 / variants.length,
      impressions: 0,
      clicks: 0,
      timeOnPage: 0,
      bounceRate: 0,
      createdAt: new Date()
    };
  });

  this.collection.insertMany(docs, callback);
};

VariantManager.prototype.selectVariant = function(contentId, callback) {
  // Weighted random selection based on traffic share
  this.collection.find({ contentId: contentId }).toArray(function(err, variants) {
    if (err || !variants.length) {
      callback(err || new Error('No variants found'));
      return;
    }

    var random = Math.random();
    var cumulative = 0;
    var selected = variants[0];

    for (var i = 0; i < variants.length; i++) {
      cumulative += variants[i].trafficShare;
      if (random <= cumulative) {
        selected = variants[i];
        break;
      }
    }

    callback(null, selected);
  });
};

VariantManager.prototype.recordImpression = function(variantId, metrics, callback) {
  this.collection.updateOne(
    { variantId: variantId },
    {
      $inc: { impressions: 1 },
      $set: { lastImpression: new Date() }
    },
    callback
  );
};

module.exports = VariantManager;

Content Versioning and Revision History

Every piece of generated content should have a complete audit trail: who requested it, which template and model version produced it, what quality scores it received, and every revision.

// versioning/contentStore.js
function ContentStore(db) {
  this.db = db;
  this.content = db.collection('generated_content');
  this.revisions = db.collection('content_revisions');
}

ContentStore.prototype.save = function(contentData, callback) {
  var self = this;
  var doc = {
    title: contentData.title,
    slug: contentData.slug,
    currentVersion: 1,
    markdown: contentData.markdown,
    html: contentData.html,
    metadata: contentData.metadata,
    generation: {
      model: contentData.model,
      templateVersion: contentData.templateVersion,
      tokens: contentData.tokens,
      cost: contentData.cost,
      generatedAt: contentData.generatedAt
    },
    quality: contentData.qualityResults,
    status: 'draft',
    createdAt: new Date(),
    updatedAt: new Date()
  };

  self.content.insertOne(doc, function(err, result) {
    if (err) { callback(err); return; }

    // Save initial revision
    self.revisions.insertOne({
      contentId: result.insertedId,
      version: 1,
      markdown: contentData.markdown,
      changedBy: 'system',
      changeType: 'initial_generation',
      createdAt: new Date()
    }, function(revErr) {
      callback(revErr, result.insertedId);
    });
  });
};

ContentStore.prototype.update = function(contentId, newMarkdown, changedBy, changeType, callback) {
  var self = this;

  self.content.findOne({ _id: contentId }, function(err, existing) {
    if (err || !existing) {
      callback(err || new Error('Content not found'));
      return;
    }

    var newVersion = existing.currentVersion + 1;

    self.content.updateOne(
      { _id: contentId },
      {
        $set: {
          markdown: newMarkdown,
          currentVersion: newVersion,
          updatedAt: new Date()
        }
      },
      function(updateErr) {
        if (updateErr) { callback(updateErr); return; }

        self.revisions.insertOne({
          contentId: contentId,
          version: newVersion,
          markdown: newMarkdown,
          changedBy: changedBy,
          changeType: changeType,
          createdAt: new Date()
        }, callback);
      }
    );
  });
};

ContentStore.prototype.getHistory = function(contentId, callback) {
  this.revisions.find({ contentId: contentId })
    .sort({ version: -1 })
    .toArray(callback);
};

module.exports = ContentStore;

Handling Images and Media References in Generated Content

LLMs generate text, not images. Your pipeline needs a strategy for visual content: placeholder references that get resolved by a separate media service.

// media/imageResolver.js
function ImageResolver(config) {
  this.unsplashKey = config.unsplashKey;
  this.imageDir = config.imageDir || '/images/generated';
}

ImageResolver.prototype.resolveReferences = function(markdown, brief, callback) {
  var self = this;
  // Find image placeholders like [IMAGE: description of needed image]
  var placeholders = [];
  var regex = /\[IMAGE:\s*(.+?)\]/g;
  var match;

  while ((match = regex.exec(markdown)) !== null) {
    placeholders.push({
      fullMatch: match[0],
      description: match[1],
      index: match.index
    });
  }

  if (placeholders.length === 0) {
    callback(null, markdown);
    return;
  }

  var resolved = 0;
  var result = markdown;

  placeholders.forEach(function(placeholder) {
    self.findImage(placeholder.description, brief.category, function(err, imageUrl) {
      var replacement = err
        ? '<!-- Image not found: ' + placeholder.description + ' -->'
        : '![' + placeholder.description + '](' + imageUrl + ')';

      result = result.replace(placeholder.fullMatch, replacement);
      resolved++;

      if (resolved === placeholders.length) {
        callback(null, result);
      }
    });
  });
};

ImageResolver.prototype.findImage = function(description, category, callback) {
  // Search stock photo API or internal asset library
  var axios = require('axios');
  var query = encodeURIComponent(description + ' ' + category);

  axios.get('https://api.unsplash.com/search/photos?query=' + query + '&per_page=1', {
    headers: { 'Authorization': 'Client-ID ' + this.unsplashKey }
  }).then(function(response) {
    if (response.data.results.length > 0) {
      callback(null, response.data.results[0].urls.regular);
    } else {
      callback(new Error('No images found for: ' + description));
    }
  }).catch(function(err) {
    callback(err);
  });
};

module.exports = ImageResolver;

Cost Management for High-Volume Content Generation

At scale, LLM costs add up fast. A 3000-word article costs roughly $0.03-0.08 with GPT-4o. Multiply that by quality review calls, SEO generation, and variant testing, and a single article can cost $0.15-0.30. A thousand articles per month is $150-300 just in API calls.

// cost/tracker.js
function CostTracker(db) {
  this.db = db;
  this.collection = db.collection('cost_tracking');
}

CostTracker.prototype.record = function(entry, callback) {
  this.collection.insertOne({
    contentId: entry.contentId,
    stage: entry.stage,  // 'generation', 'review', 'seo', 'revision'
    model: entry.model,
    inputTokens: entry.inputTokens,
    outputTokens: entry.outputTokens,
    cost: entry.cost,
    timestamp: new Date()
  }, callback);
};

CostTracker.prototype.getDailyCost = function(date, callback) {
  var startOfDay = new Date(date);
  startOfDay.setHours(0, 0, 0, 0);
  var endOfDay = new Date(date);
  endOfDay.setHours(23, 59, 59, 999);

  this.collection.aggregate([
    {
      $match: {
        timestamp: { $gte: startOfDay, $lte: endOfDay }
      }
    },
    {
      $group: {
        _id: '$stage',
        totalCost: { $sum: '$cost' },
        totalTokens: { $sum: { $add: ['$inputTokens', '$outputTokens'] } },
        count: { $sum: 1 }
      }
    }
  ]).toArray(callback);
};

CostTracker.prototype.checkBudget = function(dailyLimit, callback) {
  var self = this;
  self.getDailyCost(new Date(), function(err, results) {
    if (err) { callback(err); return; }

    var totalToday = results.reduce(function(sum, r) {
      return sum + r.totalCost;
    }, 0);

    callback(null, {
      spent: totalToday,
      limit: dailyLimit,
      remaining: dailyLimit - totalToday,
      overBudget: totalToday >= dailyLimit
    });
  });
};

module.exports = CostTracker;

Strategies that actually reduce costs:

  • Use cheaper models for review — GPT-4o-mini for quality checks, Claude Haiku for classification
  • Cache system prompts — Most providers now support prompt caching that cuts input costs by 50-90%
  • Batch API calls — OpenAI's batch API is 50% cheaper with 24-hour turnaround
  • Template token budgets — Set max_tokens per template to prevent runaway generation
  • Skip review for low-stakes content — Product descriptions do not need the same quality pipeline as technical articles

Complete Working Example

Here is a complete system that takes topic briefs, generates articles through a quality pipeline, and publishes to a CMS via API.

// index.js — Complete content generation pipeline
var express = require('express');
var ContentGenerator = require('./services/generator');
var StructuralChecker = require('./quality/structuralCheck');
var LLMReviewer = require('./quality/llmReview');
var ContentFormatter = require('./formatters/multiFormat');
var ContentStore = require('./versioning/contentStore');
var CostTracker = require('./cost/tracker');
var articleTemplate = require('./templates/article-template');
var MongoClient = require('mongodb').MongoClient;

var app = express();
app.use(express.json());

var db;
var generator;
var checker;
var reviewer;
var formatter;
var store;
var costTracker;

MongoClient.connect(process.env.MONGODB_URL || 'mongodb://localhost:27017', function(err, client) {
  if (err) {
    console.error('MongoDB connection failed:', err.message);
    process.exit(1);
  }

  db = client.db('content_pipeline');
  generator = new ContentGenerator({
    apiKey: process.env.OPENAI_API_KEY,
    model: 'gpt-4o',
    maxRetries: 3,
    tokenBudget: 4096
  });
  checker = new StructuralChecker(articleTemplate);
  reviewer = new LLMReviewer(process.env.OPENAI_API_KEY);
  formatter = new ContentFormatter();
  store = new ContentStore(db);
  costTracker = new CostTracker(db);

  console.log('Pipeline initialized');
});

// POST /generate — Submit a content brief
app.post('/generate', function(req, res) {
  var brief = req.body;

  if (!brief.title || !brief.category || !brief.topicSections) {
    return res.status(400).json({ error: 'Missing required fields: title, category, topicSections' });
  }

  console.log('[Pipeline] Starting generation for: %s', brief.title);

  // Stage 1: Generate content
  generator.generate(articleTemplate, brief, function(genErr, genResult) {
    if (genErr) {
      console.error('[Pipeline] Generation failed:', genErr.message);
      return res.status(500).json({ error: 'Generation failed', details: genErr.message });
    }

    costTracker.record({
      contentId: brief.title,
      stage: 'generation',
      model: 'gpt-4o',
      inputTokens: genResult.tokens.prompt_tokens,
      outputTokens: genResult.tokens.completion_tokens,
      cost: generator.calculateCost(genResult.tokens)
    }, function() {});

    // Stage 2: Structural quality check
    var structuralResult = checker.check(genResult.content);
    console.log('[Pipeline] Structural check: %s (%d words, %d issues)',
      structuralResult.passed ? 'PASSED' : 'FAILED',
      structuralResult.wordCount,
      structuralResult.issues.length);

    if (!structuralResult.passed) {
      return res.status(422).json({
        error: 'Content failed structural checks',
        issues: structuralResult.issues
      });
    }

    // Stage 3: LLM quality review
    reviewer.review(genResult.content, brief, function(revErr, reviewResult) {
      if (revErr) {
        console.error('[Pipeline] Review failed:', revErr.message);
        // Continue without review rather than blocking
        reviewResult = { score: 0, passed: true, issues: ['Review unavailable'] };
      }

      console.log('[Pipeline] LLM review: score=%d, passed=%s',
        reviewResult.score, reviewResult.passed);

      // Stage 4: Format and store
      var html = formatter.toHTML(genResult.content);
      var metadata = {
        title: brief.title,
        slug: brief.title.toLowerCase().replace(/[^a-z0-9]+/g, '-'),
        category: brief.category,
        tags: brief.tags || [],
        difficulty: brief.difficulty,
        metaDescription: brief.description || ''
      };

      store.save({
        title: brief.title,
        slug: metadata.slug,
        markdown: genResult.content,
        html: html,
        metadata: metadata,
        model: 'gpt-4o',
        templateVersion: articleTemplate.version,
        tokens: genResult.tokens,
        cost: generator.calculateCost(genResult.tokens),
        generatedAt: genResult.generatedAt,
        qualityResults: {
          structural: structuralResult,
          llmReview: reviewResult
        }
      }, function(saveErr, contentId) {
        if (saveErr) {
          return res.status(500).json({ error: 'Failed to save content', details: saveErr.message });
        }

        console.log('[Pipeline] Content saved: %s (id: %s)', brief.title, contentId);

        res.json({
          contentId: contentId,
          title: brief.title,
          wordCount: structuralResult.wordCount,
          qualityScore: reviewResult.score,
          cost: generator.calculateCost(genResult.tokens),
          status: reviewResult.passed ? 'approved' : 'needs_review'
        });
      });
    });
  });
});

// GET /status — Pipeline health and cost summary
app.get('/status', function(req, res) {
  costTracker.getDailyCost(new Date(), function(err, costs) {
    res.json({
      status: 'running',
      totalTokensUsed: generator.totalTokensUsed,
      totalCost: generator.totalCost,
      dailyCosts: costs || []
    });
  });
});

var PORT = process.env.PORT || 3000;
app.listen(PORT, function() {
  console.log('Content generation pipeline running on port %d', PORT);
});

Submit a brief like this:

curl -X POST http://localhost:3000/generate \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Building REST APIs with Express.js",
    "category": "backend/api-development",
    "difficulty": "intermediate",
    "tags": ["express", "rest", "api", "nodejs"],
    "description": "A practical guide to building production REST APIs with Express.js",
    "topicSections": [
      "Route organization and middleware",
      "Request validation",
      "Error handling patterns",
      "Authentication middleware",
      "Rate limiting and security"
    ]
  }'

Expected output:

{
  "contentId": "65f2a1b3c4d5e6f7a8b9c0d1",
  "title": "Building REST APIs with Express.js",
  "wordCount": 2847,
  "qualityScore": 8,
  "cost": 0.0412,
  "status": "approved"
}

Common Issues and Troubleshooting

1. Rate Limit Errors During Batch Processing

Error: 429 Too Many Requests — Rate limit reached for gpt-4o
  on tokens per min (TPM): Limit 30000, Used 28500, Requested 4096

This happens when your worker concurrency is too high. Reduce concurrency and add inter-request delays. Use BullMQ's built-in rate limiter instead of manual setTimeout calls. For large batches, use OpenAI's batch API endpoint which processes jobs asynchronously at a 50% discount.

2. Content Truncated Mid-Sentence

[Generator] Warning: finish_reason=length for "Building REST APIs"
  Generated 4096 tokens but content appears incomplete

The max_tokens budget was too low for the content length. Increase max_tokens to at least 8192 for long-form articles. Check finish_reason in every response — if it is length instead of stop, the content was cut off. Implement a continuation strategy that sends the truncated content back with "Continue from where you left off."

3. MongoDB Write Failures on High-Volume Inserts

MongoError: E11000 duplicate key error collection: content_pipeline.generated_content
  index: slug_1 dup key: { slug: "building-rest-apis-with-expressjs" }

Slug collisions when generating content with similar titles. Add a short hash suffix to slugs: slug + '-' + Date.now().toString(36). Alternatively, use updateOne with upsert: true if you want to overwrite existing content with the same slug.

4. LLM Returns Malformed JSON in Review Stage

SyntaxError: Unexpected token 'T' at position 0
  Input: "The article is well-written and covers..."

Even with response_format: { type: 'json_object' }, LLMs occasionally return plain text. Wrap the JSON.parse call in a try-catch and fall back to a regex-based extraction. Always validate the parsed object has the expected fields before using it.

function safeParseReview(text) {
  try {
    var parsed = JSON.parse(text);
    if (typeof parsed.score === 'number' && typeof parsed.passed === 'boolean') {
      return parsed;
    }
  } catch (e) {
    // Try to extract JSON from markdown code blocks
    var jsonMatch = text.match(/```(?:json)?\s*([\s\S]*?)```/);
    if (jsonMatch) {
      try { return JSON.parse(jsonMatch[1]); } catch (e2) {}
    }
  }
  return { score: 0, passed: false, issues: ['Failed to parse review response'] };
}

5. Memory Exhaustion During Large Batch Processing

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

This occurs when you load all content into memory for processing. Stream results to disk instead of accumulating them. Process batches in chunks of 50-100 rather than loading 1000+ briefs at once. Set --max-old-space-size=4096 as a safety net but fix the underlying memory pattern.

Best Practices

  • Version everything — Templates, style guides, model versions, and prompts should all be versioned. When content quality drops, you need to trace exactly what changed.

  • Separate generation from publishing — The generation pipeline should produce content to a staging area. Publishing should be a separate, deliberate step with its own approval workflow.

  • Budget alerts, not just limits — Set daily cost alerts at 50%, 75%, and 90% of your budget. A runaway batch job can burn through hundreds of dollars in minutes if your retry logic has a bug.

  • Cache aggressively — If two briefs produce similar prompts, cache the result. Use content hashing on the full prompt (system + user) to detect duplicates before hitting the API.

  • Log every LLM call — Store the full request and response for every API call. This is essential for debugging quality issues, disputing billing, and training evaluation datasets.

  • Use the cheapest model that works — GPT-4o for generation, GPT-4o-mini for review, Haiku for classification. Do not use your most expensive model for every stage of the pipeline.

  • Test templates with a sample batch first — Before running 500 articles through a new template, generate 5 and review them manually. Template bugs are expensive at scale.

  • Implement circuit breakers — If 3 consecutive generation calls fail, pause the pipeline and alert. Do not keep hammering a broken API and burning through retry budgets.

  • Separate content briefs from generation logic — Briefs should be data. Templates should be configuration. Generation logic should be code. Keep all three cleanly separated so non-engineers can create briefs without touching code.

References

Powered by Contentful