Use Cases

Documentation Automation with AI

Automate documentation generation with AI including API docs, code comments, READMEs, and changelogs from source code in Node.js.

Documentation Automation with AI

Overview

Documentation is the first thing to rot in any codebase. Engineers write it once, then the code changes and the docs stay frozen in time. AI-powered documentation automation solves this by scanning your actual source code, extracting structure and intent, and generating accurate, current documentation on demand. This article covers building a complete documentation pipeline in Node.js that produces API docs, JSDoc comments, READMEs, changelogs, and tutorials from your Express.js projects.

Prerequisites

  • Node.js 18+ installed
  • An OpenAI API key (GPT-4o or GPT-4o-mini)
  • Familiarity with Express.js routing patterns
  • Basic understanding of AST (Abstract Syntax Tree) concepts
  • Git installed (for changelog generation)
  • An Express.js project to document (we will build a sample)

The Documentation Problem

Every engineering team I have worked with has the same complaint: the docs are outdated. This is not a discipline problem. It is a systems problem. Documentation decays because it exists separately from the code it describes. You change a route handler, add a query parameter, rename a field in the response body, and the API docs still say the old thing. Nobody notices until a consumer files a bug.

The manual documentation workflow looks like this: write code, then context-switch into a different file or system, then describe what you just wrote, then hope you remembered every detail. Multiply that by every pull request across every engineer on the team. It does not scale.

What does scale is treating documentation as a build artifact. The same way you compile TypeScript or bundle frontend assets, you generate documentation from the source of truth: the code itself. AI makes this feasible because it can read code and produce human-readable explanations that were previously impossible to automate without brittle template systems.

Types of Documentation AI Can Generate

Not all documentation is created equal. AI excels at some types and struggles with others. Here is the breakdown from most to least automatable:

API Documentation - Route paths, HTTP methods, request parameters, response shapes. This is highly structured and lives directly in the code. AI can extract it with near-perfect accuracy.

JSDoc Comments - Function signatures, parameter types, return values, brief descriptions. The function itself contains almost everything the comment needs to say.

README Files - Project structure, installation instructions, available scripts, dependencies. Most of this lives in package.json and the file tree.

Changelogs - What changed between versions. Git commits and diffs contain the raw material. AI summarizes and groups them.

Tutorials - Step-by-step guides based on code examples. This requires the most human judgment and produces the least reliable output without review.

Architecture Documentation - System diagrams, data flow descriptions, design decisions. AI can draft these but lacks context about why decisions were made. Always needs heavy human editing.

Parsing Source Code for Documentation Context

Before you can generate documentation, you need to extract structured information from the codebase. There are two primary approaches: AST analysis for precise code structure, and regex-based scanning for quick extraction.

AST Analysis with Acorn

AST parsing gives you the actual structure of the code, not just pattern matches against text.

// lib/ast-parser.js
var acorn = require("acorn");
var walk = require("acorn-walk");
var fs = require("fs");

function extractFunctions(filePath) {
  var source = fs.readFileSync(filePath, "utf8");
  var ast = acorn.parse(source, {
    ecmaVersion: 2020,
    sourceType: "module",
    locations: true
  });

  var functions = [];

  walk.simple(ast, {
    FunctionDeclaration: function(node) {
      functions.push({
        name: node.id ? node.id.name : "(anonymous)",
        params: node.params.map(function(p) {
          return p.name || p.left && p.left.name || "unknown";
        }),
        line: node.loc.start.line,
        endLine: node.loc.end.line,
        body: source.substring(node.start, node.end)
      });
    },
    VariableDeclarator: function(node) {
      if (node.init && node.init.type === "FunctionExpression") {
        functions.push({
          name: node.id.name,
          params: node.init.params.map(function(p) {
            return p.name || "unknown";
          }),
          line: node.loc.start.line,
          endLine: node.loc.end.line,
          body: source.substring(node.start, node.end)
        });
      }
    }
  });

  return functions;
}

module.exports = { extractFunctions: extractFunctions };

JSDoc Extraction

If your codebase already has partial JSDoc coverage, extract existing comments so the AI can preserve and extend them rather than starting from scratch.

// lib/jsdoc-extractor.js
var fs = require("fs");

function extractJSDocComments(filePath) {
  var source = fs.readFileSync(filePath, "utf8");
  var pattern = /\/\*\*[\s\S]*?\*\//g;
  var comments = [];
  var match;

  while ((match = pattern.exec(source)) !== null) {
    var lineNumber = source.substring(0, match.index).split("\n").length;
    comments.push({
      text: match[0],
      line: lineNumber,
      params: extractParams(match[0]),
      returns: extractReturns(match[0]),
      description: extractDescription(match[0])
    });
  }

  return comments;
}

function extractParams(comment) {
  var paramPattern = /@param\s+\{([^}]+)\}\s+(\w+)\s*[-–]?\s*(.*)/g;
  var params = [];
  var match;

  while ((match = paramPattern.exec(comment)) !== null) {
    params.push({
      type: match[1],
      name: match[2],
      description: match[3].trim()
    });
  }

  return params;
}

function extractReturns(comment) {
  var returnPattern = /@returns?\s+\{([^}]+)\}\s*(.*)/;
  var match = comment.match(returnPattern);
  if (match) {
    return { type: match[1], description: match[2].trim() };
  }
  return null;
}

function extractDescription(comment) {
  var lines = comment.split("\n");
  var desc = [];
  for (var i = 1; i < lines.length; i++) {
    var line = lines[i].replace(/^\s*\*\s?/, "").trim();
    if (line.startsWith("@")) break;
    if (line && line !== "/") desc.push(line);
  }
  return desc.join(" ");
}

module.exports = { extractJSDocComments: extractJSDocComments };

Generating API Documentation from Route Definitions

Express routes follow predictable patterns. You can extract method, path, middleware, and handler information by scanning route files.

// lib/route-scanner.js
var fs = require("fs");
var path = require("path");

function scanRouteFile(filePath) {
  var source = fs.readFileSync(filePath, "utf8");
  var routes = [];

  // Match router.get/post/put/delete/patch patterns
  var routePattern = /router\.(get|post|put|delete|patch)\(\s*["'`]([^"'`]+)["'`]/g;
  var match;

  while ((match = routePattern.exec(source)) !== null) {
    var method = match[1].toUpperCase();
    var routePath = match[2];

    // Extract the handler function body for context
    var handlerStart = match.index;
    var braceCount = 0;
    var handlerEnd = handlerStart;
    var foundFirstBrace = false;

    for (var i = handlerStart; i < source.length; i++) {
      if (source[i] === "{") {
        braceCount++;
        foundFirstBrace = true;
      }
      if (source[i] === "}") {
        braceCount--;
      }
      if (foundFirstBrace && braceCount === 0) {
        handlerEnd = i + 1;
        break;
      }
    }

    var handlerBody = source.substring(handlerStart, handlerEnd);

    routes.push({
      method: method,
      path: routePath,
      file: filePath,
      handler: handlerBody,
      params: extractRouteParams(routePath),
      queryParams: extractQueryParams(handlerBody),
      responseFields: extractResponseFields(handlerBody)
    });
  }

  return routes;
}

function extractRouteParams(routePath) {
  var params = [];
  var paramPattern = /:(\w+)/g;
  var match;
  while ((match = paramPattern.exec(routePath)) !== null) {
    params.push(match[1]);
  }
  return params;
}

function extractQueryParams(handlerBody) {
  var params = [];
  var queryPattern = /req\.query\.(\w+)/g;
  var match;
  while ((match = queryPattern.exec(handlerBody)) !== null) {
    if (params.indexOf(match[1]) === -1) {
      params.push(match[1]);
    }
  }
  return params;
}

function extractResponseFields(handlerBody) {
  // Look for res.json() calls and extract object keys
  var jsonPattern = /res\.json\(\s*\{([^}]+)\}/;
  var match = handlerBody.match(jsonPattern);
  if (match) {
    var fieldPattern = /(\w+)\s*:/g;
    var fields = [];
    var fieldMatch;
    while ((fieldMatch = fieldPattern.exec(match[1])) !== null) {
      fields.push(fieldMatch[1]);
    }
    return fields;
  }
  return [];
}

function scanRouteDirectory(dirPath) {
  var allRoutes = [];
  var files = fs.readdirSync(dirPath);

  files.forEach(function(file) {
    if (path.extname(file) === ".js") {
      var filePath = path.join(dirPath, file);
      var routes = scanRouteFile(filePath);
      allRoutes = allRoutes.concat(routes);
    }
  });

  return allRoutes;
}

module.exports = {
  scanRouteFile: scanRouteFile,
  scanRouteDirectory: scanRouteDirectory
};

Auto-Generating JSDoc Comments for Functions

This is where the AI earns its keep. Given a function body, it generates a complete JSDoc comment that describes what the function does, what the parameters are, and what it returns.

// lib/jsdoc-generator.js
var OpenAI = require("openai");

var client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

function generateJSDoc(functionInfo) {
  var prompt = [
    "Generate a JSDoc comment for this JavaScript function.",
    "Include @param tags with types, @returns with type, and a brief description.",
    "Do NOT include the function itself, only the JSDoc comment.",
    "Be precise about types. Use {string}, {number}, {Object}, {Array}, {boolean}, {Promise} etc.",
    "Keep the description to one or two sentences.",
    "",
    "Function:",
    "```javascript",
    functionInfo.body,
    "```"
  ].join("\n");

  return client.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [
      {
        role: "system",
        content: "You are a technical documentation writer. Output only valid JSDoc comments, nothing else."
      },
      { role: "user", content: prompt }
    ],
    temperature: 0.2,
    max_tokens: 500
  }).then(function(response) {
    var comment = response.choices[0].message.content.trim();
    // Strip markdown code fences if the model wraps them
    comment = comment.replace(/^```[\w]*\n?/, "").replace(/\n?```$/, "");
    return {
      function: functionInfo.name,
      line: functionInfo.line,
      comment: comment
    };
  });
}

function generateJSDocsForFile(functions) {
  var promises = functions.map(function(fn) {
    return generateJSDoc(fn);
  });
  return Promise.all(promises);
}

module.exports = {
  generateJSDoc: generateJSDoc,
  generateJSDocsForFile: generateJSDocsForFile
};

README Generation from Project Structure

A README generator inspects the file tree, package.json, and existing configuration files to produce installation instructions, project structure, and available commands.

// lib/readme-generator.js
var fs = require("fs");
var path = require("path");
var OpenAI = require("openai");

var client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

function gatherProjectContext(projectRoot) {
  var context = {};

  // Read package.json
  var pkgPath = path.join(projectRoot, "package.json");
  if (fs.existsSync(pkgPath)) {
    context.packageJson = JSON.parse(fs.readFileSync(pkgPath, "utf8"));
  }

  // Build file tree (top 2 levels)
  context.fileTree = buildFileTree(projectRoot, 0, 2);

  // Check for common config files
  var configFiles = [
    ".env.example", "Dockerfile", "docker-compose.yml",
    ".eslintrc.json", "jest.config.js", "tsconfig.json",
    ".github/workflows/ci.yml"
  ];

  context.configs = [];
  configFiles.forEach(function(file) {
    var fullPath = path.join(projectRoot, file);
    if (fs.existsSync(fullPath)) {
      context.configs.push(file);
    }
  });

  // Check for existing README
  var readmePath = path.join(projectRoot, "README.md");
  if (fs.existsSync(readmePath)) {
    context.existingReadme = fs.readFileSync(readmePath, "utf8");
  }

  return context;
}

function buildFileTree(dirPath, currentDepth, maxDepth) {
  if (currentDepth >= maxDepth) return [];

  var entries = [];
  var items = fs.readdirSync(dirPath);

  var ignoreDirs = ["node_modules", ".git", "coverage", "dist", ".nyc_output"];

  items.forEach(function(item) {
    if (ignoreDirs.indexOf(item) !== -1) return;
    if (item.startsWith(".") && item !== ".env.example") return;

    var fullPath = path.join(dirPath, item);
    var stat = fs.statSync(fullPath);

    if (stat.isDirectory()) {
      entries.push({
        name: item + "/",
        type: "directory",
        children: buildFileTree(fullPath, currentDepth + 1, maxDepth)
      });
    } else {
      entries.push({ name: item, type: "file" });
    }
  });

  return entries;
}

function generateReadme(projectRoot) {
  var context = gatherProjectContext(projectRoot);

  var prompt = [
    "Generate a professional README.md for this Node.js project.",
    "",
    "Project name: " + (context.packageJson ? context.packageJson.name : "unknown"),
    "Description: " + (context.packageJson ? context.packageJson.description || "No description" : ""),
    "Version: " + (context.packageJson ? context.packageJson.version : "0.0.0"),
    "",
    "Scripts available:",
    JSON.stringify(context.packageJson ? context.packageJson.scripts : {}, null, 2),
    "",
    "Dependencies:",
    JSON.stringify(context.packageJson ? context.packageJson.dependencies : {}, null, 2),
    "",
    "File structure:",
    JSON.stringify(context.fileTree, null, 2),
    "",
    "Config files present: " + context.configs.join(", "),
    "",
    "Include these sections: Project Title, Description, Prerequisites,",
    "Installation, Usage, API Endpoints (if Express), Project Structure,",
    "Environment Variables, Testing, Deployment, License.",
    "Use markdown formatting. Be concise and practical."
  ].join("\n");

  return client.chat.completions.create({
    model: "gpt-4o",
    messages: [
      {
        role: "system",
        content: "You are a senior engineer writing clear, actionable README documentation. Output only markdown."
      },
      { role: "user", content: prompt }
    ],
    temperature: 0.3,
    max_tokens: 3000
  }).then(function(response) {
    return response.choices[0].message.content.trim();
  });
}

module.exports = { generateReadme: generateReadme };

Changelog Generation from Git Commit History

Changelogs are tedious to maintain manually but trivial to generate from git history. The key is grouping commits by type and filtering out noise.

// lib/changelog-generator.js
var execSync = require("child_process").execSync;
var OpenAI = require("openai");

var client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

function getCommitsSince(tag, cwd) {
  var cmd = tag
    ? 'git log ' + tag + '..HEAD --pretty=format:"%h|%s|%an|%ai" --no-merges'
    : 'git log --pretty=format:"%h|%s|%an|%ai" --no-merges -50';

  try {
    var output = execSync(cmd, { cwd: cwd, encoding: "utf8" });
    return output.split("\n").filter(Boolean).map(function(line) {
      var parts = line.split("|");
      return {
        hash: parts[0],
        message: parts[1],
        author: parts[2],
        date: parts[3]
      };
    });
  } catch (err) {
    console.error("Failed to read git log:", err.message);
    return [];
  }
}

function getLatestTag(cwd) {
  try {
    return execSync("git describe --tags --abbrev=0", {
      cwd: cwd,
      encoding: "utf8"
    }).trim();
  } catch (err) {
    return null;
  }
}

function generateChangelog(projectRoot, version) {
  var latestTag = getLatestTag(projectRoot);
  var commits = getCommitsSince(latestTag, projectRoot);

  if (commits.length === 0) {
    return Promise.resolve("No commits found since last tag.");
  }

  var commitList = commits.map(function(c) {
    return c.hash + " " + c.message + " (" + c.author + ")";
  }).join("\n");

  var prompt = [
    "Generate a changelog entry for version " + (version || "NEXT") + ".",
    "Group commits into: Added, Changed, Fixed, Removed, Security.",
    "Omit empty groups. Write human-readable bullet points, not raw commit messages.",
    "Combine related commits into single entries where appropriate.",
    "Skip trivial commits like typo fixes or merge commits.",
    "",
    "Raw commits:",
    commitList
  ].join("\n");

  return client.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [
      {
        role: "system",
        content: "You write concise changelogs following Keep a Changelog format. Output only markdown."
      },
      { role: "user", content: prompt }
    ],
    temperature: 0.2,
    max_tokens: 1500
  }).then(function(response) {
    return response.choices[0].message.content.trim();
  });
}

module.exports = { generateChangelog: generateChangelog };

Tutorial Generation from Code Examples

Tutorials require more human oversight than other doc types, but AI can produce solid first drafts when given well-structured examples.

// lib/tutorial-generator.js
var fs = require("fs");
var OpenAI = require("openai");

var client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

function generateTutorial(options) {
  var codeExample = fs.readFileSync(options.exampleFile, "utf8");

  var prompt = [
    "Write a step-by-step tutorial based on this code example.",
    "Title: " + options.title,
    "Target audience: " + (options.audience || "intermediate Node.js developers"),
    "",
    "Include: introduction, prerequisites, step-by-step walkthrough,",
    "explanation of each section, expected output, and common pitfalls.",
    "Break the code into logical steps. Do not dump the entire file at once.",
    "",
    "Code example:",
    "```javascript",
    codeExample,
    "```"
  ].join("\n");

  return client.chat.completions.create({
    model: "gpt-4o",
    messages: [
      {
        role: "system",
        content: "You write practical, hands-on tutorials for software engineers. Use clear step numbering and include code at each step."
      },
      { role: "user", content: prompt }
    ],
    temperature: 0.4,
    max_tokens: 4000
  }).then(function(response) {
    return response.choices[0].message.content.trim();
  });
}

module.exports = { generateTutorial: generateTutorial };

Implementing a Documentation Pipeline

The real power comes from wiring these generators into a single pipeline that scans your project, decides what needs documenting, generates the docs, and outputs formatted files.

// lib/doc-pipeline.js
var fs = require("fs");
var path = require("path");
var routeScanner = require("./route-scanner");
var astParser = require("./ast-parser");
var jsdocGenerator = require("./jsdoc-generator");
var readmeGenerator = require("./readme-generator");
var changelogGenerator = require("./changelog-generator");

function runPipeline(options) {
  var projectRoot = options.projectRoot;
  var outputDir = options.outputDir || path.join(projectRoot, "docs");

  if (!fs.existsSync(outputDir)) {
    fs.mkdirSync(outputDir, { recursive: true });
  }

  console.log("Documentation pipeline starting...");
  console.log("Project root:", projectRoot);
  console.log("Output directory:", outputDir);

  var results = {
    apiDocs: null,
    jsdocs: null,
    readme: null,
    changelog: null,
    errors: []
  };

  // Step 1: Scan routes
  return scanAndDocumentRoutes(projectRoot, outputDir, results)
    .then(function() {
      // Step 2: Generate JSDoc comments
      return generateJSDocs(projectRoot, outputDir, results);
    })
    .then(function() {
      // Step 3: Generate README
      return readmeGenerator.generateReadme(projectRoot)
        .then(function(readme) {
          results.readme = readme;
          var readmePath = path.join(outputDir, "README.generated.md");
          fs.writeFileSync(readmePath, readme);
          console.log("README generated:", readmePath);
        })
        .catch(function(err) {
          results.errors.push({ step: "readme", error: err.message });
        });
    })
    .then(function() {
      // Step 4: Generate changelog
      return changelogGenerator.generateChangelog(projectRoot, options.version)
        .then(function(changelog) {
          results.changelog = changelog;
          var changelogPath = path.join(outputDir, "CHANGELOG.generated.md");
          fs.writeFileSync(changelogPath, changelog);
          console.log("Changelog generated:", changelogPath);
        })
        .catch(function(err) {
          results.errors.push({ step: "changelog", error: err.message });
        });
    })
    .then(function() {
      // Step 5: Write summary report
      writeSummaryReport(outputDir, results);
      return results;
    });
}

function scanAndDocumentRoutes(projectRoot, outputDir, results) {
  var routesDir = path.join(projectRoot, "routes");

  if (!fs.existsSync(routesDir)) {
    console.log("No routes/ directory found, skipping API docs");
    return Promise.resolve();
  }

  var routes = routeScanner.scanRouteDirectory(routesDir);
  console.log("Found " + routes.length + " routes");

  var markdown = "# API Documentation\n\n";
  markdown += "Generated: " + new Date().toISOString() + "\n\n";

  routes.forEach(function(route) {
    markdown += "## " + route.method + " " + route.path + "\n\n";
    markdown += "**File:** `" + path.relative(projectRoot, route.file) + "`\n\n";

    if (route.params.length > 0) {
      markdown += "### Path Parameters\n\n";
      markdown += "| Parameter | Type |\n|-----------|------|\n";
      route.params.forEach(function(param) {
        markdown += "| `" + param + "` | string |\n";
      });
      markdown += "\n";
    }

    if (route.queryParams.length > 0) {
      markdown += "### Query Parameters\n\n";
      markdown += "| Parameter | Type |\n|-----------|------|\n";
      route.queryParams.forEach(function(param) {
        markdown += "| `" + param + "` | string |\n";
      });
      markdown += "\n";
    }

    if (route.responseFields.length > 0) {
      markdown += "### Response Fields\n\n";
      markdown += "| Field | Type |\n|-------|------|\n";
      route.responseFields.forEach(function(field) {
        markdown += "| `" + field + "` | - |\n";
      });
      markdown += "\n";
    }

    markdown += "---\n\n";
  });

  results.apiDocs = markdown;
  var apiDocPath = path.join(outputDir, "API.generated.md");
  fs.writeFileSync(apiDocPath, markdown);
  console.log("API docs generated:", apiDocPath);

  return Promise.resolve();
}

function generateJSDocs(projectRoot, outputDir, results) {
  var srcFiles = findJSFiles(projectRoot);
  console.log("Found " + srcFiles.length + " JS files to analyze");

  var allJSDocs = [];

  var chain = Promise.resolve();

  srcFiles.forEach(function(filePath) {
    chain = chain.then(function() {
      var functions = astParser.extractFunctions(filePath);
      if (functions.length === 0) return;

      return jsdocGenerator.generateJSDocsForFile(functions)
        .then(function(docs) {
          allJSDocs.push({
            file: path.relative(projectRoot, filePath),
            docs: docs
          });
        })
        .catch(function(err) {
          results.errors.push({
            step: "jsdoc",
            file: filePath,
            error: err.message
          });
        });
    });
  });

  return chain.then(function() {
    results.jsdocs = allJSDocs;
    var jsdocPath = path.join(outputDir, "JSDOC.generated.md");
    var markdown = "# Generated JSDoc Comments\n\n";

    allJSDocs.forEach(function(fileDoc) {
      markdown += "## " + fileDoc.file + "\n\n";
      fileDoc.docs.forEach(function(doc) {
        markdown += "### `" + doc.function + "()` (line " + doc.line + ")\n\n";
        markdown += "```javascript\n" + doc.comment + "\n```\n\n";
      });
    });

    fs.writeFileSync(jsdocPath, markdown);
    console.log("JSDoc comments generated:", jsdocPath);
  });
}

function findJSFiles(dirPath) {
  var results = [];
  var ignoreDirs = ["node_modules", ".git", "coverage", "dist", "docs", "test"];

  function walk(dir) {
    var items = fs.readdirSync(dir);
    items.forEach(function(item) {
      var fullPath = path.join(dir, item);
      var stat = fs.statSync(fullPath);

      if (stat.isDirectory()) {
        if (ignoreDirs.indexOf(item) === -1) {
          walk(fullPath);
        }
      } else if (path.extname(item) === ".js") {
        results.push(fullPath);
      }
    });
  }

  walk(dirPath);
  return results;
}

function writeSummaryReport(outputDir, results) {
  var report = "# Documentation Generation Report\n\n";
  report += "Generated: " + new Date().toISOString() + "\n\n";
  report += "## Results\n\n";
  report += "- API Documentation: " + (results.apiDocs ? "Generated" : "Skipped") + "\n";
  report += "- JSDoc Comments: " + (results.jsdocs ? results.jsdocs.length + " files processed" : "Skipped") + "\n";
  report += "- README: " + (results.readme ? "Generated" : "Skipped") + "\n";
  report += "- Changelog: " + (results.changelog ? "Generated" : "Skipped") + "\n";

  if (results.errors.length > 0) {
    report += "\n## Errors\n\n";
    results.errors.forEach(function(err) {
      report += "- **" + err.step + "**: " + err.error;
      if (err.file) report += " (" + err.file + ")";
      report += "\n";
    });
  }

  var reportPath = path.join(outputDir, "REPORT.md");
  fs.writeFileSync(reportPath, report);
  console.log("Summary report:", reportPath);
}

module.exports = { runPipeline: runPipeline };

Quality Checks for Generated Documentation

Generated docs need validation. Do not ship them blindly. Here are the checks I run on every generated artifact.

// lib/quality-checker.js
function checkAPIDocQuality(apiDocs, routes) {
  var issues = [];

  // Check coverage - every route should be documented
  routes.forEach(function(route) {
    var documented = apiDocs.indexOf(route.method + " " + route.path) !== -1;
    if (!documented) {
      issues.push({
        severity: "error",
        message: "Route not documented: " + route.method + " " + route.path
      });
    }
  });

  // Check for placeholder text the AI sometimes leaves in
  var placeholders = ["TODO", "FIXME", "INSERT", "PLACEHOLDER", "YOUR_", "EXAMPLE"];
  placeholders.forEach(function(placeholder) {
    if (apiDocs.indexOf(placeholder) !== -1) {
      issues.push({
        severity: "warning",
        message: "Possible placeholder text found: " + placeholder
      });
    }
  });

  // Check for empty sections
  var emptySection = /## .+\n\n##/;
  if (emptySection.test(apiDocs)) {
    issues.push({
      severity: "warning",
      message: "Empty documentation section detected"
    });
  }

  return issues;
}

function checkJSDocQuality(jsdocComment) {
  var issues = [];

  if (jsdocComment.indexOf("@param") === -1 &&
      jsdocComment.indexOf("@returns") === -1) {
    issues.push({
      severity: "warning",
      message: "JSDoc comment has no @param or @returns tags"
    });
  }

  // Check for generic descriptions
  var genericPhrases = [
    "This function does something",
    "Handles the request",
    "Processes the data"
  ];
  genericPhrases.forEach(function(phrase) {
    if (jsdocComment.toLowerCase().indexOf(phrase.toLowerCase()) !== -1) {
      issues.push({
        severity: "info",
        message: "Generic description detected: '" + phrase + "'"
      });
    }
  });

  return issues;
}

module.exports = {
  checkAPIDocQuality: checkAPIDocQuality,
  checkJSDocQuality: checkJSDocQuality
};

Incremental Documentation Updates

Regenerating all documentation on every run is wasteful and slow. Track file hashes and only regenerate docs for files that have changed.

// lib/incremental.js
var fs = require("fs");
var path = require("path");
var crypto = require("crypto");

var CACHE_FILE = ".doc-cache.json";

function loadCache(projectRoot) {
  var cachePath = path.join(projectRoot, CACHE_FILE);
  if (fs.existsSync(cachePath)) {
    return JSON.parse(fs.readFileSync(cachePath, "utf8"));
  }
  return {};
}

function saveCache(projectRoot, cache) {
  var cachePath = path.join(projectRoot, CACHE_FILE);
  fs.writeFileSync(cachePath, JSON.stringify(cache, null, 2));
}

function hashFile(filePath) {
  var content = fs.readFileSync(filePath);
  return crypto.createHash("md5").update(content).digest("hex");
}

function getChangedFiles(projectRoot, files) {
  var cache = loadCache(projectRoot);
  var changed = [];
  var newCache = {};

  files.forEach(function(filePath) {
    var hash = hashFile(filePath);
    var relativePath = path.relative(projectRoot, filePath);
    newCache[relativePath] = hash;

    if (!cache[relativePath] || cache[relativePath] !== hash) {
      changed.push(filePath);
    }
  });

  return {
    changed: changed,
    cache: newCache,
    save: function() {
      saveCache(projectRoot, newCache);
    }
  };
}

module.exports = { getChangedFiles: getChangedFiles };

Use it in the pipeline by wrapping the JSDoc generation step:

var incremental = require("./incremental");

// Inside the pipeline, replace findJSFiles call:
var allFiles = findJSFiles(projectRoot);
var result = incremental.getChangedFiles(projectRoot, allFiles);
var filesToProcess = result.changed;

console.log(filesToProcess.length + " of " + allFiles.length + " files changed since last run");

// After successful generation:
result.save();

Integrating Documentation Generation into CI/CD

Add a GitHub Actions workflow that generates documentation on every push and commits the results if anything changed.

# .github/workflows/docs.yml
name: Generate Documentation

on:
  push:
    branches: [main]
    paths:
      - 'routes/**'
      - 'lib/**'
      - 'models/**'
      - 'app.js'

jobs:
  generate-docs:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: actions/setup-node@v4
        with:
          node-version: '20'

      - run: npm ci

      - name: Generate documentation
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
        run: node scripts/generate-docs.js

      - name: Check for changes
        id: changes
        run: |
          git diff --quiet docs/ || echo "changed=true" >> $GITHUB_OUTPUT

      - name: Commit documentation
        if: steps.changes.outputs.changed == 'true'
        run: |
          git config user.name "docs-bot"
          git config user.email "[email protected]"
          git add docs/
          git commit -m "docs: auto-generate documentation [skip ci]"
          git push

The [skip ci] in the commit message prevents an infinite loop where the docs commit triggers another docs generation.

Maintaining Documentation Style Consistency

Without style guidance, AI-generated docs will be inconsistent across runs. Use a system prompt that enforces your team's conventions.

// lib/style-config.js
var STYLE_GUIDE = [
  "Follow these documentation style rules strictly:",
  "",
  "1. Use present tense ('Returns a user object', not 'Will return')",
  "2. Start function descriptions with a verb ('Fetches', 'Validates', 'Parses')",
  "3. Use backticks for parameter names, function names, and file paths",
  "4. Use 'Note:' for important callouts, not 'NB:' or 'Important:'",
  "5. Describe errors and edge cases when relevant",
  "6. Keep descriptions under 3 sentences",
  "7. Use American English spelling",
  "8. Do not use words like 'simply', 'just', 'easy', or 'obviously'",
  "9. Include units for numeric parameters (milliseconds, bytes, etc.)",
  "10. Always document side effects (file writes, network calls, mutations)"
].join("\n");

module.exports = { STYLE_GUIDE: STYLE_GUIDE };

Inject this into every LLM system prompt across your generators to get consistent output regardless of which model version you are using or which part of the pipeline is running.

Human Review Workflow for Generated Docs

AI-generated documentation is a draft, not a finished product. Build a review step into the process.

// scripts/review-docs.js
var fs = require("fs");
var path = require("path");

function createReviewChecklist(outputDir) {
  var files = fs.readdirSync(outputDir).filter(function(f) {
    return f.endsWith(".generated.md");
  });

  var checklist = "# Documentation Review Checklist\n\n";
  checklist += "Generated: " + new Date().toISOString() + "\n\n";

  files.forEach(function(file) {
    var content = fs.readFileSync(path.join(outputDir, file), "utf8");
    var wordCount = content.split(/\s+/).length;

    checklist += "## " + file + "\n\n";
    checklist += "- [ ] Accuracy: All technical details are correct\n";
    checklist += "- [ ] Completeness: No missing endpoints, params, or functions\n";
    checklist += "- [ ] Clarity: A new team member could understand this\n";
    checklist += "- [ ] No hallucinations: All referenced code actually exists\n";
    checklist += "- [ ] Style: Follows team documentation standards\n";
    checklist += "- Word count: " + wordCount + "\n\n";
  });

  var checklistPath = path.join(outputDir, "REVIEW_CHECKLIST.md");
  fs.writeFileSync(checklistPath, checklist);
  console.log("Review checklist created:", checklistPath);
  return checklistPath;
}

module.exports = { createReviewChecklist: createReviewChecklist };

The most important check is "no hallucinations." AI will sometimes document parameters that do not exist, invent response fields, or describe behavior the code does not actually implement. Every generated doc should be diffed against the actual source code by a human before merging.

Complete Working Example

Here is the entry point that ties everything together into a CLI tool you can run against any Express.js project.

// scripts/generate-docs.js
var path = require("path");
var pipeline = require("../lib/doc-pipeline");

var args = process.argv.slice(2);
var projectRoot = args[0] || process.cwd();
var version = args[1] || null;

if (!process.env.OPENAI_API_KEY) {
  console.error("Error: OPENAI_API_KEY environment variable is required");
  process.exit(1);
}

var options = {
  projectRoot: path.resolve(projectRoot),
  outputDir: path.join(path.resolve(projectRoot), "docs", "generated"),
  version: version
};

console.log("=== Documentation Automation Tool ===");
console.log("");
console.log("Project: " + options.projectRoot);
console.log("Output:  " + options.outputDir);
console.log("Version: " + (options.version || "auto-detect"));
console.log("");

pipeline.runPipeline(options)
  .then(function(results) {
    console.log("");
    console.log("=== Generation Complete ===");
    console.log("");

    if (results.errors.length > 0) {
      console.log("Completed with " + results.errors.length + " error(s):");
      results.errors.forEach(function(err) {
        console.log("  - [" + err.step + "] " + err.error);
      });
      process.exit(1);
    } else {
      console.log("All documentation generated successfully.");
      console.log("");
      console.log("Generated files:");
      console.log("  docs/generated/API.generated.md");
      console.log("  docs/generated/JSDOC.generated.md");
      console.log("  docs/generated/README.generated.md");
      console.log("  docs/generated/CHANGELOG.generated.md");
      console.log("  docs/generated/REPORT.md");
    }
  })
  .catch(function(err) {
    console.error("Pipeline failed:", err.message);
    console.error(err.stack);
    process.exit(1);
  });

Run it:

# Generate docs for current project
OPENAI_API_KEY=sk-... node scripts/generate-docs.js

# Generate docs for a specific project with version
OPENAI_API_KEY=sk-... node scripts/generate-docs.js /path/to/project v2.1.0

Expected output:

=== Documentation Automation Tool ===

Project: /home/user/my-express-app
Output:  /home/user/my-express-app/docs/generated
Version: v2.1.0

Documentation pipeline starting...
Project root: /home/user/my-express-app
Output directory: /home/user/my-express-app/docs/generated
Found 14 routes
API docs generated: /home/user/my-express-app/docs/generated/API.generated.md
Found 23 JS files to analyze
8 of 23 files changed since last run
JSDoc comments generated: /home/user/my-express-app/docs/generated/JSDOC.generated.md
README generated: /home/user/my-express-app/docs/generated/README.generated.md
Changelog generated: /home/user/my-express-app/docs/generated/CHANGELOG.generated.md
Summary report: /home/user/my-express-app/docs/generated/REPORT.md

=== Generation Complete ===

All documentation generated successfully.

Generated files:
  docs/generated/API.generated.md
  docs/generated/JSDOC.generated.md
  docs/generated/README.generated.md
  docs/generated/CHANGELOG.generated.md
  docs/generated/REPORT.md

Common Issues and Troubleshooting

1. AST Parsing Fails on Dynamic Requires

SyntaxError: Unexpected token (12:4)
  at Parser.raise (/node_modules/acorn/dist/acorn.js:3590:15)

Acorn's default configuration does not handle all Node.js syntax patterns. If your code uses optional chaining, top-level await, or other modern features, bump the ecmaVersion:

var ast = acorn.parse(source, {
  ecmaVersion: 2022,
  sourceType: "module",
  allowReturnOutsideFunction: true,
  allowImportExportEverywhere: true
});

If a file still fails, catch the error and skip it rather than crashing the entire pipeline.

2. Rate Limiting from OpenAI API

Error: 429 Too Many Requests
  Rate limit reached for gpt-4o-mini on tokens per min

When processing dozens of files, you will hit rate limits. Add a delay between API calls and process files sequentially rather than in parallel:

function delay(ms) {
  return new Promise(function(resolve) {
    setTimeout(resolve, ms);
  });
}

// Between each file
chain = chain.then(function() {
  return delay(500);  // 500ms between requests
}).then(function() {
  return processNextFile();
});

For larger projects, use GPT-4o-mini for JSDoc generation (high volume, lower complexity) and GPT-4o for README and tutorial generation (lower volume, higher quality needed).

3. Generated JSDoc Has Wrong Parameter Names

/**
 * @param {Object} request - The HTTP request object    <-- Wrong
 * @param {Object} response - The HTTP response object  <-- Wrong
 */
function getUser(req, res) { ... }

The AI sometimes uses full names instead of the actual parameter names from the code. Add a post-processing validation step:

function validateParamNames(jsdocComment, actualParams) {
  var docParams = [];
  var paramPattern = /@param\s+\{[^}]+\}\s+(\w+)/g;
  var match;

  while ((match = paramPattern.exec(jsdocComment)) !== null) {
    docParams.push(match[1]);
  }

  // Check that documented params match actual params
  for (var i = 0; i < actualParams.length; i++) {
    if (docParams[i] && docParams[i] !== actualParams[i]) {
      jsdocComment = jsdocComment.replace(
        new RegExp("@param\\s+(\\{[^}]+\\})\\s+" + docParams[i]),
        "@param $1 " + actualParams[i]
      );
    }
  }

  return jsdocComment;
}

4. Git Changelog Generation Returns Empty

fatal: No names found, cannot describe anything.

This happens when the repository has no tags. The getLatestTag function falls back to null, but getCommitsSince(null) should handle this by fetching the last 50 commits. If you still see empty output, check that the cwd option points to the actual git repository root:

var commits = getCommitsSince(null, projectRoot);
if (commits.length === 0) {
  // Check if we're actually in a git repo
  try {
    execSync("git rev-parse --git-dir", { cwd: projectRoot });
    console.log("Git repo found but no commits match the criteria");
  } catch (err) {
    console.error("Not a git repository:", projectRoot);
  }
}

5. Large Files Exceed Token Limits

Error: 400 This model's maximum context length is 128000 tokens.
  However, your messages resulted in 142856 tokens.

Some source files are too large to send to the API in a single request. Truncate function bodies to a reasonable size before sending:

var MAX_BODY_LENGTH = 3000; // characters

function truncateBody(body) {
  if (body.length <= MAX_BODY_LENGTH) return body;
  return body.substring(0, MAX_BODY_LENGTH) + "\n// ... truncated for documentation generation";
}

Best Practices

  • Generate docs from code, never the reverse. The code is the source of truth. If the documentation contradicts the code, the documentation is wrong. Never manually edit generated files; edit the generator or the source code instead.

  • Use different models for different doc types. GPT-4o-mini is fast and cheap for JSDoc comments where the function body gives it everything it needs. Use GPT-4o for READMEs and tutorials where the model needs to synthesize information from multiple sources and produce coherent prose.

  • Always include a human review gate. Automated generation is the first step, not the last. Every generated document should be reviewed by an engineer who knows the codebase before it ships. Create pull requests for doc changes so they go through the same review process as code.

  • Version your documentation generator alongside your code. The doc generator should live in the same repository as the project it documents. When the code changes, the generator and its output change in the same commit.

  • Cache aggressively, regenerate sparingly. Use file hashing to skip unchanged files. On a 100-file project, regenerating everything takes 3-5 minutes and costs $0.50-2.00 in API calls. Incremental runs finish in seconds and cost nearly nothing.

  • Keep generated files separate from hand-written docs. Use a naming convention like *.generated.md or a dedicated docs/generated/ directory. This makes it clear which files are safe to regenerate and which contain human-authored content that should never be overwritten.

  • Pin your model versions. Use gpt-4o-2024-08-06 not gpt-4o. Model aliases can change without notice, and when they do, your documentation output format and quality may shift. Pinned versions give you reproducible output.

  • Test your generators against known output. Keep a fixture file with a simple Express route and a snapshot of the expected documentation. Run this as a unit test to catch regressions in your generation pipeline.

  • Set temperature low for documentation. Use 0.1-0.3 for all documentation generation. You want deterministic, factual output. Higher temperatures introduce variation and occasional hallucinations that undermine trust in the generated docs.

References

Powered by Contentful