Use Cases

AI-Assisted Code Review Implementation

Build AI-assisted code review with diff analysis, LLM-powered comments, security checks, and GitHub integration in Node.js.

AI-Assisted Code Review Implementation

AI-assisted code review transforms the traditional pull request workflow by catching issues that human reviewers routinely miss under time pressure. By combining deterministic rule checks with LLM-powered contextual analysis, you can build a review system that delivers consistent, thorough feedback on every PR without burning out your senior engineers. This guide walks through building a production-grade AI code review service in Node.js that parses diffs, applies security and quality checks, generates actionable comments with the Claude API, and posts them back to GitHub or Azure DevOps.

Prerequisites

  • Node.js 18+ installed
  • A GitHub account with a personal access token (or GitHub App credentials)
  • An Anthropic API key for Claude access
  • Familiarity with Express.js, webhooks, and Git diffs
  • Basic understanding of pull request workflows
  • A public-facing server or tunneling tool (ngrok) for webhook delivery

How AI Enhances Code Review

Manual code review suffers from three chronic problems: inconsistency, fatigue, and bottlenecks. A senior engineer reviewing their tenth PR on a Friday afternoon will miss things they would catch on Monday morning. AI review solves all three.

Catching patterns humans miss. Human reviewers are excellent at evaluating architecture and design intent, but they struggle with repetitive pattern detection across hundreds of lines of changed code. An LLM scanning a diff can identify that a new database query is missing parameterized inputs even when it appears in the middle of 400 changed lines. It can also spot that a try-catch block silently swallows errors, a pattern that humans skim past because the code "looks right" at a glance.

Consistency. AI review applies the same standards to every PR regardless of who authored it, what time of day it is, or how many other PRs are in the queue. This eliminates the social dynamics that plague human review: going easier on a senior engineer's code, being overly critical of a new hire, or rubber-stamping a PR because the submitter is blocked.

Speed. A typical AI review completes in 30-60 seconds. That means developers get initial feedback before they context-switch to another task. The faster the feedback loop, the cheaper the fix.

That said, AI review does not replace human review. It handles the mechanical checks so human reviewers can focus on architecture, business logic, and design decisions that require domain knowledge.

Parsing Diffs and Pull Requests Programmatically

The foundation of any AI code review system is the ability to parse unified diffs into structured data. GitHub delivers PR diffs in unified diff format, which looks straightforward but has edge cases you need to handle carefully.

var DIFF_HEADER_REGEX = /^diff --git a\/(.+) b\/(.+)$/;
var HUNK_HEADER_REGEX = /^@@ -(\d+),?(\d*) \+(\d+),?(\d*) @@(.*)$/;

function parseDiff(rawDiff) {
  var files = [];
  var currentFile = null;
  var currentHunk = null;
  var lines = rawDiff.split("\n");

  for (var i = 0; i < lines.length; i++) {
    var line = lines[i];
    var fileMatch = line.match(DIFF_HEADER_REGEX);
    var hunkMatch = line.match(HUNK_HEADER_REGEX);

    if (fileMatch) {
      currentFile = {
        oldPath: fileMatch[1],
        newPath: fileMatch[2],
        hunks: [],
        additions: 0,
        deletions: 0
      };
      files.push(currentFile);
      currentHunk = null;
    } else if (hunkMatch && currentFile) {
      currentHunk = {
        oldStart: parseInt(hunkMatch[1], 10),
        oldLines: parseInt(hunkMatch[2] || "1", 10),
        newStart: parseInt(hunkMatch[3], 10),
        newLines: parseInt(hunkMatch[4] || "1", 10),
        context: hunkMatch[5].trim(),
        changes: []
      };
      currentFile.hunks.push(currentHunk);
    } else if (currentHunk) {
      if (line.startsWith("+") && !line.startsWith("+++")) {
        currentHunk.changes.push({ type: "add", content: line.substring(1), lineNumber: currentHunk.newStart + currentHunk.changes.filter(function(c) { return c.type !== "delete"; }).length });
        currentFile.additions++;
      } else if (line.startsWith("-") && !line.startsWith("---")) {
        currentHunk.changes.push({ type: "delete", content: line.substring(1), lineNumber: currentHunk.oldStart + currentHunk.changes.filter(function(c) { return c.type !== "add"; }).length });
        currentFile.deletions++;
      } else if (line.startsWith(" ")) {
        currentHunk.changes.push({ type: "context", content: line.substring(1) });
      }
    }
  }

  return files;
}

To fetch the diff from GitHub, use the Octokit library:

var Octokit = require("@octokit/rest").Octokit;

function fetchPullRequestDiff(owner, repo, pullNumber) {
  var octokit = new Octokit({ auth: process.env.GITHUB_TOKEN });

  return octokit.pulls.get({
    owner: owner,
    repo: repo,
    pull_number: pullNumber,
    mediaType: { format: "diff" }
  }).then(function(response) {
    return response.data;
  });
}

function fetchPullRequestFiles(owner, repo, pullNumber) {
  var octokit = new Octokit({ auth: process.env.GITHUB_TOKEN });

  return octokit.pulls.listFiles({
    owner: owner,
    repo: repo,
    pull_number: pullNumber,
    per_page: 100
  }).then(function(response) {
    return response.data;
  });
}

Implementing Review Rules

Before involving an LLM, apply deterministic rules that are cheap to run and produce zero false positives. These catch the obvious stuff instantly.

var SECURITY_PATTERNS = [
  { pattern: /eval\s*\(/, severity: "critical", message: "Use of eval() detected. This creates code injection vulnerabilities." },
  { pattern: /innerHTML\s*=/, severity: "warning", message: "Direct innerHTML assignment risks XSS. Use textContent or a sanitization library." },
  { pattern: /process\.env\.\w+/, severity: "info", message: "Environment variable access detected. Verify this is not exposing secrets in client-side code." },
  { pattern: /new Function\s*\(/, severity: "critical", message: "Dynamic Function constructor is equivalent to eval(). Avoid in production code." },
  { pattern: /child_process/, severity: "warning", message: "Shell execution detected. Ensure inputs are sanitized to prevent command injection." },
  { pattern: /Math\.random\(\)/, severity: "warning", message: "Math.random() is not cryptographically secure. Use crypto.randomBytes() for security-sensitive values." },
  { pattern: /password|secret|api_key|apikey/i, severity: "warning", message: "Potential hardcoded credential detected. Move to environment variables." },
  { pattern: /SELECT\s+.*\s+FROM\s+.*\+\s*['"]?\w/, severity: "critical", message: "Possible SQL injection via string concatenation. Use parameterized queries." }
];

var QUALITY_PATTERNS = [
  { pattern: /console\.log\(/, severity: "suggestion", message: "Remove console.log before merging. Use a proper logging library." },
  { pattern: /TODO|FIXME|HACK|XXX/, severity: "suggestion", message: "Unresolved TODO/FIXME comment. Address or create a tracking issue." },
  { pattern: /catch\s*\(\s*\w+\s*\)\s*\{\s*\}/, severity: "warning", message: "Empty catch block swallows errors silently. At minimum, log the error." },
  { pattern: /\.then\(.*\.then\(.*\.then\(/, severity: "suggestion", message: "Deeply nested promise chain. Consider async/await for readability." }
];

function applyStaticRules(parsedFiles) {
  var findings = [];

  parsedFiles.forEach(function(file) {
    var allPatterns = SECURITY_PATTERNS.concat(QUALITY_PATTERNS);

    file.hunks.forEach(function(hunk) {
      hunk.changes.forEach(function(change) {
        if (change.type !== "add") return;

        allPatterns.forEach(function(rule) {
          if (rule.pattern.test(change.content)) {
            findings.push({
              file: file.newPath,
              line: change.lineNumber,
              severity: rule.severity,
              message: rule.message,
              source: "static-rule",
              code: change.content.trim()
            });
          }
        });
      });
    });
  });

  return findings;
}

Using LLMs to Generate Review Comments

Static rules catch known patterns. LLMs catch everything else: logic errors, naming problems, missing edge cases, and architectural concerns. The key is constructing a prompt that gives the model enough context to generate useful comments without hallucinating issues.

var Anthropic = require("@anthropic-ai/sdk");

var anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

function buildReviewPrompt(file, hunkContent, prDescription) {
  return [
    "You are an expert code reviewer. Analyze this code change and identify issues.",
    "",
    "Pull Request Description: " + (prDescription || "No description provided"),
    "",
    "File: " + file.newPath,
    "",
    "Code changes (unified diff format):",
    "```",
    hunkContent,
    "```",
    "",
    "Review for:",
    "1. Security vulnerabilities (injection, XSS, auth bypass, data exposure)",
    "2. Logic errors and edge cases (null checks, off-by-one, race conditions)",
    "3. Error handling gaps (uncaught exceptions, missing error responses)",
    "4. Performance issues (N+1 queries, unbounded loops, memory leaks)",
    "5. Code clarity (confusing naming, missing documentation for complex logic)",
    "",
    "For each issue found, respond with a JSON array of objects:",
    '{ "line": <line_number>, "severity": "critical|warning|suggestion", "message": "<concise issue description>", "suggestion": "<improved code if applicable>" }',
    "",
    "Only report genuine issues. If the code looks correct, return an empty array [].",
    "Focus ONLY on the added/changed lines (lines starting with +).",
    "Do not flag style preferences unless they impact readability significantly."
  ].join("\n");
}

function analyzeWithLLM(file, hunkContent, prDescription) {
  var prompt = buildReviewPrompt(file, hunkContent, prDescription);

  return anthropic.messages.create({
    model: "claude-sonnet-4-20250514",
    max_tokens: 2048,
    messages: [{ role: "user", content: prompt }]
  }).then(function(response) {
    var text = response.content[0].text;
    var jsonMatch = text.match(/\[[\s\S]*\]/);

    if (!jsonMatch) return [];

    try {
      var findings = JSON.parse(jsonMatch[0]);
      return findings.map(function(f) {
        f.file = file.newPath;
        f.source = "llm-analysis";
        return f;
      });
    } catch (e) {
      console.error("Failed to parse LLM response:", e.message);
      return [];
    }
  });
}

Integrating with GitHub PR Webhooks

Your review service needs to receive webhook events when PRs are opened or updated, then post comments back to the PR. Here is the Express webhook handler:

var express = require("express");
var crypto = require("crypto");

var app = express();
app.use(express.json());

function verifyWebhookSignature(req) {
  var signature = req.headers["x-hub-signature-256"];
  if (!signature) return false;

  var hmac = crypto.createHmac("sha256", process.env.GITHUB_WEBHOOK_SECRET);
  var digest = "sha256=" + hmac.update(JSON.stringify(req.body)).digest("hex");

  return crypto.timingSafeEqual(
    Buffer.from(signature),
    Buffer.from(digest)
  );
}

app.post("/webhook/github", function(req, res) {
  if (!verifyWebhookSignature(req)) {
    return res.status(401).json({ error: "Invalid signature" });
  }

  var event = req.headers["x-github-event"];
  var action = req.body.action;

  if (event === "pull_request" && (action === "opened" || action === "synchronize")) {
    var pr = req.body.pull_request;
    var repo = req.body.repository;

    res.status(202).json({ status: "review started" });

    runReview({
      owner: repo.owner.login,
      repo: repo.name,
      pullNumber: pr.number,
      description: pr.body,
      headSha: pr.head.sha
    }).catch(function(err) {
      console.error("Review failed:", err);
    });
  } else {
    res.status(200).json({ status: "ignored" });
  }
});

For Azure DevOps, the webhook payload structure differs, but the pattern is identical:

app.post("/webhook/azuredevops", function(req, res) {
  var eventType = req.body.eventType;

  if (eventType === "git.pullrequest.created" || eventType === "git.pullrequest.updated") {
    var resource = req.body.resource;

    res.status(202).json({ status: "review started" });

    runAzureDevOpsReview({
      organization: resource.repository.project.name,
      project: resource.repository.project.name,
      repositoryId: resource.repository.id,
      pullRequestId: resource.pullRequestId
    }).catch(function(err) {
      console.error("Azure DevOps review failed:", err);
    });
  } else {
    res.status(200).json({ status: "ignored" });
  }
});

Prioritizing Review Findings

Not all findings deserve equal attention. Flooding a PR with dozens of "suggestion" level comments is noisy and counterproductive. Implement a priority system that controls what gets posted:

var SEVERITY_WEIGHTS = {
  critical: 10,
  warning: 5,
  suggestion: 1,
  info: 0
};

var DEFAULT_STRICTNESS = {
  maxComments: 20,
  minSeverity: "suggestion",
  postSummary: true,
  blockOnCritical: true
};

function filterAndPrioritize(findings, strictness) {
  var config = Object.assign({}, DEFAULT_STRICTNESS, strictness || {});
  var severityOrder = ["critical", "warning", "suggestion", "info"];
  var minIndex = severityOrder.indexOf(config.minSeverity);

  var filtered = findings.filter(function(f) {
    return severityOrder.indexOf(f.severity) <= minIndex;
  });

  filtered.sort(function(a, b) {
    return (SEVERITY_WEIGHTS[b.severity] || 0) - (SEVERITY_WEIGHTS[a.severity] || 0);
  });

  if (filtered.length > config.maxComments) {
    var truncated = filtered.slice(0, config.maxComments);
    truncated.push({
      severity: "info",
      message: "Showing " + config.maxComments + " of " + filtered.length + " total findings. Increase review strictness to see all.",
      source: "system"
    });
    return truncated;
  }

  return filtered;
}

Configuring Review Strictness Levels

Different repositories and teams need different review thresholds. A prototype repo should not get the same scrutiny as a payment processing service:

var STRICTNESS_PRESETS = {
  relaxed: {
    maxComments: 10,
    minSeverity: "warning",
    postSummary: true,
    blockOnCritical: false,
    skipFiles: ["*.test.js", "*.spec.js", "*.md", "package-lock.json"]
  },
  standard: {
    maxComments: 20,
    minSeverity: "suggestion",
    postSummary: true,
    blockOnCritical: true,
    skipFiles: ["package-lock.json", "*.md"]
  },
  strict: {
    maxComments: 50,
    minSeverity: "info",
    postSummary: true,
    blockOnCritical: true,
    skipFiles: ["package-lock.json"]
  }
};

function getStrictnessForRepo(owner, repo) {
  var key = owner + "/" + repo;
  var overrides = {
    "myorg/payments-service": "strict",
    "myorg/internal-tools": "relaxed"
  };

  return STRICTNESS_PRESETS[overrides[key] || "standard"];
}

Detecting Anti-Patterns with LLM Analysis

Beyond line-level issues, LLMs excel at detecting higher-level anti-patterns that span multiple changes. You can construct a dedicated prompt for architectural concerns:

function detectAntiPatterns(parsedFiles, prDescription) {
  var fileList = parsedFiles.map(function(f) {
    return f.newPath + " (+" + f.additions + "/-" + f.deletions + ")";
  }).join("\n");

  var prompt = [
    "Analyze this pull request for architectural anti-patterns.",
    "",
    "PR Description: " + (prDescription || "None"),
    "Files changed:",
    fileList,
    "",
    "Look for these anti-patterns:",
    "- God object: Single file with too many responsibilities",
    "- Shotgun surgery: Small change requiring modifications across many files",
    "- Feature envy: Code that heavily accesses another module's data",
    "- Primitive obsession: Using raw strings/numbers where a type would be clearer",
    "- Missing abstraction: Duplicated logic across files that should be extracted",
    "- Tight coupling: Direct dependencies that should go through an interface",
    "",
    "Return a JSON array of found patterns:",
    '{ "pattern": "<name>", "severity": "warning|suggestion", "message": "<explanation>", "files": ["<affected files>"] }',
    "",
    "Return an empty array if no anti-patterns are detected."
  ].join("\n");

  return anthropic.messages.create({
    model: "claude-sonnet-4-20250514",
    max_tokens: 1024,
    messages: [{ role: "user", content: prompt }]
  }).then(function(response) {
    var text = response.content[0].text;
    var jsonMatch = text.match(/\[[\s\S]*\]/);
    if (!jsonMatch) return [];

    try {
      return JSON.parse(jsonMatch[0]);
    } catch (e) {
      return [];
    }
  });
}

Checking for Security Vulnerabilities

Security checks deserve special treatment because the stakes are higher. Run a dedicated security-focused LLM pass on files that handle authentication, input parsing, or data access:

var SECURITY_SENSITIVE_PATHS = [
  /auth/i, /login/i, /session/i, /token/i,
  /password/i, /crypto/i, /middleware/i,
  /routes?\//i, /api\//i, /controller/i
];

function isSecuritySensitive(filePath) {
  return SECURITY_SENSITIVE_PATHS.some(function(pattern) {
    return pattern.test(filePath);
  });
}

function deepSecurityReview(file, fullContent) {
  var prompt = [
    "Perform a thorough security review of this code change.",
    "",
    "File: " + file.newPath,
    "```",
    fullContent,
    "```",
    "",
    "Check specifically for:",
    "1. SQL injection, NoSQL injection, command injection",
    "2. Cross-site scripting (XSS) in output rendering",
    "3. Authentication and authorization bypass",
    "4. Insecure cryptographic practices",
    "5. Path traversal vulnerabilities",
    "6. Insecure deserialization",
    "7. Server-Side Request Forgery (SSRF)",
    "8. Exposure of sensitive data in logs or responses",
    "9. Missing input validation or sanitization",
    "10. Insecure default configurations",
    "",
    "Return JSON array. Each finding must include:",
    '{ "vulnerability": "<CWE category>", "severity": "critical|warning", "line": <number>, "message": "<description>", "remediation": "<how to fix>" }',
    "",
    "Only report genuine vulnerabilities. False positives erode trust."
  ].join("\n");

  return anthropic.messages.create({
    model: "claude-sonnet-4-20250514",
    max_tokens: 2048,
    messages: [{ role: "user", content: prompt }]
  }).then(function(response) {
    var text = response.content[0].text;
    var jsonMatch = text.match(/\[[\s\S]*\]/);
    if (!jsonMatch) return [];

    try {
      var findings = JSON.parse(jsonMatch[0]);
      return findings.map(function(f) {
        f.file = file.newPath;
        f.source = "security-review";
        return f;
      });
    } catch (e) {
      return [];
    }
  });
}

Handling Large PRs

Large pull requests are the enemy of effective code review, both human and AI. When a PR exceeds token limits, you need to chunk the diff intelligently:

var MAX_TOKENS_PER_CHUNK = 3000;

function estimateTokens(text) {
  return Math.ceil(text.length / 4);
}

function chunkDiffForReview(parsedFiles) {
  var chunks = [];
  var currentChunk = [];
  var currentTokens = 0;

  parsedFiles.forEach(function(file) {
    file.hunks.forEach(function(hunk) {
      var hunkText = hunk.changes.map(function(c) {
        var prefix = c.type === "add" ? "+" : c.type === "delete" ? "-" : " ";
        return prefix + c.content;
      }).join("\n");

      var hunkTokens = estimateTokens(hunkText);

      if (currentTokens + hunkTokens > MAX_TOKENS_PER_CHUNK && currentChunk.length > 0) {
        chunks.push(currentChunk.slice());
        currentChunk = [];
        currentTokens = 0;
      }

      currentChunk.push({
        file: file.newPath,
        hunk: hunk,
        text: hunkText
      });
      currentTokens += hunkTokens;
    });
  });

  if (currentChunk.length > 0) {
    chunks.push(currentChunk);
  }

  return chunks;
}

function generatePRSummary(parsedFiles, prDescription) {
  var summary = parsedFiles.map(function(f) {
    return f.newPath + ": +" + f.additions + " / -" + f.deletions;
  }).join("\n");

  var prompt = [
    "Summarize this pull request in 3-5 bullet points.",
    "",
    "PR Description: " + (prDescription || "None"),
    "Files changed:",
    summary,
    "",
    "Focus on: what changed, why it might have changed, and any risks."
  ].join("\n");

  return anthropic.messages.create({
    model: "claude-sonnet-4-20250514",
    max_tokens: 512,
    messages: [{ role: "user", content: prompt }]
  }).then(function(response) {
    return response.content[0].text;
  });
}

Review Templates Per File Type

Different file types warrant different review focus. A database migration file needs schema validation, while a React component needs accessibility checks:

var FILE_TYPE_TEMPLATES = {
  ".js": {
    focus: ["error handling", "input validation", "async correctness", "security"],
    ignore: ["formatting"]
  },
  ".sql": {
    focus: ["migration safety", "backward compatibility", "index usage", "data loss risk"],
    ignore: ["naming conventions"]
  },
  ".json": {
    focus: ["schema validity", "sensitive data exposure", "version changes"],
    ignore: ["formatting"]
  },
  ".yml": {
    focus: ["syntax validity", "security credentials", "configuration correctness"],
    ignore: []
  },
  ".dockerfile": {
    focus: ["base image pinning", "layer optimization", "security (running as root)", "secret exposure"],
    ignore: []
  }
};

function getTemplateForFile(filePath) {
  var path = require("path");
  var ext = path.extname(filePath).toLowerCase();

  if (filePath.toLowerCase().includes("dockerfile")) {
    return FILE_TYPE_TEMPLATES[".dockerfile"];
  }

  return FILE_TYPE_TEMPLATES[ext] || {
    focus: ["correctness", "security", "error handling"],
    ignore: []
  };
}

function buildTemplatedPrompt(file, hunkContent, template) {
  return [
    "Review this code change with focus on: " + template.focus.join(", ") + ".",
    template.ignore.length > 0 ? "Do not flag issues related to: " + template.ignore.join(", ") + "." : "",
    "",
    "File: " + file.newPath,
    "```",
    hunkContent,
    "```",
    "",
    "Return JSON array of findings with line, severity, and message fields.",
    "Return empty array if no issues found."
  ].join("\n");
}

Tracking Review Effectiveness Metrics

An AI review system that does not track its own effectiveness is flying blind. Measure precision (are the comments useful?) and recall (are issues still getting through?):

function trackReviewMetrics(reviewId, findings, prMeta) {
  var metrics = {
    reviewId: reviewId,
    timestamp: new Date().toISOString(),
    repo: prMeta.owner + "/" + prMeta.repo,
    pullNumber: prMeta.pullNumber,
    totalFindings: findings.length,
    bySeverity: {
      critical: findings.filter(function(f) { return f.severity === "critical"; }).length,
      warning: findings.filter(function(f) { return f.severity === "warning"; }).length,
      suggestion: findings.filter(function(f) { return f.severity === "suggestion"; }).length
    },
    bySource: {
      staticRule: findings.filter(function(f) { return f.source === "static-rule"; }).length,
      llmAnalysis: findings.filter(function(f) { return f.source === "llm-analysis"; }).length,
      securityReview: findings.filter(function(f) { return f.source === "security-review"; }).length
    },
    filesReviewed: prMeta.filesChanged,
    linesChanged: prMeta.additions + prMeta.deletions,
    reviewDurationMs: prMeta.durationMs
  };

  console.log("Review metrics:", JSON.stringify(metrics, null, 2));
  return metrics;
}

function trackFeedback(reviewId, commentId, reaction) {
  // Track thumbs up/down on review comments to measure precision
  // reaction: "helpful" | "not_helpful" | "false_positive"
  var feedback = {
    reviewId: reviewId,
    commentId: commentId,
    reaction: reaction,
    timestamp: new Date().toISOString()
  };

  // Store in your metrics database for analysis
  console.log("Review feedback:", JSON.stringify(feedback));
  return feedback;
}

Over time, analyze these metrics to tune your system. If a particular static rule generates more than 30% false positives, disable it. If the LLM consistently misses a class of issues, add a static rule for it.

Avoiding False Positives and Noise

False positives are the number one reason teams abandon AI review tools. Every incorrect comment trains the developer to ignore the bot. Defend against this aggressively:

var IGNORE_PATTERNS = [
  /package-lock\.json$/,
  /yarn\.lock$/,
  /\.min\.js$/,
  /\.map$/,
  /\.svg$/,
  /node_modules\//,
  /dist\//,
  /build\//,
  /coverage\//
];

function shouldSkipFile(filePath, strictness) {
  var minimatch = require("minimatch");

  if (IGNORE_PATTERNS.some(function(p) { return p.test(filePath); })) {
    return true;
  }

  if (strictness && strictness.skipFiles) {
    return strictness.skipFiles.some(function(glob) {
      return minimatch(filePath, glob);
    });
  }

  return false;
}

function deduplicateFindings(findings) {
  var seen = {};

  return findings.filter(function(f) {
    var key = f.file + ":" + f.line + ":" + f.severity;
    if (seen[key]) return false;
    seen[key] = true;
    return true;
  });
}

Also include an escape hatch. Developers should be able to suppress specific warnings with inline comments:

function isCommentSuppressed(change, previousLine) {
  var suppressPattern = /\/\/\s*ai-review-ignore|\/\/\s*noqa|\/\*\s*ai-review-disable\s*\*\//;
  return suppressPattern.test(change.content) || suppressPattern.test(previousLine || "");
}

Complete Working Example

Here is the full orchestration that ties everything together. This runReview function receives PR metadata, fetches the diff, runs static and LLM analysis, and posts the results back to GitHub:

var Octokit = require("@octokit/rest").Octokit;
var Anthropic = require("@anthropic-ai/sdk");

var octokit = new Octokit({ auth: process.env.GITHUB_TOKEN });
var anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

function runReview(prMeta) {
  var startTime = Date.now();
  var allFindings = [];
  var strictness = getStrictnessForRepo(prMeta.owner, prMeta.repo);

  console.log("Starting review for " + prMeta.owner + "/" + prMeta.repo + " #" + prMeta.pullNumber);

  return fetchPullRequestDiff(prMeta.owner, prMeta.repo, prMeta.pullNumber)
    .then(function(rawDiff) {
      var parsedFiles = parseDiff(rawDiff);

      // Filter out files we should skip
      parsedFiles = parsedFiles.filter(function(f) {
        return !shouldSkipFile(f.newPath, strictness);
      });

      // Step 1: Static rule analysis (instant)
      var staticFindings = applyStaticRules(parsedFiles);
      allFindings = allFindings.concat(staticFindings);

      // Step 2: LLM analysis (chunked for large PRs)
      var chunks = chunkDiffForReview(parsedFiles);
      var llmPromises = chunks.map(function(chunk) {
        var combinedText = chunk.map(function(c) {
          return "// File: " + c.file + "\n" + c.text;
        }).join("\n\n");

        return analyzeWithLLM(
          { newPath: chunk[0].file },
          combinedText,
          prMeta.description
        );
      });

      // Step 3: Deep security review for sensitive files
      var securityFiles = parsedFiles.filter(function(f) {
        return isSecuritySensitive(f.newPath);
      });

      var securityPromises = securityFiles.map(function(file) {
        var content = file.hunks.map(function(h) {
          return h.changes.map(function(c) {
            return c.content;
          }).join("\n");
        }).join("\n");

        return deepSecurityReview(file, content);
      });

      // Step 4: Anti-pattern detection
      var antiPatternPromise = detectAntiPatterns(parsedFiles, prMeta.description);

      // Step 5: PR summary
      var summaryPromise = generatePRSummary(parsedFiles, prMeta.description);

      return Promise.all([
        Promise.all(llmPromises),
        Promise.all(securityPromises),
        antiPatternPromise,
        summaryPromise,
        parsedFiles
      ]);
    })
    .then(function(results) {
      var llmResults = results[0];
      var securityResults = results[1];
      var antiPatterns = results[2];
      var summary = results[3];
      var parsedFiles = results[4];

      // Merge all findings
      llmResults.forEach(function(batch) {
        allFindings = allFindings.concat(batch);
      });
      securityResults.forEach(function(batch) {
        allFindings = allFindings.concat(batch);
      });
      antiPatterns.forEach(function(ap) {
        allFindings.push({
          severity: ap.severity,
          message: "[Anti-pattern: " + ap.pattern + "] " + ap.message,
          source: "anti-pattern",
          file: ap.files ? ap.files[0] : null
        });
      });

      // Deduplicate and prioritize
      allFindings = deduplicateFindings(allFindings);
      allFindings = filterAndPrioritize(allFindings, strictness);

      var hasCritical = allFindings.some(function(f) {
        return f.severity === "critical";
      });

      // Post results to GitHub
      return postReviewComments(prMeta, allFindings, summary, hasCritical)
        .then(function() {
          prMeta.durationMs = Date.now() - startTime;
          prMeta.filesChanged = parsedFiles.length;
          prMeta.additions = parsedFiles.reduce(function(sum, f) { return sum + f.additions; }, 0);
          prMeta.deletions = parsedFiles.reduce(function(sum, f) { return sum + f.deletions; }, 0);
          trackReviewMetrics("review-" + Date.now(), allFindings, prMeta);
        });
    });
}

function postReviewComments(prMeta, findings, summary, hasCritical) {
  var reviewEvent = hasCritical ? "REQUEST_CHANGES" : "COMMENT";

  // Build summary body
  var body = "## AI Code Review Summary\n\n";
  body += summary + "\n\n";
  body += "### Findings: " + findings.length + " total\n";

  var criticalCount = findings.filter(function(f) { return f.severity === "critical"; }).length;
  var warningCount = findings.filter(function(f) { return f.severity === "warning"; }).length;
  var suggestionCount = findings.filter(function(f) { return f.severity === "suggestion"; }).length;

  if (criticalCount > 0) body += "- **Critical:** " + criticalCount + "\n";
  if (warningCount > 0) body += "- **Warning:** " + warningCount + "\n";
  if (suggestionCount > 0) body += "- **Suggestion:** " + suggestionCount + "\n";

  // Build inline comments
  var comments = findings
    .filter(function(f) { return f.file && f.line; })
    .map(function(f) {
      var severityIcon = f.severity === "critical" ? "🔴" : f.severity === "warning" ? "🟡" : "🔵";
      var commentBody = severityIcon + " **" + f.severity.toUpperCase() + "**: " + f.message;

      if (f.suggestion) {
        commentBody += "\n\n**Suggested fix:**\n```suggestion\n" + f.suggestion + "\n```";
      }

      return {
        path: f.file,
        line: f.line,
        body: commentBody
      };
    });

  return octokit.pulls.createReview({
    owner: prMeta.owner,
    repo: prMeta.repo,
    pull_number: prMeta.pullNumber,
    commit_id: prMeta.headSha,
    body: body,
    event: reviewEvent,
    comments: comments
  });
}

// Start the server
var PORT = process.env.PORT || 3000;
app.listen(PORT, function() {
  console.log("AI Code Review service running on port " + PORT);
});

Common Issues and Troubleshooting

1. Webhook signature verification fails with "Invalid signature"

Error: Invalid signature
  at verifyWebhookSignature (/app/server.js:15:11)

This happens when the raw request body is parsed before signature verification. Express's express.json() middleware modifies the body, so JSON.stringify(req.body) may not match the original payload. Use a raw body parser for the webhook route:

app.post("/webhook/github",
  express.raw({ type: "application/json" }),
  function(req, res) {
    var rawBody = req.body; // Buffer, not parsed JSON
    var hmac = crypto.createHmac("sha256", process.env.GITHUB_WEBHOOK_SECRET);
    var digest = "sha256=" + hmac.update(rawBody).digest("hex");
    // ... verify signature against raw body
    var payload = JSON.parse(rawBody);
  }
);

2. LLM returns malformed JSON causing review to crash

SyntaxError: Unexpected token 'I' at position 0
  at JSON.parse (<anonymous>)
  at analyzeWithLLM (/app/review.js:84:22)

The model sometimes wraps JSON in markdown fences or adds explanatory text. The regex text.match(/\[[\s\S]*\]/) usually handles this, but can fail if the model outputs something like "I found no issues.". Always wrap the parse in a try-catch and return an empty array on failure. Additionally, add a system message reinforcing the JSON-only response format.

3. GitHub API rate limiting with 403 responses

HttpError: API rate limit exceeded for installation. Rate limit resets at 2026-01-15T14:30:00Z
  at /app/node_modules/@octokit/request/dist-node/index.js:86:21

When reviewing multiple PRs concurrently, you can burn through GitHub's API rate limit (5000 requests/hour for authenticated requests). Implement exponential backoff and batch comment posting:

function postWithRetry(fn, maxRetries) {
  var attempt = 0;

  function tryRequest() {
    return fn().catch(function(err) {
      if (err.status === 403 && attempt < maxRetries) {
        attempt++;
        var delay = Math.pow(2, attempt) * 1000;
        console.log("Rate limited. Retrying in " + delay + "ms");
        return new Promise(function(resolve) {
          setTimeout(resolve, delay);
        }).then(tryRequest);
      }
      throw err;
    });
  }

  return tryRequest();
}

4. Token limit exceeded on large diffs

Error: prompt is too long: 215847 tokens > 200000 maximum
  at createMessage (/app/node_modules/@anthropic-ai/sdk/src/resources/messages.ts:122:15)

The chunking logic prevents this for most PRs, but monorepo diffs with hundreds of files can still exceed limits. Add a hard cap that falls back to summary-only mode:

var MAX_FILES_FOR_FULL_REVIEW = 50;

function shouldUseFullReview(parsedFiles) {
  var totalLines = parsedFiles.reduce(function(sum, f) {
    return sum + f.additions + f.deletions;
  }, 0);

  return parsedFiles.length <= MAX_FILES_FOR_FULL_REVIEW && totalLines <= 2000;
}

When full review is not feasible, run only static rules and the PR summary, and post a comment suggesting the PR be split into smaller changes.

5. Duplicate review comments on force-pushed PRs

Already reviewed commit abc123. Posting duplicate comments on def456.

When a developer force-pushes, GitHub fires a synchronize event. Without deduplication, you will post a second batch of identical comments. Track reviewed commits and skip re-reviews or dismiss the previous review before posting a new one:

var reviewedCommits = {};

function hasBeenReviewed(prKey, commitSha) {
  if (reviewedCommits[prKey] === commitSha) return true;
  reviewedCommits[prKey] = commitSha;
  return false;
}

Best Practices

  • Run static rules before LLM analysis. Deterministic checks are free and instant. Use them as a first pass so the LLM can focus on higher-level concerns. This also reduces API costs by filtering out obvious issues early.

  • Keep review comments actionable. Every comment should tell the developer what to do, not just what is wrong. "This function is complex" is useless. "Extract the validation logic into a separate validateInput function to reduce cyclomatic complexity" is actionable.

  • Set a maximum comment count per PR. More than 20 comments overwhelms the developer and causes them to ignore all of them. Prioritize by severity and cap output. If you hit the cap, add a summary note indicating how many findings were suppressed.

  • Version your review prompts. Treat prompts like code. Store them in version control, tag releases, and track which prompt version produced which results. When you tweak a prompt, you need to know whether precision improved or regressed.

  • Use different models for different tasks. Fast, cheap models (Claude Haiku) work well for static rule augmentation and file-type classification. Reserve more capable models (Claude Sonnet) for security reviews and architectural analysis where accuracy matters most.

  • Implement a feedback loop. Add thumbs up/down reactions to review comments and track the ratio. If a particular rule or prompt consistently gets thumbs down, it needs tuning. Without feedback data, you are guessing at effectiveness.

  • Respect developer autonomy. AI review should inform, not block. Use REQUEST_CHANGES only for genuine security vulnerabilities. Everything else should be a comment that the developer can choose to address or dismiss.

  • Handle failures gracefully. If the LLM API is down, post a comment saying the automated review could not complete rather than silently failing. Developers should know whether the bot ran or not.

  • Test your review system against known vulnerable code. Maintain a set of intentionally vulnerable code samples and verify your system catches them. This is your regression test suite for review accuracy.

References

Powered by Contentful