AI Integration & Development

When Your Document Pipeline Hallucinates: Building Guardrails for AI-Powered Data Extraction

Five-layer guardrail architecture for AI document extraction pipelines that fabricate data instead of failing visibly.

A thread on Hacker News last week described something that should stop every engineer building AI document pipelines cold: an AI system didn't misread a receipt. It fabricated one. Not an incorrect interpretation of existing data: a complete confabulation of a document that didn't match what was actually submitted.

If you're building any system that extracts structured data from documents: invoices, receipts, contracts, forms, medical records: you need to read that thread and then you need to build guardrails. This is how.

Digital Transformation at Machine Speed: How CIOs Are Using AI to Accelerate Change Without Losing Control

Digital Transformation at Machine Speed: How CIOs Are Using AI to Accelerate Change Without Losing Control

Enterprise transformation fails 70% of the time—and AI makes it harder. Learn to use agentic AI to accelerate transformation itself, not just as the end goal.

Learn More

Why Document Extraction Hallucinates

Understanding the failure mode helps you build the right defenses.

Large language models and document AI models are trained to produce plausible output. When given an ambiguous or low-quality document: a blurry scan, a partially obscured field, a non-standard format: the model fills in gaps based on what's statistically likely. It's not lying. It's doing what it was trained to do: produce coherent, complete-looking output from incomplete input.

The problem is that "plausible" and "accurate" are different things. A plausible invoice has a vendor name, a date, line items, and a total. If the model can't clearly read the date, it will produce a date that looks reasonable: not necessarily the correct one.

This is categorically different from a parsing error, which fails visibly. A hallucinated extraction succeeds silently. The data looks good. Your validation passes. The wrong number hits your database and eventually your accounting system or your insurance claim or your patient record.


Layer 1: Schema Enforcement

The first guardrail is also the most basic: define exactly what the output should look like and reject anything that doesn't conform.

If you're extracting invoice data, your schema might look like:

const invoiceSchema = {
  vendor_name: { type: 'string', required: true, max_length: 255 },
  invoice_number: { type: 'string', required: true, pattern: /^[A-Z0-9\-\/]{1,50}$/ },
  invoice_date: { type: 'date', required: true, not_future: true },
  due_date: { type: 'date', required: false },
  line_items: {
    type: 'array',
    min_items: 1,
    item_schema: {
      description: { type: 'string', required: true },
      quantity: { type: 'number', positive: true },
      unit_price: { type: 'number', positive: true },
      total: { type: 'number', positive: true }
    }
  },
  subtotal: { type: 'number', positive: true },
  tax: { type: 'number', min: 0 },
  total: { type: 'number', positive: true }
};

Schema enforcement catches the obvious failures: missing required fields, wrong data types, values outside expected ranges. It doesn't catch plausible-but-wrong values: a date that's formatted correctly but wrong, a total that's realistic but fabricated.


Layer 2: Internal Consistency Checks

The second layer checks whether the extracted data is internally consistent. This catches a large class of hallucinations because fabricated data rarely obeys all the mathematical and logical relationships that real data does.

For invoice extraction:

function validateInvoiceConsistency(invoice) {
  const errors = [];

  // Line item totals should match quantity * unit_price
  for (const item of invoice.line_items) {
    const expectedTotal = item.quantity * item.unit_price;
    const tolerance = 0.02; // Allow 2 cents for rounding
    if (Math.abs(item.total - expectedTotal) > tolerance) {
      errors.push({
        field: 'line_items',
        message: `Line item total ${item.total} doesn't match quantity * unit_price (${expectedTotal})`,
        severity: 'high'
      });
    }
  }

  // Subtotal should equal sum of line item totals
  const lineItemSum = invoice.line_items.reduce((sum, item) => sum + item.total, 0);
  if (Math.abs(invoice.subtotal - lineItemSum) > 0.02) {
    errors.push({
      field: 'subtotal',
      message: `Subtotal ${invoice.subtotal} doesn't match sum of line items (${lineItemSum})`,
      severity: 'high'
    });
  }

  // Total should equal subtotal + tax
  const expectedTotal = invoice.subtotal + (invoice.tax || 0);
  if (Math.abs(invoice.total - expectedTotal) > 0.02) {
    errors.push({
      field: 'total',
      message: `Total ${invoice.total} doesn't match subtotal + tax (${expectedTotal})`,
      severity: 'high'
    });
  }

  // Due date should not be before invoice date
  if (invoice.due_date && invoice.due_date < invoice.invoice_date) {
    errors.push({
      field: 'due_date',
      message: 'Due date is before invoice date',
      severity: 'medium'
    });
  }

  return errors;
}

A hallucinated invoice that the model invented wholesale is unlikely to have correct arithmetic throughout. Real documents have these relationships. Fabricated ones often don't.


Layer 3: Confidence Scoring

Many document AI APIs return confidence scores alongside extracted values. These scores are imperfect: the model can be confidently wrong: but low confidence is a reliable signal that you need human review.

Build confidence scoring into your pipeline:

function assessExtractionConfidence(extractionResult) {
  const CONFIDENCE_THRESHOLD = 0.85;
  const lowConfidenceFields = [];

  for (const [field, data] of Object.entries(extractionResult.fields)) {
    if (data.confidence < CONFIDENCE_THRESHOLD) {
      lowConfidenceFields.push({
        field,
        value: data.value,
        confidence: data.confidence
      });
    }
  }

  const overallConfidence = Object.values(extractionResult.fields)
    .reduce((sum, data) => sum + data.confidence, 0) /
    Object.keys(extractionResult.fields).length;

  return {
    overallConfidence,
    lowConfidenceFields,
    requiresReview: overallConfidence < CONFIDENCE_THRESHOLD || lowConfidenceFields.length > 0
  };
}

When requiresReview is true, route the document to a human queue instead of auto-processing it. The cost of human review on flagged documents is far lower than the cost of processing incorrect data downstream.


Layer 4: Cross-Reference Against Known Constraints

For some document types, you can validate against data you already have. This is one of the strongest guardrails because it requires the extracted data to agree with an independent source of truth.

Examples:

async function crossReferenceInvoice(invoice, db) {
  const errors = [];

  // Check if vendor exists in your vendor database
  const vendor = await db.query(
    'SELECT id, name, typical_payment_terms FROM vendors WHERE name ILIKE $1',
    [invoice.vendor_name]
  );

  if (vendor.rows.length === 0) {
    errors.push({
      type: 'unknown_vendor',
      message: `Vendor "${invoice.vendor_name}" not found in vendor database`,
      severity: 'medium',
      action: 'human_review'
    });
  } else {
    const knownVendor = vendor.rows[0];

    // Check if total is within expected range for this vendor
    const stats = await db.query(
      `SELECT AVG(total) as avg_total, STDDEV(total) as stddev_total
       FROM invoices
       WHERE vendor_id = $1 AND created_at > NOW() - INTERVAL '12 months'`,
      [knownVendor.id]
    );

    if (stats.rows[0].avg_total) {
      const avg = parseFloat(stats.rows[0].avg_total);
      const stddev = parseFloat(stats.rows[0].stddev_total);
      const zScore = Math.abs(invoice.total - avg) / stddev;

      // Flag invoices more than 3 standard deviations from historical average
      if (zScore > 3) {
        errors.push({
          type: 'unusual_amount',
          message: `Invoice total ${invoice.total} is unusual for vendor (avg: ${avg.toFixed(2)})`,
          severity: 'high',
          action: 'human_review'
        });
      }
    }

    // Check if invoice number format matches vendor's known format
    // (This catches the model inventing a plausible but wrong invoice number)
    const recentInvoices = await db.query(
      'SELECT invoice_number FROM invoices WHERE vendor_id = $1 ORDER BY created_at DESC LIMIT 10',
      [knownVendor.id]
    );

    // (Pattern matching logic here based on known invoice number formats)
  }

  return errors;
}

Cross-referencing is powerful because it's hard to hallucinate your way past a constraint that comes from a completely separate system.


Layer 5: Human-in-the-Loop Checkpoints

For any document type where errors have meaningful consequences, build a human review queue. This is not a failure of the AI system: it's the correct architecture for high-stakes extraction.

Your routing logic should look something like:

function routeExtraction(extractionResult, consistencyErrors, confidenceAssessment, crossRefErrors) {
  const highSeverityErrors = [
    ...consistencyErrors.filter(e => e.severity === 'high'),
    ...crossRefErrors.filter(e => e.severity === 'high')
  ];

  const mediumSeverityErrors = [
    ...consistencyErrors.filter(e => e.severity === 'medium'),
    ...crossRefErrors.filter(e => e.severity === 'medium')
  ];

  if (highSeverityErrors.length > 0 || confidenceAssessment.overallConfidence < 0.75) {
    return {
      route: 'reject',
      reason: 'High confidence errors or very low extraction confidence',
      errors: highSeverityErrors
    };
  }

  if (mediumSeverityErrors.length > 0 || confidenceAssessment.requiresReview) {
    return {
      route: 'human_review',
      reason: 'Requires human verification',
      flags: [...mediumSeverityErrors, ...confidenceAssessment.lowConfidenceFields]
    };
  }

  return {
    route: 'auto_process',
    confidence: confidenceAssessment.overallConfidence
  };
}

The "reject" path returns the document to the submitter with a clear message about why it couldn't be processed automatically. The "humanreview" path routes to a queue where a human sees the original document alongside the extracted data and the specific flags. The "autoprocess" path proceeds without human involvement.

Getting this routing right requires knowing your error rate and your tolerance for false negatives. Start conservative: route more to human review initially, and loosen the thresholds as you understand your extraction quality.


Putting It Together

Think of these guardrails as the validation layer that wraps any extraction pipeline. The pipeline extracts data. The guardrails determine whether to trust it.

The Hacker News thread that prompted this article is a useful reminder that "the AI returned a result" is not the same as "the result is correct." These are different claims, and in any system where correctness matters: financial data, medical records, legal documents, insurance claims: the validation layer is not optional.

Build the pipeline. Build the guardrails. Deploy them together.

Powered by Contentful