LLM Response Parsing and Structured Output

Shane

2/13/2026

27 min read

Techniques for parsing structured data from LLM responses including JSON mode, schema validation, retry patterns, and streaming in Node.js.

llm nodejs structured output json mode parsing validation

LLM Response Parsing and Structured Output

Overview

Large language models produce natural language by default, but most real-world applications need structured data they can actually work with: JSON objects, typed fields, arrays of records. Getting reliable, machine-readable output from an LLM is one of the most important engineering problems in AI integration, and the solutions range from provider-specific JSON modes to hand-rolled parsing pipelines with schema validation and automatic retry. This article covers every major approach, with working Node.js code you can drop into production.

Prerequisites

Node.js v18 or later installed
Working knowledge of Express.js and async/await patterns
An OpenAI API key and/or an Anthropic API key
Familiarity with npm package management
Basic understanding of JSON Schema concepts

Install the dependencies we will use throughout this article:

npm install openai @anthropic-ai/sdk zod

The Fundamental Challenge

When you call an LLM API, you get back a string. That string might contain JSON, or it might contain JSON wrapped in markdown code fences, or it might contain a friendly preamble followed by JSON, or it might contain something that looks like JSON but has trailing commas and single quotes. The model is a text generator, not a serializer. Every integration that needs structured data must solve this gap between free-text output and typed, validated data structures.

I have seen production systems break because an LLM decided to add a helpful "Here's the JSON you requested:" prefix before the actual payload. I have seen models return valid JSON 99% of the time and then, on one particular input, produce a truncated object because the response hit the token limit. These are not edge cases. They are the normal operating conditions of LLM-powered systems.

The solution is a layered approach: constrain the model as much as possible at the API level, parse the raw response defensively, validate the parsed result against a schema, and retry when validation fails.

JSON Mode in OpenAI

OpenAI's API provides a response_format parameter that constrains the model to produce valid JSON. This is the simplest path to structured output when you are using GPT models.

var OpenAI = require("openai");

var client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

function extractProductInfo(description) {
  return client.chat.completions.create({
    model: "gpt-4o",
    response_format: { type: "json_object" },
    messages: [
      {
        role: "system",
        content: "You are a product data extractor. Always respond with a JSON object containing: name (string), price (number), currency (string), features (array of strings), category (string)."
      },
      {
        role: "user",
        content: "Extract product information from this description: " + description
      }
    ]
  }).then(function(response) {
    return JSON.parse(response.choices[0].message.content);
  });
}

extractProductInfo("The UltraWidget Pro costs $49.99 and features wireless charging, water resistance, and a 3-year warranty. It's in the electronics category.")
  .then(function(product) {
    console.log(product.name);     // "UltraWidget Pro"
    console.log(product.price);    // 49.99
    console.log(product.features); // ["wireless charging", "water resistance", "3-year warranty"]
  })
  .catch(function(err) {
    console.error("Extraction failed:", err.message);
  });

A few things to know about JSON mode. First, you must mention "JSON" somewhere in the system or user prompt, or the API will return an error. Second, JSON mode guarantees syntactically valid JSON, but it does not guarantee the JSON matches any particular schema. The model might return {"result": "I don't know"} instead of the structure you asked for. Third, JSON mode does not work with streaming in the way you might expect: each chunk is a text fragment, not a complete JSON object.

OpenAI also offers a stricter "Structured Outputs" feature that takes a JSON Schema definition and guarantees the output conforms to it:

function extractWithSchema(description) {
  return client.chat.completions.create({
    model: "gpt-4o",
    response_format: {
      type: "json_schema",
      json_schema: {
        name: "product_info",
        strict: true,
        schema: {
          type: "object",
          properties: {
            name: { type: "string" },
            price: { type: "number" },
            currency: { type: "string" },
            features: {
              type: "array",
              items: { type: "string" }
            },
            category: { type: "string" }
          },
          required: ["name", "price", "currency", "features", "category"],
          additionalProperties: false
        }
      }
    },
    messages: [
      {
        role: "system",
        content: "Extract product information from the user's description."
      },
      {
        role: "user",
        content: description
      }
    ]
  }).then(function(response) {
    return JSON.parse(response.choices[0].message.content);
  });
}

With strict: true, OpenAI uses constrained decoding to guarantee every response matches your schema. This is the gold standard for structured output when you are committed to the OpenAI ecosystem.

Structured Output with Claude

Anthropic's Claude API does not have a built-in JSON mode as of early 2026, but Claude is exceptionally good at following formatting instructions. The two most reliable approaches are XML tag delimiters and explicit JSON instructions.

XML Tag Approach

Claude has been trained to understand and produce XML-style tags reliably. You can use this to create clearly delimited structured regions in the response:

var Anthropic = require("@anthropic-ai/sdk");

var anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

function extractWithClaude(description) {
  return anthropic.messages.create({
    model: "claude-sonnet-4-20250514",
    max_tokens: 1024,
    messages: [
      {
        role: "user",
        content: "Extract product information from this description. Return ONLY a JSON object inside <json> tags with these fields: name (string), price (number), currency (string), features (array of strings), category (string).\n\nDescription: " + description
      }
    ]
  }).then(function(response) {
    var text = response.content[0].text;
    var match = text.match(/<json>([\s\S]*?)<\/json>/);
    if (!match) {
      throw new Error("No <json> tags found in response");
    }
    return JSON.parse(match[1].trim());
  });
}

Direct JSON Instruction

For simpler cases, you can instruct Claude to return only JSON with no surrounding text:

function extractJsonDirect(description) {
  return anthropic.messages.create({
    model: "claude-sonnet-4-20250514",
    max_tokens: 1024,
    system: "You are a data extraction API. Respond with ONLY valid JSON, no markdown formatting, no explanation, no code fences.",
    messages: [
      {
        role: "user",
        content: "Extract from this description into {name, price, currency, features, category}: " + description
      }
    ]
  }).then(function(response) {
    var text = response.content[0].text.trim();
    // Strip markdown code fences if the model added them despite instructions
    text = text.replace(/^```json?\s*\n?/, "").replace(/\n?```\s*$/, "");
    return JSON.parse(text);
  });
}

Notice the defensive stripping of code fences. Even with explicit instructions, models sometimes wrap JSON in markdown formatting. Always handle this.

Parsing Strategies for Free-Text Responses

Not every LLM response comes in a clean format. Sometimes you are working with a model or a prompt setup that produces a mix of explanation and data. Here are the parsing strategies I use most frequently.

Regex Extraction

When you know the shape of the data but it might be embedded in prose:

function parseRatingFromText(text) {
  var result = {};

  var scoreMatch = text.match(/(?:score|rating)[:\s]*(\d+(?:\.\d+)?)\s*(?:\/\s*(\d+))?/i);
  if (scoreMatch) {
    result.score = parseFloat(scoreMatch[1]);
    result.maxScore = scoreMatch[2] ? parseInt(scoreMatch[2], 10) : 10;
  }

  var sentimentMatch = text.match(/(?:sentiment|tone)[:\s]*(positive|negative|neutral|mixed)/i);
  if (sentimentMatch) {
    result.sentiment = sentimentMatch[1].toLowerCase();
  }

  var keywordsMatch = text.match(/(?:keywords?|tags?)[:\s]*(.+?)(?:\n|$)/i);
  if (keywordsMatch) {
    result.keywords = keywordsMatch[1].split(/[,;]/).map(function(k) {
      return k.trim();
    }).filter(function(k) {
      return k.length > 0;
    });
  }

  return result;
}

Marker-Based Extraction

Ask the model to use specific markers that are easy to parse:

function parseMarkerResponse(text) {
  var sections = {};
  var lines = text.split("\n");

  lines.forEach(function(line) {
    var match = line.match(/^###(\w+)###\s*(.+)$/);
    if (match) {
      sections[match[1].toLowerCase()] = match[2].trim();
    }
  });

  return sections;
}

// Prompt would include: "Format each field on its own line as ###FIELDNAME### value"
// Model returns:
// ###TITLE### Understanding Async Patterns
// ###CATEGORY### programming
// ###DIFFICULTY### intermediate
// ###SUMMARY### A guide to async/await and Promise patterns in Node.js

JSON Block Extraction

When the response might contain JSON somewhere in a larger text body:

function extractJsonBlock(text) {
  // Try to parse the whole thing first
  try {
    return JSON.parse(text.trim());
  } catch (e) {
    // Not pure JSON, look for it
  }

  // Strip markdown code fences
  var cleaned = text.replace(/```json?\s*\n?([\s\S]*?)\n?```/g, "$1");
  try {
    return JSON.parse(cleaned.trim());
  } catch (e) {
    // Still not clean
  }

  // Find the first { and last } and try that
  var firstBrace = text.indexOf("{");
  var lastBrace = text.lastIndexOf("}");
  if (firstBrace !== -1 && lastBrace > firstBrace) {
    try {
      return JSON.parse(text.substring(firstBrace, lastBrace + 1));
    } catch (e) {
      // Malformed
    }
  }

  // Same for arrays
  var firstBracket = text.indexOf("[");
  var lastBracket = text.lastIndexOf("]");
  if (firstBracket !== -1 && lastBracket > firstBracket) {
    try {
      return JSON.parse(text.substring(firstBracket, lastBracket + 1));
    } catch (e) {
      // Malformed
    }
  }

  throw new Error("Could not extract JSON from response");
}

This layered extraction approach handles almost every format I have encountered in production. It tries the cleanest parse first and falls back to progressively more aggressive extraction.

Zod Schema Validation for LLM Outputs

Parsing JSON is only half the battle. You also need to verify that the parsed object has the right shape, the right types, and reasonable values. Zod is the best tool for this in the Node.js ecosystem.

var z = require("zod");

var ProductSchema = z.object({
  name: z.string().min(1, "Product name cannot be empty"),
  price: z.number().positive("Price must be positive"),
  currency: z.string().length(3, "Currency must be a 3-letter code"),
  features: z.array(z.string()).min(1, "Must have at least one feature"),
  category: z.enum(["electronics", "clothing", "food", "software", "other"]),
  inStock: z.boolean().optional().default(true)
});

function validateProductOutput(rawJson) {
  var result = ProductSchema.safeParse(rawJson);
  if (result.success) {
    return { valid: true, data: result.data };
  }
  return {
    valid: false,
    errors: result.error.issues.map(function(issue) {
      return issue.path.join(".") + ": " + issue.message;
    })
  };
}

Zod gives you type coercion, default values, custom error messages, and transformation pipelines. For LLM outputs, the safeParse method is essential because you always want to handle validation failures gracefully rather than throwing.

You can also use Zod's transform to normalize LLM outputs that might vary in format:

var FlexiblePriceSchema = z.union([
  z.number(),
  z.string().transform(function(val) {
    var num = parseFloat(val.replace(/[^0-9.]/g, ""));
    if (isNaN(num)) throw new Error("Cannot parse price: " + val);
    return num;
  })
]);

// Handles: 49.99, "49.99", "$49.99", "49.99 USD"

Retry-on-Parse-Failure Patterns

When structured output parsing fails, the right move is usually to retry with a more explicit prompt. Here is a retry pattern I use in every LLM integration:

function withRetry(fn, options) {
  var maxRetries = (options && options.maxRetries) || 3;
  var onRetry = (options && options.onRetry) || function() {};

  return function() {
    var args = Array.prototype.slice.call(arguments);
    var attempt = 0;

    function tryCall() {
      attempt++;
      return fn.apply(null, args).catch(function(err) {
        if (attempt >= maxRetries) {
          throw new Error("Failed after " + maxRetries + " attempts. Last error: " + err.message);
        }
        onRetry(attempt, err);
        return tryCall();
      });
    }

    return tryCall();
  };
}

// Usage with feedback loop
function extractProduct(description, previousError) {
  var messages = [
    {
      role: "user",
      content: "Extract product info as JSON with fields: name, price (number), currency, features (array), category.\n\nDescription: " + description
    }
  ];

  if (previousError) {
    messages.push({
      role: "assistant",
      content: previousError.rawResponse
    });
    messages.push({
      role: "user",
      content: "That response had validation errors: " + previousError.errors.join("; ") + ". Please fix the JSON and return only the corrected JSON object."
    });
  }

  return client.chat.completions.create({
    model: "gpt-4o",
    response_format: { type: "json_object" },
    messages: messages
  }).then(function(response) {
    var raw = response.choices[0].message.content;
    var parsed = JSON.parse(raw);
    var validation = validateProductOutput(parsed);
    if (!validation.valid) {
      var err = new Error("Validation failed: " + validation.errors.join("; "));
      err.rawResponse = raw;
      err.errors = validation.errors;
      throw err;
    }
    return validation.data;
  });
}

The key insight is feeding the failed output and the specific validation errors back to the model. This gives the LLM a chance to correct its own mistakes with concrete feedback, which works surprisingly well.

Extracting Multiple Fields from a Single Response

When you need several pieces of structured data from one LLM call, define a clear schema up front and validate each field independently:

var AnalysisSchema = z.object({
  summary: z.string().min(10).max(500),
  sentiment: z.enum(["positive", "negative", "neutral", "mixed"]),
  confidence: z.number().min(0).max(1),
  topics: z.array(z.string()).min(1).max(10),
  actionItems: z.array(z.object({
    task: z.string(),
    priority: z.enum(["high", "medium", "low"]),
    assignee: z.string().optional()
  })),
  metadata: z.object({
    wordCount: z.number().int().positive(),
    language: z.string(),
    readingTime: z.number().positive()
  })
});

function analyzeDocument(text) {
  var prompt = [
    "Analyze this document and return a JSON object with exactly these fields:",
    "- summary: string (10-500 chars)",
    "- sentiment: one of positive/negative/neutral/mixed",
    "- confidence: number between 0 and 1",
    "- topics: array of 1-10 topic strings",
    "- actionItems: array of {task: string, priority: high/medium/low, assignee?: string}",
    "- metadata: {wordCount: integer, language: string, readingTime: number in minutes}",
    "",
    "Document:",
    text
  ].join("\n");

  return client.chat.completions.create({
    model: "gpt-4o",
    response_format: { type: "json_object" },
    messages: [{ role: "user", content: prompt }]
  }).then(function(response) {
    var parsed = JSON.parse(response.choices[0].message.content);
    return AnalysisSchema.parse(parsed);
  });
}

Handling Partial and Malformed JSON from Streaming

Streaming responses come in token-by-token. If you are accumulating JSON from a stream, you need to handle the incomplete state:

function StreamingJsonAccumulator() {
  this.buffer = "";
  this.braceDepth = 0;
  this.bracketDepth = 0;
  this.inString = false;
  this.escaped = false;
  this.objects = [];
}

StreamingJsonAccumulator.prototype.addChunk = function(chunk) {
  this.buffer += chunk;

  for (var i = 0; i < chunk.length; i++) {
    var ch = chunk[i];

    if (this.escaped) {
      this.escaped = false;
      continue;
    }

    if (ch === "\\") {
      this.escaped = true;
      continue;
    }

    if (ch === '"') {
      this.inString = !this.inString;
      continue;
    }

    if (this.inString) continue;

    if (ch === "{") this.braceDepth++;
    if (ch === "}") this.braceDepth--;
    if (ch === "[") this.bracketDepth++;
    if (ch === "]") this.bracketDepth--;
  }
};

StreamingJsonAccumulator.prototype.isComplete = function() {
  return this.buffer.trim().length > 0 &&
         this.braceDepth === 0 &&
         this.bracketDepth === 0 &&
         !this.inString;
};

StreamingJsonAccumulator.prototype.tryParse = function() {
  if (!this.isComplete()) {
    return { complete: false, data: null };
  }
  try {
    var data = JSON.parse(this.buffer.trim());
    return { complete: true, data: data };
  } catch (e) {
    return { complete: false, data: null, error: e.message };
  }
};

// Usage with OpenAI streaming
function streamStructuredResponse(prompt) {
  var accumulator = new StreamingJsonAccumulator();

  return client.chat.completions.create({
    model: "gpt-4o",
    response_format: { type: "json_object" },
    stream: true,
    messages: [{ role: "user", content: prompt }]
  }).then(function(stream) {
    return new Promise(function(resolve, reject) {
      stream.on("data", function(chunk) {
        var delta = chunk.choices[0].delta;
        if (delta && delta.content) {
          accumulator.addChunk(delta.content);

          // Optional: emit partial progress
          var result = accumulator.tryParse();
          if (result.complete) {
            resolve(result.data);
          }
        }
      });

      stream.on("end", function() {
        var result = accumulator.tryParse();
        if (result.complete) {
          resolve(result.data);
        } else {
          reject(new Error("Stream ended with incomplete JSON: " + (result.error || "unknown")));
        }
      });

      stream.on("error", function(err) {
        reject(err);
      });
    });
  });
}

For modern versions of the OpenAI SDK that use async iterators, the streaming consumption looks slightly different:

function streamWithAsyncIterator(prompt) {
  var accumulator = new StreamingJsonAccumulator();

  return client.chat.completions.create({
    model: "gpt-4o",
    response_format: { type: "json_object" },
    stream: true,
    messages: [{ role: "user", content: prompt }]
  }).then(function(stream) {
    return (function processStream() {
      // Use a manual iteration approach for CommonJS compatibility
      var iterator = stream[Symbol.asyncIterator]();

      function processNext() {
        return iterator.next().then(function(result) {
          if (result.done) {
            var parsed = accumulator.tryParse();
            if (parsed.complete) return parsed.data;
            throw new Error("Stream ended with incomplete JSON");
          }
          var chunk = result.value;
          var content = chunk.choices[0].delta.content;
          if (content) {
            accumulator.addChunk(content);
          }
          return processNext();
        });
      }

      return processNext();
    })();
  });
}

Building Type-Safe Response Parsers

Here is a pattern for building reusable, type-safe parsers that encapsulate the entire parse-validate-retry cycle:

var z = require("zod");

function createStructuredParser(options) {
  var schema = options.schema;
  var extractJson = options.extractJson || extractJsonBlock;
  var maxRetries = options.maxRetries || 2;

  function parse(rawText) {
    var extracted;
    try {
      extracted = extractJson(rawText);
    } catch (e) {
      return {
        success: false,
        error: "JSON extraction failed: " + e.message,
        raw: rawText
      };
    }

    var validation = schema.safeParse(extracted);
    if (validation.success) {
      return {
        success: true,
        data: validation.data,
        raw: rawText
      };
    }

    return {
      success: false,
      error: "Validation failed: " + validation.error.issues.map(function(i) {
        return i.path.join(".") + " - " + i.message;
      }).join("; "),
      parsed: extracted,
      raw: rawText
    };
  }

  function formatErrorFeedback(result) {
    return "Your previous response could not be parsed correctly. Error: " + result.error + "\n\nPlease return ONLY a valid JSON object matching the required schema.";
  }

  return {
    parse: parse,
    formatErrorFeedback: formatErrorFeedback,
    schema: schema
  };
}

// Usage
var sentimentParser = createStructuredParser({
  schema: z.object({
    sentiment: z.enum(["positive", "negative", "neutral"]),
    score: z.number().min(-1).max(1),
    explanation: z.string()
  })
});

var result = sentimentParser.parse('{"sentiment": "positive", "score": 0.85, "explanation": "The text expresses satisfaction"}');
console.log(result.success); // true
console.log(result.data.sentiment); // "positive"

Output Format Negotiation in Prompts

The prompt itself is your first line of defense. Here are prompt patterns that consistently produce parseable output:

The Schema-in-Prompt Pattern:

var schemaPrompt = [
  "Respond with a JSON object matching this exact schema:",
  "{",
  '  "title": "string - the article title",',
  '  "author": "string - author name",',
  '  "tags": ["string array - relevant tags"],',
  '  "rating": "number 1-5",',
  '  "published": "boolean"',
  "}",
  "",
  "Return ONLY the JSON. No explanation, no code fences."
].join("\n");

The Example-Driven Pattern:

var examplePrompt = [
  "Classify the following text. Respond in the exact format shown in this example:",
  "",
  "Example input: 'I love this product!'",
  'Example output: {"label": "positive", "confidence": 0.95, "reasoning": "Strong positive language"}',
  "",
  "Now classify this text: "
].join("\n");

The Constraint Pattern:

var constraintPrompt = [
  "Rules for your response:",
  "1. Respond with ONLY valid JSON",
  "2. Do not wrap in code fences",
  "3. Do not add any text before or after the JSON",
  "4. All string values must be non-empty",
  "5. All number values must be finite",
  "6. Use null for unknown fields, never omit them"
].join("\n");

In my experience, combining the schema-in-prompt pattern with the constraint pattern produces the most reliable results across both OpenAI and Claude.

Structured Extraction Pipeline

Here is the full pipeline pattern: raw LLM text goes in, validated typed data comes out.

function createExtractionPipeline(options) {
  var prompt = options.prompt;
  var schema = options.schema;
  var provider = options.provider || "openai";
  var maxRetries = options.maxRetries || 3;
  var parser = createStructuredParser({ schema: schema });

  function callLLM(messages) {
    if (provider === "openai") {
      return client.chat.completions.create({
        model: options.model || "gpt-4o",
        response_format: { type: "json_object" },
        messages: messages
      }).then(function(r) {
        return r.choices[0].message.content;
      });
    }

    if (provider === "claude") {
      return anthropic.messages.create({
        model: options.model || "claude-sonnet-4-20250514",
        max_tokens: options.maxTokens || 2048,
        messages: messages
      }).then(function(r) {
        return r.content[0].text;
      });
    }

    throw new Error("Unknown provider: " + provider);
  }

  function extract(input) {
    var messages = [
      { role: "user", content: prompt + "\n\nInput:\n" + input }
    ];
    var attempts = 0;

    function attempt() {
      attempts++;
      return callLLM(messages).then(function(rawText) {
        var result = parser.parse(rawText);

        if (result.success) {
          return {
            data: result.data,
            attempts: attempts,
            raw: rawText
          };
        }

        if (attempts >= maxRetries) {
          throw new Error("Extraction failed after " + attempts + " attempts: " + result.error);
        }

        // Feed error back for retry
        messages.push({ role: "assistant", content: rawText });
        messages.push({ role: "user", content: parser.formatErrorFeedback(result) });

        return attempt();
      });
    }

    return attempt();
  }

  return { extract: extract };
}

// Complete usage example
var invoiceExtractor = createExtractionPipeline({
  provider: "openai",
  prompt: "Extract invoice data from the following text. Return JSON with: vendor (string), invoiceNumber (string), date (YYYY-MM-DD string), lineItems (array of {description, quantity, unitPrice, total}), subtotal (number), tax (number), total (number).",
  schema: z.object({
    vendor: z.string().min(1),
    invoiceNumber: z.string().min(1),
    date: z.string().regex(/^\d{4}-\d{2}-\d{2}$/),
    lineItems: z.array(z.object({
      description: z.string(),
      quantity: z.number().positive(),
      unitPrice: z.number().nonnegative(),
      total: z.number().nonnegative()
    })).min(1),
    subtotal: z.number().nonnegative(),
    tax: z.number().nonnegative(),
    total: z.number().positive()
  }),
  maxRetries: 3
});

invoiceExtractor.extract("Invoice #2024-0892 from Acme Corp, dated 2024-11-15. 3x Widget at $10.00 each ($30.00), 1x Gadget at $25.00 ($25.00). Subtotal $55.00, tax $4.95, total $59.95.")
  .then(function(result) {
    console.log("Extracted in", result.attempts, "attempt(s)");
    console.log("Vendor:", result.data.vendor);
    console.log("Total:", result.data.total);
    console.log("Line items:", result.data.lineItems.length);
  });

Comparing Structured Output Approaches Across Providers

Feature	OpenAI JSON Mode	OpenAI Structured Outputs	Claude (Prompt-Based)	Claude (XML Tags)
Guarantees valid JSON	Yes	Yes	No (but very reliable)	No (but very reliable)
Guarantees schema match	No	Yes	No	No
Works with streaming	Yes (accumulate)	Yes (accumulate)	Yes (accumulate)	Yes (extract after)
Extra latency	Minimal	Minimal	None	None
Flexibility	Medium	Low (strict schema)	High	High
Requires prompt mention	"JSON" in prompt	Schema in API call	Yes	Yes
Handles nested objects	Yes	Yes	Yes	Yes
Cost impact	None	None	None	None

My recommendation: use OpenAI Structured Outputs when you need iron-clad guarantees and are on the OpenAI platform. Use Claude with the XML tag pattern when you need flexibility or are doing complex multi-part extraction. Always layer Zod validation on top regardless of provider.

Complete Working Example

Here is a production-ready structured output parser that ties everything together:

var z = require("zod");
var OpenAI = require("openai");
var Anthropic = require("@anthropic-ai/sdk");

// ---- Configuration ----

var openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
var anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

// ---- JSON Extraction ----

function extractJsonFromText(text) {
  var trimmed = text.trim();

  // Direct parse
  try { return JSON.parse(trimmed); } catch (e) { /* continue */ }

  // Strip code fences
  var fenceStripped = trimmed.replace(/^```json?\s*\n?/, "").replace(/\n?```\s*$/, "");
  try { return JSON.parse(fenceStripped); } catch (e) { /* continue */ }

  // Extract from XML tags
  var xmlMatch = trimmed.match(/<json>([\s\S]*?)<\/json>/);
  if (xmlMatch) {
    try { return JSON.parse(xmlMatch[1].trim()); } catch (e) { /* continue */ }
  }

  // Brace extraction
  var start = trimmed.indexOf("{");
  var end = trimmed.lastIndexOf("}");
  if (start !== -1 && end > start) {
    try { return JSON.parse(trimmed.substring(start, end + 1)); } catch (e) { /* continue */ }
  }

  throw new Error("No valid JSON found in response");
}

// ---- Streaming Accumulator ----

function JsonStreamAccumulator() {
  this.buffer = "";
  this.depth = 0;
  this.inString = false;
  this.escaped = false;
}

JsonStreamAccumulator.prototype.append = function(chunk) {
  this.buffer += chunk;
  for (var i = 0; i < chunk.length; i++) {
    var c = chunk[i];
    if (this.escaped) { this.escaped = false; continue; }
    if (c === "\\") { this.escaped = true; continue; }
    if (c === '"') { this.inString = !this.inString; continue; }
    if (this.inString) continue;
    if (c === "{" || c === "[") this.depth++;
    if (c === "}" || c === "]") this.depth--;
  }
};

JsonStreamAccumulator.prototype.isComplete = function() {
  return this.buffer.trim().length > 0 && this.depth === 0 && !this.inString;
};

JsonStreamAccumulator.prototype.getResult = function() {
  return JSON.parse(this.buffer.trim());
};

// ---- Core Parser ----

function StructuredOutputParser(config) {
  this.schema = config.schema;
  this.provider = config.provider || "openai";
  this.model = config.model || (this.provider === "openai" ? "gpt-4o" : "claude-sonnet-4-20250514");
  this.maxRetries = config.maxRetries || 3;
  this.systemPrompt = config.systemPrompt || "";
  this.debug = config.debug || false;
}

StructuredOutputParser.prototype.log = function(msg) {
  if (this.debug) console.log("[StructuredOutputParser]", msg);
};

StructuredOutputParser.prototype.callOpenAI = function(messages) {
  var self = this;
  var allMessages = [];

  if (this.systemPrompt) {
    allMessages.push({ role: "system", content: this.systemPrompt });
  }
  allMessages = allMessages.concat(messages);

  return openai.chat.completions.create({
    model: this.model,
    response_format: { type: "json_object" },
    messages: allMessages,
    temperature: 0.1
  }).then(function(response) {
    var content = response.choices[0].message.content;
    self.log("OpenAI response length: " + content.length);
    return content;
  });
};

StructuredOutputParser.prototype.callClaude = function(messages) {
  var self = this;
  return anthropic.messages.create({
    model: this.model,
    max_tokens: 4096,
    system: this.systemPrompt || "Respond with only valid JSON. No code fences, no explanation.",
    messages: messages,
    temperature: 0.1
  }).then(function(response) {
    var content = response.content[0].text;
    self.log("Claude response length: " + content.length);
    return content;
  });
};

StructuredOutputParser.prototype.callLLM = function(messages) {
  if (this.provider === "openai") return this.callOpenAI(messages);
  if (this.provider === "claude") return this.callClaude(messages);
  throw new Error("Unsupported provider: " + this.provider);
};

StructuredOutputParser.prototype.parse = function(input, userPrompt) {
  var self = this;
  var messages = [
    { role: "user", content: userPrompt + "\n\nInput:\n" + input }
  ];
  var attempt = 0;

  function tryParse() {
    attempt++;
    self.log("Attempt " + attempt + " of " + self.maxRetries);

    return self.callLLM(messages).then(function(rawText) {
      // Step 1: Extract JSON
      var extracted;
      try {
        extracted = extractJsonFromText(rawText);
      } catch (e) {
        if (attempt >= self.maxRetries) {
          throw new Error("JSON extraction failed after " + attempt + " attempts: " + e.message);
        }
        messages.push({ role: "assistant", content: rawText });
        messages.push({
          role: "user",
          content: "Your response was not valid JSON. Error: " + e.message + ". Please respond with ONLY a valid JSON object."
        });
        return tryParse();
      }

      // Step 2: Validate with Zod
      var validation = self.schema.safeParse(extracted);
      if (validation.success) {
        return {
          success: true,
          data: validation.data,
          attempts: attempt,
          raw: rawText
        };
      }

      var errors = validation.error.issues.map(function(issue) {
        return issue.path.join(".") + ": " + issue.message;
      });

      self.log("Validation errors: " + errors.join("; "));

      if (attempt >= self.maxRetries) {
        return {
          success: false,
          errors: errors,
          attempts: attempt,
          parsed: extracted,
          raw: rawText
        };
      }

      // Step 3: Retry with feedback
      messages.push({ role: "assistant", content: rawText });
      messages.push({
        role: "user",
        content: "The JSON had validation errors:\n" + errors.map(function(e) { return "- " + e; }).join("\n") + "\n\nPlease fix these issues and return only the corrected JSON."
      });

      return tryParse();
    });
  }

  return tryParse();
};

// ---- Streaming Parse ----

StructuredOutputParser.prototype.parseStream = function(input, userPrompt) {
  var self = this;
  var accumulator = new JsonStreamAccumulator();

  var allMessages = [];
  if (this.systemPrompt) {
    allMessages.push({ role: "system", content: this.systemPrompt });
  }
  allMessages.push({ role: "user", content: userPrompt + "\n\nInput:\n" + input });

  return openai.chat.completions.create({
    model: this.model,
    response_format: { type: "json_object" },
    stream: true,
    messages: allMessages,
    temperature: 0.1
  }).then(function(stream) {
    var iterator = stream[Symbol.asyncIterator]();

    function processNext() {
      return iterator.next().then(function(result) {
        if (result.done) {
          if (accumulator.isComplete()) {
            var data = accumulator.getResult();
            var validation = self.schema.safeParse(data);
            if (validation.success) {
              return { success: true, data: validation.data };
            }
            return {
              success: false,
              errors: validation.error.issues.map(function(i) {
                return i.path.join(".") + ": " + i.message;
              })
            };
          }
          throw new Error("Stream ended with incomplete JSON");
        }

        var delta = result.value.choices[0].delta;
        if (delta && delta.content) {
          accumulator.append(delta.content);
        }
        return processNext();
      });
    }

    return processNext();
  });
};

// ---- Example Usage ----

var ReviewSchema = z.object({
  productName: z.string().min(1),
  rating: z.number().int().min(1).max(5),
  sentiment: z.enum(["positive", "negative", "mixed", "neutral"]),
  pros: z.array(z.string()).min(1),
  cons: z.array(z.string()),
  summary: z.string().min(20).max(500),
  recommendsBuying: z.boolean()
});

var reviewParser = new StructuredOutputParser({
  schema: ReviewSchema,
  provider: "openai",
  maxRetries: 3,
  debug: true,
  systemPrompt: "You are a product review analyzer. Extract structured data from product reviews. Always respond with valid JSON."
});

var sampleReview = "I bought the SoundMax Pro headphones last month. The noise cancellation is incredible and the battery lasts forever - easily 30+ hours. Sound quality is crisp and the bass is deep without being muddy. My only complaints are the ear cups get warm after a couple hours and the carrying case feels cheap. Overall I'd rate them 4 out of 5 and would definitely recommend them to anyone looking for premium wireless headphones.";

reviewParser.parse(sampleReview, "Analyze this product review and extract structured data with fields: productName, rating (1-5), sentiment, pros (array), cons (array), summary (20-500 chars), recommendsBuying (boolean).")
  .then(function(result) {
    if (result.success) {
      console.log("Product:", result.data.productName);
      console.log("Rating:", result.data.rating + "/5");
      console.log("Sentiment:", result.data.sentiment);
      console.log("Pros:", result.data.pros.join(", "));
      console.log("Cons:", result.data.cons.join(", "));
      console.log("Recommends:", result.data.recommendsBuying ? "Yes" : "No");
      console.log("Parsed in", result.attempts, "attempt(s)");
    } else {
      console.error("Failed to parse review:", result.errors);
    }
  })
  .catch(function(err) {
    console.error("Fatal error:", err.message);
  });

Common Issues and Troubleshooting

1. SyntaxError: Unexpected token at position 0

SyntaxError: Unexpected token H in JSON at position 0

This happens when the LLM prefixes JSON with text like "Here is the JSON:" or wraps it in markdown code fences. The fix is to use the extractJsonFromText function shown above, which strips common wrappers before parsing. Also double-check that your system prompt explicitly says "respond with ONLY JSON."

2. Zod Validation Errors on Numeric Strings

price: Expected number, received string

LLMs frequently return numbers as strings, especially in Claude responses. Use Zod's coerce to handle this:

var schema = z.object({
  price: z.coerce.number().positive(),
  quantity: z.coerce.number().int().positive()
});

This will accept both 49.99 and "49.99" and coerce the string to a number.

3. Truncated JSON from Token Limits

SyntaxError: Unexpected end of JSON input

This occurs when max_tokens is too low for the expected response. The model generates valid JSON until it runs out of tokens, then cuts off mid-object. Solutions: increase max_tokens, reduce the amount of data you are asking for, or split large extractions into multiple calls. You can detect this by checking the finish_reason in the API response:

var finishReason = response.choices[0].finish_reason;
if (finishReason === "length") {
  console.warn("Response was truncated due to token limit");
  // Retry with higher max_tokens or smaller input
}

4. Schema Mismatch with enum Fields

Invalid enum value. Expected 'electronics' | 'clothing' | 'food', received 'Electronics'

LLMs do not always respect casing in enum values. Add a preprocessing transform:

var CategorySchema = z.string().transform(function(val) {
  return val.toLowerCase().trim();
}).pipe(z.enum(["electronics", "clothing", "food", "software", "other"]));

5. OpenAI JSON Mode Requires "JSON" in Prompt

BadRequestError: 'messages' must contain the word 'json' in some form, to use 'response_format' of type 'json_object'.

This is a hard requirement from OpenAI's API. If you are using response_format: { type: "json_object" }, the word "JSON" (case-insensitive) must appear somewhere in the system or user prompt. A simple fix is adding it to your system prompt: "Respond with valid JSON."

6. Streaming Depth Tracking Off by One

If your streaming JSON accumulator reports completion too early, it is usually because the buffer started with whitespace or a non-JSON preamble. Always trim the buffer before checking depth, and start tracking only after you encounter the first { or [.

Best Practices

Always validate with a schema. Never trust raw LLM output, even with JSON mode enabled. Provider-level JSON guarantees only ensure syntactic validity, not semantic correctness. Zod or a similar validation library should be your last gate before the data enters your application.
Use the retry-with-feedback pattern. When validation fails, send the specific errors back to the model. This is dramatically more effective than a blind retry because the model can see exactly what went wrong and correct it. Three attempts with feedback is usually sufficient.
Set low temperature for structured output. Use temperature: 0 or temperature: 0.1 when you need consistent structured output. Higher temperatures increase creativity but also increase the chance of format deviations, unexpected field names, and values outside your expected ranges.
Prefer provider-native structured output when available. OpenAI's Structured Outputs with strict: true is the most reliable option for that platform. It uses constrained decoding at the model level, which is fundamentally more reliable than prompt-based constraints. Use it whenever your schema fits within its limitations.
Design prompts defensively. Include the expected schema in the prompt, provide an example of the expected output, and explicitly state what NOT to do ("no code fences, no explanation, no markdown"). Redundancy in prompt instructions is a feature, not a bug.
Handle streaming JSON as a state machine. Track brace/bracket depth, string context, and escape sequences. Do not try to parse partial JSON with JSON.parse on every chunk; instead wait until the depth tracker indicates completion. This prevents both errors and wasted CPU cycles.
Log raw responses in development. When structured output parsing fails in production, you need to see exactly what the model returned. Log the raw response text alongside the parse error so you can diagnose whether the issue is in the prompt, the model, or the parser.
Separate extraction from validation. Keep your JSON extraction logic (handling code fences, XML tags, brace extraction) separate from your schema validation logic. This makes each layer independently testable and lets you swap out extraction strategies without touching validation.
Version your schemas. When you change the structure of your LLM output schema, treat it like an API version change. Old prompts cached in your system might still produce the old format. Use schema versioning or graceful migration to handle this.
Monitor parse success rates. Track the percentage of LLM calls that parse and validate successfully on the first attempt, on retry, and that fail completely. This metric tells you more about your system's reliability than any unit test.

LLM Response Parsing and Structured Output

LLM Response Parsing and Structured Output

Overview

Prerequisites

The Fundamental Challenge

JSON Mode in OpenAI

Structured Output with Claude

XML Tag Approach

Direct JSON Instruction

Parsing Strategies for Free-Text Responses

Regex Extraction

Marker-Based Extraction

JSON Block Extraction

Zod Schema Validation for LLM Outputs

Retry-on-Parse-Failure Patterns

Extracting Multiple Fields from a Single Response

Handling Partial and Malformed JSON from Streaming

Building Type-Safe Response Parsers

Output Format Negotiation in Prompts

Structured Extraction Pipeline

Comparing Structured Output Approaches Across Providers

Complete Working Example

Common Issues and Troubleshooting

1. SyntaxError: Unexpected token at position 0

2. Zod Validation Errors on Numeric Strings

3. Truncated JSON from Token Limits

4. Schema Mismatch with enum Fields

5. OpenAI JSON Mode Requires "JSON" in Prompt

6. Streaming Depth Tracking Off by One

Best Practices

References

Quick Links

Need Expert Help?