Agents

Planning and Reasoning in AI Agents

Build planning and reasoning capabilities for AI agents with task decomposition, tree-of-thought, replanning, and adaptive execution in Node.js.

Planning and Reasoning in AI Agents

Overview

An LLM that can answer questions is useful. An LLM that can plan a sequence of actions, reason about dependencies, adapt when things go wrong, and execute a multi-step workflow autonomously is transformative. Planning and reasoning are the capabilities that separate a chatbot from a genuine AI agent — they give the system the ability to decompose ambiguous goals into concrete steps, evaluate trade-offs between approaches, and recover gracefully when reality diverges from the plan. In this article, we will build planning and reasoning infrastructure in Node.js from the ground up, covering plan generation, validation, execution, and replanning.

Prerequisites

  • Node.js v18 or later installed
  • Working knowledge of Express.js and asynchronous JavaScript
  • An OpenAI API key (or Anthropic API key — the patterns apply to any LLM provider)
  • Familiarity with basic agent concepts (tool calling, prompt engineering)
  • openai npm package installed (npm install openai)

Why Planning Matters for Complex Agent Tasks

Most agent frameworks treat the LLM as a reactive loop: observe the environment, pick a tool, execute, repeat. This works for simple tasks — "look up the weather" or "summarize this document." But hand a reactive agent a task like "migrate our user database from MySQL to PostgreSQL, validate data integrity, update the application connection strings, run the test suite, and roll back if anything fails," and you will watch it flounder.

The problem is that reactive agents have no foresight. They cannot reason about dependencies between steps, estimate whether they have the tools and permissions to complete the job, or decide upfront that one approach is better than another. They stumble forward one action at a time, often painting themselves into corners that require expensive backtracking.

Planning gives an agent three critical capabilities:

  1. Decomposition — breaking an ambiguous goal into concrete, ordered steps
  2. Anticipation — identifying potential failure points before committing resources
  3. Coordination — managing dependencies so that steps execute in the right order with the right inputs

Without planning, agents waste tokens, make redundant API calls, and produce brittle results. With planning, they behave more like experienced engineers who sketch out an approach before writing code.

Plan-Then-Execute vs. Interleaved Planning

There are two dominant paradigms for how agents incorporate planning into their execution loop.

Plan-then-execute generates the entire plan upfront, then hands it to an executor that runs each step sequentially. This is simpler to implement and easier to debug, but it is fragile — if step 3 fails or produces unexpected output, the remaining steps may be invalid.

Interleaved planning generates a high-level plan, executes one or two steps, then replans based on the new state of the world. This is more robust but more expensive in terms of tokens and latency, since the LLM is invoked for planning at every decision point.

In practice, the best approach is a hybrid: generate a full plan upfront, execute it step by step, and trigger replanning only when a step fails or produces output that materially changes the situation.

var PLANNING_MODE = {
  FULL_UPFRONT: "full_upfront",
  INTERLEAVED: "interleaved",
  HYBRID: "hybrid"
};

function selectPlanningMode(task) {
  // Simple tasks with well-known steps: plan once, execute
  if (task.complexity === "low") return PLANNING_MODE.FULL_UPFRONT;
  // Highly uncertain tasks: replan after every step
  if (task.uncertainty === "high") return PLANNING_MODE.INTERLEAVED;
  // Most real-world tasks: plan upfront, replan on failure
  return PLANNING_MODE.HYBRID;
}

Implementing Plan Generation with Structured Output

The first concrete piece of infrastructure is a plan generator that takes a goal and produces a structured plan. We want the LLM to output JSON, not free-form text, so that our executor can parse and traverse the plan programmatically.

var OpenAI = require("openai");

var client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

var PLAN_GENERATION_PROMPT = [
  "You are a planning agent. Given a goal, produce a structured execution plan.",
  "Each step must have: id, description, dependencies (array of step IDs that must complete first),",
  "tool (the tool/function to call), parameters (object), expectedOutput (what success looks like),",
  "and fallback (what to do if this step fails).",
  "",
  "Rules:",
  "- Steps with no dependencies can run in parallel",
  "- Every step must have at least one fallback strategy",
  "- Identify resource constraints (API rate limits, file locks, etc.)",
  "- Estimate the complexity of each step as low/medium/high",
  "",
  "Respond with valid JSON only. No markdown, no explanation."
].join("\n");

function generatePlan(goal, availableTools, constraints) {
  var toolDescriptions = availableTools.map(function(t) {
    return t.name + ": " + t.description;
  }).join("\n");

  var constraintText = "";
  if (constraints) {
    constraintText = "\n\nConstraints:\n" +
      "- Max steps: " + (constraints.maxSteps || 20) + "\n" +
      "- Available tools: " + toolDescriptions + "\n" +
      "- Time budget: " + (constraints.timeBudgetMs || "unlimited") + "ms\n" +
      "- Max LLM calls: " + (constraints.maxLlmCalls || "unlimited");
  }

  return client.chat.completions.create({
    model: "gpt-4o",
    response_format: { type: "json_object" },
    messages: [
      { role: "system", content: PLAN_GENERATION_PROMPT },
      {
        role: "user",
        content: "Goal: " + goal + constraintText
      }
    ],
    temperature: 0.2
  }).then(function(response) {
    var plan = JSON.parse(response.choices[0].message.content);
    plan.metadata = {
      generatedAt: new Date().toISOString(),
      goal: goal,
      constraints: constraints || {}
    };
    return plan;
  });
}

A well-structured plan output looks like this:

{
  "steps": [
    {
      "id": "step_1",
      "description": "Read the source MySQL database schema",
      "dependencies": [],
      "tool": "database_query",
      "parameters": { "query": "SHOW TABLES; DESCRIBE each table" },
      "expectedOutput": "Complete schema definition",
      "fallback": "Request schema file from user",
      "complexity": "low"
    },
    {
      "id": "step_2",
      "description": "Generate PostgreSQL-compatible DDL from MySQL schema",
      "dependencies": ["step_1"],
      "tool": "llm_transform",
      "parameters": { "input": "step_1.output", "targetDialect": "postgresql" },
      "expectedOutput": "Valid PostgreSQL CREATE TABLE statements",
      "fallback": "Use generic SQL and fix errors iteratively",
      "complexity": "medium"
    }
  ]
}

Tree-of-Thought Planning

For particularly complex or ambiguous tasks, a single linear plan is often insufficient. Tree-of-thought (ToT) planning generates multiple candidate plans, evaluates each one, and selects the best path forward. Think of it as the agent brainstorming several approaches before committing.

function treeOfThoughtPlan(goal, availableTools, constraints, numCandidates) {
  numCandidates = numCandidates || 3;
  var candidates = [];

  // Generate multiple candidate plans in parallel
  var promises = [];
  for (var i = 0; i < numCandidates; i++) {
    var prompt = "Generate approach #" + (i + 1) + " for the following goal. " +
      "Each approach should be meaningfully different from the others. " +
      "Approach " + (i + 1) + " should " +
      (i === 0 ? "prioritize speed and simplicity" :
       i === 1 ? "prioritize reliability and error handling" :
       "prioritize minimal resource usage") + ".";

    promises.push(generatePlan(prompt + "\n\nGoal: " + goal, availableTools, constraints));
  }

  return Promise.all(promises).then(function(plans) {
    // Score each plan
    return evaluatePlans(plans, goal, constraints);
  });
}

function evaluatePlans(plans, goal, constraints) {
  var scored = plans.map(function(plan, index) {
    var stepCount = plan.steps ? plan.steps.length : 0;
    var maxDepth = calculateDependencyDepth(plan);
    var parallelizable = countParallelizableSteps(plan);
    var highComplexityCount = plan.steps ? plan.steps.filter(function(s) {
      return s.complexity === "high";
    }).length : 0;

    // Lower is better for step count and depth; higher is better for parallelism
    var score = 100;
    score -= stepCount * 2;            // Penalize many steps
    score -= maxDepth * 5;             // Penalize deep dependency chains
    score += parallelizable * 3;       // Reward parallelism
    score -= highComplexityCount * 10; // Penalize complex steps

    // Penalize plans that exceed constraints
    if (constraints && constraints.maxSteps && stepCount > constraints.maxSteps) {
      score -= 50;
    }

    return { plan: plan, score: score, index: index };
  });

  scored.sort(function(a, b) { return b.score - a.score; });
  return scored[0].plan;
}

function calculateDependencyDepth(plan) {
  if (!plan.steps) return 0;
  var depthMap = {};

  function getDepth(stepId) {
    if (depthMap[stepId] !== undefined) return depthMap[stepId];
    var step = plan.steps.find(function(s) { return s.id === stepId; });
    if (!step || !step.dependencies || step.dependencies.length === 0) {
      depthMap[stepId] = 0;
      return 0;
    }
    var maxDep = 0;
    step.dependencies.forEach(function(depId) {
      var d = getDepth(depId) + 1;
      if (d > maxDep) maxDep = d;
    });
    depthMap[stepId] = maxDep;
    return maxDep;
  }

  var maxDepth = 0;
  plan.steps.forEach(function(step) {
    var d = getDepth(step.id);
    if (d > maxDepth) maxDepth = d;
  });
  return maxDepth;
}

function countParallelizableSteps(plan) {
  if (!plan.steps) return 0;
  return plan.steps.filter(function(step) {
    return !step.dependencies || step.dependencies.length === 0;
  }).length;
}

Plan Validation and Feasibility Checking

Generating a plan is not enough — you need to validate it before execution. Invalid plans waste time and tokens. A good validator checks for structural correctness, feasibility, and completeness.

function validatePlan(plan, availableTools) {
  var errors = [];
  var warnings = [];
  var stepIds = {};

  if (!plan.steps || plan.steps.length === 0) {
    errors.push("Plan has no steps");
    return { valid: false, errors: errors, warnings: warnings };
  }

  var toolNames = {};
  availableTools.forEach(function(t) { toolNames[t.name] = true; });

  plan.steps.forEach(function(step, index) {
    // Check required fields
    if (!step.id) errors.push("Step " + index + " missing id");
    if (!step.tool) errors.push("Step " + index + " missing tool");
    if (!step.description) warnings.push("Step " + index + " missing description");

    // Check for duplicate IDs
    if (stepIds[step.id]) {
      errors.push("Duplicate step id: " + step.id);
    }
    stepIds[step.id] = true;

    // Check tool availability
    if (step.tool && !toolNames[step.tool] && step.tool !== "llm_transform") {
      errors.push("Step " + step.id + " references unavailable tool: " + step.tool);
    }

    // Check dependency references
    if (step.dependencies) {
      step.dependencies.forEach(function(depId) {
        if (!plan.steps.find(function(s) { return s.id === depId; })) {
          errors.push("Step " + step.id + " depends on non-existent step: " + depId);
        }
      });
    }
  });

  // Check for circular dependencies
  var circularCheck = detectCycles(plan);
  if (circularCheck.hasCycle) {
    errors.push("Circular dependency detected: " + circularCheck.cycle.join(" -> "));
  }

  return {
    valid: errors.length === 0,
    errors: errors,
    warnings: warnings
  };
}

function detectCycles(plan) {
  var visited = {};
  var inStack = {};

  function dfs(stepId, path) {
    if (inStack[stepId]) {
      return { hasCycle: true, cycle: path.concat(stepId) };
    }
    if (visited[stepId]) return { hasCycle: false };

    visited[stepId] = true;
    inStack[stepId] = true;

    var step = plan.steps.find(function(s) { return s.id === stepId; });
    if (step && step.dependencies) {
      for (var i = 0; i < step.dependencies.length; i++) {
        var result = dfs(step.dependencies[i], path.concat(stepId));
        if (result.hasCycle) return result;
      }
    }

    inStack[stepId] = false;
    return { hasCycle: false };
  }

  for (var i = 0; i < plan.steps.length; i++) {
    var result = dfs(plan.steps[i].id, []);
    if (result.hasCycle) return result;
  }
  return { hasCycle: false };
}

Replanning When Steps Fail

Static plans break. APIs go down, data is not in the expected format, permissions are missing. The replanning module takes the current plan state — which steps completed, which failed, what outputs were collected — and generates a revised plan.

function replan(originalPlan, executionState, failedStep, failureReason) {
  var completedSteps = Object.keys(executionState.completed || {});
  var completedOutputs = executionState.completed || {};

  var context = {
    originalGoal: originalPlan.metadata.goal,
    completedSteps: completedSteps.map(function(id) {
      return {
        id: id,
        output: completedOutputs[id].substring(0, 500) // Truncate for token efficiency
      };
    }),
    failedStep: {
      id: failedStep.id,
      description: failedStep.description,
      tool: failedStep.tool,
      error: failureReason
    },
    remainingSteps: originalPlan.steps.filter(function(s) {
      return !completedSteps.includes(s.id) && s.id !== failedStep.id;
    })
  };

  var replanPrompt = [
    "The following plan partially executed but step '" + failedStep.id + "' failed.",
    "",
    "Original goal: " + context.originalGoal,
    "Completed steps: " + JSON.stringify(context.completedSteps),
    "Failed step: " + JSON.stringify(context.failedStep),
    "Remaining steps: " + JSON.stringify(context.remainingSteps),
    "",
    "Generate a revised plan that:",
    "1. Does NOT repeat already-completed steps",
    "2. Uses the failed step's fallback strategy if viable",
    "3. Adjusts remaining steps to account for the failure",
    "4. Maintains the original goal",
    "",
    "If the goal is no longer achievable, set 'achievable' to false and explain why."
  ].join("\n");

  return client.chat.completions.create({
    model: "gpt-4o",
    response_format: { type: "json_object" },
    messages: [
      { role: "system", content: PLAN_GENERATION_PROMPT },
      { role: "user", content: replanPrompt }
    ],
    temperature: 0.2
  }).then(function(response) {
    var revisedPlan = JSON.parse(response.choices[0].message.content);
    revisedPlan.metadata = {
      generatedAt: new Date().toISOString(),
      goal: context.originalGoal,
      isReplan: true,
      replanReason: failureReason,
      replanCount: (originalPlan.metadata.replanCount || 0) + 1
    };
    return revisedPlan;
  });
}

Hierarchical Planning

Real-world tasks have natural hierarchies. "Deploy the application" breaks down into "build the artifact," "run tests," "push to registry," and "update the deployment." Each of those breaks down further. Hierarchical planning captures this structure and lets the agent reason at different levels of abstraction.

function hierarchicalPlan(goal, availableTools, depth) {
  depth = depth || 0;
  var maxDepth = 3;

  if (depth >= maxDepth) {
    // At leaf level, generate atomic actions
    return generatePlan(goal, availableTools, { maxSteps: 5 });
  }

  // Generate high-level plan first
  var highlevelPrompt = [
    "Break this goal into 3-5 HIGH-LEVEL phases. Each phase should be a major milestone.",
    "Do NOT include low-level implementation details.",
    "Each phase will be decomposed into detailed steps separately."
  ].join("\n");

  return generatePlan(highlevelPrompt + "\n\nGoal: " + goal, availableTools, null)
    .then(function(highLevelPlan) {
      // Recursively decompose each phase
      var decompositions = highLevelPlan.steps.map(function(phase) {
        return hierarchicalPlan(
          "Implement: " + phase.description +
          " (as part of: " + goal + ")",
          availableTools,
          depth + 1
        ).then(function(subPlan) {
          phase.subSteps = subPlan.steps;
          return phase;
        });
      });

      return Promise.all(decompositions).then(function(phases) {
        highLevelPlan.steps = phases;
        highLevelPlan.metadata = {
          goal: goal,
          hierarchyDepth: maxDepth,
          generatedAt: new Date().toISOString()
        };
        return highLevelPlan;
      });
    });
}

Plan Representation Formats

Different plan structures suit different execution models. Here are three common formats and when to use each.

JSON Task Lists — simplest format. A flat array of tasks with dependency pointers. Good for linear or mildly parallel workflows.

DAGs (Directed Acyclic Graphs) — express complex dependencies where multiple paths converge. Better for build-system-style workflows.

State Machines — model plans where the next action depends on the current state. Best for workflows with conditional branches and loops.

// Convert a task-list plan to a DAG for execution
function planToDAG(plan) {
  var nodes = {};
  var edges = [];

  plan.steps.forEach(function(step) {
    nodes[step.id] = {
      id: step.id,
      tool: step.tool,
      parameters: step.parameters,
      status: "pending",
      output: null
    };

    if (step.dependencies) {
      step.dependencies.forEach(function(depId) {
        edges.push({ from: depId, to: step.id });
      });
    }
  });

  return { nodes: nodes, edges: edges };
}

// Get all steps that are ready to execute (dependencies satisfied)
function getReadySteps(dag) {
  var ready = [];

  Object.keys(dag.nodes).forEach(function(nodeId) {
    var node = dag.nodes[nodeId];
    if (node.status !== "pending") return;

    var incomingEdges = dag.edges.filter(function(e) { return e.to === nodeId; });
    var allDepsSatisfied = incomingEdges.every(function(edge) {
      return dag.nodes[edge.from].status === "completed";
    });

    if (allDepsSatisfied) ready.push(node);
  });

  return ready;
}

Constraint-Aware Planning

Production agents operate under real constraints: API rate limits, budget caps, time windows, and tool availability. Constraint-aware planning bakes these into the plan generation process so the agent does not produce plans it cannot execute.

function ConstraintChecker(constraints) {
  this.maxLlmCalls = constraints.maxLlmCalls || Infinity;
  this.maxTimeBudgetMs = constraints.timeBudgetMs || Infinity;
  this.maxCostDollars = constraints.maxCostDollars || Infinity;
  this.availableTools = constraints.availableTools || [];
  this.llmCallCount = 0;
  this.startTime = Date.now();
  this.costAccumulated = 0;
}

ConstraintChecker.prototype.canProceed = function() {
  if (this.llmCallCount >= this.maxLlmCalls) {
    return { allowed: false, reason: "LLM call limit reached (" + this.maxLlmCalls + ")" };
  }
  var elapsed = Date.now() - this.startTime;
  if (elapsed >= this.maxTimeBudgetMs) {
    return { allowed: false, reason: "Time budget exhausted (" + this.maxTimeBudgetMs + "ms)" };
  }
  if (this.costAccumulated >= this.maxCostDollars) {
    return { allowed: false, reason: "Cost budget exhausted ($" + this.maxCostDollars + ")" };
  }
  return { allowed: true };
};

ConstraintChecker.prototype.recordLlmCall = function(inputTokens, outputTokens) {
  this.llmCallCount++;
  // Approximate cost for GPT-4o: $2.50/1M input, $10/1M output
  this.costAccumulated += (inputTokens / 1000000) * 2.50 + (outputTokens / 1000000) * 10;
};

ConstraintChecker.prototype.hasToolAvailable = function(toolName) {
  if (this.availableTools.length === 0) return true; // No restriction
  return this.availableTools.some(function(t) { return t.name === toolName; });
};

Implementing a Plan Executor with Rollback

The executor runs each step, tracks state, and handles failures with rollback support. This is the heart of the agent's execution engine.

function PlanExecutor(plan, toolRegistry, constraintChecker) {
  this.plan = plan;
  this.dag = planToDAG(plan);
  this.toolRegistry = toolRegistry;
  this.constraintChecker = constraintChecker;
  this.executionLog = [];
  this.rollbackStack = [];
  this.state = {
    completed: {},
    failed: {},
    skipped: {}
  };
}

PlanExecutor.prototype.execute = function() {
  var self = this;
  return self._executeNextBatch();
};

PlanExecutor.prototype._executeNextBatch = function() {
  var self = this;

  // Check constraints before proceeding
  var check = self.constraintChecker.canProceed();
  if (!check.allowed) {
    return Promise.reject(new Error("Constraint violation: " + check.reason));
  }

  var readySteps = getReadySteps(self.dag);
  if (readySteps.length === 0) {
    // Check if we're done or stuck
    var pendingCount = Object.keys(self.dag.nodes).filter(function(id) {
      return self.dag.nodes[id].status === "pending";
    }).length;

    if (pendingCount === 0) {
      return Promise.resolve({
        success: true,
        state: self.state,
        log: self.executionLog
      });
    }

    return Promise.reject(new Error("Deadlock: " + pendingCount + " steps pending but none ready"));
  }

  // Execute ready steps in parallel
  var stepPromises = readySteps.map(function(step) {
    return self._executeStep(step);
  });

  return Promise.all(stepPromises).then(function() {
    return self._executeNextBatch();
  });
};

PlanExecutor.prototype._executeStep = function(step) {
  var self = this;
  var startTime = Date.now();

  self.executionLog.push({
    stepId: step.id,
    event: "started",
    timestamp: new Date().toISOString()
  });

  var tool = self.toolRegistry[step.tool];
  if (!tool) {
    return self._handleStepFailure(step, "Tool not found: " + step.tool);
  }

  // Resolve parameter references (e.g., "step_1.output" -> actual output)
  var resolvedParams = self._resolveParameters(step.parameters);

  return tool.execute(resolvedParams).then(function(output) {
    var duration = Date.now() - startTime;
    step.status = "completed";
    step.output = output;
    self.state.completed[step.id] = output;

    // Push rollback action if the tool supports it
    if (tool.rollback) {
      self.rollbackStack.push({
        stepId: step.id,
        rollbackFn: function() { return tool.rollback(resolvedParams, output); }
      });
    }

    self.executionLog.push({
      stepId: step.id,
      event: "completed",
      duration: duration,
      timestamp: new Date().toISOString()
    });

    return output;
  }).catch(function(error) {
    return self._handleStepFailure(step, error.message);
  });
};

PlanExecutor.prototype._handleStepFailure = function(step, errorMessage) {
  var self = this;

  self.executionLog.push({
    stepId: step.id,
    event: "failed",
    error: errorMessage,
    timestamp: new Date().toISOString()
  });

  step.status = "failed";
  self.state.failed[step.id] = errorMessage;

  // Find the original plan step for fallback info
  var originalStep = self.plan.steps.find(function(s) { return s.id === step.id; });

  if (originalStep && originalStep.fallback && self.plan.metadata.replanCount < 3) {
    console.log("Step " + step.id + " failed. Attempting replan...");
    return replan(self.plan, self.state, originalStep, errorMessage)
      .then(function(revisedPlan) {
        // Replace remaining steps with revised plan
        self.plan = revisedPlan;
        self.dag = planToDAG(revisedPlan);
        // Carry over completed step statuses
        Object.keys(self.state.completed).forEach(function(completedId) {
          if (self.dag.nodes[completedId]) {
            self.dag.nodes[completedId].status = "completed";
          }
        });
        return null; // Continue execution with revised plan
      });
  }

  return Promise.reject(new Error("Step " + step.id + " failed with no fallback: " + errorMessage));
};

PlanExecutor.prototype._resolveParameters = function(params) {
  var self = this;
  var resolved = {};

  Object.keys(params || {}).forEach(function(key) {
    var value = params[key];
    if (typeof value === "string" && value.match(/^step_\w+\.output$/)) {
      var refStepId = value.replace(".output", "");
      resolved[key] = self.state.completed[refStepId] || value;
    } else {
      resolved[key] = value;
    }
  });

  return resolved;
};

PlanExecutor.prototype.rollback = function() {
  var self = this;
  console.log("Rolling back " + self.rollbackStack.length + " steps...");

  function rollbackNext() {
    if (self.rollbackStack.length === 0) return Promise.resolve();
    var entry = self.rollbackStack.pop();
    console.log("Rolling back step: " + entry.stepId);
    return entry.rollbackFn().then(rollbackNext).catch(function(err) {
      console.error("Rollback failed for step " + entry.stepId + ": " + err.message);
      return rollbackNext(); // Continue rolling back remaining steps
    });
  }

  return rollbackNext();
};

Plan Caching and Reuse

Many tasks are structurally similar. "Generate a report for Q1" and "Generate a report for Q2" should not require full plan regeneration. Plan caching stores successful plans keyed by a normalized version of the goal, so similar future tasks can reuse them.

function PlanCache() {
  this.cache = {};
}

PlanCache.prototype.normalizeGoal = function(goal) {
  // Strip specific values, keep structure
  return goal
    .toLowerCase()
    .replace(/\d{4}-\d{2}-\d{2}/g, "<DATE>")
    .replace(/\b\d+\b/g, "<NUM>")
    .replace(/["'][^"']+["']/g, "<STRING>")
    .replace(/\s+/g, " ")
    .trim();
};

PlanCache.prototype.get = function(goal) {
  var key = this.normalizeGoal(goal);
  var entry = this.cache[key];
  if (!entry) return null;

  // Check if cached plan is still fresh (24 hours)
  var age = Date.now() - entry.cachedAt;
  if (age > 24 * 60 * 60 * 1000) {
    delete this.cache[key];
    return null;
  }

  console.log("Plan cache hit for: " + key);
  // Deep clone to avoid mutation
  return JSON.parse(JSON.stringify(entry.plan));
};

PlanCache.prototype.set = function(goal, plan) {
  var key = this.normalizeGoal(goal);
  this.cache[key] = {
    plan: plan,
    cachedAt: Date.now(),
    hitCount: 0
  };
};

Reasoning Traces for Explainability

When a plan fails or produces unexpected results, you need to understand why the agent made the decisions it did. Reasoning traces capture the agent's thought process at each decision point.

function ReasoningTracer() {
  this.traces = [];
}

ReasoningTracer.prototype.record = function(stepId, category, reasoning) {
  this.traces.push({
    stepId: stepId,
    category: category, // "planning", "execution", "replanning", "constraint"
    reasoning: reasoning,
    timestamp: new Date().toISOString()
  });
};

ReasoningTracer.prototype.getTraceForStep = function(stepId) {
  return this.traces.filter(function(t) { return t.stepId === stepId; });
};

ReasoningTracer.prototype.getSummary = function() {
  var categories = {};
  this.traces.forEach(function(t) {
    if (!categories[t.category]) categories[t.category] = 0;
    categories[t.category]++;
  });
  return {
    totalTraces: this.traces.length,
    byCategory: categories,
    timeline: this.traces.map(function(t) {
      return "[" + t.category + "] " + t.stepId + ": " + t.reasoning.substring(0, 100);
    })
  };
};

Combining Planning with ReAct for Adaptive Execution

The ReAct (Reason + Act) pattern interleaves reasoning and action. Combined with planning, you get an agent that has a plan but can deviate intelligently when observations do not match expectations.

function reactExecuteStep(step, toolRegistry, tracer) {
  var maxIterations = 5;
  var iteration = 0;

  function iterate(observation) {
    iteration++;
    if (iteration > maxIterations) {
      return Promise.reject(new Error("Max ReAct iterations exceeded for step " + step.id));
    }

    // Reason about the observation
    return client.chat.completions.create({
      model: "gpt-4o",
      response_format: { type: "json_object" },
      messages: [
        {
          role: "system",
          content: "You are executing step: " + step.description + ". " +
            "Given the current observation, decide: " +
            "(1) 'act' with a tool call, or (2) 'finish' with the final result. " +
            "Respond with JSON: { action: 'act'|'finish', tool?: string, params?: object, result?: string, reasoning: string }"
        },
        {
          role: "user",
          content: "Observation: " + (observation || "No prior observation. Beginning step execution.")
        }
      ],
      temperature: 0.1
    }).then(function(response) {
      var decision = JSON.parse(response.choices[0].message.content);

      tracer.record(step.id, "execution",
        "Iteration " + iteration + ": " + decision.reasoning
      );

      if (decision.action === "finish") {
        return decision.result;
      }

      // Execute the tool and observe the result
      var tool = toolRegistry[decision.tool];
      if (!tool) {
        return iterate("Error: Tool '" + decision.tool + "' not found. Available: " +
          Object.keys(toolRegistry).join(", "));
      }

      return tool.execute(decision.params || {}).then(function(output) {
        return iterate("Tool " + decision.tool + " returned: " + JSON.stringify(output).substring(0, 1000));
      }).catch(function(err) {
        return iterate("Tool " + decision.tool + " failed: " + err.message);
      });
    });
  }

  return iterate(null);
}

Complete Working Example

Here is a complete Node.js planning agent that ties everything together. It accepts a complex task, decomposes it into a plan, validates feasibility, executes with monitoring, and replans on failure.

var OpenAI = require("openai");
var http = require("http");

var client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// ---- Tool Registry ----
var tools = {
  file_read: {
    name: "file_read",
    description: "Read a file from disk",
    execute: function(params) {
      var fs = require("fs");
      return new Promise(function(resolve, reject) {
        fs.readFile(params.path, "utf8", function(err, data) {
          if (err) return reject(err);
          resolve(data);
        });
      });
    },
    rollback: null
  },
  file_write: {
    name: "file_write",
    description: "Write content to a file",
    execute: function(params) {
      var fs = require("fs");
      return new Promise(function(resolve, reject) {
        fs.writeFile(params.path, params.content, "utf8", function(err) {
          if (err) return reject(err);
          resolve("File written: " + params.path);
        });
      });
    },
    rollback: function(params) {
      var fs = require("fs");
      return new Promise(function(resolve) {
        fs.unlink(params.path, function() { resolve(); });
      });
    }
  },
  http_request: {
    name: "http_request",
    description: "Make an HTTP request",
    execute: function(params) {
      return new Promise(function(resolve, reject) {
        var url = new URL(params.url);
        var req = http.request({
          hostname: url.hostname,
          path: url.pathname + url.search,
          method: params.method || "GET"
        }, function(res) {
          var data = "";
          res.on("data", function(chunk) { data += chunk; });
          res.on("end", function() { resolve(data); });
        });
        req.on("error", reject);
        req.end();
      });
    },
    rollback: null
  },
  llm_transform: {
    name: "llm_transform",
    description: "Use an LLM to transform or analyze text",
    execute: function(params) {
      return client.chat.completions.create({
        model: "gpt-4o",
        messages: [
          { role: "system", content: params.instruction || "Transform the following input." },
          { role: "user", content: params.input || "" }
        ]
      }).then(function(r) { return r.choices[0].message.content; });
    },
    rollback: null
  }
};

// ---- Main Agent ----
function PlanningAgent(options) {
  this.tools = options.tools || tools;
  this.cache = new PlanCache();
  this.tracer = new ReasoningTracer();
  this.maxReplans = options.maxReplans || 3;
  this.constraints = new ConstraintChecker(options.constraints || {
    maxLlmCalls: 50,
    timeBudgetMs: 300000,  // 5 minutes
    maxCostDollars: 1.00,
    availableTools: Object.keys(options.tools || tools).map(function(name) {
      return { name: name };
    })
  });
}

PlanningAgent.prototype.run = function(goal) {
  var self = this;
  console.log("=== Planning Agent Started ===");
  console.log("Goal: " + goal);

  self.tracer.record("agent", "planning", "Received goal: " + goal);

  // Check cache first
  var cachedPlan = self.cache.get(goal);
  var planPromise;

  if (cachedPlan) {
    self.tracer.record("agent", "planning", "Using cached plan");
    planPromise = Promise.resolve(cachedPlan);
  } else {
    planPromise = generatePlan(goal, Object.values(self.tools), self.constraints);
  }

  return planPromise
    .then(function(plan) {
      self.tracer.record("agent", "planning",
        "Plan generated with " + plan.steps.length + " steps");

      // Validate
      var validation = validatePlan(plan, Object.values(self.tools));
      if (!validation.valid) {
        self.tracer.record("agent", "planning",
          "Plan validation failed: " + validation.errors.join("; "));
        return Promise.reject(new Error(
          "Invalid plan: " + validation.errors.join("; ")
        ));
      }

      if (validation.warnings.length > 0) {
        console.log("Plan warnings: " + validation.warnings.join("; "));
      }

      console.log("Plan validated. Executing " + plan.steps.length + " steps...");

      // Execute
      var executor = new PlanExecutor(plan, self.tools, self.constraints);
      return executor.execute().then(function(result) {
        // Cache successful plan
        self.cache.set(goal, plan);
        return result;
      }).catch(function(err) {
        console.log("Execution failed: " + err.message);
        console.log("Attempting rollback...");
        return executor.rollback().then(function() {
          return Promise.reject(err);
        });
      });
    })
    .then(function(result) {
      console.log("=== Planning Agent Completed ===");
      console.log("Trace summary:", self.tracer.getSummary());
      return result;
    });
};

// ---- Run the Agent ----
var agent = new PlanningAgent({
  tools: tools,
  constraints: {
    maxLlmCalls: 30,
    timeBudgetMs: 120000,
    maxCostDollars: 0.50
  }
});

agent.run(
  "Read the file config.json, extract all API endpoint URLs, " +
  "test each endpoint with a GET request, and write a health report " +
  "to health-report.txt with the status of each endpoint"
).then(function(result) {
  console.log("Final result:", JSON.stringify(result, null, 2));
}).catch(function(err) {
  console.error("Agent failed:", err.message);
});

Common Issues and Troubleshooting

1. Plan generation returns malformed JSON

SyntaxError: Unexpected token < in JSON at position 0

This happens when the LLM returns markdown-wrapped JSON (json ... ) instead of raw JSON. Always use response_format: { type: "json_object" } with OpenAI, or add explicit instructions to strip markdown fencing. Parse defensively:

function safeParseJSON(text) {
  // Strip markdown code fences if present
  var cleaned = text.replace(/^```(?:json)?\s*\n?/i, "").replace(/\n?```\s*$/i, "");
  return JSON.parse(cleaned);
}

2. Circular dependency deadlock

Error: Deadlock: 3 steps pending but none ready

This occurs when steps reference each other in a cycle (A depends on B, B depends on C, C depends on A). The detectCycles validation catches this before execution, but if you skip validation, you will hit a deadlock in the executor. Always validate plans before executing them.

3. Token limit exceeded during replanning

Error: This model's maximum context length is 128000 tokens.
However, your messages resulted in 134521 tokens.

Replanning includes the full execution state, which grows with each completed step. Truncate step outputs in the replan context to stay within limits. A good rule is to cap each step's output to 500 characters in the replan prompt, which is enough for the LLM to understand what happened without blowing the token budget.

4. Constraint violation mid-execution

Error: Constraint violation: LLM call limit reached (30)

This fires when the agent exhausts its LLM call budget. It is tempting to just increase the limit, but the right fix is to make your plans more efficient: use fewer ReAct iterations, cache intermediate LLM results, and combine multiple small LLM calls into single batch operations. Set realistic budgets and tune them based on observed usage patterns.

5. Rollback fails after partial execution

Error: Rollback failed for step step_3: ENOENT: no such file or directory

Rollback functions must be idempotent — calling them when nothing needs to be rolled back should be a no-op, not an error. Wrap rollback logic in try-catch and check preconditions before acting. The executor already handles this by continuing rollback of remaining steps even if one fails.

Best Practices

  • Always validate plans before execution. A five-millisecond validation pass saves minutes of debugging failed executions. Check for circular dependencies, missing tools, and unreferenced step IDs.

  • Set hard constraints and enforce them. Budget limits, time limits, and LLM call caps are not suggestions — they are circuit breakers. Without them, a runaway agent can burn through your API budget in minutes.

  • Keep planning prompts stable and version them. Small changes to the planning prompt can dramatically change plan structure. Treat your planning prompts like code: version them, test them, and do not change them without validating the output.

  • Truncate context aggressively during replanning. The LLM does not need the full output of every completed step to replan. Pass summaries, not raw data. This keeps token usage predictable and prevents context window overflow.

  • Log reasoning traces in production. When an agent does something unexpected, the reasoning trace is your debugging lifeline. Store traces alongside execution logs and make them searchable. They are also invaluable for improving your planning prompts.

  • Cache plans for structurally similar tasks. Normalizing goals and caching successful plans eliminates redundant LLM calls. A good cache hit rate of 30-40% can cut your planning costs significantly.

  • Limit replan depth. Set a maximum number of replans (typically 2-3) and fail gracefully if the agent cannot recover. Unbounded replanning loops are expensive and rarely converge on a solution.

  • Design tools with rollback in mind. Every tool that creates or modifies state should expose a rollback function. This makes the executor's job dramatically simpler and gives you a safety net for partial failures.

  • Use DAG execution for parallelism. Steps with no dependencies should run in parallel. The DAG-based executor handles this naturally, and the speedup can be substantial for I/O-bound tasks like API calls or file operations.

References

  • Wei, J. et al. "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models." NeurIPS 2022.
  • Yao, S. et al. "Tree of Thoughts: Deliberate Problem Solving with Large Language Models." NeurIPS 2023.
  • Yao, S. et al. "ReAct: Synergizing Reasoning and Acting in Language Models." ICLR 2023.
  • OpenAI API Documentation — Structured Outputs and JSON Mode.
  • LangChain Plan-and-Execute Agent Documentation.
  • Russell, S. and Norvig, P. "Artificial Intelligence: A Modern Approach." Chapter 11: Classical Planning.
Powered by Contentful