Streaming Responses in MCP Servers

Shane

2/8/2026

24 min read

Implement streaming responses in MCP servers for real-time progress updates, chunked content delivery, and long-running operations using stdio and SSE transports.

mcp model-context-protocol nodejs streaming server-sent-events real-time

Streaming Responses in MCP Servers

Overview

The Model Context Protocol (MCP) gives AI clients a standardized way to call tools, read resources, and run prompts on external servers. But the moment your tool does something that takes more than a couple of seconds -- analyzing a dataset, generating a report, crawling a site -- the client and the user are left staring at a spinner with no feedback. MCP solves this with progress notifications and streaming transport mechanisms that let your server push incremental updates back to the client while work is still in progress. This article covers both transport layers (stdio and Streamable HTTP), the progress notification protocol, chunked content delivery, and backpressure handling, all with working Node.js code you can run today.

Prerequisites

Node.js v18+ installed (v20 recommended)
@modelcontextprotocol/sdk v1.12+ (npm install @modelcontextprotocol/sdk)
zod v3 (npm install zod) -- used for input schema validation
express v4 (npm install express) -- for the Streamable HTTP transport examples
Familiarity with JSON-RPC 2.0 message format
Basic understanding of MCP concepts (tools, resources, transports)

MCP Transport Options and Streaming Basics

MCP defines two primary transport mechanisms. Your choice of transport dictates how streaming works at the wire level.

stdio Transport

The client spawns your server as a child process. Messages flow over process.stdin and process.stdout as newline-delimited JSON-RPC messages. This is the simplest transport and the one most MCP clients (Claude Desktop, Cursor, Cline) use for local tools.

Streaming over stdio means your server writes multiple JSON-RPC notification messages to stdout before sending the final response. The client reads them as they arrive.

Streamable HTTP Transport

Introduced in the MCP spec revision 2025-03-26, Streamable HTTP replaces the older SSE-only transport. The client sends JSON-RPC requests as HTTP POST to a single endpoint (e.g., /mcp). The server can respond with either:

A single JSON response (Content-Type: application/json)
An SSE stream (Content-Type: text/event-stream) that delivers multiple JSON-RPC messages, including progress notifications, before the final result

This is the transport you use for remote servers, multi-client scenarios, and web-based integrations.

The Progress Notification Protocol

Regardless of transport, MCP streaming relies on the notifications/progress JSON-RPC notification. The flow works like this:

The client includes a progressToken in the request's _meta field
The server sends notifications/progress messages referencing that token
The server sends the final response when the operation completes

Here is what the wire protocol looks like:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "analyze_dataset",
    "arguments": { "file": "sales_2025.csv" },
    "_meta": {
      "progressToken": "req-001"
    }
  }
}

The server sends progress updates:

{
  "jsonrpc": "2.0",
  "method": "notifications/progress",
  "params": {
    "progressToken": "req-001",
    "progress": 250,
    "total": 1000,
    "message": "Processed 250 of 1000 rows"
  }
}

The progress value must increase with each notification. The total is optional -- you can omit it when the size is unknown. The message field is a human-readable status string that clients can display directly.

Implementing Streaming Over stdio Transport

The stdio transport is the default for local MCP servers. Here is a complete server that registers a long-running tool and sends progress notifications:

var { McpServer } = require("@modelcontextprotocol/sdk/server/mcp.js");
var { StdioServerTransport } = require("@modelcontextprotocol/sdk/server/stdio.js");
var { z } = require("zod");

var server = new McpServer({
  name: "data-analyzer",
  version: "1.0.0"
});

server.tool(
  "analyze_dataset",
  "Analyze a CSV dataset and return summary statistics",
  {
    rowCount: z.number().describe("Number of rows to process"),
    chunkSize: z.number().optional().default(100).describe("Rows per batch")
  },
  function(args, extra) {
    return new Promise(function(resolve) {
      var processed = 0;
      var total = args.rowCount;
      var chunkSize = args.chunkSize;
      var results = [];

      function processChunk() {
        var end = Math.min(processed + chunkSize, total);

        // Simulate processing work
        for (var i = processed; i < end; i++) {
          results.push({
            row: i,
            value: Math.random() * 1000
          });
        }

        processed = end;

        // Send progress notification
        extra.sendProgress({
          progress: processed,
          total: total,
          message: "Processed " + processed + " of " + total + " rows"
        });

        if (processed < total) {
          setTimeout(processChunk, 50);
        } else {
          // Calculate summary
          var sum = 0;
          for (var j = 0; j < results.length; j++) {
            sum += results[j].value;
          }
          var avg = sum / results.length;

          resolve({
            content: [{
              type: "text",
              text: JSON.stringify({
                totalRows: total,
                average: avg.toFixed(2),
                min: Math.min.apply(null, results.map(function(r) { return r.value; })).toFixed(2),
                max: Math.max.apply(null, results.map(function(r) { return r.value; })).toFixed(2)
              }, null, 2)
            }]
          });
        }
      }

      processChunk();
    });
  }
);

var transport = new StdioServerTransport();
server.connect(transport).then(function() {
  // Server is running, listening on stdin/stdout
});

When you run this server and a client calls analyze_dataset with rowCount: 1000 and chunkSize: 100, the server writes 10 progress notifications to stdout followed by the final result. Each notification is a separate newline-delimited JSON-RPC message.

The key API detail: the second argument to the tool handler (extra) contains the sendProgress method. The SDK handles the progressToken correlation automatically -- you do not need to extract it from the request metadata yourself.

Implementing Streaming Over Streamable HTTP Transport

For remote servers, Streamable HTTP is the modern standard. When the server responds with Content-Type: text/event-stream, the client receives progress notifications as SSE events in real time.

var { McpServer } = require("@modelcontextprotocol/sdk/server/mcp.js");
var { StreamableHTTPServerTransport } = require("@modelcontextprotocol/sdk/server/streamableHttp.js");
var express = require("express");
var crypto = require("crypto");
var { z } = require("zod");

var app = express();
app.use(express.json());

var server = new McpServer({
  name: "remote-analyzer",
  version: "1.0.0"
});

// Register the same analysis tool
server.tool(
  "analyze_dataset",
  "Analyze a dataset with streaming progress",
  {
    rowCount: z.number(),
    chunkSize: z.number().optional().default(100)
  },
  function(args, extra) {
    return new Promise(function(resolve) {
      var processed = 0;
      var total = args.rowCount;

      function processNext() {
        processed = Math.min(processed + args.chunkSize, total);

        extra.sendProgress({
          progress: processed,
          total: total,
          message: "Analyzing row " + processed + "/" + total
        });

        if (processed < total) {
          setTimeout(processNext, 100);
        } else {
          resolve({
            content: [{
              type: "text",
              text: "Analysis complete. Processed " + total + " rows."
            }]
          });
        }
      }

      processNext();
    });
  }
);

// Session management
var sessions = {};

app.post("/mcp", function(req, res) {
  var sessionId = req.headers["mcp-session-id"];

  if (!sessionId) {
    // New session -- create transport
    sessionId = crypto.randomUUID();
    var transport = new StreamableHTTPServerTransport({
      sessionId: sessionId,
      onsessioninitialized: function(session) {
        sessions[sessionId] = session;
      }
    });

    res.setHeader("Mcp-Session-Id", sessionId);
    server.connect(transport);
    transport.handleRequest(req, res);
  } else if (sessions[sessionId]) {
    sessions[sessionId].transport.handleRequest(req, res);
  } else {
    res.status(400).json({ error: "Unknown session" });
  }
});

app.get("/mcp", function(req, res) {
  var sessionId = req.headers["mcp-session-id"];
  if (sessionId && sessions[sessionId]) {
    sessions[sessionId].transport.handleGet(req, res);
  } else {
    res.status(400).json({ error: "Session required" });
  }
});

app.delete("/mcp", function(req, res) {
  var sessionId = req.headers["mcp-session-id"];
  if (sessionId && sessions[sessionId]) {
    delete sessions[sessionId];
    res.status(200).json({ message: "Session terminated" });
  } else {
    res.status(404).json({ error: "Session not found" });
  }
});

var PORT = process.env.PORT || 3001;
app.listen(PORT, function() {
  console.log("MCP Streamable HTTP server on port " + PORT);
});

When a client sends a tools/call request with a progressToken, the transport automatically upgrades the response to an SSE stream. The client sees events like:

event: message
data: {"jsonrpc":"2.0","method":"notifications/progress","params":{"progressToken":"req-001","progress":100,"total":1000,"message":"Analyzing row 100/1000"}}

event: message
data: {"jsonrpc":"2.0","method":"notifications/progress","params":{"progressToken":"req-001","progress":200,"total":1000,"message":"Analyzing row 200/1000"}}

event: message
data: {"jsonrpc":"2.0","id":1,"result":{"content":[{"type":"text","text":"Analysis complete. Processed 1000 rows."}]}}

The final event contains the JSON-RPC result with the matching id from the original request.

Progress Notifications and Partial Results

Progress notifications are not limited to simple counters. You can use the message field to send rich status updates, and you can structure your tool to return partial results at each stage.

server.tool(
  "multi_stage_pipeline",
  "Run a multi-stage data processing pipeline",
  {
    input: z.string().describe("Input data identifier")
  },
  function(args, extra) {
    return new Promise(function(resolve, reject) {
      var stages = [
        { name: "Fetching data", weight: 10 },
        { name: "Validating schema", weight: 5 },
        { name: "Transforming records", weight: 40 },
        { name: "Running aggregations", weight: 30 },
        { name: "Generating report", weight: 15 }
      ];

      var totalWeight = 0;
      for (var i = 0; i < stages.length; i++) {
        totalWeight += stages[i].weight;
      }

      var completedWeight = 0;
      var currentStage = 0;
      var partialResults = [];

      function runStage() {
        if (currentStage >= stages.length) {
          resolve({
            content: [{
              type: "text",
              text: JSON.stringify({
                status: "complete",
                stages: partialResults,
                totalTime: Date.now() - startTime + "ms"
              }, null, 2)
            }]
          });
          return;
        }

        var stage = stages[currentStage];

        extra.sendProgress({
          progress: completedWeight,
          total: totalWeight,
          message: "Stage " + (currentStage + 1) + "/" + stages.length + ": " + stage.name
        });

        // Simulate stage work
        var stageStart = Date.now();
        setTimeout(function() {
          completedWeight += stage.weight;
          partialResults.push({
            stage: stage.name,
            duration: Date.now() - stageStart + "ms",
            status: "ok"
          });
          currentStage++;
          runStage();
        }, stage.weight * 50);
      }

      var startTime = Date.now();
      runStage();
    });
  }
);

The output the client sees is a structured pipeline report:

{
  "status": "complete",
  "stages": [
    { "stage": "Fetching data", "duration": "502ms", "status": "ok" },
    { "stage": "Validating schema", "duration": "251ms", "status": "ok" },
    { "stage": "Transforming records", "duration": "2004ms", "status": "ok" },
    { "stage": "Running aggregations", "duration": "1501ms", "status": "ok" },
    { "stage": "Generating report", "duration": "753ms", "status": "ok" }
  ],
  "totalTime": "5011ms"
}

Streaming Tool Responses for Long-Running Operations

Some operations genuinely take minutes. Database migrations, large file processing, external API crawls. For these, you need to think carefully about timeout management. Most MCP clients enforce a request timeout (often 60 seconds). Progress notifications can help keep the connection alive, but not all clients reset their timeout when they receive progress.

The defensive pattern is to send frequent progress updates and design your tool to operate within reasonable time bounds:

server.tool(
  "crawl_sitemap",
  "Crawl a website sitemap and extract page metadata",
  {
    url: z.string().url(),
    maxPages: z.number().optional().default(50)
  },
  function(args, extra) {
    var http = require("http");
    var https = require("https");

    return new Promise(function(resolve, reject) {
      var pages = [];
      var queue = [args.url];
      var visited = {};
      var maxPages = args.maxPages;
      var errors = [];

      function crawlNext() {
        if (queue.length === 0 || pages.length >= maxPages) {
          resolve({
            content: [{
              type: "text",
              text: JSON.stringify({
                pagesFound: pages.length,
                errors: errors.length,
                pages: pages.slice(0, 20),
                truncated: pages.length > 20
              }, null, 2)
            }]
          });
          return;
        }

        var pageUrl = queue.shift();
        if (visited[pageUrl]) {
          crawlNext();
          return;
        }
        visited[pageUrl] = true;

        // Progress update every page
        extra.sendProgress({
          progress: pages.length,
          total: maxPages,
          message: "Crawling page " + (pages.length + 1) + ": " + pageUrl.substring(0, 60)
        });

        var client = pageUrl.startsWith("https") ? https : http;

        var request = client.get(pageUrl, function(response) {
          var body = "";

          response.on("data", function(chunk) {
            body += chunk;
          });

          response.on("end", function() {
            var titleMatch = body.match(/<title>(.*?)<\/title>/i);
            pages.push({
              url: pageUrl,
              title: titleMatch ? titleMatch[1] : "No title",
              status: response.statusCode,
              size: body.length
            });

            // Extract links for further crawling (simplified)
            var linkRegex = /href="(https?:\/\/[^"]+)"/gi;
            var match;
            while ((match = linkRegex.exec(body)) !== null) {
              if (!visited[match[1]] && queue.length < maxPages * 2) {
                queue.push(match[1]);
              }
            }

            crawlNext();
          });
        });

        request.on("error", function(err) {
          errors.push({ url: pageUrl, error: err.message });
          crawlNext();
        });

        request.setTimeout(5000, function() {
          request.destroy();
          errors.push({ url: pageUrl, error: "Timeout after 5s" });
          crawlNext();
        });
      }

      crawlNext();
    });
  }
);

The key insight here is that the progress notification fires on every page. Even if a single page takes 5 seconds to load, the client knows the server is still alive and working.

Chunked Content Delivery for Large Resources

MCP resources (the resources/read method) can also benefit from streaming. When a resource is large -- a log file, a database export, a generated report -- you do not want to buffer the entire thing in memory before responding.

The protocol does not have built-in chunking for resources the way it does for tool progress, but you can implement it at the application level using multiple content items:

var { McpServer } = require("@modelcontextprotocol/sdk/server/mcp.js");
var fs = require("fs");
var path = require("path");

var server = new McpServer({
  name: "log-reader",
  version: "1.0.0"
});

server.resource(
  "logs",
  "logs://{filename}",
  {
    description: "Read application log files in chunks",
    mimeType: "text/plain"
  },
  function(uri, extra) {
    return new Promise(function(resolve, reject) {
      var filename = uri.pathname.replace(/^\/\//, "");
      var logPath = path.join("/var/log/app", filename);

      fs.stat(logPath, function(err, stats) {
        if (err) {
          reject(new Error("Log file not found: " + filename));
          return;
        }

        var fileSizeKB = (stats.size / 1024).toFixed(1);
        var CHUNK_SIZE = 64 * 1024; // 64KB chunks
        var totalChunks = Math.ceil(stats.size / CHUNK_SIZE);
        var chunks = [];
        var chunkIndex = 0;

        var stream = fs.createReadStream(logPath, {
          highWaterMark: CHUNK_SIZE,
          encoding: "utf8"
        });

        stream.on("data", function(chunk) {
          chunkIndex++;
          chunks.push(chunk);

          // Send progress if the extra context supports it
          if (extra && extra.sendProgress) {
            extra.sendProgress({
              progress: chunkIndex,
              total: totalChunks,
              message: "Reading chunk " + chunkIndex + "/" + totalChunks +
                       " (" + fileSizeKB + " KB total)"
            });
          }
        });

        stream.on("end", function() {
          resolve({
            contents: [{
              uri: uri.href,
              mimeType: "text/plain",
              text: chunks.join("")
            }]
          });
        });

        stream.on("error", function(readErr) {
          reject(new Error("Failed to read log: " + readErr.message));
        });
      });
    });
  }
);

For truly massive resources (hundreds of megabytes), a better pattern is to return a summary with a resource_link content item that points to the full data, letting the client decide whether to fetch the complete payload:

resolve({
  content: [{
    type: "text",
    text: "Log file: " + filename + " (" + fileSizeKB + " KB, " +
          lineCount + " lines). Showing last 100 lines."
  }, {
    type: "text",
    text: lastHundredLines
  }, {
    type: "resource_link",
    resource: {
      uri: "logs://" + filename,
      mimeType: "text/plain",
      description: "Full log file (" + fileSizeKB + " KB)"
    }
  }]
});

Backpressure Handling and Flow Control

When your server produces progress notifications faster than the transport can deliver them, you get backpressure. Over stdio, this manifests as a growing write buffer on stdout. Over HTTP/SSE, it means the TCP send buffer fills up.

The MCP spec says receivers should implement rate limiting to prevent flooding. Here is a practical approach:

function createThrottledProgress(extra, minIntervalMs) {
  var lastSent = 0;
  var pending = null;
  var timer = null;

  function send(progressData) {
    var now = Date.now();
    var elapsed = now - lastSent;

    if (elapsed >= minIntervalMs) {
      // Enough time has passed -- send immediately
      extra.sendProgress(progressData);
      lastSent = now;
      pending = null;
    } else {
      // Too soon -- queue it
      pending = progressData;
      if (!timer) {
        timer = setTimeout(function() {
          timer = null;
          if (pending) {
            extra.sendProgress(pending);
            lastSent = Date.now();
            pending = null;
          }
        }, minIntervalMs - elapsed);
      }
    }
  }

  function flush() {
    if (timer) {
      clearTimeout(timer);
      timer = null;
    }
    if (pending) {
      extra.sendProgress(pending);
      pending = null;
    }
  }

  return { send: send, flush: flush };
}

// Usage in a tool handler
server.tool(
  "process_large_file",
  "Process a large file with throttled progress",
  {
    size: z.number()
  },
  function(args, extra) {
    return new Promise(function(resolve) {
      var throttle = createThrottledProgress(extra, 200); // Max 5 updates/second
      var processed = 0;
      var total = args.size;

      function processChunk() {
        processed += 1000;
        if (processed > total) processed = total;

        throttle.send({
          progress: processed,
          total: total,
          message: "Processed " + processed + " of " + total + " bytes"
        });

        if (processed < total) {
          setImmediate(processChunk);
        } else {
          throttle.flush(); // Send any pending progress
          resolve({
            content: [{
              type: "text",
              text: "Processed " + total + " bytes"
            }]
          });
        }
      }

      processChunk();
    });
  }
);

The createThrottledProgress wrapper ensures you never send more than 5 progress updates per second (configurable), regardless of how fast your processing loop runs. Without throttling, a tight loop processing millions of rows could flood stdout with thousands of notifications per second, drowning out the actual work.

Client-Side Consumption of Streaming Responses

To test your streaming server, you need a client. Here is how to consume progress notifications using the MCP client SDK:

stdio Client

var { Client } = require("@modelcontextprotocol/sdk/client/index.js");
var { StdioClientTransport } = require("@modelcontextprotocol/sdk/client/stdio.js");

var client = new Client({
  name: "test-client",
  version: "1.0.0"
});

var transport = new StdioClientTransport({
  command: "node",
  args: ["./server.js"]
});

client.setNotificationHandler(
  { method: "notifications/progress" },
  function(notification) {
    var params = notification.params;
    var pct = params.total
      ? Math.round((params.progress / params.total) * 100)
      : "?";
    console.log("[Progress " + pct + "%] " + (params.message || ""));
  }
);

client.connect(transport).then(function() {
  return client.callTool({
    name: "analyze_dataset",
    arguments: { rowCount: 500, chunkSize: 50 }
  }, {
    _meta: {
      progressToken: "my-token-001"
    }
  });
}).then(function(result) {
  console.log("\nFinal result:");
  console.log(result.content[0].text);
  return client.close();
}).catch(function(err) {
  console.error("Error:", err.message);
  process.exit(1);
});

Running this produces:

[Progress 10%] Processed 50 of 500 rows
[Progress 20%] Processed 100 of 500 rows
[Progress 30%] Processed 150 of 500 rows
[Progress 40%] Processed 200 of 500 rows
[Progress 50%] Processed 250 of 500 rows
[Progress 60%] Processed 300 of 500 rows
[Progress 70%] Processed 350 of 500 rows
[Progress 80%] Processed 400 of 500 rows
[Progress 90%] Processed 450 of 500 rows
[Progress 100%] Processed 500 of 500 rows

Final result:
{
  "totalRows": 500,
  "average": "502.31",
  "min": "0.87",
  "max": "998.44"
}

Streamable HTTP Client

var { Client } = require("@modelcontextprotocol/sdk/client/index.js");
var { StreamableHTTPClientTransport } = require("@modelcontextprotocol/sdk/client/streamableHttp.js");

var client = new Client({
  name: "http-test-client",
  version: "1.0.0"
});

var transport = new StreamableHTTPClientTransport(
  new URL("http://localhost:3001/mcp")
);

client.setNotificationHandler(
  { method: "notifications/progress" },
  function(notification) {
    var p = notification.params;
    var bar = "";
    if (p.total) {
      var pct = Math.round((p.progress / p.total) * 100);
      var filled = Math.round(pct / 5);
      bar = "[" + "#".repeat(filled) + "-".repeat(20 - filled) + "] " + pct + "% ";
    }
    process.stdout.write("\r" + bar + (p.message || ""));
  }
);

client.connect(transport).then(function() {
  return client.callTool({
    name: "analyze_dataset",
    arguments: { rowCount: 1000, chunkSize: 100 }
  }, {
    _meta: { progressToken: "http-001" }
  });
}).then(function(result) {
  console.log("\n\nResult:", result.content[0].text);
  return client.close();
}).catch(function(err) {
  console.error("Error:", err);
});

This client renders a progress bar in the terminal:

[####################] 100% Analyzing row 1000/1000

Result: Analysis complete. Processed 1000 rows.

Complete Working Example

Here is a self-contained MCP server that performs a multi-step "data analysis" with both transport options, plus a client script to test it. Create three files:

server-stdio.js -- stdio transport server:

var { McpServer } = require("@modelcontextprotocol/sdk/server/mcp.js");
var { StdioServerTransport } = require("@modelcontextprotocol/sdk/server/stdio.js");
var { z } = require("zod");

var server = new McpServer({
  name: "analysis-server",
  version: "1.0.0"
});

function simulateAnalysis(data, chunkSize, onProgress) {
  return new Promise(function(resolve) {
    var rows = [];
    for (var i = 0; i < data.rows; i++) {
      rows.push({
        id: i,
        value: Math.random() * data.range,
        category: ["A", "B", "C"][i % 3]
      });
    }

    var processed = 0;
    var categories = {};

    function processChunk() {
      var end = Math.min(processed + chunkSize, rows.length);

      for (var j = processed; j < end; j++) {
        var row = rows[j];
        if (!categories[row.category]) {
          categories[row.category] = { count: 0, sum: 0, min: Infinity, max: -Infinity };
        }
        var cat = categories[row.category];
        cat.count++;
        cat.sum += row.value;
        cat.min = Math.min(cat.min, row.value);
        cat.max = Math.max(cat.max, row.value);
      }

      processed = end;
      onProgress(processed, rows.length, "Crunching numbers: " + processed + "/" + rows.length);

      if (processed < rows.length) {
        setTimeout(processChunk, 30);
      } else {
        // Build summary
        var summary = {};
        var keys = Object.keys(categories);
        for (var k = 0; k < keys.length; k++) {
          var key = keys[k];
          var c = categories[key];
          summary[key] = {
            count: c.count,
            average: (c.sum / c.count).toFixed(2),
            min: c.min.toFixed(2),
            max: c.max.toFixed(2)
          };
        }
        resolve(summary);
      }
    }

    processChunk();
  });
}

server.tool(
  "run_analysis",
  "Run statistical analysis on generated dataset",
  {
    rows: z.number().min(10).max(100000).describe("Number of data rows"),
    range: z.number().optional().default(1000).describe("Value range"),
    chunkSize: z.number().optional().default(200).describe("Processing batch size")
  },
  function(args, extra) {
    var startTime = Date.now();

    return simulateAnalysis(
      { rows: args.rows, range: args.range },
      args.chunkSize,
      function(progress, total, message) {
        extra.sendProgress({
          progress: progress,
          total: total,
          message: message
        });
      }
    ).then(function(summary) {
      var elapsed = Date.now() - startTime;
      return {
        content: [{
          type: "text",
          text: JSON.stringify({
            analysis: summary,
            metadata: {
              rowsProcessed: args.rows,
              processingTime: elapsed + "ms",
              throughput: Math.round(args.rows / (elapsed / 1000)) + " rows/sec"
            }
          }, null, 2)
        }]
      };
    });
  }
);

var transport = new StdioServerTransport();
server.connect(transport);

server-http.js -- Streamable HTTP transport server:

var { McpServer } = require("@modelcontextprotocol/sdk/server/mcp.js");
var { StreamableHTTPServerTransport } = require("@modelcontextprotocol/sdk/server/streamableHttp.js");
var express = require("express");
var crypto = require("crypto");
var { z } = require("zod");

var app = express();
app.use(express.json());

var server = new McpServer({
  name: "analysis-server-http",
  version: "1.0.0"
});

// Register the same tool from server-stdio.js
// (copy the tool registration and simulateAnalysis function here)

var transports = {};

app.post("/mcp", function(req, res) {
  var sessionId = req.headers["mcp-session-id"];

  if (!sessionId || !transports[sessionId]) {
    sessionId = crypto.randomUUID();
    var transport = new StreamableHTTPServerTransport({ sessionId: sessionId });
    transports[sessionId] = transport;

    res.setHeader("Mcp-Session-Id", sessionId);
    server.connect(transport);
    transport.handleRequest(req, res);
  } else {
    transports[sessionId].handleRequest(req, res);
  }
});

app.get("/mcp", function(req, res) {
  var sessionId = req.headers["mcp-session-id"];
  if (sessionId && transports[sessionId]) {
    transports[sessionId].handleGet(req, res);
  } else {
    res.status(400).json({ error: "Session required" });
  }
});

app.delete("/mcp", function(req, res) {
  var sessionId = req.headers["mcp-session-id"];
  if (sessionId && transports[sessionId]) {
    transports[sessionId].close();
    delete transports[sessionId];
    res.status(200).end();
  } else {
    res.status(404).end();
  }
});

app.listen(3001, function() {
  console.log("Streamable HTTP MCP server running on http://localhost:3001/mcp");
});

client.js -- client that connects and runs the analysis:

var { Client } = require("@modelcontextprotocol/sdk/client/index.js");
var { StdioClientTransport } = require("@modelcontextprotocol/sdk/client/stdio.js");

var client = new Client({ name: "analysis-client", version: "1.0.0" });

var transport = new StdioClientTransport({
  command: "node",
  args: ["./server-stdio.js"]
});

var startTime = Date.now();

client.setNotificationHandler(
  { method: "notifications/progress" },
  function(notification) {
    var p = notification.params;
    if (p.total) {
      var pct = Math.round((p.progress / p.total) * 100);
      var elapsed = ((Date.now() - startTime) / 1000).toFixed(1);
      console.log("[" + elapsed + "s] " + pct + "% - " + p.message);
    }
  }
);

client.connect(transport).then(function() {
  console.log("Connected. Starting analysis of 10,000 rows...\n");
  startTime = Date.now();

  return client.callTool({
    name: "run_analysis",
    arguments: { rows: 10000, chunkSize: 500 }
  }, {
    _meta: { progressToken: "analysis-001" }
  });
}).then(function(result) {
  var elapsed = ((Date.now() - startTime) / 1000).toFixed(2);
  console.log("\nCompleted in " + elapsed + "s");
  console.log(result.content[0].text);
  return client.close();
}).catch(function(err) {
  console.error("Failed:", err.message);
  process.exit(1);
});

Expected output:

Connected. Starting analysis of 10,000 rows...

[0.1s] 5% - Crunching numbers: 500/10000
[0.2s] 10% - Crunching numbers: 1000/10000
[0.3s] 15% - Crunching numbers: 1500/10000
[0.4s] 20% - Crunching numbers: 2000/10000
[0.5s] 25% - Crunching numbers: 2500/10000
[0.6s] 30% - Crunching numbers: 3000/10000
[0.7s] 35% - Crunching numbers: 3500/10000
[0.8s] 40% - Crunching numbers: 4000/10000
[0.8s] 45% - Crunching numbers: 4500/10000
[0.9s] 50% - Crunching numbers: 5000/10000
[1.0s] 55% - Crunching numbers: 5500/10000
[1.1s] 60% - Crunching numbers: 6000/10000
[1.2s] 65% - Crunching numbers: 6500/10000
[1.3s] 70% - Crunching numbers: 7000/10000
[1.4s] 75% - Crunching numbers: 7500/10000
[1.5s] 80% - Crunching numbers: 8000/10000
[1.5s] 85% - Crunching numbers: 8500/10000
[1.6s] 90% - Crunching numbers: 9000/10000
[1.7s] 95% - Crunching numbers: 9500/10000
[1.8s] 100% - Crunching numbers: 10000/10000

Completed in 1.82s
{
  "analysis": {
    "A": { "count": 3334, "average": "498.72", "min": "0.14", "max": "999.93" },
    "B": { "count": 3333, "average": "501.88", "min": "0.31", "max": "999.87" },
    "C": { "count": 3333, "average": "499.45", "min": "0.08", "max": "999.99" }
  },
  "metadata": {
    "rowsProcessed": 10000,
    "processingTime": "1820ms",
    "throughput": "5494 rows/sec"
  }
}

Common Issues and Troubleshooting

1. Progress Notifications Not Appearing

Error: No progress updates on the client, but the tool completes successfully.

Cause: The client did not include a progressToken in the request metadata. Without a token, the SDK silently discards sendProgress calls.

// Wrong - no progress token
client.callTool({ name: "my_tool", arguments: {} });

// Correct - include progressToken in _meta
client.callTool({ name: "my_tool", arguments: {} }, {
  _meta: { progressToken: "token-123" }
});

Fix: Always pass _meta.progressToken when calling tools that support progress. If you are building a client library, generate tokens automatically.

2. Request Timeout on Long-Running Tools

Error:

McpError: Request timed out (60000ms) for method tools/call
  code: -32001

Cause: The MCP client enforces a default timeout (typically 60 seconds). Not all clients reset the timeout when they receive progress notifications.

Fix: Design tools to complete within the timeout window, or break them into smaller operations. If you control the client, increase the timeout:

var client = new Client(
  { name: "my-client", version: "1.0.0" },
  { requestTimeoutMs: 300000 } // 5 minutes
);

For clients you do not control (Claude Desktop, Cursor), return an immediate acknowledgment with a task ID and provide a separate check_status tool for polling.

3. Broken SSE Stream on Proxy or Load Balancer

Error: Client receives partial progress, then the connection drops. Server logs show the request completed normally.

Error: Premature close
  at ServerResponse.onclose (node:http:...)

Cause: A reverse proxy (nginx, AWS ALB, Cloudflare) closed the connection due to an idle timeout. SSE connections look idle between events.

Fix: Configure your proxy to allow long-lived connections, or send keepalive comments:

// In your SSE handler, send a comment every 15 seconds
var keepalive = setInterval(function() {
  res.write(":keepalive\n\n");
}, 15000);

// Clear on close
res.on("close", function() {
  clearInterval(keepalive);
});

For nginx, add:

proxy_read_timeout 300s;
proxy_buffering off;
proxy_cache off;

4. Progress Values Not Monotonically Increasing

Error:

Warning: Progress value decreased from 150 to 100 for token "abc123"

Cause: The MCP spec requires that progress values must increase with each notification. If you are processing items out of order (parallel workers, retries), the reported progress can jump backwards.

Fix: Track the highest value sent and never go below it:

var highWaterMark = 0;

function safeProgress(extra, current, total, message) {
  if (current > highWaterMark) {
    highWaterMark = current;
    extra.sendProgress({
      progress: current,
      total: total,
      message: message
    });
  }
}

5. Memory Leak from Accumulating Session State

Error: Server memory grows continuously under load. Node.js process eventually crashes:

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

Cause: HTTP transport sessions stored in a map are never cleaned up when clients disconnect without sending DELETE.

Fix: Implement session expiration:

var SESSION_TTL = 30 * 60 * 1000; // 30 minutes

setInterval(function() {
  var now = Date.now();
  var keys = Object.keys(sessions);
  for (var i = 0; i < keys.length; i++) {
    if (now - sessions[keys[i]].lastActivity > SESSION_TTL) {
      sessions[keys[i]].transport.close();
      delete sessions[keys[i]];
      console.log("Expired session:", keys[i]);
    }
  }
}, 60000);

Best Practices

Always send a final progress notification at 100% before resolving. Clients use the final progress message to update their UI. If you skip it, the progress bar jumps from 90% to "done" with no visual closure.
Throttle progress notifications to 3-5 per second maximum. More frequent updates waste bandwidth and CPU on serialization. Humans cannot perceive UI updates faster than about 200ms anyway.
Include meaningful messages in every progress notification. A bare { progress: 50, total: 100 } tells the client a number. { progress: 50, total: 100, message: "Validating record 50 of 100 against schema" } tells the user what is actually happening.
Design tools to be cancellable. The MCP spec supports notifications/cancelled for in-progress requests. Check for cancellation between chunks and abort early when the client signals it no longer needs the result.
Use the total field when you know it, omit it when you do not. A progress bar with an unknown total is better than a progress bar that lies. Clients should render an indeterminate spinner when total is absent.
Never send progress after the tool response has been returned. Once you resolve the handler promise, the request is complete. Any subsequent sendProgress calls for that token are protocol violations and will be ignored or cause errors.
Test with the MCP Inspector. The official @modelcontextprotocol/inspector package lets you connect to your server and visualize progress notifications in real time. It is invaluable for debugging timing and message ordering issues.
Separate compute-heavy work from progress reporting. Use setImmediate or setTimeout(fn, 0) between processing chunks to yield the event loop. This ensures progress notifications are actually flushed to the transport and not queued behind a blocking computation.
Handle client disconnects gracefully on the server side. If the client closes the connection mid-operation, stop doing work. Listen for the close event on the response object (HTTP) or the transport's onclose callback (stdio) and abort processing.

References

MCP Progress Specification (2025-03-26) -- Official progress notification protocol
MCP Transports Documentation -- Transport mechanisms (stdio, Streamable HTTP, legacy SSE)
MCP TypeScript SDK -- Official SDK with server and client packages
@modelcontextprotocol/sdk on npm -- Package installation and version info
JSON-RPC 2.0 Specification -- Wire protocol that MCP is built on
Server-Sent Events (SSE) Standard -- W3C specification for the SSE protocol used by Streamable HTTP

Streaming Responses in MCP Servers

Streaming Responses in MCP Servers

Overview

Prerequisites

MCP Transport Options and Streaming Basics

stdio Transport

Streamable HTTP Transport

The Progress Notification Protocol

Implementing Streaming Over stdio Transport

Implementing Streaming Over Streamable HTTP Transport

Progress Notifications and Partial Results

Streaming Tool Responses for Long-Running Operations

Chunked Content Delivery for Large Resources

Backpressure Handling and Flow Control

Client-Side Consumption of Streaming Responses

stdio Client

Streamable HTTP Client

Complete Working Example

Common Issues and Troubleshooting

1. Progress Notifications Not Appearing

2. Request Timeout on Long-Running Tools

3. Broken SSE Stream on Proxy or Load Balancer

4. Progress Values Not Monotonically Increasing

5. Memory Leak from Accumulating Session State

Best Practices

References

Quick Links

Recommended Reading

Retrieval Augmented Generation with Node.js

Need Expert Help?