Skip to main content

Fixing Broken Streaming in Next.js + AI SDK on Production

· 5 min read
fr4nk
Software Engineer
Hugging Face

Your streaming works perfectly in development but batches everything on production? Here's the systematic debugging approach to fix it.

The Problem

When deploying a Next.js application using Vercel AI SDK (createStreamableValue or streamText), you observe different behavior between environments:

EnvironmentBehavior
localhost:3000Tokens stream smoothly, real-time updates
ProductionUI freezes, then dumps all content at once

This is not a code bug—it's an infrastructure issue.

Root Cause Analysis

1. Proxy Buffering (Most Common)

When your app sits behind a reverse proxy (Nginx, Traefik, AWS ALB, Cloudflare), the proxy buffers the entire response before forwarding to the client:

Client ←── [Buffered] ←── Proxy ←── [Streaming] ←── Next.js

Waits for complete response

The proxy collects all chunks, then sends them as a single response. Your streaming becomes a batch.

How to identify: Check your network tab. If you see a single large response instead of incremental chunks, proxy buffering is the culprit.

2. Next.js Built-in Compression

Next.js enables gzip compression by default. Compression algorithms need sufficient data to compress efficiently, so they buffer content before sending.

// Next.js internal behavior (simplified)
const compressed = gzip(await collectAllChunks(response));
res.send(compressed);

3. React Production Batching

React's production build batches state updates more aggressively than development. Code using flushSync may work in dev but cause render blocking in production:

// Works in dev, problematic in production
for await (const chunk of stream) {
flushSync(() => setState(prev => [...prev, chunk]));
}

The Fix

Layer 1: Next.js Configuration

Disable compression and add streaming headers:

next.config.ts
import type { NextConfig } from "next";

const nextConfig: NextConfig = {
// Disable compression - let your proxy handle it
compress: false,

async headers() {
return [
{
// Apply to all routes, or scope to specific API paths
source: '/:path*',
headers: [
// Nginx: disable proxy buffering
{ key: 'X-Accel-Buffering', value: 'no' },
// Prevent caching of streaming responses
{ key: 'Cache-Control', value: 'no-cache, no-transform' },
],
},
];
},
};

export default nextConfig;

Why X-Accel-Buffering? Nginx checks this header and disables buffering when set to no. Other proxies may ignore it, requiring additional configuration.

Layer 2: React Concurrent Rendering

Replace synchronous updates with React's concurrent features:

// ❌ Before: Synchronous forced renders
import { flushSync } from "react-dom";

for await (const event of stream) {
flushSync(() => {
setMessages(prev => [...prev, event]);
});
}

// ✅ After: Concurrent updates with priority
import { startTransition } from "react";

for await (const event of stream) {
if (event.type === "text-delta") {
// High priority: user-visible content
setContent(prev => prev + event.text);
} else {
// Low priority: metadata, tool calls
startTransition(() => {
setMetadata(prev => [...prev, event]);
});
}
}

startTransition marks updates as non-urgent, allowing React to batch them without blocking user interactions.

Layer 3: Defer Expensive Renders

For components that process streaming data heavily, use useDeferredValue:

import { useDeferredValue, useMemo } from 'react';

interface StreamDisplayProps {
events: StreamEvent[];
}

function StreamDisplay({ events }: StreamDisplayProps) {
// Defer the value to prevent blocking during rapid updates
const deferredEvents = useDeferredValue(events);

// Expensive computation uses deferred value
const processed = useMemo(() => {
return deferredEvents.map(e => parseMarkdown(e.content));
}, [deferredEvents]);

// Show loading indicator when deferred value is stale
const isStale = events !== deferredEvents;

return (
<div className={isStale ? 'opacity-80' : ''}>
{processed.map(item => <MessageBlock key={item.id} {...item} />)}
</div>
);
}

Layer 4: Infrastructure Configuration

Nginx

location /api/ {
proxy_pass http://upstream;

# Disable buffering for streaming
proxy_buffering off;
proxy_cache off;

# Required for SSE/streaming
proxy_http_version 1.1;
proxy_set_header Connection '';

# Increase timeouts for long-running streams
proxy_read_timeout 86400;
proxy_send_timeout 86400;
}

Kubernetes Ingress (nginx-ingress-controller)

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
annotations:
nginx.ingress.kubernetes.io/proxy-buffering: "off"
nginx.ingress.kubernetes.io/proxy-request-buffering: "off"
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
spec:
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: app-service
port:
number: 3000

Traefik

# docker-compose.yml
services:
app:
labels:
- "traefik.http.middlewares.streaming.buffering.maxRequestBodyBytes=0"
- "traefik.http.middlewares.streaming.buffering.maxResponseBodyBytes=0"
- "traefik.http.routers.app.middlewares=streaming"

AWS Application Load Balancer

ALB doesn't support disabling response buffering. Options:

  1. Use Network Load Balancer (NLB) instead
  2. Deploy behind CloudFront with streaming enabled
  3. Use direct EC2/ECS connection for streaming endpoints

Debugging Checklist

# 1. Verify streaming at the source
curl -N https://your-app.com/api/stream

# 2. Check response headers
curl -I https://your-app.com/api/stream | grep -i buffer

# 3. Test without proxy (if possible)
curl -N http://localhost:3000/api/stream

Implementation Checklist

  • Set compress: false in next.config.ts
  • Add X-Accel-Buffering: no header
  • Replace flushSync with startTransition
  • Add useDeferredValue for heavy streaming components
  • Configure proxy_buffering off on Nginx/Ingress
  • Verify CDN doesn't cache streaming endpoints
  • Test with curl -N to confirm chunked transfer

References

Summary

Streaming failures on production are almost always infrastructure issues, not application bugs. The fix requires changes at multiple layers:

LayerFixImpact
Next.jscompress: false, headersPrevents app-level buffering
ReactstartTransition, useDeferredValueSmooth concurrent updates
Proxyproxy_buffering offEnables true streaming
CDNDisable caching for stream routesPrevents edge buffering

Start with the Next.js config and proxy settings—these solve 90% of cases.