Server-Sent Events (SSE) Protocol

Server-Sent Events (SSE) is a standard HTTP-based protocol for server-to-client streaming. It provides:

✅ Automatic reconnection - Browser handles connection drops
✅ Event-driven - Native browser EventSource API
✅ Simple protocol - Text-based, easy to debug
✅ Wide support - Works in all modern browsers
✅ Efficient - Single long-lived HTTP connection

This document describes how TanStack AI transmits StreamChunks over Server-Sent Events (SSE), the recommended protocol for most use cases.

Protocol Specification

HTTP Request

Method: POST

Headers:

http

Content-Type: application/json

Content-Type: application/json

Body:

json

{
  "messages": [
    {
      "role": "user",
      "content": "Hello, how are you?"
    }
  ],
  "data": {
    // Optional additional data
  }
}

{
  "messages": [
    {
      "role": "user",
      "content": "Hello, how are you?"
    }
  ],
  "data": {
    // Optional additional data
  }
}

HTTP Response

Status: 200 OK

Headers:

http

Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive

Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive

Body: Stream of SSE events

SSE Format

Each StreamChunk is transmitted as an SSE event with the following format:

data: {JSON_ENCODED_CHUNK}\n\n

data: {JSON_ENCODED_CHUNK}\n\n

Key Points

Each event starts with data:
Followed by the JSON-encoded chunk
Ends with double newline \n\n
No event names or IDs (not required for our use case)

Examples

Content Chunk

data: {"type":"content","id":"chatcmpl-abc123","model":"gpt-4o","timestamp":1701234567890,"delta":"Hello","content":"Hello","role":"assistant"}\n\n

data: {"type":"content","id":"chatcmpl-abc123","model":"gpt-4o","timestamp":1701234567890,"delta":"Hello","content":"Hello","role":"assistant"}\n\n

Tool Call Chunk

data: {"type":"tool_call","id":"chatcmpl-abc123","model":"gpt-4o","timestamp":1701234567891,"toolCall":{"id":"call_xyz","type":"function","function":{"name":"get_weather","arguments":"{\"location\":\"SF\"}"}},"index":0}\n\n

data: {"type":"tool_call","id":"chatcmpl-abc123","model":"gpt-4o","timestamp":1701234567891,"toolCall":{"id":"call_xyz","type":"function","function":{"name":"get_weather","arguments":"{\"location\":\"SF\"}"}},"index":0}\n\n

Done Chunk

data: {"type":"done","id":"chatcmpl-abc123","model":"gpt-4o","timestamp":1701234567892,"finishReason":"stop","usage":{"promptTokens":10,"completionTokens":5,"totalTokens":15}}\n\n

data: {"type":"done","id":"chatcmpl-abc123","model":"gpt-4o","timestamp":1701234567892,"finishReason":"stop","usage":{"promptTokens":10,"completionTokens":5,"totalTokens":15}}\n\n

Stream Lifecycle

1. Client Initiates Connection

typescript

// Client code
const response = await fetch('/api/chat', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({ messages }),
});

// Client code
const response = await fetch('/api/chat', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({ messages }),
});

2. Server Sends Response Header

http

HTTP/1.1 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive

HTTP/1.1 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive

3. Server Streams Chunks

The server sends multiple data: events as chunks are generated:

data: {"type":"content","id":"msg_1","model":"gpt-4o","timestamp":1701234567890,"delta":"The","content":"The"}\n\n
data: {"type":"content","id":"msg_1","model":"gpt-4o","timestamp":1701234567891,"delta":" weather","content":"The weather"}\n\n
data: {"type":"content","id":"msg_1","model":"gpt-4o","timestamp":1701234567892,"delta":" is","content":"The weather is"}\n\n
data: {"type":"content","id":"msg_1","model":"gpt-4o","timestamp":1701234567893,"delta":" sunny","content":"The weather is sunny"}\n\n
data: {"type":"done","id":"msg_1","model":"gpt-4o","timestamp":1701234567894,"finishReason":"stop"}\n\n

data: {"type":"content","id":"msg_1","model":"gpt-4o","timestamp":1701234567890,"delta":"The","content":"The"}\n\n
data: {"type":"content","id":"msg_1","model":"gpt-4o","timestamp":1701234567891,"delta":" weather","content":"The weather"}\n\n
data: {"type":"content","id":"msg_1","model":"gpt-4o","timestamp":1701234567892,"delta":" is","content":"The weather is"}\n\n
data: {"type":"content","id":"msg_1","model":"gpt-4o","timestamp":1701234567893,"delta":" sunny","content":"The weather is sunny"}\n\n
data: {"type":"done","id":"msg_1","model":"gpt-4o","timestamp":1701234567894,"finishReason":"stop"}\n\n

4. Stream Completion

After the final chunk, the server sends a completion marker:

data: [DONE]\n\n

data: [DONE]\n\n

Then closes the connection.

Error Handling

Server-Side Errors

If an error occurs during generation, send an error chunk:

data: {"type":"error","id":"msg_1","model":"gpt-4o","timestamp":1701234567895,"error":{"message":"Rate limit exceeded","code":"rate_limit_exceeded"}}\n\n

data: {"type":"error","id":"msg_1","model":"gpt-4o","timestamp":1701234567895,"error":{"message":"Rate limit exceeded","code":"rate_limit_exceeded"}}\n\n

Then close the connection.

Connection Errors

SSE provides automatic reconnection:

Browser automatically reconnects on connection drop
Server can send retry: field to control reconnection delay
Client can handle error events from EventSource

Implementation

Server-Side (Node.js/TypeScript)

TanStack AI provides toServerSentEventsStream() and toServerSentEventsResponse() utilities:

typescript

import { chat, toServerSentEventsResponse } from '@tanstack/ai';
import { openaiText } from '@tanstack/ai-openai';

export async function POST(request: Request) {
  const { messages } = await request.json();

  const stream = chat({
    adapter: openaiText('gpt-4o'),
    messages,
  });

  // Automatically converts StreamChunks to SSE format
  return toServerSentEventsResponse(stream);
}

import { chat, toServerSentEventsResponse } from '@tanstack/ai';
import { openaiText } from '@tanstack/ai-openai';

export async function POST(request: Request) {
  const { messages } = await request.json();

  const stream = chat({
    adapter: openaiText('gpt-4o'),
    messages,
  });

  // Automatically converts StreamChunks to SSE format
  return toServerSentEventsResponse(stream);
}

What toServerSentEventsResponse() does:

Creates a ReadableStream from the async iterable
Wraps each chunk as data: {JSON}\n\n
Sends data: [DONE]\n\n at the end
Sets proper SSE headers
Handles errors and cleanup

Client-Side (Browser/Node.js)

TanStack AI provides fetchServerSentEvents() connection adapter:

typescript

import { useChat, fetchServerSentEvents } from '@tanstack/ai-react';

const { messages, sendMessage } = useChat({
  connection: fetchServerSentEvents('/api/chat'),
});

import { useChat, fetchServerSentEvents } from '@tanstack/ai-react';

const { messages, sendMessage } = useChat({
  connection: fetchServerSentEvents('/api/chat'),
});

What fetchServerSentEvents() does:

Makes POST request with messages
Reads response body as stream
Parses SSE format (data: prefix)
Deserializes JSON chunks
Yields StreamChunk objects
Stops on [DONE] marker

Manual Implementation (Advanced)

If you need custom handling:

Server

typescript

export async function POST(request: Request) {
  const { messages } = await request.json();
  const encoder = new TextEncoder();
  
  const stream = new ReadableStream({
    async start(controller) {
      try {
        for await (const chunk of chat({ adapter: openaiText('gpt-4o'), messages })) {
          const sseData = `data: ${JSON.stringify(chunk)}\n\n`;
          controller.enqueue(encoder.encode(sseData));
        }
        controller.enqueue(encoder.encode('data: [DONE]\n\n'));
        controller.close();
      } catch (error) {
        const errorChunk = {
          type: 'error',
          error: { message: error.message }
        };
        controller.enqueue(encoder.encode(`data: ${JSON.stringify(errorChunk)}\n\n`));
        controller.close();
      }
    }
  });

  return new Response(stream, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      'Connection': 'keep-alive',
    },
  });
}

export async function POST(request: Request) {
  const { messages } = await request.json();
  const encoder = new TextEncoder();
  
  const stream = new ReadableStream({
    async start(controller) {
      try {
        for await (const chunk of chat({ adapter: openaiText('gpt-4o'), messages })) {
          const sseData = `data: ${JSON.stringify(chunk)}\n\n`;
          controller.enqueue(encoder.encode(sseData));
        }
        controller.enqueue(encoder.encode('data: [DONE]\n\n'));
        controller.close();
      } catch (error) {
        const errorChunk = {
          type: 'error',
          error: { message: error.message }
        };
        controller.enqueue(encoder.encode(`data: ${JSON.stringify(errorChunk)}\n\n`));
        controller.close();
      }
    }
  });

  return new Response(stream, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      'Connection': 'keep-alive',
    },
  });
}

Client

typescript

const response = await fetch('/api/chat', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ messages }),
});

const reader = response.body!.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n');
  buffer = lines.pop() || '';
  
  for (const line of lines) {
    if (line.startsWith('data: ')) {
      const data = line.slice(6);
      if (data === '[DONE]') continue;
      
      const chunk = JSON.parse(data);
      // Handle chunk...
    }
  }
}

const response = await fetch('/api/chat', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ messages }),
});

const reader = response.body!.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n');
  buffer = lines.pop() || '';
  
  for (const line of lines) {
    if (line.startsWith('data: ')) {
      const data = line.slice(6);
      if (data === '[DONE]') continue;
      
      const chunk = JSON.parse(data);
      // Handle chunk...
    }
  }
}

Debugging

Inspecting SSE Traffic

Browser DevTools:

Open Network tab
Look for requests with text/event-stream type
View response as it streams in

cURL:

bash

curl -N -X POST http://localhost:3000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Hello"}]}'

curl -N -X POST http://localhost:3000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Hello"}]}'

The -N flag disables buffering to see real-time output.

Example Output:

data: {"type":"content","id":"msg_1","model":"gpt-4o","timestamp":1701234567890,"delta":"Hello","content":"Hello"}

data: {"type":"content","id":"msg_1","model":"gpt-4o","timestamp":1701234567891,"delta":" there","content":"Hello there"}

data: {"type":"done","id":"msg_1","model":"gpt-4o","timestamp":1701234567892,"finishReason":"stop"}

data: [DONE]

data: {"type":"content","id":"msg_1","model":"gpt-4o","timestamp":1701234567890,"delta":"Hello","content":"Hello"}

data: {"type":"content","id":"msg_1","model":"gpt-4o","timestamp":1701234567891,"delta":" there","content":"Hello there"}

data: {"type":"done","id":"msg_1","model":"gpt-4o","timestamp":1701234567892,"finishReason":"stop"}

data: [DONE]

Advantages of SSE

Built-in Reconnection - Browser handles connection drops automatically
Simpler than WebSocket - No handshake, just HTTP
Server-to-Client Only - Matches chat streaming use case perfectly
Wide Browser Support - Works everywhere (except IE11)
Proxy-Friendly - Works through most HTTP proxies
Easy to Debug - Plain text format, visible in DevTools

Limitations

One-Way Communication - Server to client only (fine for streaming responses)
HTTP/1.1 Connection Limits - Browsers limit concurrent connections per domain (6-8)
No Binary Data - Text-only (not an issue for JSON chunks)
HTTP/2 Streams - Can be more efficient but SSE works fine

Best Practices

Always set proper headers - Content-Type, Cache-Control, Connection
Send [DONE] marker - Helps client know when to close
Handle errors gracefully - Send error chunk before closing
Use compression - Enable gzip/brotli at the reverse proxy level
Set timeouts - Prevent hanging connections
Monitor connection count - Watch for connection leaks

Server-Sent Events (SSE) Protocol

Protocol Specification

HTTP Request

HTTP Response

SSE Format

Key Points

Examples

Content Chunk

Tool Call Chunk

Done Chunk

Stream Lifecycle

1. Client Initiates Connection

2. Server Sends Response Header

3. Server Streams Chunks

4. Stream Completion

Error Handling

Server-Side Errors

Connection Errors

Implementation

Server-Side (Node.js/TypeScript)

Client-Side (Browser/Node.js)

Manual Implementation (Advanced)

Server

Client

Debugging

Inspecting SSE Traffic

Advantages of SSE

Limitations

Best Practices

See Also