# AI Streaming

> Stream LLM tokens from your Cloudflare Worker to clients as they are generated. SSE and WebSocket delivery supported out of the box.

Stream LLM tokens from your server to clients as they are generated.

## Server-side streaming

```ts

app.post('/api/chat', async (c) => {
  const { message, sessionId } = await c.req.json()

  // Stream tokens to WebSocket subscribers
  const stream = await sdk.ai.streamCompletion({
    prompt: message,
    model: 'gpt-4o-mini',
  })

  const channel = `ai-chat/${sessionId}`

  for await (const token of stream) {
    sdk.socket.emit('ai:token', { token }, channel)
  }

  sdk.socket.emit('ai:done', {}, channel)

  return c.json({ ok: true })
})
```

## Client-side receiving

```tsx
useEffect(() => {
  const channel = realtime.channel(`ai-chat/${sessionId}`)

  channel
    .on('ai:token', ({ data }) => {
      setStreamingText(prev => prev + data.token)
    })
    .on('ai:done', () => {
      setIsStreaming(false)
    })
    .subscribe()

  return () => channel.unsubscribe()
}, [])
```

For the complete client + server implementation, see [AI Streaming](/features/realtime/ai-streaming) in the Realtime docs and the [AI Streaming Chat example](/examples/ai-streaming-chat).