AI Streaming — Realtime

Stream LLM tokens to clients in real time via WebSocket. Unlike SSE (which requires one connection per client), a single Aerostack channel can deliver an AI stream to thousands of concurrent watchers.

How it works

Client subscribes to a session-specific channel
Client sends an HTTP request to your server to trigger generation
Server loops through LLM tokens, emitting each one via sdk.socket.emit()
All channel subscribers receive tokens as they arrive
Server emits ai:done to signal completion

Client A subscribes to 'ai-chat/session-xyz'
Client B subscribes to 'ai-chat/session-xyz'

POST /ai/generate → triggers server
Server: sdk.socket.emit('ai:token', { token: 'Hello' }, 'ai-chat/session-xyz')
Server: sdk.socket.emit('ai:token', { token: ' world' }, 'ai-chat/session-xyz')
Server: sdk.socket.emit('ai:done', {}, 'ai-chat/session-xyz')

Both Client A and Client B see the streaming response.

Client setup

import { useAerostack } from '@aerostack/react'
import { v4 as uuid } from 'uuid'

function AiChat() {
  const { realtime } = useAerostack()
  const [messages, setMessages] = useState([])
  const [isStreaming, setIsStreaming] = useState(false)
  const sessionId = useRef(uuid())

  useEffect(() => {
    const channel = realtime.channel(`ai-chat/${sessionId.current}`)

    channel
      .on('ai:token', ({ data }) => {
        // Append each token to the last assistant message
        setMessages(prev => {
          const last = prev[prev.length - 1]
          if (last?.role === 'assistant') {
            return [
              ...prev.slice(0, -1),
              { ...last, content: last.content + data.token }
            ]
          }
          return [...prev, { role: 'assistant', content: data.token }]
        })
      })
      .on('ai:done', () => {
        setIsStreaming(false)
      })
      .subscribe()

    return () => channel.unsubscribe()
  }, [realtime])

  const sendMessage = async (text) => {
    setMessages(prev => [...prev, { role: 'user', content: text }])
    setIsStreaming(true)

    await fetch('/api/ai/chat', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        message: text,
        sessionId: sessionId.current,
      }),
    })
  }

  return (
    <div>
      {messages.map((m, i) => (
        <div key={i} className={m.role === 'user' ? 'text-right' : 'text-left'}>
          {m.content}
          {isStreaming && i === messages.length - 1 && m.role === 'assistant' && (
            <span className="animate-pulse">▊</span>
          )}
        </div>
      ))}
    </div>
  )
}

Server handler

// In your Worker
import { sdk } from '@aerostack/sdk'

app.post('/ai/chat', async (c) => {
  const { message, sessionId } = await c.req.json()
  const roomId = `ai-chat/${sessionId}`

  // Stream from your AI provider, token by token
  const stream = await sdk.ai.streamCompletion({
    prompt: message,
    model: 'gpt-4o-mini',
  })

  for await (const token of stream) {
    sdk.socket.emit('ai:token', { token }, roomId)
  }

  sdk.socket.emit('ai:done', {}, roomId)

  return c.json({ ok: true })
})

Advantages over SSE

	SSE	Aerostack Realtime
Connections for 1000 watchers	1000	1 (shared channel)
Works through proxies/CDN	Sometimes	Yes (WebSocket)
Bi-directional	No	Yes
Persistent history	Manual	Built-in (`persist: true`)

See the AI Streaming Chat example for a complete working implementation.