FeaturesRealtimeAI Streaming

AI Streaming

Stream LLM tokens to clients in real time via WebSocket. Unlike SSE (which requires one connection per client), a single Aerostack channel can deliver an AI stream to thousands of concurrent watchers.

How it works

  1. Client subscribes to a session-specific channel
  2. Client sends an HTTP request to your server to trigger generation
  3. Server loops through LLM tokens, emitting each one via sdk.socket.emit()
  4. All channel subscribers receive tokens as they arrive
  5. Server emits ai:done to signal completion
Client A subscribes to 'ai-chat/session-xyz'
Client B subscribes to 'ai-chat/session-xyz'

POST /ai/generate → triggers server
Server: sdk.socket.emit('ai:token', { token: 'Hello' }, 'ai-chat/session-xyz')
Server: sdk.socket.emit('ai:token', { token: ' world' }, 'ai-chat/session-xyz')
Server: sdk.socket.emit('ai:done', {}, 'ai-chat/session-xyz')

Both Client A and Client B see the streaming response.

Client setup

import { useAerostack } from '@aerostack/react'
import { useEffect, useRef, useState } from 'react'
import { v4 as uuid } from 'uuid'
 
function AiChat() {
  const { realtime } = useAerostack()
  const [messages, setMessages] = useState([])
  const [isStreaming, setIsStreaming] = useState(false)
  const sessionId = useRef(uuid())
 
  useEffect(() => {
    const channel = realtime.channel(`ai-chat/${sessionId.current}`)
 
    channel
      .on('ai:token', ({ data }) => {
        // Append each token to the last assistant message
        setMessages(prev => {
          const last = prev[prev.length - 1]
          if (last?.role === 'assistant') {
            return [
              ...prev.slice(0, -1),
              { ...last, content: last.content + data.token }
            ]
          }
          return [...prev, { role: 'assistant', content: data.token }]
        })
      })
      .on('ai:done', () => {
        setIsStreaming(false)
      })
      .subscribe()
 
    return () => channel.unsubscribe()
  }, [realtime])
 
  const sendMessage = async (text) => {
    setMessages(prev => [...prev, { role: 'user', content: text }])
    setIsStreaming(true)
 
    await fetch('/api/ai/chat', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        message: text,
        sessionId: sessionId.current,
      }),
    })
  }
 
  return (
    <div>
      {messages.map((m, i) => (
        <div key={i} className={m.role === 'user' ? 'text-right' : 'text-left'}>
          {m.content}
          {isStreaming && i === messages.length - 1 && m.role === 'assistant' && (
            <span className="animate-pulse">▊</span>
          )}
        </div>
      ))}
    </div>
  )
}

Server handler

// In your Worker
import { sdk } from '@aerostack/sdk'
 
app.post('/ai/chat', async (c) => {
  const { message, sessionId } = await c.req.json()
  const roomId = `ai-chat/${sessionId}`
 
  // Stream from your AI provider, token by token
  const stream = await sdk.ai.streamCompletion({
    prompt: message,
    model: 'gpt-4o-mini',
  })
 
  for await (const token of stream) {
    sdk.socket.emit('ai:token', { token }, roomId)
  }
 
  sdk.socket.emit('ai:done', {}, roomId)
 
  return c.json({ ok: true })
})

Use session-specific channel names (e.g., ai-chat/{uuid}) to keep conversations private. Multiple users watching the same session ID will all receive the same stream — useful for pair programming or collaborative Q&A.

Advantages over SSE

SSEAerostack Realtime
Connections for 1000 watchers10001 (shared channel)
Works through proxies/CDNSometimesYes (WebSocket)
Bi-directionalNoYes
Persistent historyManualBuilt-in (persist: true)

See the AI Streaming Chat example for a complete working implementation.