Skip to content

AI Streaming — Realtime

Stream LLM tokens to clients in real time via WebSocket. Unlike SSE (which requires one connection per client), a single Aerostack channel can deliver an AI stream to thousands of concurrent watchers.

  1. Client subscribes to a session-specific channel
  2. Client sends an HTTP request to your server to trigger generation
  3. Server loops through LLM tokens, emitting each one via sdk.socket.emit()
  4. All channel subscribers receive tokens as they arrive
  5. Server emits ai:done to signal completion
Client A subscribes to 'ai-chat/session-xyz'
Client B subscribes to 'ai-chat/session-xyz'
POST /ai/generate → triggers server
Server: sdk.socket.emit('ai:token', { token: 'Hello' }, 'ai-chat/session-xyz')
Server: sdk.socket.emit('ai:token', { token: ' world' }, 'ai-chat/session-xyz')
Server: sdk.socket.emit('ai:done', {}, 'ai-chat/session-xyz')
Both Client A and Client B see the streaming response.
import { useAerostack } from '@aerostack/react'
import { v4 as uuid } from 'uuid'
function AiChat() {
const { realtime } = useAerostack()
const [messages, setMessages] = useState([])
const [isStreaming, setIsStreaming] = useState(false)
const sessionId = useRef(uuid())
useEffect(() => {
const channel = realtime.channel(`ai-chat/${sessionId.current}`)
channel
.on('ai:token', ({ data }) => {
// Append each token to the last assistant message
setMessages(prev => {
const last = prev[prev.length - 1]
if (last?.role === 'assistant') {
return [
...prev.slice(0, -1),
{ ...last, content: last.content + data.token }
]
}
return [...prev, { role: 'assistant', content: data.token }]
})
})
.on('ai:done', () => {
setIsStreaming(false)
})
.subscribe()
return () => channel.unsubscribe()
}, [realtime])
const sendMessage = async (text) => {
setMessages(prev => [...prev, { role: 'user', content: text }])
setIsStreaming(true)
await fetch('/api/ai/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: text,
sessionId: sessionId.current,
}),
})
}
return (
<div>
{messages.map((m, i) => (
<div key={i} className={m.role === 'user' ? 'text-right' : 'text-left'}>
{m.content}
{isStreaming && i === messages.length - 1 && m.role === 'assistant' && (
<span className="animate-pulse">▊</span>
)}
</div>
))}
</div>
)
}
// In your Worker
import { sdk } from '@aerostack/sdk'
app.post('/ai/chat', async (c) => {
const { message, sessionId } = await c.req.json()
const roomId = `ai-chat/${sessionId}`
// Stream from your AI provider, token by token
const stream = await sdk.ai.streamCompletion({
prompt: message,
model: 'gpt-4o-mini',
})
for await (const token of stream) {
sdk.socket.emit('ai:token', { token }, roomId)
}
sdk.socket.emit('ai:done', {}, roomId)
return c.json({ ok: true })
})
SSEAerostack Realtime
Connections for 1000 watchers10001 (shared channel)
Works through proxies/CDNSometimesYes (WebSocket)
Bi-directionalNoYes
Persistent historyManualBuilt-in (persist: true)

See the AI Streaming Chat example for a complete working implementation.