AI Streaming
Stream LLM tokens to clients in real time via WebSocket. Unlike SSE (which requires one connection per client), a single Aerostack channel can deliver an AI stream to thousands of concurrent watchers.
How it works
- Client subscribes to a session-specific channel
- Client sends an HTTP request to your server to trigger generation
- Server loops through LLM tokens, emitting each one via
sdk.socket.emit() - All channel subscribers receive tokens as they arrive
- Server emits
ai:doneto signal completion
Client A subscribes to 'ai-chat/session-xyz'
Client B subscribes to 'ai-chat/session-xyz'
POST /ai/generate → triggers server
Server: sdk.socket.emit('ai:token', { token: 'Hello' }, 'ai-chat/session-xyz')
Server: sdk.socket.emit('ai:token', { token: ' world' }, 'ai-chat/session-xyz')
Server: sdk.socket.emit('ai:done', {}, 'ai-chat/session-xyz')
Both Client A and Client B see the streaming response.Client setup
import { useAerostack } from '@aerostack/react'
import { useEffect, useRef, useState } from 'react'
import { v4 as uuid } from 'uuid'
function AiChat() {
const { realtime } = useAerostack()
const [messages, setMessages] = useState([])
const [isStreaming, setIsStreaming] = useState(false)
const sessionId = useRef(uuid())
useEffect(() => {
const channel = realtime.channel(`ai-chat/${sessionId.current}`)
channel
.on('ai:token', ({ data }) => {
// Append each token to the last assistant message
setMessages(prev => {
const last = prev[prev.length - 1]
if (last?.role === 'assistant') {
return [
...prev.slice(0, -1),
{ ...last, content: last.content + data.token }
]
}
return [...prev, { role: 'assistant', content: data.token }]
})
})
.on('ai:done', () => {
setIsStreaming(false)
})
.subscribe()
return () => channel.unsubscribe()
}, [realtime])
const sendMessage = async (text) => {
setMessages(prev => [...prev, { role: 'user', content: text }])
setIsStreaming(true)
await fetch('/api/ai/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: text,
sessionId: sessionId.current,
}),
})
}
return (
<div>
{messages.map((m, i) => (
<div key={i} className={m.role === 'user' ? 'text-right' : 'text-left'}>
{m.content}
{isStreaming && i === messages.length - 1 && m.role === 'assistant' && (
<span className="animate-pulse">▊</span>
)}
</div>
))}
</div>
)
}Server handler
// In your Worker
import { sdk } from '@aerostack/sdk'
app.post('/ai/chat', async (c) => {
const { message, sessionId } = await c.req.json()
const roomId = `ai-chat/${sessionId}`
// Stream from your AI provider, token by token
const stream = await sdk.ai.streamCompletion({
prompt: message,
model: 'gpt-4o-mini',
})
for await (const token of stream) {
sdk.socket.emit('ai:token', { token }, roomId)
}
sdk.socket.emit('ai:done', {}, roomId)
return c.json({ ok: true })
})Use session-specific channel names (e.g., ai-chat/{uuid}) to keep conversations private. Multiple users watching the same session ID will all receive the same stream — useful for pair programming or collaborative Q&A.
Advantages over SSE
| SSE | Aerostack Realtime | |
|---|---|---|
| Connections for 1000 watchers | 1000 | 1 (shared channel) |
| Works through proxies/CDN | Sometimes | Yes (WebSocket) |
| Bi-directional | No | Yes |
| Persistent history | Manual | Built-in (persist: true) |
See the AI Streaming Chat example for a complete working implementation.