AI Streaming — Realtime
Stream LLM tokens to clients in real time via WebSocket. Unlike SSE (which requires one connection per client), a single Aerostack channel can deliver an AI stream to thousands of concurrent watchers.
How it works
Section titled “How it works”- Client subscribes to a session-specific channel
- Client sends an HTTP request to your server to trigger generation
- Server loops through LLM tokens, emitting each one via
sdk.socket.emit() - All channel subscribers receive tokens as they arrive
- Server emits
ai:doneto signal completion
Client A subscribes to 'ai-chat/session-xyz'Client B subscribes to 'ai-chat/session-xyz'
POST /ai/generate → triggers serverServer: sdk.socket.emit('ai:token', { token: 'Hello' }, 'ai-chat/session-xyz')Server: sdk.socket.emit('ai:token', { token: ' world' }, 'ai-chat/session-xyz')Server: sdk.socket.emit('ai:done', {}, 'ai-chat/session-xyz')
Both Client A and Client B see the streaming response.Client setup
Section titled “Client setup”import { useAerostack } from '@aerostack/react'import { v4 as uuid } from 'uuid'
function AiChat() { const { realtime } = useAerostack() const [messages, setMessages] = useState([]) const [isStreaming, setIsStreaming] = useState(false) const sessionId = useRef(uuid())
useEffect(() => { const channel = realtime.channel(`ai-chat/${sessionId.current}`)
channel .on('ai:token', ({ data }) => { // Append each token to the last assistant message setMessages(prev => { const last = prev[prev.length - 1] if (last?.role === 'assistant') { return [ ...prev.slice(0, -1), { ...last, content: last.content + data.token } ] } return [...prev, { role: 'assistant', content: data.token }] }) }) .on('ai:done', () => { setIsStreaming(false) }) .subscribe()
return () => channel.unsubscribe() }, [realtime])
const sendMessage = async (text) => { setMessages(prev => [...prev, { role: 'user', content: text }]) setIsStreaming(true)
await fetch('/api/ai/chat', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ message: text, sessionId: sessionId.current, }), }) }
return ( <div> {messages.map((m, i) => ( <div key={i} className={m.role === 'user' ? 'text-right' : 'text-left'}> {m.content} {isStreaming && i === messages.length - 1 && m.role === 'assistant' && ( <span className="animate-pulse">▊</span> )} </div> ))} </div> )}Server handler
Section titled “Server handler”// In your Workerimport { sdk } from '@aerostack/sdk'
app.post('/ai/chat', async (c) => { const { message, sessionId } = await c.req.json() const roomId = `ai-chat/${sessionId}`
// Stream from your AI provider, token by token const stream = await sdk.ai.streamCompletion({ prompt: message, model: 'gpt-4o-mini', })
for await (const token of stream) { sdk.socket.emit('ai:token', { token }, roomId) }
sdk.socket.emit('ai:done', {}, roomId)
return c.json({ ok: true })})Advantages over SSE
Section titled “Advantages over SSE”| SSE | Aerostack Realtime | |
|---|---|---|
| Connections for 1000 watchers | 1000 | 1 (shared channel) |
| Works through proxies/CDN | Sometimes | Yes (WebSocket) |
| Bi-directional | No | Yes |
| Persistent history | Manual | Built-in (persist: true) |
See the AI Streaming Chat example for a complete working implementation.