Skip to content

AI & Vector Search

The AI module gives you access to 30+ LLM models through a unified API with streaming support. Combined with the built-in Vector Search module, you can build RAG pipelines, semantic search, and intelligent content systems without any external services.

  • AI chat interfaces — Streaming chat completions with token-by-token rendering
  • Content generation — Blog posts, product descriptions, email drafts
  • Document summarization — Condense long documents into key points
  • Knowledge bases — Ingest documents and answer questions using RAG
  • Semantic search — Find content by meaning, not just keywords
  • FAQ auto-matching — Match user questions to existing answers
  • Product recommendations — Find similar products based on description embeddings
  • Code search — Semantic search across codebases and documentation
  • Classification — Categorize text, detect sentiment, extract entities
import { useAI } from '@aerostack/react'
function AISummary({ text }: { text: string }) {
const { chat, loading } = useAI()
const [summary, setSummary] = useState('')
const handleSummarize = async () => {
const result = await chat({
prompt: `Summarize this text in 3 bullet points:\n\n${text}`,
model: 'gpt-4o-mini',
maxTokens: 256,
})
setSummary(result.text)
}
return (
<div>
<button onClick={handleSummarize} disabled={loading}>Summarize</button>
{summary && <p>{summary}</p>}
</div>
)
}
const result = await sdk.ai.chat({
messages: [
{ role: 'system', content: 'You are a helpful coding assistant.' },
{ role: 'user', content: 'How do I sort an array in JavaScript?' },
{ role: 'assistant', content: 'Use Array.prototype.sort()...' },
{ role: 'user', content: 'What about sorting objects by a property?' },
],
model: 'gpt-4o',
maxTokens: 512,
})
const result = await sdk.ai.chat({
system: 'You are a customer support agent for Acme Corp. Be concise and helpful.',
prompt: userMessage,
model: 'claude-3-5-sonnet-20241022',
maxTokens: 1024,
})

Stream completions token-by-token for chat interfaces. This provides a much better user experience than waiting for the full response.

The useGatewayChat hook provides a complete streaming chat experience with token wallet integration:

import { useGatewayChat } from '@aerostack/react'
function ChatInterface() {
const {
messages,
input,
setInput,
sendMessage,
isStreaming,
error,
} = useGatewayChat({
model: 'gpt-4o-mini',
systemPrompt: 'You are a helpful assistant.',
})
return (
<div>
{messages.map((msg, i) => (
<div key={i}>
<strong>{msg.role}:</strong> {msg.content}
</div>
))}
<form onSubmit={(e) => { e.preventDefault(); sendMessage() }}>
<input
value={input}
onChange={(e) => setInput(e.target.value)}
placeholder="Ask anything..."
disabled={isStreaming}
/>
<button type="submit" disabled={isStreaming}>Send</button>
</form>
{error && <p style={{ color: 'red' }}>{error}</p>}
</div>
)
}

For custom streaming implementations:

import { useStream } from '@aerostack/react'
function CustomStream() {
const { stream, isStreaming, error } = useStream()
const [output, setOutput] = useState('')
const handleGenerate = async () => {
setOutput('')
await stream({
url: '/api/ai/complete',
body: { prompt: 'Write a haiku about coding', model: 'gpt-4o-mini' },
onToken: (token) => setOutput(prev => prev + token),
onDone: () => console.log('Stream complete'),
})
}
return (
<div>
<button onClick={handleGenerate} disabled={isStreaming}>Generate</button>
<pre>{output}</pre>
</div>
)
}
const stream = await sdk.ai.streamCompletion({
prompt: 'Write a short story about a robot.',
model: 'gpt-4o',
maxTokens: 1024,
})
for await (const token of stream) {
// Send each token to client via SSE or WebSocket
sdk.socket.emit('ai:token', { token }, `ai-chat/${sessionId}`)
}
sdk.socket.emit('ai:done', {}, `ai-chat/${sessionId}`)

Track AI token usage and balance with the built-in wallet system:

import { useGatewayWallet, useTokenBalance } from '@aerostack/react'
function WalletDisplay() {
const { balance, usage, loading } = useGatewayWallet()
return (
<div>
<p>Balance: {balance.remaining} tokens</p>
<p>Used today: {usage.today} tokens</p>
</div>
)
}
// Or the simpler hook:
function SimpleBalance() {
const { balance } = useTokenBalance()
return <p>{balance} tokens remaining</p>
}

Ingest documents as vector embeddings and query them with natural language. The vector search module handles embedding generation, storage, and similarity matching automatically.

// Ingest a single document
await sdk.ai.search.ingest({
id: 'doc-123',
content: 'Aerostack provides WebSocket-based realtime with presence tracking...',
type: 'documentation',
metadata: {
title: 'Realtime Guide',
url: '/sdk/realtime',
section: 'presence',
},
})
// Ingest multiple documents
const documents = [
{ id: 'faq-1', content: 'How do I reset my password?', type: 'faq', metadata: { category: 'auth' } },
{ id: 'faq-2', content: 'What models are available for AI?', type: 'faq', metadata: { category: 'ai' } },
{ id: 'faq-3', content: 'How do I upload files?', type: 'faq', metadata: { category: 'storage' } },
]
for (const doc of documents) {
await sdk.ai.search.ingest(doc)
}
// Find documents similar to a natural language query
const results = await sdk.ai.search.query({
query: 'how to track who is online',
limit: 5,
type: 'documentation', // optional: filter by document type
})
// results: [{ id, content, metadata, score }, ...]
// score: 0-1 similarity (higher is more similar)
import { useVectorSearch } from '@aerostack/react'
function SemanticSearch() {
const { search, results, loading } = useVectorSearch()
const [query, setQuery] = useState('')
return (
<div>
<input
value={query}
onChange={(e) => setQuery(e.target.value)}
placeholder="Search by meaning..."
/>
<button onClick={() => search({ query, limit: 10 })} disabled={loading}>
Search
</button>
{results.map(r => (
<div key={r.id}>
<h3>{r.metadata.title}</h3>
<p>{r.content.slice(0, 200)}...</p>
<small>Relevance: {(r.score * 100).toFixed(1)}%</small>
</div>
))}
</div>
)
}
// Get a specific document
const doc = await sdk.ai.search.get('doc-123')
// Update a document (re-indexes the embedding)
await sdk.ai.search.update('doc-123', {
content: 'Updated content...',
metadata: { title: 'Updated Title' },
})
// Delete a document
await sdk.ai.search.delete('doc-123')
// Delete all documents of a type
await sdk.ai.search.deleteByType('faq')
// List all document types
const types = await sdk.ai.search.listTypes()
// ['documentation', 'faq', 'product']
// Count documents
const count = await sdk.ai.search.count()
const faqCount = await sdk.ai.search.count({ type: 'faq' })
// Configure search settings
await sdk.ai.search.configure({
embeddingModel: 'text-embedding-3-small',
similarityThreshold: 0.7,
})

RAG pipeline (retrieval-augmented generation)

Section titled “RAG pipeline (retrieval-augmented generation)”

Combine vector search with AI completions to build systems that answer questions using your own data.

User Question

Vector Search

Relevant Documents

LLM Completion

Answer with Sources

rag-pipeline.ts
async function answerQuestion(question: string) {
// Step 1: Find relevant documents
const results = await sdk.ai.search.query({
query: question,
limit: 5,
type: 'documentation',
})
// Step 2: Build context from search results
const context = results
.filter(r => r.score > 0.7)
.map(r => `[${r.metadata.title}]: ${r.content}`)
.join('\n\n')
// Step 3: Generate answer with context
const answer = await sdk.ai.chat({
system: `You are a helpful assistant. Answer questions using ONLY the provided context. If the context doesn't contain the answer, say "I don't have information about that."`,
prompt: `Context:\n${context}\n\nQuestion: ${question}`,
model: 'gpt-4o',
maxTokens: 1024,
})
return {
answer: answer.text,
sources: results.filter(r => r.score > 0.7).map(r => ({
title: r.metadata.title,
url: r.metadata.url,
relevance: r.score,
})),
}
}
src/components/RAGChat.tsx
import { useGatewayChat, useVectorSearch } from '@aerostack/react'
export function KnowledgeBaseChat() {
const { search } = useVectorSearch()
const { messages, sendMessage, isStreaming } = useGatewayChat({
model: 'gpt-4o',
systemPrompt: 'Answer using only the provided context. Cite your sources.',
})
const [input, setInput] = useState('')
const handleSubmit = async () => {
// 1. Search for relevant docs
const docs = await search({ query: input, limit: 5 })
// 2. Build augmented prompt
const context = docs
.filter(d => d.score > 0.7)
.map(d => `[${d.metadata.title}]: ${d.content}`)
.join('\n\n')
const augmentedPrompt = `Context:\n${context}\n\nQuestion: ${input}`
// 3. Send to chat with context
await sendMessage(augmentedPrompt)
setInput('')
}
return (
<div>
{messages.map((msg, i) => (
<div key={i}>
<strong>{msg.role}:</strong> {msg.content}
</div>
))}
<form onSubmit={(e) => { e.preventDefault(); handleSubmit() }}>
<input
value={input}
onChange={(e) => setInput(e.target.value)}
placeholder="Ask about our documentation..."
disabled={isStreaming}
/>
<button type="submit" disabled={isStreaming}>Ask</button>
</form>
</div>
)
}

MethodSignatureReturns
sdk.ai.chat(opts: { prompt?, messages?, system?, model?, maxTokens?, temperature? }) => Promise<CompletionResult>{ text, usage, finishReason }
sdk.ai.embed(text: string) => Promise<number[]>Embedding vector
sdk.ai.streamCompletion(opts) => Promise<AsyncIterable<string>>Token stream
MethodSignatureReturns
sdk.ai.search.ingest(doc: { id, content, type?, metadata? }) => Promise<void>Nothing
sdk.ai.search.query(opts: { query, limit?, type? }) => Promise<SearchResult[]>Ranked results
sdk.ai.search.get(id: string) => Promise<Document>Single document
sdk.ai.search.update(id: string, updates) => Promise<void>Nothing
sdk.ai.search.delete(id: string) => Promise<void>Nothing
sdk.ai.search.deleteByType(type: string) => Promise<void>Nothing
sdk.ai.search.listTypes() => Promise<string[]>Document types
sdk.ai.search.count(opts?: { type? }) => Promise<number>Count
sdk.ai.search.configure(opts) => Promise<void>Nothing