AI & Vector Search

The AI module gives you access to 30+ LLM models through a unified API with streaming support. Combined with the built-in Vector Search module, you can build RAG pipelines, semantic search, and intelligent content systems without any external services.

⚠️

Beta — AI and Vector Search APIs are stable but may receive non-breaking additions.

What you can build

AI chat interfaces — Streaming chat completions with token-by-token rendering
Content generation — Blog posts, product descriptions, email drafts
Document summarization — Condense long documents into key points
Knowledge bases — Ingest documents and answer questions using RAG
Semantic search — Find content by meaning, not just keywords
FAQ auto-matching — Match user questions to existing answers
Product recommendations — Find similar products based on description embeddings
Code search — Semantic search across codebases and documentation
Classification — Categorize text, detect sentiment, extract entities

AI completions

Basic completion

import { useAI } from '@aerostack/react'
 
function AISummary({ text }: { text: string }) {
  const { chat, loading } = useAI()
  const [summary, setSummary] = useState('')
 
  const handleSummarize = async () => {
    const result = await chat({
      prompt: `Summarize this text in 3 bullet points:\n\n${text}`,
      model: 'gpt-4o-mini',
      maxTokens: 256,
    })
    setSummary(result.text)
  }
 
  return (
    <div>
      <button onClick={handleSummarize} disabled={loading}>Summarize</button>
      {summary && <p>{summary}</p>}
    </div>
  )
}

const result = await sdk.ai.chat({
  prompt: 'Explain WebSocket pub/sub in two sentences.',
  model: 'gpt-4o-mini',
  maxTokens: 256,
  temperature: 0.7,
})
 
console.log(result.text)        // The completion text
console.log(result.usage)       // { promptTokens, completionTokens, totalTokens }
console.log(result.finishReason) // 'stop' | 'length'

result, err := client.AI.Complete(ctx, aerostack.CompletionInput{
    Prompt:    "Explain WebSocket pub/sub in two sentences.",
    Model:     "gpt-4o-mini",
    MaxTokens: 256,
})
fmt.Println(result.Text)

result = await client.ai.complete(
    prompt="Explain WebSocket pub/sub in two sentences.",
    model="gpt-4o-mini",
    max_tokens=256
)
print(result.text)

Chat with message history

const result = await sdk.ai.chat({
  messages: [
    { role: 'system', content: 'You are a helpful coding assistant.' },
    { role: 'user', content: 'How do I sort an array in JavaScript?' },
    { role: 'assistant', content: 'Use Array.prototype.sort()...' },
    { role: 'user', content: 'What about sorting objects by a property?' },
  ],
  model: 'gpt-4o',
  maxTokens: 512,
})

System prompts

const result = await sdk.ai.chat({
  system: 'You are a customer support agent for Acme Corp. Be concise and helpful.',
  prompt: userMessage,
  model: 'claude-3-5-sonnet-20241022',
  maxTokens: 1024,
})

Streaming responses

Stream completions token-by-token for chat interfaces. This provides a much better user experience than waiting for the full response.

React: useGatewayChat

The useGatewayChat hook provides a complete streaming chat experience with token wallet integration:

import { useGatewayChat } from '@aerostack/react'
 
function ChatInterface() {
  const {
    messages,
    input,
    setInput,
    sendMessage,
    isStreaming,
    error,
  } = useGatewayChat({
    model: 'gpt-4o-mini',
    systemPrompt: 'You are a helpful assistant.',
  })
 
  return (
    <div>
      {messages.map((msg, i) => (
        <div key={i}>
          <strong>{msg.role}:</strong> {msg.content}
        </div>
      ))}
 
      <form onSubmit={(e) => { e.preventDefault(); sendMessage() }}>
        <input
          value={input}
          onChange={(e) => setInput(e.target.value)}
          placeholder="Ask anything..."
          disabled={isStreaming}
        />
        <button type="submit" disabled={isStreaming}>Send</button>
      </form>
 
      {error && <p style={{ color: 'red' }}>{error}</p>}
    </div>
  )
}

React: useStream (low-level)

For custom streaming implementations:

import { useStream } from '@aerostack/react'
 
function CustomStream() {
  const { stream, isStreaming, error } = useStream()
  const [output, setOutput] = useState('')
 
  const handleGenerate = async () => {
    setOutput('')
    await stream({
      url: '/api/ai/complete',
      body: { prompt: 'Write a haiku about coding', model: 'gpt-4o-mini' },
      onToken: (token) => setOutput(prev => prev + token),
      onDone: () => console.log('Stream complete'),
    })
  }
 
  return (
    <div>
      <button onClick={handleGenerate} disabled={isStreaming}>Generate</button>
      <pre>{output}</pre>
    </div>
  )
}

Server-side streaming

const stream = await sdk.ai.streamCompletion({
  prompt: 'Write a short story about a robot.',
  model: 'gpt-4o',
  maxTokens: 1024,
})
 
for await (const token of stream) {
  // Send each token to client via SSE or WebSocket
  sdk.socket.emit('ai:token', { token }, `ai-chat/${sessionId}`)
}
sdk.socket.emit('ai:done', {}, `ai-chat/${sessionId}`)

Token wallet

Track AI token usage and balance with the built-in wallet system:

import { useGatewayWallet, useTokenBalance } from '@aerostack/react'
 
function WalletDisplay() {
  const { balance, usage, loading } = useGatewayWallet()
 
  return (
    <div>
      <p>Balance: {balance.remaining} tokens</p>
      <p>Used today: {usage.today} tokens</p>
    </div>
  )
}
 
// Or the simpler hook:
function SimpleBalance() {
  const { balance } = useTokenBalance()
  return <p>{balance} tokens remaining</p>
}

Vector Search

Ingest documents as vector embeddings and query them with natural language. The vector search module handles embedding generation, storage, and similarity matching automatically.

Ingest documents

// Ingest a single document
await sdk.ai.search.ingest({
  id: 'doc-123',
  content: 'Aerostack provides WebSocket-based realtime with presence tracking...',
  type: 'documentation',
  metadata: {
    title: 'Realtime Guide',
    url: '/sdk/realtime',
    section: 'presence',
  },
})
 
// Ingest multiple documents
const documents = [
  { id: 'faq-1', content: 'How do I reset my password?', type: 'faq', metadata: { category: 'auth' } },
  { id: 'faq-2', content: 'What models are available for AI?', type: 'faq', metadata: { category: 'ai' } },
  { id: 'faq-3', content: 'How do I upload files?', type: 'faq', metadata: { category: 'storage' } },
]
 
for (const doc of documents) {
  await sdk.ai.search.ingest(doc)
}

Semantic query

// Find documents similar to a natural language query
const results = await sdk.ai.search.query({
  query: 'how to track who is online',
  limit: 5,
  type: 'documentation', // optional: filter by document type
})
 
// results: [{ id, content, metadata, score }, ...]
// score: 0-1 similarity (higher is more similar)

React: useVectorSearch

import { useVectorSearch } from '@aerostack/react'
 
function SemanticSearch() {
  const { search, results, loading } = useVectorSearch()
  const [query, setQuery] = useState('')
 
  return (
    <div>
      <input
        value={query}
        onChange={(e) => setQuery(e.target.value)}
        placeholder="Search by meaning..."
      />
      <button onClick={() => search({ query, limit: 10 })} disabled={loading}>
        Search
      </button>
 
      {results.map(r => (
        <div key={r.id}>
          <h3>{r.metadata.title}</h3>
          <p>{r.content.slice(0, 200)}...</p>
          <small>Relevance: {(r.score * 100).toFixed(1)}%</small>
        </div>
      ))}
    </div>
  )
}

Manage indexed documents

// Get a specific document
const doc = await sdk.ai.search.get('doc-123')
 
// Update a document (re-indexes the embedding)
await sdk.ai.search.update('doc-123', {
  content: 'Updated content...',
  metadata: { title: 'Updated Title' },
})
 
// Delete a document
await sdk.ai.search.delete('doc-123')
 
// Delete all documents of a type
await sdk.ai.search.deleteByType('faq')
 
// List all document types
const types = await sdk.ai.search.listTypes()
// ['documentation', 'faq', 'product']
 
// Count documents
const count = await sdk.ai.search.count()
const faqCount = await sdk.ai.search.count({ type: 'faq' })
 
// Configure search settings
await sdk.ai.search.configure({
  embeddingModel: 'text-embedding-3-small',
  similarityThreshold: 0.7,
})

RAG pipeline (retrieval-augmented generation)

Combine vector search with AI completions to build systems that answer questions using your own data.

Implementation

rag-pipeline.ts

async function answerQuestion(question: string) {
  // Step 1: Find relevant documents
  const results = await sdk.ai.search.query({
    query: question,
    limit: 5,
    type: 'documentation',
  })
 
  // Step 2: Build context from search results
  const context = results
    .filter(r => r.score > 0.7)
    .map(r => `[${r.metadata.title}]: ${r.content}`)
    .join('\n\n')
 
  // Step 3: Generate answer with context
  const answer = await sdk.ai.chat({
    system: `You are a helpful assistant. Answer questions using ONLY the provided context. If the context doesn't contain the answer, say "I don't have information about that."`,
    prompt: `Context:\n${context}\n\nQuestion: ${question}`,
    model: 'gpt-4o',
    maxTokens: 1024,
  })
 
  return {
    answer: answer.text,
    sources: results.filter(r => r.score > 0.7).map(r => ({
      title: r.metadata.title,
      url: r.metadata.url,
      relevance: r.score,
    })),
  }
}

Complete RAG chat component

src/components/RAGChat.tsx

import { useGatewayChat, useVectorSearch } from '@aerostack/react'
import { useState } from 'react'
 
export function KnowledgeBaseChat() {
  const { search } = useVectorSearch()
  const { messages, sendMessage, isStreaming } = useGatewayChat({
    model: 'gpt-4o',
    systemPrompt: 'Answer using only the provided context. Cite your sources.',
  })
  const [input, setInput] = useState('')
 
  const handleSubmit = async () => {
    // 1. Search for relevant docs
    const docs = await search({ query: input, limit: 5 })
 
    // 2. Build augmented prompt
    const context = docs
      .filter(d => d.score > 0.7)
      .map(d => `[${d.metadata.title}]: ${d.content}`)
      .join('\n\n')
 
    const augmentedPrompt = `Context:\n${context}\n\nQuestion: ${input}`
 
    // 3. Send to chat with context
    await sendMessage(augmentedPrompt)
    setInput('')
  }
 
  return (
    <div>
      {messages.map((msg, i) => (
        <div key={i}>
          <strong>{msg.role}:</strong> {msg.content}
        </div>
      ))}
 
      <form onSubmit={(e) => { e.preventDefault(); handleSubmit() }}>
        <input
          value={input}
          onChange={(e) => setInput(e.target.value)}
          placeholder="Ask about our documentation..."
          disabled={isStreaming}
        />
        <button type="submit" disabled={isStreaming}>Ask</button>
      </form>
    </div>
  )
}

API reference

AI completions

Method	Signature	Returns
`sdk.ai.chat`	`(opts: { prompt?, messages?, system?, model?, maxTokens?, temperature? }) => Promise<CompletionResult>`	`{ text, usage, finishReason }`
`sdk.ai.embed`	`(text: string) => Promise<number[]>`	Embedding vector
`sdk.ai.streamCompletion`	`(opts) => Promise<AsyncIterable<string>>`	Token stream

Vector Search

Method	Signature	Returns
`sdk.ai.search.ingest`	`(doc: { id, content, type?, metadata? }) => Promise<void>`	Nothing
`sdk.ai.search.query`	`(opts: { query, limit?, type? }) => Promise<SearchResult[]>`	Ranked results
`sdk.ai.search.get`	`(id: string) => Promise<Document>`	Single document
`sdk.ai.search.update`	`(id: string, updates) => Promise<void>`	Nothing
`sdk.ai.search.delete`	`(id: string) => Promise<void>`	Nothing
`sdk.ai.search.deleteByType`	`(type: string) => Promise<void>`	Nothing
`sdk.ai.search.listTypes`	`() => Promise<string[]>`	Document types
`sdk.ai.search.count`	`(opts?: { type? }) => Promise<number>`	Count
`sdk.ai.search.configure`	`(opts) => Promise<void>`	Nothing

Storage Cache & Queue