# AI & Vector Search

> LLM chat completions, streaming responses, vector embeddings, semantic search, and RAG pipelines with the Aerostack SDK.

The AI module gives you access to 30+ LLM models through a unified API with streaming support. Combined with the built-in Vector Search module, you can build RAG pipelines, semantic search, and intelligent content systems without any external services.

  **Beta** -- AI and Vector Search APIs are stable but may receive non-breaking additions.

## What you can build

- **AI chat interfaces** -- Streaming chat completions with token-by-token rendering
- **Content generation** -- Blog posts, product descriptions, email drafts
- **Document summarization** -- Condense long documents into key points
- **Knowledge bases** -- Ingest documents and answer questions using RAG
- **Semantic search** -- Find content by meaning, not just keywords
- **FAQ auto-matching** -- Match user questions to existing answers
- **Product recommendations** -- Find similar products based on description embeddings
- **Code search** -- Semantic search across codebases and documentation
- **Classification** -- Categorize text, detect sentiment, extract entities

## AI completions

### Basic completion

  
```tsx

function AISummary({ text }: { text: string }) {
  const { chat, loading } = useAI()
  const [summary, setSummary] = useState('')

  const handleSummarize = async () => {
    const result = await chat({
      prompt: `Summarize this text in 3 bullet points:\n\n${text}`,
      model: 'gpt-4o-mini',
      maxTokens: 256,
    })
    setSummary(result.text)
  }

  return (
    <div>
      <button onClick={handleSummarize} disabled={loading}>Summarize</button>
      {summary && <p>{summary}</p>}
    </div>
  )
}
```
  
  
```ts
const result = await sdk.ai.chat({
  prompt: 'Explain WebSocket pub/sub in two sentences.',
  model: 'gpt-4o-mini',
  maxTokens: 256,
  temperature: 0.7,
})

console.log(result.text)        // The completion text
console.log(result.usage)       // { promptTokens, completionTokens, totalTokens }
console.log(result.finishReason) // 'stop' | 'length'
```
  
  
```go
result, err := client.AI.Complete(ctx, aerostack.CompletionInput{
    Prompt:    "Explain WebSocket pub/sub in two sentences.",
    Model:     "gpt-4o-mini",
    MaxTokens: 256,
})
fmt.Println(result.Text)
```
  
  
```python
result = await client.ai.complete(
    prompt="Explain WebSocket pub/sub in two sentences.",
    model="gpt-4o-mini",
    max_tokens=256
)
print(result.text)
```
  

### Chat with message history

```ts
const result = await sdk.ai.chat({
  messages: [
    { role: 'system', content: 'You are a helpful coding assistant.' },
    { role: 'user', content: 'How do I sort an array in JavaScript?' },
    { role: 'assistant', content: 'Use Array.prototype.sort()...' },
    { role: 'user', content: 'What about sorting objects by a property?' },
  ],
  model: 'gpt-4o',
  maxTokens: 512,
})
```

### System prompts

```ts
const result = await sdk.ai.chat({
  system: 'You are a customer support agent for Acme Corp. Be concise and helpful.',
  prompt: userMessage,
  model: 'claude-3-5-sonnet-20241022',
  maxTokens: 1024,
})
```

---

## Streaming responses

Stream completions token-by-token for chat interfaces. This provides a much better user experience than waiting for the full response.

### React: useGatewayChat

The `useGatewayChat` hook provides a complete streaming chat experience with token wallet integration:

```tsx

function ChatInterface() {
  const {
    messages,
    input,
    setInput,
    sendMessage,
    isStreaming,
    error,
  } = useGatewayChat({
    model: 'gpt-4o-mini',
    systemPrompt: 'You are a helpful assistant.',
  })

  return (
    <div>
      {messages.map((msg, i) => (
        <div key={i}>
          <strong>{msg.role}:</strong> {msg.content}
        </div>
      ))}

      <form onSubmit={(e) => { e.preventDefault(); sendMessage() }}>
        <input
          value={input}
          onChange={(e) => setInput(e.target.value)}
          placeholder="Ask anything..."
          disabled={isStreaming}
        />
        <button type="submit" disabled={isStreaming}>Send</button>
      </form>

      {error && <p style={{ color: 'red' }}>{error}</p>}
    </div>
  )
}
```

### React: useStream (low-level)

For custom streaming implementations:

```tsx

function CustomStream() {
  const { stream, isStreaming, error } = useStream()
  const [output, setOutput] = useState('')

  const handleGenerate = async () => {
    setOutput('')
    await stream({
      url: '/api/ai/complete',
      body: { prompt: 'Write a haiku about coding', model: 'gpt-4o-mini' },
      onToken: (token) => setOutput(prev => prev + token),
      onDone: () => console.log('Stream complete'),
    })
  }

  return (
    <div>
      <button onClick={handleGenerate} disabled={isStreaming}>Generate</button>
      <pre>{output}</pre>
    </div>
  )
}
```

### Server-side streaming

```ts
const stream = await sdk.ai.streamCompletion({
  prompt: 'Write a short story about a robot.',
  model: 'gpt-4o',
  maxTokens: 1024,
})

for await (const token of stream) {
  // Send each token to client via SSE or WebSocket
  sdk.socket.emit('ai:token', { token }, `ai-chat/${sessionId}`)
}
sdk.socket.emit('ai:done', {}, `ai-chat/${sessionId}`)
```

---

## Token wallet

Track AI token usage and balance with the built-in wallet system:

```tsx

function WalletDisplay() {
  const { balance, usage, loading } = useGatewayWallet()

  return (
    <div>
      <p>Balance: {balance.remaining} tokens</p>
      <p>Used today: {usage.today} tokens</p>
    </div>
  )
}

// Or the simpler hook:
function SimpleBalance() {
  const { balance } = useTokenBalance()
  return <p>{balance} tokens remaining</p>
}
```

---

## Vector Search

Ingest documents as vector embeddings and query them with natural language. The vector search module handles embedding generation, storage, and similarity matching automatically.

### Ingest documents

```ts
// Ingest a single document
await sdk.ai.search.ingest({
  id: 'doc-123',
  content: 'Aerostack provides WebSocket-based realtime with presence tracking...',
  type: 'documentation',
  metadata: {
    title: 'Realtime Guide',
    url: '/sdk/realtime',
    section: 'presence',
  },
})

// Ingest multiple documents
const documents = [
  { id: 'faq-1', content: 'How do I reset my password?', type: 'faq', metadata: { category: 'auth' } },
  { id: 'faq-2', content: 'What models are available for AI?', type: 'faq', metadata: { category: 'ai' } },
  { id: 'faq-3', content: 'How do I upload files?', type: 'faq', metadata: { category: 'storage' } },
]

for (const doc of documents) {
  await sdk.ai.search.ingest(doc)
}
```

### Semantic query

```ts
// Find documents similar to a natural language query
const results = await sdk.ai.search.query({
  query: 'how to track who is online',
  limit: 5,
  type: 'documentation', // optional: filter by document type
})

// results: [{ id, content, metadata, score }, ...]
// score: 0-1 similarity (higher is more similar)
```

### React: useVectorSearch

```tsx

function SemanticSearch() {
  const { search, results, loading } = useVectorSearch()
  const [query, setQuery] = useState('')

  return (
    <div>
      <input
        value={query}
        onChange={(e) => setQuery(e.target.value)}
        placeholder="Search by meaning..."
      />
      <button onClick={() => search({ query, limit: 10 })} disabled={loading}>
        Search
      </button>

      {results.map(r => (
        <div key={r.id}>
          <h3>{r.metadata.title}</h3>
          <p>{r.content.slice(0, 200)}...</p>
          <small>Relevance: {(r.score * 100).toFixed(1)}%</small>
        </div>
      ))}
    </div>
  )
}
```

### Manage indexed documents

```ts
// Get a specific document
const doc = await sdk.ai.search.get('doc-123')

// Update a document (re-indexes the embedding)
await sdk.ai.search.update('doc-123', {
  content: 'Updated content...',
  metadata: { title: 'Updated Title' },
})

// Delete a document
await sdk.ai.search.delete('doc-123')

// Delete all documents of a type
await sdk.ai.search.deleteByType('faq')

// List all document types
const types = await sdk.ai.search.listTypes()
// ['documentation', 'faq', 'product']

// Count documents
const count = await sdk.ai.search.count()
const faqCount = await sdk.ai.search.count({ type: 'faq' })

// Configure search settings
await sdk.ai.search.configure({
  embeddingModel: 'text-embedding-3-small',
  similarityThreshold: 0.7,
})
```

---

## RAG pipeline (retrieval-augmented generation)

Combine vector search with AI completions to build systems that answer questions using your own data.

```mermaid
flowchart LR
    Q[User Question] --> VS[Vector Search]
    VS --> CTX[Relevant Documents]
    CTX --> LLM[LLM Completion]
    Q --> LLM
    LLM --> A[Answer with Sources]
```

### Implementation

```ts title="rag-pipeline.ts"
async function answerQuestion(question: string) {
  // Step 1: Find relevant documents
  const results = await sdk.ai.search.query({
    query: question,
    limit: 5,
    type: 'documentation',
  })

  // Step 2: Build context from search results
  const context = results
    .filter(r => r.score > 0.7)
    .map(r => `[${r.metadata.title}]: ${r.content}`)
    .join('\n\n')

  // Step 3: Generate answer with context
  const answer = await sdk.ai.chat({
    system: `You are a helpful assistant. Answer questions using ONLY the provided context. If the context doesn't contain the answer, say "I don't have information about that."`,
    prompt: `Context:\n${context}\n\nQuestion: ${question}`,
    model: 'gpt-4o',
    maxTokens: 1024,
  })

  return {
    answer: answer.text,
    sources: results.filter(r => r.score > 0.7).map(r => ({
      title: r.metadata.title,
      url: r.metadata.url,
      relevance: r.score,
    })),
  }
}
```

### Complete RAG chat component

```tsx title="src/components/RAGChat.tsx"

  const { search } = useVectorSearch()
  const { messages, sendMessage, isStreaming } = useGatewayChat({
    model: 'gpt-4o',
    systemPrompt: 'Answer using only the provided context. Cite your sources.',
  })
  const [input, setInput] = useState('')

  const handleSubmit = async () => {
    // 1. Search for relevant docs
    const docs = await search({ query: input, limit: 5 })

    // 2. Build augmented prompt
    const context = docs
      .filter(d => d.score > 0.7)
      .map(d => `[${d.metadata.title}]: ${d.content}`)
      .join('\n\n')

    const augmentedPrompt = `Context:\n${context}\n\nQuestion: ${input}`

    // 3. Send to chat with context
    await sendMessage(augmentedPrompt)
    setInput('')
  }

  return (
    <div>
      {messages.map((msg, i) => (
        <div key={i}>
          <strong>{msg.role}:</strong> {msg.content}
        </div>
      ))}

      <form onSubmit={(e) => { e.preventDefault(); handleSubmit() }}>
        <input
          value={input}
          onChange={(e) => setInput(e.target.value)}
          placeholder="Ask about our documentation..."
          disabled={isStreaming}
        />
        <button type="submit" disabled={isStreaming}>Ask</button>
      </form>
    </div>
  )
}
```

---

## API reference

### AI completions

| Method | Signature | Returns |
|--------|-----------|---------|
| `sdk.ai.chat` | `(opts: { prompt?, messages?, system?, model?, maxTokens?, temperature? }) => Promise` | `{ text, usage, finishReason }` |
| `sdk.ai.embed` | `(text: string) => Promise<number[]>` | Embedding vector |
| `sdk.ai.streamCompletion` | `(opts) => Promise>` | Token stream |

### Vector Search

| Method | Signature | Returns |
|--------|-----------|---------|
| `sdk.ai.search.ingest` | `(doc: { id, content, type?, metadata? }) => Promise<void>` | Nothing |
| `sdk.ai.search.query` | `(opts: { query, limit?, type? }) => Promise` | Ranked results |
| `sdk.ai.search.get` | `(id: string) => Promise` | Single document |
| `sdk.ai.search.update` | `(id: string, updates) => Promise<void>` | Nothing |
| `sdk.ai.search.delete` | `(id: string) => Promise<void>` | Nothing |
| `sdk.ai.search.deleteByType` | `(type: string) => Promise<void>` | Nothing |
| `sdk.ai.search.listTypes` | `() => Promise<string[]>` | Document types |
| `sdk.ai.search.count` | `(opts?: { type? }) => Promise<number>` | Count |
| `sdk.ai.search.configure` | `(opts) => Promise<void>` | Nothing |
