# AI

> LLM completions, vector embeddings, and token streaming via one SDK — no API key management, no provider juggling. Works on Cloudflare Workers.

Aerostack AI gives you LLM completions, vector embeddings, and token streaming via a single SDK — without managing API keys or provider SDKs directly.

## Quick start

```ts

// Text completion
const result = await sdk.ai.complete({
  prompt: 'Summarize this article in 3 bullet points:\n\n' + articleText,
  model: 'gpt-4o-mini',
  maxTokens: 256,
})
console.log(result.text)

// Embeddings
const embedding = await sdk.ai.embed('What is the capital of France?')
// Returns: number[] (1536-dimensional vector)
```

## Completions

```ts
const result = await sdk.ai.complete({
  prompt: 'Write a product description for: ' + productName,
  model: 'gpt-4o-mini',   // or 'gpt-4o', 'claude-3-haiku', etc.
  maxTokens: 512,
  temperature: 0.7,
})

// result.text — the generated text
// result.usage — { promptTokens, completionTokens, totalTokens }
```

## Embeddings

Generate vector representations for semantic search and similarity:

```ts
// Single text
const vector = await sdk.ai.embed('user search query')

// Store in database alongside your content
await sdk.db.query(
  'INSERT INTO documents (id, text, embedding) VALUES (?, ?, ?)',
  [id, text, JSON.stringify(vector)]
)

// Find similar documents
const queryVector = await sdk.ai.embed(searchQuery)
const similar = await sdk.search.query(queryVector, { table: 'documents', limit: 10 })
```

## Streaming (via WebSocket)

For real-time token delivery to clients, use the [AI Streaming](/features/realtime/ai-streaming) guide — tokens are pushed via your Realtime channel.

```ts
// Stream from server and push tokens via WebSocket
for await (const token of await sdk.ai.streamCompletion({ prompt })) {
  sdk.socket.emit('ai:token', { token }, sessionChannel)
}
sdk.socket.emit('ai:done', {}, sessionChannel)
```

## Model support

Configure your AI provider and model in **Dashboard → AI → Configuration**. Supported providers include OpenAI, Anthropic, and Cloudflare AI Workers.

AI requests are proxied through Aerostack's edge — your provider API key is never exposed to the client.

## Use Cases

### Chat completions

Build a customer support chatbot or internal assistant. Send conversation history as the prompt and stream responses back to users token-by-token via the Realtime channel. Swap models (GPT-4o, Claude, etc.) from the dashboard without changing code.

### Content moderation

Automatically screen user-generated content before publishing. Pass submitted text through an LLM with a moderation prompt to detect policy violations, spam, or harmful content. Run this as a queue job so moderation happens asynchronously without blocking the user.

```ts
const result = await sdk.ai.complete({
  model: 'gpt-4o-mini',
  system: 'You are a content moderator. Respond with JSON: { "safe": boolean, "reason": string }',
  prompt: `Review this user submission:\n\n${userContent}`,
  maxTokens: 128,
})
const verdict = JSON.parse(result.text)
```

### Semantic search

Generate embeddings for your content library (help articles, product descriptions, documentation) and let users search by meaning instead of exact keywords. A query like "how do I reset my password" matches articles about account recovery even if they never use the word "reset."

### Document Q&A with RAG

Combine embeddings + vector search + completions into a retrieval-augmented generation pipeline. Users ask a question, you find the most relevant documents via vector search, then pass those documents as context to an LLM to generate a grounded answer with source citations.

### Code analysis

Feed source code into an LLM to generate documentation, find bugs, or suggest refactors. Use streaming to show results progressively as the model works through large files.

## Next steps

- [Completions](/features/ai/completions) — full completion API
- [Embeddings](/features/ai/embeddings) — vector embeddings for search
- [Streaming](/features/ai/streaming) — stream tokens to clients
- [AI Streaming Chat example](/examples/ai-streaming-chat) — complete working app