AI
Aerostack AI gives you LLM completions, vector embeddings, and token streaming via a single SDK — without managing API keys or provider SDKs directly.
Quick start
Section titled “Quick start”import { sdk } from '@aerostack/sdk'
// Text completionconst result = await sdk.ai.complete({ prompt: 'Summarize this article in 3 bullet points:\n\n' + articleText, model: 'gpt-4o-mini', maxTokens: 256,})console.log(result.text)
// Embeddingsconst embedding = await sdk.ai.embed('What is the capital of France?')// Returns: number[] (1536-dimensional vector)Completions
Section titled “Completions”const result = await sdk.ai.complete({ prompt: 'Write a product description for: ' + productName, model: 'gpt-4o-mini', // or 'gpt-4o', 'claude-3-haiku', etc. maxTokens: 512, temperature: 0.7,})
// result.text — the generated text// result.usage — { promptTokens, completionTokens, totalTokens }Embeddings
Section titled “Embeddings”Generate vector representations for semantic search and similarity:
// Single textconst vector = await sdk.ai.embed('user search query')
// Store in database alongside your contentawait sdk.db.query( 'INSERT INTO documents (id, text, embedding) VALUES (?, ?, ?)', [id, text, JSON.stringify(vector)])
// Find similar documentsconst queryVector = await sdk.ai.embed(searchQuery)const similar = await sdk.search.query(queryVector, { table: 'documents', limit: 10 })Streaming (via WebSocket)
Section titled “Streaming (via WebSocket)”For real-time token delivery to clients, use the AI Streaming guide — tokens are pushed via your Realtime channel.
// Stream from server and push tokens via WebSocketfor await (const token of await sdk.ai.streamCompletion({ prompt })) { sdk.socket.emit('ai:token', { token }, sessionChannel)}sdk.socket.emit('ai:done', {}, sessionChannel)Model support
Section titled “Model support”Configure your AI provider and model in Dashboard → AI → Configuration. Supported providers include OpenAI, Anthropic, and Cloudflare AI Workers.
Use Cases
Section titled “Use Cases”Chat completions
Section titled “Chat completions”Build a customer support chatbot or internal assistant. Send conversation history as the prompt and stream responses back to users token-by-token via the Realtime channel. Swap models (GPT-4o, Claude, etc.) from the dashboard without changing code.
Content moderation
Section titled “Content moderation”Automatically screen user-generated content before publishing. Pass submitted text through an LLM with a moderation prompt to detect policy violations, spam, or harmful content. Run this as a queue job so moderation happens asynchronously without blocking the user.
const result = await sdk.ai.complete({ model: 'gpt-4o-mini', system: 'You are a content moderator. Respond with JSON: { "safe": boolean, "reason": string }', prompt: `Review this user submission:\n\n${userContent}`, maxTokens: 128,})const verdict = JSON.parse(result.text)Semantic search
Section titled “Semantic search”Generate embeddings for your content library (help articles, product descriptions, documentation) and let users search by meaning instead of exact keywords. A query like “how do I reset my password” matches articles about account recovery even if they never use the word “reset.”
Document Q&A with RAG
Section titled “Document Q&A with RAG”Combine embeddings + vector search + completions into a retrieval-augmented generation pipeline. Users ask a question, you find the most relevant documents via vector search, then pass those documents as context to an LLM to generate a grounded answer with source citations.
Code analysis
Section titled “Code analysis”Feed source code into an LLM to generate documentation, find bugs, or suggest refactors. Use streaming to show results progressively as the model works through large files.
Next steps
Section titled “Next steps”- Completions — full completion API
- Embeddings — vector embeddings for search
- Streaming — stream tokens to clients
- AI Streaming Chat example — complete working app