AI

Aerostack AI gives you LLM completions, vector embeddings, and token streaming via a single SDK — without managing API keys or provider SDKs directly.

Quick start

import { sdk } from '@aerostack/sdk'

// Text completion
const result = await sdk.ai.complete({
  prompt: 'Summarize this article in 3 bullet points:\n\n' + articleText,
  model: 'gpt-4o-mini',
  maxTokens: 256,
})
console.log(result.text)

// Embeddings
const embedding = await sdk.ai.embed('What is the capital of France?')
// Returns: number[] (1536-dimensional vector)

Completions

const result = await sdk.ai.complete({
  prompt: 'Write a product description for: ' + productName,
  model: 'gpt-4o-mini',   // or 'gpt-4o', 'claude-3-haiku', etc.
  maxTokens: 512,
  temperature: 0.7,
})

// result.text — the generated text
// result.usage — { promptTokens, completionTokens, totalTokens }

Embeddings

Generate vector representations for semantic search and similarity:

// Single text
const vector = await sdk.ai.embed('user search query')

// Store in database alongside your content
await sdk.db.query(
  'INSERT INTO documents (id, text, embedding) VALUES (?, ?, ?)',
  [id, text, JSON.stringify(vector)]
)

// Find similar documents
const queryVector = await sdk.ai.embed(searchQuery)
const similar = await sdk.search.query(queryVector, { table: 'documents', limit: 10 })

Streaming (via WebSocket)

For real-time token delivery to clients, use the AI Streaming guide — tokens are pushed via your Realtime channel.

// Stream from server and push tokens via WebSocket
for await (const token of await sdk.ai.streamCompletion({ prompt })) {
  sdk.socket.emit('ai:token', { token }, sessionChannel)
}
sdk.socket.emit('ai:done', {}, sessionChannel)

Model support

Configure your AI provider and model in Dashboard → AI → Configuration. Supported providers include OpenAI, Anthropic, and Cloudflare AI Workers.

Use Cases

Chat completions

Build a customer support chatbot or internal assistant. Send conversation history as the prompt and stream responses back to users token-by-token via the Realtime channel. Swap models (GPT-4o, Claude, etc.) from the dashboard without changing code.

Content moderation

Automatically screen user-generated content before publishing. Pass submitted text through an LLM with a moderation prompt to detect policy violations, spam, or harmful content. Run this as a queue job so moderation happens asynchronously without blocking the user.

const result = await sdk.ai.complete({
  model: 'gpt-4o-mini',
  system: 'You are a content moderator. Respond with JSON: { "safe": boolean, "reason": string }',
  prompt: `Review this user submission:\n\n${userContent}`,
  maxTokens: 128,
})
const verdict = JSON.parse(result.text)

Semantic search

Generate embeddings for your content library (help articles, product descriptions, documentation) and let users search by meaning instead of exact keywords. A query like “how do I reset my password” matches articles about account recovery even if they never use the word “reset.”

Document Q&A with RAG

Combine embeddings + vector search + completions into a retrieval-augmented generation pipeline. Users ask a question, you find the most relevant documents via vector search, then pass those documents as context to an LLM to generate a grounded answer with source citations.

Code analysis

Feed source code into an LLM to generate documentation, find bugs, or suggest refactors. Use streaming to show results progressively as the model works through large files.

Next steps

Completions — full completion API
Embeddings — vector embeddings for search
Streaming — stream tokens to clients
AI Streaming Chat example — complete working app