Pipeline Builder

Available for BYOK (AI provider) products. Add stages between the user’s message and the LLM — RAG retrieval, pre-processing hooks, and post-processing hooks — all toggleable per request.

The pipeline builder is only available for BYOK (AI provider API) products. BYOC (your own server) products proxy requests directly to your server.

Pipeline stages

Stages run in this fixed order:

User message
    ↓
[1] RAG retrieval       (optional) — inject relevant docs into context
    ↓
[2] Pre-processing hook  (optional) — transform prompt, call APIs, detect language
    ↓
[3] LLM call             (required) — primary model + fallback chain
    ↓
[4] Post-processing hook (optional) — filter, format, store, trigger webhooks
    ↓
Streamed response → user

Each stage can be toggled on or off from AI Products → your API → Pipeline in the dashboard.

Stage 1 — RAG Retrieval

Upload documents and let Aerostack automatically inject relevant content as context before the LLM call. Uses Cloudflare Vectorize + Workers AI embeddings.

How it works:

Upload your documents via the dashboard (PDF, TXT, MD, DOCX)
Aerostack chunks, embeds, and indexes them into Vectorize
On each request, the user’s query is embedded and the top-k most relevant chunks are retrieved
Chunks are injected into the system prompt before the LLM call

Config options:

Option	Default	Description
`top_k`	`5`	Number of chunks to retrieve
`score_threshold`	`0.75`	Minimum similarity score (0–1)
`namespace`	`"default"`	Isolate knowledge per namespace

Test it: Dashboard → Pipeline → RAG stage → “Test RAG query” — enter a query and see exactly which chunks would be injected.

Documents API:

# Upload a document
curl -X POST https://api.aerocall.ai/api/v1/gateway/apis/:apiId/knowledge/upload \
  -H "Authorization: Bearer <your-developer-jwt>" \
  -F "[email protected]"
 
# Test retrieval
curl -X POST https://api.aerocall.ai/api/v1/gateway/apis/:apiId/knowledge/test \
  -H "Authorization: Bearer <your-developer-jwt>" \
  -d '{"query": "What is the refund policy?"}'

Stage 2 & 4 — Hooks (Pre and Post)

Run an edge function before or after the LLM call. Functions are picked from your My Functions list.

Pre-hook use cases:

Translate the user’s message to English before the LLM
Detect intent and route to different system prompts
Validate input (block off-topic queries)
Fetch external data to enrich context (e.g. pull user’s account data)

Post-hook use cases:

Format the response (convert to JSON, strip markdown)
Filter for compliance (block PII in response)
Store to a database or trigger webhooks
Translate response back to the user’s language

Hook contract:

Your function receives the current message list and returns an updated one:

// Your edge function receives this request body
interface HookRequest {
  messages: Array<{ role: string; content: string }>
  metadata: Record<string, any>
}
 
// And must return this
interface HookResponse {
  messages: Array<{ role: string; content: string }>
  metadata?: Record<string, any>
}

Example pre-hook (language detection + translation):

export default {
  async fetch(request: Request): Promise<Response> {
    const { messages, metadata } = await request.json()
    const last = messages.at(-1)
 
    // Detect and translate (pseudocode — use your preferred translation API)
    const { lang, translated } = await detectAndTranslate(last.content)
 
    const updatedMessages = [
      ...messages.slice(0, -1),
      { role: 'user', content: translated },
    ]
 
    return Response.json({ messages: updatedMessages, metadata: { original_lang: lang } })
  }
}

Stage 3 — LLM Call

The LLM stage is always enabled. Configure your primary model and optional fallbacks.

Supported providers:

Provider	Models
OpenAI	`gpt-4o`, `gpt-4o-mini`, `gpt-4-turbo`, `gpt-3.5-turbo`
Anthropic	`claude-3-5-sonnet`, `claude-3-haiku`, `claude-3-opus`
Gemini	`gemini-1.5-pro`, `gemini-1.5-flash`, `gemini-1.0-pro`
Azure OpenAI	Any deployment name
Workers AI	`@cf/meta/llama-3.1-8b-instruct`, `@cf/mistral/mistral-7b-instruct`

Fallback chains:

Configure fallbacks to handle provider outages or rate limits automatically:

{
  "primary": {
    "provider": "openai",
    "model": "gpt-4o",
    "secret_key_name": "OPENAI_API_KEY"
  },
  "fallbacks": [
    {
      "provider": "workers-ai",
      "model": "@cf/meta/llama-3.1-8b-instruct",
      "on_status": [429, 503]
    }
  ]
}

When the primary returns a 429 (rate limit) or 503, Aerostack automatically retries with the fallback — transparent to the user.

Save via API

curl -X PUT https://api.aerocall.ai/api/v1/gateway/apis/:apiId/pipeline \
  -H "Authorization: Bearer <your-developer-jwt>" \
  -H "Content-Type: application/json" \
  -d '{
    "pipeline": [
      { "stage": "rag", "enabled": true, "config": { "top_k": 5, "score_threshold": 0.75 } },
      { "stage": "pre_hook", "enabled": false, "config": { "function_id": null } },
      {
        "stage": "llm",
        "enabled": true,
        "config": {
          "primary": { "provider": "openai", "model": "gpt-4o", "secret_key_name": "OPENAI_API_KEY" },
          "fallbacks": [{ "provider": "workers-ai", "model": "@cf/meta/llama-3.1-8b-instruct", "on_status": [429, 503] }]
        }
      },
      { "stage": "post_hook", "enabled": false, "config": { "function_id": null } }
    ]
  }'

Auth Modes Subscription Plans