Billing & Limits
Aerostack bots support three billing modes that control how LLM costs are charged. You can set the billing mode per bot.
Billing Modes
Section titled “Billing Modes”Wallet (Default)
Section titled “Wallet (Default)”The wallet mode deducts LLM costs from your prepaid Aerostack balance.
How it works:
- Before processing a message, the bot checks your wallet has sufficient balance
- The message is processed through the LLM (with or without tool calls)
- Cost is calculated based on tokens used
- The cost is automatically deducted from your wallet
# Create a bot with wallet billing (default)curl -X POST https://api.aerostack.dev/api/bots \ -H "Authorization: Bearer YOUR_JWT_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "name": "My Bot", "platform": "custom", "workspace_id": "...", "system_prompt": "...", "billing_mode": "wallet" }'BYOK (Bring Your Own Key)
Section titled “BYOK (Bring Your Own Key)”The BYOK mode uses your own LLM API key. Aerostack does not charge for LLM usage — you pay your provider directly at their standard rates.
To use BYOK:
- Set
billing_modetobyok - Provide your LLM API key via the
llm_api_keyfield (encrypted at rest)
curl -X POST https://api.aerostack.dev/api/bots \ -H "Authorization: Bearer YOUR_JWT_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "name": "BYOK Bot", "platform": "custom", "workspace_id": "...", "system_prompt": "...", "billing_mode": "byok", "llm_provider": "openai", "llm_model": "gpt-4o", "llm_api_key": "sk-..." }'Plan Quota (Coming Soon)
Section titled “Plan Quota (Coming Soon)”The plan_quota mode will use your account plan’s included LLM usage quota. This mode is not yet fully implemented.
LLM Pricing
Section titled “LLM Pricing”All prices are per 1 million tokens when using wallet mode (Aerostack’s pooled keys).
Anthropic
Section titled “Anthropic”| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| claude-opus-4-6 | $15.00 | $75.00 |
| claude-sonnet-4-6 | $3.00 | $15.00 |
| claude-haiku-4-5 | $0.80 | $4.00 |
OpenAI
Section titled “OpenAI”| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| gpt-4o | $2.50 | $10.00 |
| gpt-4o-mini | $0.15 | $0.60 |
| o1 | $15.00 | $60.00 |
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| gemini-2.5-pro | $1.25 | $5.00 |
| gemini-2.5-flash | $0.15 | $0.60 |
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| llama-3.3-70b-versatile | $0.59 | $0.79 |
| mixtral-8x7b-32768 | $0.24 | $0.24 |
Spending Caps
Section titled “Spending Caps”Set a spending cap on any bot to limit total lifetime spend. When the bot’s cumulative cost reaches the cap, it stops responding to messages.
# Set a $5 spending cap (500 cents)curl -X PATCH https://api.aerostack.dev/api/bots/YOUR_BOT_ID \ -H "Authorization: Bearer YOUR_JWT_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "spending_cap_cents": 500 }'When the cap is reached, the bot returns a message like: “Bot has reached its spending limit.”
To remove a spending cap, set it to null:
curl -X PATCH https://api.aerostack.dev/api/bots/YOUR_BOT_ID \ -H "Authorization: Bearer YOUR_JWT_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "spending_cap_cents": null }'Cost Visibility
Section titled “Cost Visibility”Per-Message Costs
Section titled “Per-Message Costs”The test endpoint returns detailed cost information:
{ "response": "...", "tokens": { "input": 485, "output": 42 }, "cost_cents": 1, "latency_ms": 2340, "token_breakdown": { "systemPrompt": 120, "toolDefinitions": 200, "conversationHistory": 80, "currentMessage": 25, "toolResults": 40, "llmOutput": 42, "total": 507 }}Analytics Dashboard
Section titled “Analytics Dashboard”The analytics endpoint provides daily cost rollups:
curl "https://api.aerostack.dev/api/bots/YOUR_BOT_ID/analytics?from=2026-03-01&to=2026-03-15" \ -H "Authorization: Bearer YOUR_JWT_TOKEN"Returns daily breakdowns of tokens used, costs, and conversation counts:
{ "daily": [ { "date": "2026-03-14", "messages_received": 45, "messages_sent": 45, "tokens_input": 22500, "tokens_output": 4200, "total_cost_cents": 12, "unique_users": 8 } ], "summary": { "total_conversations": 120, "total_messages": 890, "total_tokens": 445000, "total_cost_cents": 156 }}Rate Limits
Section titled “Rate Limits”| Platform | Requests per Minute |
|---|---|
| Telegram | 60 |
| Discord | 120 |
| 60 | |
| Slack | 60 |
| Custom | 60 |
| Test endpoint | 10 (per user) |
Rate limits are enforced per bot (webhooks) or per user (test endpoint). Exceeding the limit returns a 429 Too Many Requests response.
Conversation Limits
Section titled “Conversation Limits”| Setting | Default | Description |
|---|---|---|
conversation_max_messages | 20 | Maximum messages in the conversation context window. Older messages are summarized. |
conversation_ttl_hours | 24 | Hours before a conversation expires. After expiry, a new conversation starts. |
max_loop_iterations | 10 | Maximum agent loop iterations (tool call rounds) per message. |
max_tokens_per_turn | 8192 | Maximum output tokens per LLM call. |
timeout_ms | 30000 | Hard timeout for message processing (max 60000ms). |
All of these can be configured per bot:
curl -X PATCH https://api.aerostack.dev/api/bots/YOUR_BOT_ID \ -H "Authorization: Bearer YOUR_JWT_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "conversation_max_messages": 30, "conversation_ttl_hours": 48, "max_loop_iterations": 5, "max_tokens_per_turn": 4096, "timeout_ms": 45000 }'Cost Optimization Tips
Section titled “Cost Optimization Tips”-
Use cheaper models for simple bots.
gpt-4o-miniandgemini-2.5-flashare 10-20x cheaper than flagship models while still capable for most tasks. -
Set spending caps. Always set a spending cap during development and testing.
-
Reduce
max_loop_iterations. If your bot rarely needs more than 2-3 tool calls, lower this from the default 10 to reduce runaway costs. -
Lower
conversation_max_messages. Shorter context windows use fewer tokens per message. -
Use BYOK for high-volume bots. If you have negotiated rates or free credits with an LLM provider, BYOK lets you pay your provider directly.
-
Monitor with analytics. Check the analytics endpoint regularly to identify cost spikes.