Pricing & Billing
Agent Endpoints support per-run pricing and two authentication modes. This page covers how costs are calculated and charged.
Per-Run Pricing
The price_per_run_cents field lets you set a fixed charge per execution. This amount is deducted from the endpoint owner’s wallet after each successful run.
curl -X POST https://api.aerostack.dev/api/agent-endpoints \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "invoice-parser",
"workspace_id": "ws_abc123",
"system_prompt": "Parse the invoice and extract structured data.",
"output_format": "json",
"price_per_run_cents": 5
}'In this example, each execution deducts 5 cents ($0.05) from the endpoint owner’s wallet.
The per-run charge is separate from the underlying LLM token cost. The token cost is always calculated and reported in the usage.cost_cents field, but it is the platform cost, not an additional charge.
Pricing Limits
| Parameter | Min | Max | Default |
|---|---|---|---|
price_per_run_cents | 0 | 10000 ($100) | 0 (free) |
How Charging Works
- The agent executes and produces a result
- If
price_per_run_cents > 0, the charge is deducted from the owner’s wallet - The deduction happens asynchronously after the response is sent
- If the owner has insufficient balance, the run still succeeds but the charge may not be collected
There is currently no pre-execution balance check. Runs will succeed even if the owner’s wallet does not have enough funds. The charge is attempted but will not block execution.
Cost Breakdown
Every execution returns a usage object with the full cost breakdown:
{
"output": "...",
"usage": {
"tokens_input": 1245,
"tokens_output": 189,
"cost_cents": 0.34,
"latency_ms": 2891,
"iterations": 2
}
}| Field | Description |
|---|---|
tokens_input | Total input tokens across all LLM calls in the agentic loop |
tokens_output | Total output tokens across all LLM calls |
cost_cents | Platform cost based on model pricing with markup |
latency_ms | Total wall-clock time from request to response |
iterations | Number of LLM calls made (1 = no tool calls, 2+ = tool calls triggered re-inference) |
Token Cost Calculation
Token costs are calculated based on the model used. The cost varies by model:
| Model | Approximate Cost |
|---|---|
gpt-4o-mini | Low (best value for most tasks) |
gpt-4o | Medium |
claude-sonnet-4-20250514 | Medium |
gemini-2.0-flash | Low |
groq/llama-3.3-70b | Low |
Bring your own LLM API key to use your provider’s pricing directly. When you provide an llm_api_key, you pay the provider directly for tokens. The cost_cents field still reports the estimated cost for observability.
Auth Modes
Agent Endpoints support two authentication modes that affect both security and rate limiting.
API Key (Default)
curl -X POST https://api.aerostack.dev/api/run/my-agent \
-H "Authorization: Bearer aek_a1b2c3d4e5f6a7b8c9d0" \
-H "Content-Type: application/json" \
-d '{"input": "..."}'Or using the X-API-Key header:
curl -X POST https://api.aerostack.dev/api/run/my-agent \
-H "X-API-Key: aek_a1b2c3d4e5f6a7b8c9d0" \
-H "Content-Type: application/json" \
-d '{"input": "..."}'- API key is generated on endpoint creation (prefixed
aek_) - The raw key is shown once at creation — store it securely
- Rate limit: 60 requests per minute per endpoint
- Use
POST /:id/regenerate-keyto rotate the key
Public (No Auth)
curl -X POST https://api.aerostack.dev/api/run/my-public-agent \
-H "Content-Type: application/json" \
-d '{"input": "..."}'- No authentication required
- Anyone with the URL can call the endpoint
- Additional per-IP rate limiting: 20 requests per minute per IP per endpoint
- Standard per-endpoint rate limit still applies (60/min)
Set the auth mode when creating or updating:
curl -X PATCH https://api.aerostack.dev/api/agent-endpoints/aep_your_id \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{"auth_mode": "public"}'Public endpoints are accessible to anyone on the internet. Use them only for non-sensitive operations where you are comfortable with open access. Every run still costs tokens and deducts from the owner’s wallet.
Rate Limits Summary
| Scope | Limit | Window | Applies To |
|---|---|---|---|
| Per endpoint | 60 requests | 1 minute | All auth modes |
| Per IP per endpoint | 20 requests | 1 minute | public mode only |
| Admin test runs | 10 requests | 1 minute | Dashboard test endpoint |
Monitoring Costs
Track costs over time by querying the run history:
curl "https://api.aerostack.dev/api/agent-endpoints/aep_your_id/runs?limit=100" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"Each run record includes cost_cents, tokens_input, tokens_output, and latency_ms. Aggregate these in your application or dashboard to monitor spending.
API Key Management
Regenerate Key
If a key is compromised or you need to rotate:
curl -X POST https://api.aerostack.dev/api/agent-endpoints/aep_your_id/regenerate-key \
-H "Authorization: Bearer YOUR_JWT_TOKEN"{
"api_key": "aek_new_key_value_shown_once"
}The old key is immediately invalidated. All consumers must update to the new key.
Bring Your Own LLM Key
To use your own LLM provider account (and avoid platform rate limits on the shared key):
curl -X POST https://api.aerostack.dev/api/agent-endpoints \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "my-agent",
"workspace_id": "ws_abc123",
"system_prompt": "...",
"llm_provider": "anthropic",
"llm_model": "claude-sonnet-4-20250514",
"llm_api_key": "sk-ant-your-key-here"
}'The LLM API key is encrypted at rest. It is never exposed via the GET endpoints.