Pricing & Billing

Agent Endpoints support per-run pricing and two authentication modes. This page covers how costs are calculated and charged.

Per-Run Pricing

The price_per_run_cents field lets you set a fixed charge per execution. This amount is deducted from the endpoint owner’s wallet after each successful run.

curl -X POST https://api.aerostack.dev/api/agent-endpoints \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "invoice-parser",
    "workspace_id": "ws_abc123",
    "system_prompt": "Parse the invoice and extract structured data.",
    "output_format": "json",
    "price_per_run_cents": 5
  }'

In this example, each execution deducts 5 cents ($0.05) from the endpoint owner’s wallet.

The per-run charge is separate from the underlying LLM token cost. The token cost is always calculated and reported in the usage.cost_cents field, but it is the platform cost, not an additional charge.

Pricing Limits

Parameter	Min	Max	Default
`price_per_run_cents`	0	10000 ($100)	0 (free)

How Charging Works

The agent executes and produces a result
If price_per_run_cents > 0, the charge is deducted from the owner’s wallet
The deduction happens asynchronously after the response is sent
If the owner has insufficient balance, the run still succeeds but the charge may not be collected

⚠️

There is currently no pre-execution balance check. Runs will succeed even if the owner’s wallet does not have enough funds. The charge is attempted but will not block execution.

Cost Breakdown

Every execution returns a usage object with the full cost breakdown:

{
  "output": "...",
  "usage": {
    "tokens_input": 1245,
    "tokens_output": 189,
    "cost_cents": 0.34,
    "latency_ms": 2891,
    "iterations": 2
  }
}

Field	Description
`tokens_input`	Total input tokens across all LLM calls in the agentic loop
`tokens_output`	Total output tokens across all LLM calls
`cost_cents`	Platform cost based on model pricing with markup
`latency_ms`	Total wall-clock time from request to response
`iterations`	Number of LLM calls made (1 = no tool calls, 2+ = tool calls triggered re-inference)

Token Cost Calculation

Token costs are calculated based on the model used. The cost varies by model:

Model	Approximate Cost
`gpt-4o-mini`	Low (best value for most tasks)
`gpt-4o`	Medium
`claude-sonnet-4-20250514`	Medium
`gemini-2.0-flash`	Low
`groq/llama-3.3-70b`	Low

Bring your own LLM API key to use your provider’s pricing directly. When you provide an llm_api_key, you pay the provider directly for tokens. The cost_cents field still reports the estimated cost for observability.

Auth Modes

Agent Endpoints support two authentication modes that affect both security and rate limiting.

API Key (Default)

curl -X POST https://api.aerostack.dev/api/run/my-agent \
  -H "Authorization: Bearer aek_a1b2c3d4e5f6a7b8c9d0" \
  -H "Content-Type: application/json" \
  -d '{"input": "..."}'

Or using the X-API-Key header:

curl -X POST https://api.aerostack.dev/api/run/my-agent \
  -H "X-API-Key: aek_a1b2c3d4e5f6a7b8c9d0" \
  -H "Content-Type: application/json" \
  -d '{"input": "..."}'

API key is generated on endpoint creation (prefixed aek_)
The raw key is shown once at creation — store it securely
Rate limit: 60 requests per minute per endpoint
Use POST /:id/regenerate-key to rotate the key

Public (No Auth)

curl -X POST https://api.aerostack.dev/api/run/my-public-agent \
  -H "Content-Type: application/json" \
  -d '{"input": "..."}'

No authentication required
Anyone with the URL can call the endpoint
Additional per-IP rate limiting: 20 requests per minute per IP per endpoint
Standard per-endpoint rate limit still applies (60/min)

Set the auth mode when creating or updating:

curl -X PATCH https://api.aerostack.dev/api/agent-endpoints/aep_your_id \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"auth_mode": "public"}'

⚠️

Public endpoints are accessible to anyone on the internet. Use them only for non-sensitive operations where you are comfortable with open access. Every run still costs tokens and deducts from the owner’s wallet.

Rate Limits Summary

Scope	Limit	Window	Applies To
Per endpoint	60 requests	1 minute	All auth modes
Per IP per endpoint	20 requests	1 minute	`public` mode only
Admin test runs	10 requests	1 minute	Dashboard test endpoint

Monitoring Costs

Track costs over time by querying the run history:

curl "https://api.aerostack.dev/api/agent-endpoints/aep_your_id/runs?limit=100" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Each run record includes cost_cents, tokens_input, tokens_output, and latency_ms. Aggregate these in your application or dashboard to monitor spending.

API Key Management

Regenerate Key

If a key is compromised or you need to rotate:

curl -X POST https://api.aerostack.dev/api/agent-endpoints/aep_your_id/regenerate-key \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

{
  "api_key": "aek_new_key_value_shown_once"
}

The old key is immediately invalidated. All consumers must update to the new key.

Bring Your Own LLM Key

To use your own LLM provider account (and avoid platform rate limits on the shared key):

curl -X POST https://api.aerostack.dev/api/agent-endpoints \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-agent",
    "workspace_id": "ws_abc123",
    "system_prompt": "...",
    "llm_provider": "anthropic",
    "llm_model": "claude-sonnet-4-20250514",
    "llm_api_key": "sk-ant-your-key-here"
  }'

The LLM API key is encrypted at rest. It is never exposed via the GET endpoints.

Output Formats API Reference