Agent EndpointsPricing & Billing

Pricing & Billing

Agent Endpoints support per-run pricing and two authentication modes. This page covers how costs are calculated and charged.

Per-Run Pricing

The price_per_run_cents field lets you set a fixed charge per execution. This amount is deducted from the endpoint owner’s wallet after each successful run.

curl -X POST https://api.aerostack.dev/api/agent-endpoints \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "invoice-parser",
    "workspace_id": "ws_abc123",
    "system_prompt": "Parse the invoice and extract structured data.",
    "output_format": "json",
    "price_per_run_cents": 5
  }'

In this example, each execution deducts 5 cents ($0.05) from the endpoint owner’s wallet.

The per-run charge is separate from the underlying LLM token cost. The token cost is always calculated and reported in the usage.cost_cents field, but it is the platform cost, not an additional charge.

Pricing Limits

ParameterMinMaxDefault
price_per_run_cents010000 ($100)0 (free)

How Charging Works

  1. The agent executes and produces a result
  2. If price_per_run_cents > 0, the charge is deducted from the owner’s wallet
  3. The deduction happens asynchronously after the response is sent
  4. If the owner has insufficient balance, the run still succeeds but the charge may not be collected
⚠️

There is currently no pre-execution balance check. Runs will succeed even if the owner’s wallet does not have enough funds. The charge is attempted but will not block execution.

Cost Breakdown

Every execution returns a usage object with the full cost breakdown:

{
  "output": "...",
  "usage": {
    "tokens_input": 1245,
    "tokens_output": 189,
    "cost_cents": 0.34,
    "latency_ms": 2891,
    "iterations": 2
  }
}
FieldDescription
tokens_inputTotal input tokens across all LLM calls in the agentic loop
tokens_outputTotal output tokens across all LLM calls
cost_centsPlatform cost based on model pricing with markup
latency_msTotal wall-clock time from request to response
iterationsNumber of LLM calls made (1 = no tool calls, 2+ = tool calls triggered re-inference)

Token Cost Calculation

Token costs are calculated based on the model used. The cost varies by model:

ModelApproximate Cost
gpt-4o-miniLow (best value for most tasks)
gpt-4oMedium
claude-sonnet-4-20250514Medium
gemini-2.0-flashLow
groq/llama-3.3-70bLow

Bring your own LLM API key to use your provider’s pricing directly. When you provide an llm_api_key, you pay the provider directly for tokens. The cost_cents field still reports the estimated cost for observability.

Auth Modes

Agent Endpoints support two authentication modes that affect both security and rate limiting.

API Key (Default)

curl -X POST https://api.aerostack.dev/api/run/my-agent \
  -H "Authorization: Bearer aek_a1b2c3d4e5f6a7b8c9d0" \
  -H "Content-Type: application/json" \
  -d '{"input": "..."}'

Or using the X-API-Key header:

curl -X POST https://api.aerostack.dev/api/run/my-agent \
  -H "X-API-Key: aek_a1b2c3d4e5f6a7b8c9d0" \
  -H "Content-Type: application/json" \
  -d '{"input": "..."}'
  • API key is generated on endpoint creation (prefixed aek_)
  • The raw key is shown once at creation — store it securely
  • Rate limit: 60 requests per minute per endpoint
  • Use POST /:id/regenerate-key to rotate the key

Public (No Auth)

curl -X POST https://api.aerostack.dev/api/run/my-public-agent \
  -H "Content-Type: application/json" \
  -d '{"input": "..."}'
  • No authentication required
  • Anyone with the URL can call the endpoint
  • Additional per-IP rate limiting: 20 requests per minute per IP per endpoint
  • Standard per-endpoint rate limit still applies (60/min)

Set the auth mode when creating or updating:

curl -X PATCH https://api.aerostack.dev/api/agent-endpoints/aep_your_id \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"auth_mode": "public"}'
⚠️

Public endpoints are accessible to anyone on the internet. Use them only for non-sensitive operations where you are comfortable with open access. Every run still costs tokens and deducts from the owner’s wallet.

Rate Limits Summary

ScopeLimitWindowApplies To
Per endpoint60 requests1 minuteAll auth modes
Per IP per endpoint20 requests1 minutepublic mode only
Admin test runs10 requests1 minuteDashboard test endpoint

Monitoring Costs

Track costs over time by querying the run history:

curl "https://api.aerostack.dev/api/agent-endpoints/aep_your_id/runs?limit=100" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Each run record includes cost_cents, tokens_input, tokens_output, and latency_ms. Aggregate these in your application or dashboard to monitor spending.

API Key Management

Regenerate Key

If a key is compromised or you need to rotate:

curl -X POST https://api.aerostack.dev/api/agent-endpoints/aep_your_id/regenerate-key \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"
{
  "api_key": "aek_new_key_value_shown_once"
}

The old key is immediately invalidated. All consumers must update to the new key.

Bring Your Own LLM Key

To use your own LLM provider account (and avoid platform rate limits on the shared key):

curl -X POST https://api.aerostack.dev/api/agent-endpoints \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-agent",
    "workspace_id": "ws_abc123",
    "system_prompt": "...",
    "llm_provider": "anthropic",
    "llm_model": "claude-sonnet-4-20250514",
    "llm_api_key": "sk-ant-your-key-here"
  }'

The LLM API key is encrypted at rest. It is never exposed via the GET endpoints.