# Pricing & Billing

> Per-run billing, cost breakdown, and auth modes for agent endpoints.

Agent Endpoints support per-run pricing and two authentication modes. This page covers how costs are calculated and charged.

## Per-Run Pricing

The `price_per_run_cents` field lets you set a fixed charge per execution. This amount is deducted from the endpoint owner's wallet after each successful run.

```bash
curl -X POST https://api.aerostack.dev/api/agent-endpoints \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "invoice-parser",
    "workspace_id": "ws_abc123",
    "system_prompt": "Parse the invoice and extract structured data.",
    "output_format": "json",
    "price_per_run_cents": 5
  }'
```

In this example, each execution deducts 5 cents ($0.05) from the endpoint owner's wallet.

The per-run charge is separate from the underlying LLM token cost. The token cost is always calculated and reported in the `usage.cost_cents` field, but it is the platform cost, not an additional charge.

### Pricing Limits

| Parameter | Min | Max | Default |
|-----------|-----|-----|---------|
| `price_per_run_cents` | 0 | 10000 ($100) | 0 (free) |

### How Charging Works

1. The agent executes and produces a result
2. If `price_per_run_cents > 0`, the charge is deducted from the owner's wallet
3. The deduction happens asynchronously after the response is sent
4. If the owner has insufficient balance, the run still succeeds but the charge may not be collected

There is currently no pre-execution balance check. Runs will succeed even if the owner's wallet does not have enough funds. The charge is attempted but will not block execution.

## Cost Breakdown

Every execution returns a `usage` object with the full cost breakdown:

```json
{
  "output": "...",
  "usage": {
    "tokens_input": 1245,
    "tokens_output": 189,
    "cost_cents": 0.34,
    "latency_ms": 2891,
    "iterations": 2
  }
}
```

| Field | Description |
|-------|-------------|
| `tokens_input` | Total input tokens across all LLM calls in the agentic loop |
| `tokens_output` | Total output tokens across all LLM calls |
| `cost_cents` | Platform cost based on model pricing with markup |
| `latency_ms` | Total wall-clock time from request to response |
| `iterations` | Number of LLM calls made (1 = no tool calls, 2+ = tool calls triggered re-inference) |

### Token Cost Calculation

Token costs are calculated based on the model used. The cost varies by model:

| Model | Approximate Cost |
|-------|-----------------|
| `gpt-4o-mini` | Low (best value for most tasks) |
| `gpt-4o` | Medium |
| `claude-sonnet-4-20250514` | Medium |
| `gemini-2.0-flash` | Low |
| `groq/llama-3.3-70b` | Low |

Bring your own LLM API key to use your provider's pricing directly. When you provide an `llm_api_key`, you pay the provider directly for tokens. The `cost_cents` field still reports the estimated cost for observability.

## Auth Modes

Agent Endpoints support two authentication modes that affect both security and rate limiting.

### API Key (Default)

```bash
curl -X POST https://api.aerostack.dev/api/run/my-agent \
  -H "Authorization: Bearer aek_a1b2c3d4e5f6a7b8c9d0" \
  -H "Content-Type: application/json" \
  -d '{"input": "..."}'
```

Or using the `X-API-Key` header:

```bash
curl -X POST https://api.aerostack.dev/api/run/my-agent \
  -H "X-API-Key: aek_a1b2c3d4e5f6a7b8c9d0" \
  -H "Content-Type: application/json" \
  -d '{"input": "..."}'
```

- API key is generated on endpoint creation (prefixed `aek_`)
- The raw key is shown once at creation — store it securely
- Rate limit: 60 requests per minute per endpoint
- Use `POST /:id/regenerate-key` to rotate the key

### Public (No Auth)

```bash
curl -X POST https://api.aerostack.dev/api/run/my-public-agent \
  -H "Content-Type: application/json" \
  -d '{"input": "..."}'
```

- No authentication required
- Anyone with the URL can call the endpoint
- Additional per-IP rate limiting: 20 requests per minute per IP per endpoint
- Standard per-endpoint rate limit still applies (60/min)

Set the auth mode when creating or updating:

```bash
curl -X PATCH https://api.aerostack.dev/api/agent-endpoints/aep_your_id \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"auth_mode": "public"}'
```

Public endpoints are accessible to anyone on the internet. Use them only for non-sensitive operations where you are comfortable with open access. Every run still costs tokens and deducts from the owner's wallet.

## Rate Limits Summary

| Scope | Limit | Window | Applies To |
|-------|-------|--------|------------|
| Per endpoint | 60 requests | 1 minute | All auth modes |
| Per IP per endpoint | 20 requests | 1 minute | `public` mode only |
| Admin test runs | 10 requests | 1 minute | Dashboard test endpoint |

## Monitoring Costs

Track costs over time by querying the run history:

```bash
curl "https://api.aerostack.dev/api/agent-endpoints/aep_your_id/runs?limit=100" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"
```

Each run record includes `cost_cents`, `tokens_input`, `tokens_output`, and `latency_ms`. Aggregate these in your application or dashboard to monitor spending.

## API Key Management

### Regenerate Key

If a key is compromised or you need to rotate:

```bash
curl -X POST https://api.aerostack.dev/api/agent-endpoints/aep_your_id/regenerate-key \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"
```

```json
{
  "api_key": "aek_new_key_value_shown_once"
}
```

The old key is immediately invalidated. All consumers must update to the new key.

### Bring Your Own LLM Key

To use your own LLM provider account (and avoid platform rate limits on the shared key):

```bash
curl -X POST https://api.aerostack.dev/api/agent-endpoints \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-agent",
    "workspace_id": "ws_abc123",
    "system_prompt": "...",
    "llm_provider": "anthropic",
    "llm_model": "claude-sonnet-4-20250514",
    "llm_api_key": "sk-ant-your-key-here"
  }'
```

The LLM API key is encrypted at rest. It is never exposed via the GET endpoints.
