DoublewordDoubleword

Set Up Model Pricing

This guide shows you how to configure per-token pricing for your models using tariffs.

Prerequisites

Understanding Tariffs

A tariff defines per-token pricing for a model. Each model can have multiple tariffs for different purposes:

PurposeDescriptionLimit
RealtimeStandard API requestsOne per model
BatchAsynchronous batch processingOne per SLA (e.g., 24h, 1h)
PlaygroundDashboard testingOne per model

Models without tariffs are free to use.

Set Pricing via the Dashboard

  1. Go to Models in the sidebar
  2. Click on the model you want to price
  3. Click Manage Pricing Tariffs
  4. Click + Add Tariff
  5. Fill in the pricing details:
    • Name: Descriptive label (e.g., "Standard Pricing")
    • Purpose: Select realtime, batch, or playground
    • Input price: Cost per 1M input tokens
    • Output price: Cost per 1M output tokens
  6. For batch tariffs, select the SLA (completion window like "24h")
  7. Click Save Changes

Pricing takes effect immediately.

Set Pricing via the API

Include a tariffs array when creating or updating a model:

curl -X PATCH "https://your-instance/admin/api/v1/models/{model-id}" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tariffs": [
      {
        "name": "Realtime Pricing",
        "input_price_per_token": "0.003",
        "output_price_per_token": "0.015",
        "api_key_purpose": "realtime"
      }
    ]
  }'

Tariff Fields

FieldRequiredDescription
nameYesDescriptive name for the tariff
input_price_per_tokenYesPrice per input token (decimal)
output_price_per_tokenYesPrice per output token (decimal)
api_key_purposeNo"realtime", "batch", or "playground"
completion_windowBatch onlySLA like "24h" or "1h"
Note

Prices are stored per-token with 8 decimal places. The dashboard displays prices per 1M tokens for readability. To convert: \$3.00 per 1M tokens = 0.000003 per token.

Example: Pricing with Multiple Tiers

Configure different prices for realtime, batch, and playground:

{
  "tariffs": [
    {
      "name": "Realtime",
      "input_price_per_token": "0.00003",
      "output_price_per_token": "0.00006",
      "api_key_purpose": "realtime"
    },
    {
      "name": "Batch 24h",
      "input_price_per_token": "0.000015",
      "output_price_per_token": "0.00003",
      "api_key_purpose": "batch",
      "completion_window": "24h"
    },
    {
      "name": "Batch 1h (Express)",
      "input_price_per_token": "0.000025",
      "output_price_per_token": "0.00005",
      "api_key_purpose": "batch",
      "completion_window": "1h"
    },
    {
      "name": "Playground (Free)",
      "input_price_per_token": "0",
      "output_price_per_token": "0",
      "api_key_purpose": "playground"
    }
  ]
}

This configuration:

  • Charges full price for realtime API usage
  • Offers 50% discount for 24-hour batch jobs
  • Charges near-realtime rates for express 1-hour batch jobs
  • Makes playground testing free

Updating Prices

When you update tariffs, the system:

  1. Closes old tariffs by setting their end date to now
  2. Creates new tariffs effective immediately
  3. Preserves historical pricing for accurate transaction records

Old transactions are charged at the rate that was active when they occurred. New transactions use the updated rates.

Viewing Current Tariffs

To see a model's current pricing via API, include pricing in the query:

curl "https://your-instance/admin/api/v1/models/{model-id}?include=pricing" \
  -H "Authorization: Bearer $API_KEY"

The response includes a tariffs array with all active tariffs.

Removing Pricing

To make a model free, update it with an empty tariffs array:

curl -X PATCH "https://your-instance/admin/api/v1/models/{model-id}" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"tariffs": []}'

This closes all active tariffs. Requests to the model will no longer incur charges.