Guides

Billing

Per-token pricing in EUR, monthly allowances, and spending caps you control.

What gets billed

Only requests through Servada-managed upstreams are billed. If your org brings its own provider keys, you pay the provider directly and those requests do not appear on your Servada invoice.

Billing is in EUR. Only successful requests contribute to cost — errored and aborted requests are never charged.

Everything below is visible and controllable from the Billingpage in the portal: current-period spend, invoices, allowances, and your organisation's overage cap. No support ticket required for the day-to-day.

Pricing

Each model has three prices, quoted in EUR per million tokens:

inputEUR / Mtok
Regular prompt tokens.
cached_inputEUR / Mtok
Prompt tokens served from the provider cache. Typically much cheaper.
outputEUR / Mtok
Completion tokens (including reasoning / thinking tokens).

The price of a request is frozen at the time of the call, so later price changes don't alter your historical invoices.

Your monthly plan

A plan is a flat monthly fee plus an allowance. Usage within the allowance is covered by the flat fee; anything beyond it is “overage” and billed on top.

flat feeEUR / month
Base monthly charge.
allowancetokens or EUR
How much usage the flat fee includes.
text
total_eur = flat_fee + overage_cost_eur

Spending caps

You choose what happens when the allowance runs out:

Stop at allowancemode
Requests are rejected with HTTP 402 once the allowance is used.
Allow overagemode
Requests continue up to the overage cap you set. Billed separately.

Organisation administrators can switch mode and set the overage cap fromBilling in the portal. The cap can be raised at any time; requests resume immediately.

One request may briefly cross the cap

The cap is checked before each request based on total spend so far. A single in-flight request can push you slightly over — we don't pre-reject based on guesses about its output size. The next request is then rejected.

HTTP 402 — cap reached

When the cap is reached, billable requests return:

json
{
  "type": "billing_cap_exceeded",
  "code": 402,
  "error": "Monthly spending cap reached.",
  "request_id": "uuid",
  "current_eur": 10.0125,
  "cap_eur": 10.00,
  "allowance_eur": 10.00,
  "overage_cap_eur": 0
}

Raise the overage cap from the portal (or contact your account manager if the ceiling is too low) and the next request goes through.

Invoices

An invoice is produced for each calendar month. It lists one row per model with token counts and cost, split into within-allowance and overage portions. You can view and download invoices fromBilling in the portal.

The billing dashboard also shows live current-month spend, a progress bar against your cap, and alerts as you approach it.

See also

  • Usage Accounting — how tokens are counted per request.
  • Errors — every HTTP status code the gateway can return.