Billing
Per-token pricing in EUR, monthly allowances, and spending caps you control.
What gets billed
Only requests through Servada-managed upstreams are billed. If your org brings its own provider keys, you pay the provider directly and those requests do not appear on your Servada invoice.
Billing is in EUR. Only successful requests contribute to cost — errored and aborted requests are never charged.
Everything below is visible and controllable from the Billingpage in the portal: current-period spend, invoices, allowances, and your organisation's overage cap. No support ticket required for the day-to-day.
Pricing
Each model has three prices, quoted in EUR per million tokens:
inputEUR / Mtok- Regular prompt tokens.
cached_inputEUR / Mtok- Prompt tokens served from the provider cache. Typically much cheaper.
outputEUR / Mtok- Completion tokens (including reasoning / thinking tokens).
The price of a request is frozen at the time of the call, so later price changes don't alter your historical invoices.
Your monthly plan
A plan is a flat monthly fee plus an allowance. Usage within the allowance is covered by the flat fee; anything beyond it is “overage” and billed on top.
flat feeEUR / month- Base monthly charge.
allowancetokens or EUR- How much usage the flat fee includes.
total_eur = flat_fee + overage_cost_eurSpending caps
You choose what happens when the allowance runs out:
Stop at allowancemode- Requests are rejected with HTTP 402 once the allowance is used.
Allow overagemode- Requests continue up to the overage cap you set. Billed separately.
Organisation administrators can switch mode and set the overage cap fromBilling in the portal. The cap can be raised at any time; requests resume immediately.
One request may briefly cross the cap
The cap is checked before each request based on total spend so far. A single in-flight request can push you slightly over — we don't pre-reject based on guesses about its output size. The next request is then rejected.
HTTP 402 — cap reached
When the cap is reached, billable requests return:
{
"type": "billing_cap_exceeded",
"code": 402,
"error": "Monthly spending cap reached.",
"request_id": "uuid",
"current_eur": 10.0125,
"cap_eur": 10.00,
"allowance_eur": 10.00,
"overage_cap_eur": 0
}Raise the overage cap from the portal (or contact your account manager if the ceiling is too low) and the next request goes through.
Invoices
An invoice is produced for each calendar month. It lists one row per model with token counts and cost, split into within-allowance and overage portions. You can view and download invoices fromBilling in the portal.
The billing dashboard also shows live current-month spend, a progress bar against your cap, and alerts as you approach it.
See also
- Usage Accounting — how tokens are counted per request.
- Errors — every HTTP status code the gateway can return.