Guides

Usage Accounting

What we count per scope, how those counts are exposed, and how to read the request history for detail.

Two views of usage

You can query your own usage in two complementary ways:

  • Live totals — aggregated counters per period, optimised for quick dashboards and budget checks. Exposed via /v1/account/usage.
  • Request history — one entry per request, with full detail of what was sent, how long it took, and what was counted. Exposed via /v1/account/history.

Metrics per scope

Each scope counts what matters for that service — token counts for completions, characters for text-to-speech, and so on.

completionsscope
tokens, prompt_tokens, completion_tokens
languagescope
requests, corrections
piiscope
requests, entities_found, entities_redacted, replacements_made
prompt_shieldscope
requests, detections_count
sttscope
requests, audio_duration_seconds
ttsscope
requests, characters_synthesised, output_audio_seconds

Live totals: /v1/account/usage

GET/v1/account/usage

Aggregated counters for the authenticated user, grouped per scope and per model.

Request

periodstring
minute, day (default), week or month. Defines the time window.
scopestring
Optional filter, e.g. completions.

Response · 200

json
{
  "period": "day",
  "scopes": {
    "completions": {
      "tokens": 5200,
      "prompt_tokens": 3100,
      "completion_tokens": 2100
    }
  },
  "models": {
    "qwen3.5-35b": {
      "tokens": 5200,
      "prompt_tokens": 3100,
      "completion_tokens": 2100
    }
  }
}

Reading the response

  • scopes is keyed by scope name. Each value is a metric-count map matching the table above.
  • models is keyed by model ID. Only completions traffic contributes, since only completions track per-model counts.
  • Empty {} objects mean no usage was recorded in that period. Older periods drop off after a while — for hard audit records, query the request history instead.
  • Note: for the completions scope, a requests count is not part of these totals — use the request history for that. For language, pii, stt and tts it is counted here.

Request history: /v1/account/history

GET/v1/account/history

Paginated per-request audit log for the authenticated user.

Request

pagenumber
Page number (1-based). Default 1.
per_pagenumber
Page size. Default 50, max 100.
scopestring
Filter by scope.
modelstring
Filter by model ID.
statusstring
success or error.

Response · 200

json
{
  "data": [
    {
      "id": "uuid",
      "user_id": "uuid",
      "scope": "completions",
      "model_id": "qwen3.5-35b",
      "endpoint": "/v1/chat/completions",
      "usage": {
        "prompt_tokens": 200,
        "completion_tokens": 150,
        "processors": {
          "anonymise": { "entities_redacted": 2 }
        }
      },
      "latency_ms": 870,
      "status": "success",
      "stream": false,
      "timestamp": "2026-04-15T14:30:00.000Z"
    }
  ],
  "meta": {
    "total": 200,
    "per_page": 50,
    "current_page": 1,
    "last_page": 4
  }
}

Shape of the usage object per scope

The usage field on a request history entry reflects what was counted for that specific call. The shape varies by scope:

completions

json
{
  "prompt_tokens": 200,
  "completion_tokens": 150,
  "processors": {
    "anonymise": { "entities_redacted": 2 }
  }
}

processors is only present when processors ran. Each processor adds its own keys under that object — see the Processors guide for the per-processor shape.

Some processors also persist forensic fields at the top level of usage (outside processors) on rejected requests — for example prompt_shield_detections. These are server-side only and never echoed to the client.

language

json
{ "requests": 1, "corrections": 2 }

pii (analyse)

json
{ "requests": 1, "entities_found": 3 }

pii (redact / restore)

json
{ "requests": 1, "entities_redacted": 2, "replacements_made": 2 }

stt

json
{ "requests": 1, "audio_duration_seconds": 4.2 }

tts

json
{ "requests": 1, "characters_synthesised": 45, "output_audio_seconds": 3.2 }

See also

  • Rate Limits — how these counts are used to enforce limits.
  • Processors — each processor adds to its own scope's usage.