Introduction

Getting Started

Everything you need to make your first request — base URL, tokens, authentication and scopes.

Base URL

All requests are made against a single base URL — the one Servada assigned to your organisation. Ask your administrator for the exact host. The rest of this documentation uses a placeholder:

text
https://api.inferada.com

Service tokens vs. session tokens

There are two kinds of bearer tokens. You can tell them apart by their prefix:

  • inf_… Service API tokens. Use these from your applications. They call the service endpoints (completions, language, PII, audio) and are what this documentation is about.
  • infp_… Portal session tokens. Issued by logging in to the portal. They only work for portal management endpoints — not the service APIs documented here.

Create a service token

  1. Sign in to the portal.
  2. Open the user menu in the bottom-left and choose API Tokens.
  3. Click Create Token, give it a name, choose the scopes you need, optionally set an expiry or IP allowlist, then click Create.
  4. Copy the token value immediately — it starts with inf_ and is shown only once.

Authentication

Pass your service token in the Authorization header as a Bearer token on every request.

http
Authorization: Bearer inf_MjQ...

Every response carries an x-request-id header, and error bodies include a request_id field. If you ever need to reach out to support, include that ID so we can look up the request on our side.

Token scopes

Every service token has one or more scopes, controlling which endpoints it can call. Scopes are set when the token is created.

completionsscope
Chat completions, model listing. Endpoints: /v1/chat/completions, /v1/models.
languagescope
Grammar and spell checking. Endpoints: /v1/language/correct, /v2/check, /v2/languages.
piiscope
PII analysis and redaction. Endpoints: /v1/pii/analyse, /v1/pii/redact, /v1/pii/restore.
sttscope
Speech-to-text transcription. Endpoint: /v1/audio/transcriptions.
ttsscope
Text-to-speech synthesis. Endpoints: /v1/audio/speech, /v1/audio/voices.

A request to an endpoint outside the token's scopes returns a 403.

IP allowlist

Tokens can optionally be restricted to a list of source IPs or CIDR ranges. Requests from other IPs are rejected with a 403. Configure this when creating the token in the portal. Unbounded ranges (0.0.0.0/0, ::/0) are not accepted — leave the list empty to allow any IP.

Your first request

Start by listing the models available to your token. This only needs the completions scope.

bash
curl https://api.inferada.com/v1/models \
  -H "Authorization: Bearer inf_YOUR_TOKEN"

Once you have a model ID, send a chat completion:

bash
curl https://api.inferada.com/v1/chat/completions \
  -H "Authorization: Bearer inf_YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.5-35b",
    "messages": [
      { "role": "user", "content": "Say hello in one word." }
    ]
  }'

Next steps

  • Chat Completions — OpenAI-compatible chat with our extensions.
  • Processors — wrap completions with anonymisation, spellcheck, PII guards, prompt-injection blocking and routing policy.
  • Prompt Templating — render the date and other gateway facts into system prompts.
  • Rate Limits — how limits work and how to handle 429 responses.
  • Usage Accounting — understand what's counted per scope.