Introduction

Getting Started

Everything you need to make your first request — base URL, tokens, authentication and scopes.

Base URL

All requests are made against a single base URL — the one Servada assigned to your organisation. Ask your administrator for the exact host. The rest of this documentation uses a placeholder:

text

https://api.inferada.com

Service tokens vs. session tokens

There are two kinds of bearer tokens. You can tell them apart by their prefix:

inf_… — Service API tokens. Use these from your applications. They call the service endpoints (completions, language, PII, audio) and are what this documentation is about.
infp_… — Portal session tokens. Issued by logging in to the portal. They only work for portal management endpoints — not the service APIs documented here.

Create a service token

Sign in to the portal.
Open the user menu in the bottom-left and choose API Tokens.
Click Create Token, give it a name, choose the scopes you need, optionally set an expiry or IP allowlist, then click Create.
Copy the token value immediately — it starts with inf_ and is shown only once.

Authentication

Pass your service token in the Authorization header as a Bearer token on every request.

http

Authorization: Bearer inf_MjQ...

Every response carries an x-request-id header, and error bodies include a request_id field. If you ever need to reach out to support, include that ID so we can look up the request on our side.

Token scopes

Every service token has one or more scopes, controlling which endpoints it can call. Scopes are set when the token is created.

completionsscope: Chat completions, model listing. Endpoints: /v1/chat/completions, /v1/models.
languagescope: Grammar and spell checking. Endpoints: /v1/language/correct, /v2/check, /v2/languages.
piiscope: PII analysis and redaction. Endpoints: /v1/pii/analyse, /v1/pii/redact, /v1/pii/restore.
sttscope: Speech-to-text transcription. Endpoint: /v1/audio/transcriptions.
ttsscope: Text-to-speech synthesis. Endpoints: /v1/audio/speech, /v1/audio/voices.

A request to an endpoint outside the token's scopes returns a 403.

Tokens can optionally be restricted to a list of source IPs or CIDR ranges. Requests from other IPs are rejected with a 403. Configure this when creating the token in the portal. Unbounded ranges (0.0.0.0/0, ::/0) are not accepted — leave the list empty to allow any IP.

Your first request

Start by listing the models available to your token. This only needs the completions scope.

bash

curl https://api.inferada.com/v1/models \
  -H "Authorization: Bearer inf_YOUR_TOKEN"

Once you have a model ID, send a chat completion:

bash

curl https://api.inferada.com/v1/chat/completions \
  -H "Authorization: Bearer inf_YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.5-35b",
    "messages": [
      { "role": "user", "content": "Say hello in one word." }
    ]
  }'

Next steps

Chat Completions — OpenAI-compatible chat with our extensions.
Processors — wrap completions with anonymisation, spellcheck, PII guards, prompt-injection blocking and routing policy.
Prompt Templating — render the date and other gateway facts into system prompts.
Rate Limits — how limits work and how to handle 429 responses.
Usage Accounting — understand what's counted per scope.