Tools
Curated function-calling tools — real-time data lookups, small utilities, API wrappers — the gateway invokes for the model mid-completion and hands the answer back, all in one request.
What a tool does
A tool is a small, named capability the LLM can ask for while it's generating a response — "what time is it in Tokyo?", "what's the weather in Amsterdam?", "fetch that URL". You pick which tools your organisation wants from the catalogue, the model sees them as options on every completion it makes against a model that has them attached, and it calls them when it needs real-world information.
Your caller doesn't have to do anything — the gateway runs the tool, gives the result back to the model, and returns the finished answer. No client-side plumbing, no round-trips through your app.
How the catalogue is organised
Curated, reviewed, first-party. Use them by toggling them on for your org on the Tools page. Examples: current_time (no config), fetch_url (no config; safe-browses a public URL).
Added by Servada on request — usually wrapping a third-party HTTP API (weather, search, finance) or an API your team maintains. You supply the API key once in the portal; it's encrypted at rest and only used from inside the gateway. Examples: get_weather.
Enabling a tool
From the Tools page inside your organisation, click Enable (or Configure + enable if the tool needs an API key). The server saves your config encrypted and the tool is immediately available — no deploy, no reboot.
To actually have a model use the tool, open the model's settings under Modelsand add it to the model's enabled tools list. From that point, every chat completion against that model can invoke it.
What you see in the response
The gateway surfaces tool activity on the response so you can see what happened:
timings.tools_ms— total time spent running tools for this request.inferada.tools— one entry per call with its name, latency, and success / error status.usage.tools— the same entries, persisted in request logs for the history page.
Safety + limits
- Each tool call has a per-call timeout (default 10 s, hard-capped at 30 s).
- Tool-call chains are capped — the model can't spin infinitely. Default 5 steps; configurable per service model.
- API keys you configure are encrypted at rest and never returned to the client or the model.
- Tools that fetch a customer-supplied URL refuse private / link-local / cloud metadata addresses — scanning your internal network from inside a prompt doesn't work.