Guides

Prompt Templating

Substitute live values like the current date or weekday into system prompts before they reach the model — without forcing every caller to assemble them client-side.

What this is

When prompt templating is enabled on a service model, the gateway expands a small set of {{ var_name }} placeholders in system prompts before forwarding them to the upstream model. The headline use case is giving the model a sense of time — "Today is {{ now_date }} ({{ now_day_of_week }})." — without forcing every customer app to inject a date string into every request.

Templating applies to two surfaces and nothing else:

  • The model-default system prompt configured on the service model (or its template).
  • Any role: "system" message in the request body.

user, assistant, and tool messages always pass through verbatim — the gateway never modifies the conversation content your app sends.

Three modes

Each service model (and service-model template) carries a template_render_mode column. Three values:

offdefault
No rendering anywhere. Prompts pass through literally, including any { { } } sequences. Today's behaviour — opting in is always explicit.
varsorg-admin authored
Variables are substituted with a logic-less engine on both the model-default prompt and any role:"system" message in the request body. Org admins can author the prompt freely; no super-admin involvement needed.
logicoperator authored
Full Edge templating (@if, @each, expressions) on the model-default prompt — authored by an Inferada operator on your behalf. Callerrole: "system" messages still get the same logic-less engine, never Edge.

Variable reference

The same variables work in every mode. They are split into three tiers by how often the rendered string changes — see Caching guidance below for why this matters.

Stable tier (recommended default)

The rendered value rolls over at most once per day — usually less. Safe to bake into a cached prompt prefix.

{{ now_year }}integer
Current year as a 4-digit string. Example: 2026.
{{ now_quarter }}string
Q1–Q4. Example: Q2.
{{ now_month }}string
Zero-padded month (01–12). Example: 05.
{{ now_month_name }}string (English)
English month name. Example: May.
{{ now_month_name_local }}string (locale-aware)
Localised month name using inferada.language. Falls back to English when language is unset or unrecognised. Examples: mei (nl), mai (fr).
{{ now_iso_week }}string
ISO 8601 week tag. Example: 2026-W18.
{{ now_date }}string
YYYY-MM-DD UTC date. Example: 2026-05-02.
{{ now_day }}string
Day of month, no padding (1–31).
{{ now_day_of_week }}string (English)
English weekday name. Example: Saturday.
{{ now_day_of_week_local }}string (locale-aware)
Localised weekday using inferada.language. Falls back to English. Examples: zaterdag (nl), samedi (fr).

Hourly tier

Rolls over once per hour. Use when the model needs "time of day" awareness (morning vs evening tone) without invalidating the cache on every request.

{{ now_hour }}string
UTC hour as 00–23. Example: 14.
{{ now_iso_hour }}string
Hour-precision ISO 8601. Example: 2026-05-02T14Z.

Per-request tier (cache-busting)

The rendered value changes on every render. Reach for these only when second/minute precision is genuinely required — e.g. an agent doing duration arithmetic. Baking {{ now_iso }}into a 4 K-token system prompt forces a 0% cache hit on every request.

{{ now_time }}string
HH:mm UTC. Example: 14:23 UTC.
{{ now_iso }}string
Second-precision ISO 8601 UTC. Example: 2026-05-02T14:23:11Z.
{{ now_unix }}string
Epoch seconds as a string. Example: 1777989791.

Model facts

{{ model_provider }}string
Underlying upstream API format the model is being routed to — e.g. openai, anthropic, google. Useful when prompt phrasing should adapt to the provider.

Caching guidance — pick the coarsest variable that works

Anthropic and OpenAI both cache the prefix of the system prompt across requests. Cache hits are dramatically cheaper and faster than cold reads, so you want as much of your system prompt as possible to stay byte-identical between calls.

A variable in the per-request tier changes on every render — so the first byte that differs is wherever you placed it, and everything from that point on becomes a cache miss. {{ now_iso }}at the top of a 4 K-token system prompt is effectively a 100% cache-bust.

Default to the stable tier. Reach for hourly only when you need time-of-day awareness. Reach for per-request only when the use case truly requires it.

Worked example

The same intent, three ways:

"Today is {{ now_date }}."cache rolls over once per day
Recommended. The prefix stays byte-identical for ~24 hours; every call within that window can reuse the cached prefix.
"It is {{ now_iso_hour }}."cache rolls over once per hour
Use when you specifically need the model to know it's morning vs afternoon. Cache works fine within the hour.
"The current time is {{ now_iso }}."cache rolls over every request
Reserve for agents doing relative-time arithmetic (e.g. "is this deadline in the past?"). Avoid for general "give the model a sense of time" — the date is enough.

See provider-specific docs for the exact caching mechanics: Anthropic prompt caching and OpenAI prompt caching.

Writing effective templates

The engine your prompt is rendered through depends on the mode and the surface:

vars mode — both surfaceslogic-less
Only { { var_name } } substitution and \{{ for an escaped literal {{. No member access, no method calls, no logic. Author both the model-default and your own role:"system" messages with this surface in mind.
logic mode — model-default onlyEdge (full)
Full Edge templating: @if / @elseif / @else / @endif, @unless, @each, @let, @assign. Not available: @include, @layout, @component, @inject, @svg, @vite — these will fail to render. Authoring is operator-managed; ask support to update the prompt.
logic mode — your role:"system" messagelogic-less
Same engine as vars mode. Edge is never applied to caller-supplied content, regardless of mode. So if you need conditional logic in your own prompt, request a logic-mode template instead and let it own that branch.

Keep templates simple

Plain {{ var }} substitutions cache well, render in microseconds, and have predictable failure modes. Heavy @if/@each logic compounds the risk of a syntax error, and the failure mode is uglier — see below.

Render failure mode

Escaping literal {{

If you genuinely need the characters {{ to appear in the rendered output (for example, in instructions to the model about how to format a future template), escape them:

  • In vars mode: write \{{ — the gateway emits a literal {{.
  • In logic mode (Edge): use Edge's convention @{{}} for a literal.

Worked patterns

Give the model a date sense (the canonical use case)

text
You are a helpful assistant. Today is {{ now_date }} ({{ now_day_of_week }}).
When the user asks about events or deadlines, anchor your reasoning to that date.

Caches well — only changes once per day. The standard fix for "the model thinks it's 2024".

Provider-aware tone

text
You are running on {{ model_provider }}. Format mathematical expressions in
plain text rather than markdown, since rendering varies by client.

Localised greeting

text
Begin your response with a friendly greeting in the user's language.
Today is {{ now_day_of_week_local }} {{ now_day }} {{ now_month_name_local }}.

Pair the _local variants with inferada.language on the request — pass language: "nl" and you'll get zaterdag 2 mei instead of Saturday 2 May.

Time-of-day awareness without cache-bust

text
The current hour is {{ now_hour }} UTC. If you reference work hours,
assume European business time (08–18 UTC).

Hourly tier — rolls over once per hour, so the cache stays effective within the hour.

What an operator-authored logic template looks like

A glimpse of what logicmode unlocks (this is what we author for you on request — you can't put logic into your own request body):

text
Today is {{ now_date }}.

@if(model_provider === 'anthropic')
You are running on Anthropic Claude. Lean on extended thinking when the user asks for
multi-step reasoning.
@else
You are running on {{ model_provider }}. Keep responses concise.
@endif

Switching out of Advanced mode

A logic-mode prompt is operator-managed — your org admins can't edit it while the model is in Advanced mode. Two ways to take control back:

  • Permanent: an org admin switches template_render_mode to vars or off. The prompt content unlocks for editing immediately. The cost is Edge logic — you keep variable substitution in vars mode, lose it entirely in off.
  • Per-request: send your own role: "system" message in the API call. The model-default is bypassed entirely (status quo gateway behaviour) and your own system prompt still gets variable substitution via the logic-less engine — so {{ now_date }} etc. work in your custom prompt with no mode change.