Prompt Templating
Substitute live values like the current date or weekday into system prompts before they reach the model — without forcing every caller to assemble them client-side.
What this is
When prompt templating is enabled on a service model, the gateway expands a small set of {{ var_name }} placeholders in system prompts before forwarding them to the upstream model. The headline use case is giving the model a sense of time — "Today is {{ now_date }} ({{ now_day_of_week }})." — without forcing every customer app to inject a date string into every request.
Templating applies to two surfaces and nothing else:
- The model-default system prompt configured on the service model (or its template).
- Any
role: "system"message in the request body.
user, assistant, and tool messages always pass through verbatim — the gateway never modifies the conversation content your app sends.
Three modes
Each service model (and service-model template) carries a template_render_mode column. Three values:
offdefault- No rendering anywhere. Prompts pass through literally, including any { { } } sequences. Today's behaviour — opting in is always explicit.
varsorg-admin authored- Variables are substituted with a logic-less engine on both the model-default prompt and any role:"system" message in the request body. Org admins can author the prompt freely; no super-admin involvement needed.
logicoperator authored- Full Edge templating (
@if,@each, expressions) on the model-default prompt — authored by an Inferada operator on your behalf. Callerrole: "system"messages still get the same logic-less engine, never Edge.
Regardless of mode, the gateway never evaluates @if/@eachon the system prompts you send via the API. You get variable substitution, that's it. If your use case needs conditional logic baked into the prompt, contact us and we'll set up a logic-mode template on the relevant model — that prompt is then operator-managed (we change it for you on request). You can take control back at any time by switching the model out of Advanced mode, or override per-request by supplying your own role: "system" message in the API call.
Variable reference
The same variables work in every mode. They are split into three tiers by how often the rendered string changes — see Caching guidance below for why this matters.
Stable tier (recommended default)
The rendered value rolls over at most once per day — usually less. Safe to bake into a cached prompt prefix.
{{ now_year }}integer- Current year as a 4-digit string. Example: 2026.
{{ now_quarter }}string- Q1–Q4. Example: Q2.
{{ now_month }}string- Zero-padded month (01–12). Example: 05.
{{ now_month_name }}string (English)- English month name. Example: May.
{{ now_month_name_local }}string (locale-aware)- Localised month name using inferada.language. Falls back to English when language is unset or unrecognised. Examples: mei (nl), mai (fr).
{{ now_iso_week }}string- ISO 8601 week tag. Example: 2026-W18.
{{ now_date }}string- YYYY-MM-DD UTC date. Example: 2026-05-02.
{{ now_day }}string- Day of month, no padding (1–31).
{{ now_day_of_week }}string (English)- English weekday name. Example: Saturday.
{{ now_day_of_week_local }}string (locale-aware)- Localised weekday using inferada.language. Falls back to English. Examples: zaterdag (nl), samedi (fr).
Hourly tier
Rolls over once per hour. Use when the model needs "time of day" awareness (morning vs evening tone) without invalidating the cache on every request.
{{ now_hour }}string- UTC hour as 00–23. Example: 14.
{{ now_iso_hour }}string- Hour-precision ISO 8601. Example: 2026-05-02T14Z.
Per-request tier (cache-busting)
The rendered value changes on every render. Reach for these only when second/minute precision is genuinely required — e.g. an agent doing duration arithmetic. Baking {{ now_iso }}into a 4 K-token system prompt forces a 0% cache hit on every request.
{{ now_time }}string- HH:mm UTC. Example: 14:23 UTC.
{{ now_iso }}string- Second-precision ISO 8601 UTC. Example: 2026-05-02T14:23:11Z.
{{ now_unix }}string- Epoch seconds as a string. Example: 1777989791.
Model facts
{{ model_provider }}string- Underlying upstream API format the model is being routed to — e.g. openai, anthropic, google. Useful when prompt phrasing should adapt to the provider.
Caching guidance — pick the coarsest variable that works
Anthropic and OpenAI both cache the prefix of the system prompt across requests. Cache hits are dramatically cheaper and faster than cold reads, so you want as much of your system prompt as possible to stay byte-identical between calls.
A variable in the per-request tier changes on every render — so the first byte that differs is wherever you placed it, and everything from that point on becomes a cache miss. {{ now_iso }}at the top of a 4 K-token system prompt is effectively a 100% cache-bust.
Default to the stable tier. Reach for hourly only when you need time-of-day awareness. Reach for per-request only when the use case truly requires it.
Worked example
The same intent, three ways:
"Today is {{ now_date }}."cache rolls over once per day- Recommended. The prefix stays byte-identical for ~24 hours; every call within that window can reuse the cached prefix.
"It is {{ now_iso_hour }}."cache rolls over once per hour- Use when you specifically need the model to know it's morning vs afternoon. Cache works fine within the hour.
"The current time is {{ now_iso }}."cache rolls over every request- Reserve for agents doing relative-time arithmetic (e.g. "is this deadline in the past?"). Avoid for general "give the model a sense of time" — the date is enough.
See provider-specific docs for the exact caching mechanics: Anthropic prompt caching and OpenAI prompt caching.
Writing effective templates
The engine your prompt is rendered through depends on the mode and the surface:
vars mode — both surfaceslogic-less- Only { { var_name } } substitution and \{{ for an escaped literal {{. No member access, no method calls, no logic. Author both the model-default and your own role:"system" messages with this surface in mind.
logic mode — model-default onlyEdge (full)- Full Edge templating:
@if/@elseif/@else/@endif,@unless,@each,@let,@assign. Not available:@include,@layout,@component,@inject,@svg,@vite— these will fail to render. Authoring is operator-managed; ask support to update the prompt. logic mode — your role:"system" messagelogic-less- Same engine as vars mode. Edge is never applied to caller-supplied content, regardless of mode. So if you need conditional logic in your own prompt, request a logic-mode template instead and let it own that branch.
Keep templates simple
Plain {{ var }} substitutions cache well, render in microseconds, and have predictable failure modes. Heavy @if/@each logic compounds the risk of a syntax error, and the failure mode is uglier — see below.
Render failure mode
If a template fails to parse — typo in a tag name, malformed expression, anything — the gateway forwards the literal source string to the upstream model. That means an LLM may end up reading raw @if foo or {{ syntax }} in its system prompt and behave unpredictably. Simple templates degrade gracefully (the few {{ var }} placeholders that remain are mostly harmless to a model). Complex logic templates degrade ugly.
Render failures emit a prompt_template.render_failed log event with org id, service-model id, source (model_default or caller_system), and reason (syntax, undefined_var, timeout, oversize, denylist) — but never the rendered text or the template body. Operators can find these in the audit log.
Escaping literal {{
If you genuinely need the characters {{ to appear in the rendered output (for example, in instructions to the model about how to format a future template), escape them:
- In
varsmode: write\{{— the gateway emits a literal{{. - In
logicmode (Edge): use Edge's convention@{{ … }}for a literal.
Worked patterns
Give the model a date sense (the canonical use case)
You are a helpful assistant. Today is {{ now_date }} ({{ now_day_of_week }}).
When the user asks about events or deadlines, anchor your reasoning to that date.Caches well — only changes once per day. The standard fix for "the model thinks it's 2024".
Provider-aware tone
You are running on {{ model_provider }}. Format mathematical expressions in
plain text rather than markdown, since rendering varies by client.Localised greeting
Begin your response with a friendly greeting in the user's language.
Today is {{ now_day_of_week_local }} {{ now_day }} {{ now_month_name_local }}.Pair the _local variants with inferada.language on the request — pass language: "nl" and you'll get zaterdag 2 mei instead of Saturday 2 May.
Time-of-day awareness without cache-bust
The current hour is {{ now_hour }} UTC. If you reference work hours,
assume European business time (08–18 UTC).Hourly tier — rolls over once per hour, so the cache stays effective within the hour.
What an operator-authored logic template looks like
A glimpse of what logicmode unlocks (this is what we author for you on request — you can't put logic into your own request body):
Today is {{ now_date }}.
@if(model_provider === 'anthropic')
You are running on Anthropic Claude. Lean on extended thinking when the user asks for
multi-step reasoning.
@else
You are running on {{ model_provider }}. Keep responses concise.
@endifSwitching out of Advanced mode
A logic-mode prompt is operator-managed — your org admins can't edit it while the model is in Advanced mode. Two ways to take control back:
- Permanent: an org admin switches
template_render_modetovarsoroff. The prompt content unlocks for editing immediately. The cost is Edge logic — you keep variable substitution invarsmode, lose it entirely inoff. - Per-request: send your own
role: "system"message in the API call. The model-default is bypassed entirely (status quo gateway behaviour) and your own system prompt still gets variable substitution via the logic-less engine — so{{ now_date }}etc. work in your custom prompt with no mode change.