Versioned Prompt Management in NorthStar

Prompt management in NorthStar connects every model call back to the exact prompt template that produced it. Instead of hardcoding prompt strings inside your agent code, you store templates server-side in the NorthStar dashboard, pull them at runtime by name and label, compile them with your variables, and pass the compiled content to your LLM. Because the SDK automatically records a prompt link on the surrounding model call span, every trace in your dashboard carries a precise reference to the prompt version, variable values, and content hash that drove it — giving you full reproducibility and the ability to correlate prompt changes with changes in agent behaviour.

Key objects

NorthStar’s prompt system is built around three Pydantic models that flow in sequence: the server-side version record, the compiled result, and the top-level prompt descriptor.

Prompt

object

The top-level prompt descriptor returned when listing prompts. Contains the prompt’s id, project_id, name, slug, optional description, the current_version_id, and a labels dict mapping label names (e.g. "production") to the version UUID they point to.

PromptVersion

object

An immutable snapshot of a prompt at a specific point in time. Carries the template content, optional model, temperature, and max_tokens hints, an auto-extracted variables schema, a content_hash, and metadata such as version_number, parent_version_id, and change_note. See Versioning for full field documentation.

CompiledPrompt

object

The result of rendering a PromptVersion with concrete variable values. Contains the final rendered content, the original raw_content (the unrendered template), the variables dict, model hints, and the content_hash inherited from the version. Passed directly to llm.record_input_messages() or used as a string in message payloads.

Pull and compile workflow

The standard pattern is to call client.pull_prompt() to fetch a CompiledPrompt, then use prompt.bind() as a context manager to render the template and automatically link it to the enclosing model call span:

from northstar import Northstar

client = Northstar(api_key="ns_...", project_id="<project-ref>")
prompt = client.pull_prompt("summarizer", label="production")

with client.session() as session:
    with session.run("agent") as run:
        with run.span("chat") as span:
            with prompt.bind(
                variables={"doc": doc_text, "max_words": 120},
                span=span,
            ) as compiled:
                # compiled.content is the fully rendered string
                response = call_llm(model="gpt-4o", content=compiled.content)

pull_prompt resolves the version

client.pull_prompt("summarizer", label="production") posts to the NorthStar resolve endpoint and returns a CompiledPrompt pre-loaded with the PromptVersion identified by the "production" label. The result is cached by default so subsequent calls within the same process are fast.

bind() renders the template

prompt.bind(variables={...}, span=span) is a context manager. On entry it calls compile_prompt on the underlying version, substituting every {{ variable }} and {variable} placeholder. Missing variables raise a ValueError immediately — nothing is silently dropped.

Prompt is automatically bound to the span

When the bind() context manager exits, the SDK enqueues a prompt_link record that associates the span’s trace_id and span_id with the prompt_version_id and the exact variable_values used — no extra calls required.

Label-based version resolution

Labels are named pointers stored alongside a prompt that map a human-readable name to a specific version UUID. When you call pull_prompt with a label argument, NorthStar resolves that label server-side and returns the version it currently points to.

# Resolve whichever version is currently tagged "production"
prompt = client.pull_prompt("summarizer", label="production")

# Resolve the "staging" label for pre-release testing
prompt = client.pull_prompt("summarizer", label="staging")

Labels like "production" and "staging" are conventions, not reserved keywords. You can create any label name in the dashboard and point it to any version. Promoting a new version to "production" is a label update — the old version is never deleted and can always be restored. The default label used when none is specified is "prod".

Automatic prompt-to-span binding

When bind() is used, NorthStar records two pieces of traceability data automatically:

prompt.compile.requested — written to the span’s attributes dict at compile time, containing the prompt_version_id and content_hash.
Prompt link — flushed with the trace payload, containing the trace_id, span_id, prompt_version_id, and the variable_values dict used during compilation.

This means you can always look up a trace in the dashboard and see exactly which prompt version was active, what variables were injected, and whether the content has drifted (via content_hash) since the trace was recorded.

When span is omitted from prompt.bind(), the SDK automatically picks up the active model call span from the current context. This is useful when the model call span is managed by northstar.model_call() and you want minimal boilerplate. See Templates for details.

Next steps

Versioning

Understand how versions are created, what each field means, and how labels pin named pointers to specific versions.

Templates

Learn the supported template syntax, the compile API, and how missing variables are caught at compile time.

Get Started

Tracing

Prompts

Evaluations

Configuration & Deployment

Versioned Prompt Management in NorthStar

Key objects

Pull and compile workflow

Label-based version resolution

Automatic prompt-to-span binding

Next steps

Versioning

Templates

Build docs developers (and LLMs) love

Get Started

Tracing

Prompts

Evaluations

Configuration & Deployment

Documentation Index

​Key objects

​Pull and compile workflow

​Label-based version resolution

​Automatic prompt-to-span binding

​Next steps

Versioning

Templates

Build docs developers (and LLMs) love

Key objects

Pull and compile workflow

Label-based version resolution

Automatic prompt-to-span binding

Next steps