NorthStar: Observability for AI Agents

NorthStar is an observability, debugging, and evaluation platform built for AI agent development. Instrument your agents with a single decorator or context manager, and get full visibility into every trace, LLM call, tool invocation, token cost, and error — all flowing into a real-time dashboard.

Quickstart

Get a traced agent running in under five minutes

Auto-Instrumentation

One-line setup for OpenAI and Anthropic

Evaluations

Run deterministic and LLM-based evals on agent runs

SDK Reference

Full reference for every public API

How NorthStar Works

Data flows from your agent application through the NorthStar SDK to a Supabase-backed ingest pipeline, and is then surfaced in a Next.js dashboard.

Agent App (Python) ──► SDK ──► Supabase Edge Function ──► Postgres ──► Dashboard
                          │            (Deno/TS)            (RLS)
                          └─► local queue + background worker

The SDK queues records locally and flushes them in the background, so your agent code never blocks waiting for telemetry. When the backend is unreachable, the SDK silently falls back to no-ops — your application keeps running.

Key Features

Trace & Span Hierarchy

Sessions → Runs → Spans → Events with automatic context propagation via ContextVar

Auto-Instrumentation

Patches OpenAI and Anthropic clients to capture messages, tool calls, tokens, and USD cost automatically

Versioned Prompts

Store, label, and pull prompt templates server-side; compile with Jinja or Python variable syntax

Eval Suite

Deterministic graders (tool sequence, regex, ground truth) plus LLM judges (rubric, faithfulness) in one framework

LLM Cost Tracking

Token counting and USD pricing via LiteLLM on every model call span

Dashboard & Playground

Real-time trace viewer, prompt playground, eval runs, and alert rules in a hosted Next.js UI

Get Started

Install the SDK

uv add northstar-ai

Set credentials

export NORTHSTAR_API_KEY="ns_..."
export NORTHSTAR_PROJECT_ID="<supabase-project-ref>"

Initialize and instrument

import northstar

northstar.auto_instrument()   # patches OpenAI + Anthropic
northstar.init(
    api_key="ns_...",
    project_id="<project-ref>",
    project="My Agent",
    environment="production",
)

Trace your first run

@northstar.trace("my-agent")
def run(query: str) -> str:
    northstar.log_event("started", {"query": query})
    return call_llm(query)

run("What is the capital of France?")
northstar.flush()

Install northstar-ai[pricing] to enable automatic USD cost tracking on every LLM call via LiteLLM.

Get Started

Tracing

Prompts

Evaluations

Configuration & Deployment

NorthStar: Observability for AI Agents

Quickstart

Auto-Instrumentation

Evaluations

SDK Reference

How NorthStar Works

Key Features

Trace & Span Hierarchy

Auto-Instrumentation

Versioned Prompts

Eval Suite

LLM Cost Tracking

Dashboard & Playground

Get Started

Build docs developers (and LLMs) love

Get Started

Tracing

Prompts

Evaluations

Configuration & Deployment

Documentation Index

Quickstart

Auto-Instrumentation

Evaluations

SDK Reference

​How NorthStar Works

​Key Features

Trace & Span Hierarchy

Auto-Instrumentation

Versioned Prompts

Eval Suite

LLM Cost Tracking

Dashboard & Playground

​Get Started

Build docs developers (and LLMs) love

How NorthStar Works

Key Features

Get Started