Squad has first-class OpenTelemetry support built into the SDK. Every agent session, coordinator dispatch, and watch mode cycle emits spans and metrics you can visualize in the .NET Aspire dashboard, Jaeger, Zipkin, or any OTLP-compatible backend. Telemetry is disabled by default — it activates only when an OTLP endpoint is configured, so there is no performance overhead if you are not collecting traces.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/bradygaster/squad/llms.txt
Use this file to discover all available pages before exploring further.
OpenTelemetry export requires Node.js ≥ 22.5.0 and a running OTLP-compatible collector. Set the
OTEL_EXPORTER_OTLP_ENDPOINT environment variable to your collector’s gRPC endpoint (e.g. http://localhost:4317).Aspire Dashboard
The fastest way to get traces locally is the .NET Aspire standalone dashboard, which Squad can launch for you:What Gets Traced
Squad emits the following signals:Agent session spans
Agent session spans
One span per agent session covering start, messages, completion, and errors. Tagged with
agent.name, model, and squad.mode. Includes child spans for individual tool calls and message turns.Coordinator dispatch spans
Coordinator dispatch spans
Fan-out spans when the coordinator routes a request to multiple agents simultaneously. Shows which routing rule matched, which agents were selected, and how long each dispatch took.
Watch mode cycles
Watch mode cycles
One span per poll cycle in Ralph watch mode. Covers issue discovery, triage decisions, agent execution dispatch, and state backend writes.
Token cost tracking
Token cost tracking
Per-session and per-agent token usage recorded as metrics:
squad.tokens.input, squad.tokens.output, squad.tokens.cost, and squad.tokens.total. Tagged by agent.name and model.Sub-agent spawns (.NET)
Sub-agent spawns (.NET)
For
Squad.Agents.AI consumers, one span per task-tool invocation — covering SubagentSelected, SubagentStarted, SubagentCompleted, and SubagentFailed events.SDK Telemetry
The Squad SDK exports a one-call setup function that wires everything together:serviceName and mode so CLI and agent-mode spans are distinguishable in dashboards:
OTEL Metrics
Theotel-metrics module exports functions for recording the built-in Squad metrics:
| Metric | Type | Description |
|---|---|---|
squad.tokens.input | Counter | Total input tokens consumed, by agent and model |
squad.tokens.output | Counter | Total output tokens produced, by agent and model |
squad.tokens.cost | Counter | Estimated cost in USD, by agent and model |
squad.tokens.total | UpDownCounter | Running total of all tokens |
squad.agents.spawns | Counter | Total agent spawns |
squad.agents.duration | Histogram | Agent session duration in milliseconds |
squad.agents.errors | Counter | Agent errors by error type |
squad.agents.active | UpDownCounter | Currently active agent sessions |
squad.pool.size | ObservableGauge | Current session pool size |
squad.pool.available | ObservableGauge | Available (idle) pool slots |
squad.response.latency | Histogram | End-to-end response latency in milliseconds |
squad.pr.reworks | Counter | Number of PR rework cycles |
.NET OpenTelemetry
ForSquad.Agents.AI consumers, wire tracing in your DI setup with two lines:
SquadAgentDiagnostics.ActivitySourceName is "Microsoft.Agents.AI.Squad".
Span event kinds
TheSquadAgentTraceEventKind enum describes the lifecycle events emitted as span events:
| Kind | Description |
|---|---|
SubagentSelected | The coordinator selected a subagent to dispatch to |
SubagentStarted | A subagent process started (the task tool spawned it) |
SubagentCompleted | A subagent finished and returned its result |
SubagentFailed | A subagent terminated abnormally |
AssistantMessage | An assistant turn completed (from coordinator or subagent) |
ToolStart | A tool started executing |
ToolComplete | A tool finished executing |
SessionIdle | The session became idle (the run is complete) |
SquadAgentTraceEvent carries the event kind, a raw event type name, a timestamp, an optional SdkAgentId (non-null for subagent-scoped events), and the subagent’s display name when known. Consumers receive these via the OnSubagentTrace callback in SquadAgentOptions — no transitive dependency on the Copilot SDK required.
Consuming trace events in .NET
Cost Tracking
TheCostTracker accumulates token usage and estimated cost across all agents in a session:
TelemetryCollector
For opt-in aggregate telemetry (usage metrics with no PII or code content), the SDK providesTelemetryCollector:
squad.init, squad.run, squad.agent.spawn, squad.error, squad.upgrade. The collector never transmits when consent is false and caps the queue at 500 events to prevent unbounded memory growth.