Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/bradygaster/squad/llms.txt

Use this file to discover all available pages before exploring further.

Squad has first-class OpenTelemetry support built into the SDK. Every agent session, coordinator dispatch, and watch mode cycle emits spans and metrics you can visualize in the .NET Aspire dashboard, Jaeger, Zipkin, or any OTLP-compatible backend. Telemetry is disabled by default — it activates only when an OTLP endpoint is configured, so there is no performance overhead if you are not collecting traces.
OpenTelemetry export requires Node.js ≥ 22.5.0 and a running OTLP-compatible collector. Set the OTEL_EXPORTER_OTLP_ENDPOINT environment variable to your collector’s gRPC endpoint (e.g. http://localhost:4317).

Aspire Dashboard

The fastest way to get traces locally is the .NET Aspire standalone dashboard, which Squad can launch for you:
squad aspire
This opens the Aspire dashboard in a local Docker container. It provides a live trace viewer, metrics explorer, and structured log viewer — all pointing at your Squad session output. No Aspire project or .NET SDK required; Squad handles the container lifecycle.

What Gets Traced

Squad emits the following signals:
One span per agent session covering start, messages, completion, and errors. Tagged with agent.name, model, and squad.mode. Includes child spans for individual tool calls and message turns.
Fan-out spans when the coordinator routes a request to multiple agents simultaneously. Shows which routing rule matched, which agents were selected, and how long each dispatch took.
One span per poll cycle in Ralph watch mode. Covers issue discovery, triage decisions, agent execution dispatch, and state backend writes.
Per-session and per-agent token usage recorded as metrics: squad.tokens.input, squad.tokens.output, squad.tokens.cost, and squad.tokens.total. Tagged by agent.name and model.
For Squad.Agents.AI consumers, one span per task-tool invocation — covering SubagentSelected, SubagentStarted, SubagentCompleted, and SubagentFailed events.

SDK Telemetry

The Squad SDK exports a one-call setup function that wires everything together:
import { initSquadTelemetry } from '@bradygaster/squad-sdk';

// One call — everything lights up when OTEL_EXPORTER_OTLP_ENDPOINT is set
const telemetry = initSquadTelemetry();

// Access the cost tracker for manual queries
console.log(telemetry.costTracker.formatSummary());

// Access the event bus for custom subscriptions
telemetry.eventBus.on('agent.session.complete', (event) => {
  console.log('Session complete:', event);
});

// Graceful shutdown — flushes pending spans
await telemetry.shutdown();
For Copilot agent mode, use the pre-configured variant that sets serviceName and mode so CLI and agent-mode spans are distinguishable in dashboards:
import { initAgentModeTelemetry } from '@bradygaster/squad-sdk';

const telemetry = initAgentModeTelemetry({
  endpoint: 'http://localhost:4317',
});
To access the tracer and meter directly for custom instrumentation:
import { getTracer, getMeter } from '@bradygaster/squad-sdk';

const tracer = getTracer('my-extension');
const meter = getMeter('my-extension');

// Create a custom span
const span = tracer.startSpan('my-operation');
span.setAttribute('custom.attribute', 'value');
span.end();

// Create a custom counter
const counter = meter.createCounter('my.events', { unit: 'events' });
counter.add(1, { 'event.type': 'my-event' });

OTEL Metrics

The otel-metrics module exports functions for recording the built-in Squad metrics:
MetricTypeDescription
squad.tokens.inputCounterTotal input tokens consumed, by agent and model
squad.tokens.outputCounterTotal output tokens produced, by agent and model
squad.tokens.costCounterEstimated cost in USD, by agent and model
squad.tokens.totalUpDownCounterRunning total of all tokens
squad.agents.spawnsCounterTotal agent spawns
squad.agents.durationHistogramAgent session duration in milliseconds
squad.agents.errorsCounterAgent errors by error type
squad.agents.activeUpDownCounterCurrently active agent sessions
squad.pool.sizeObservableGaugeCurrent session pool size
squad.pool.availableObservableGaugeAvailable (idle) pool slots
squad.response.latencyHistogramEnd-to-end response latency in milliseconds
squad.pr.reworksCounterNumber of PR rework cycles
All metrics are no-ops when OTel is not configured — there is no overhead from recording them if no collector is active.

.NET OpenTelemetry

For Squad.Agents.AI consumers, wire tracing in your DI setup with two lines:
builder.Services.AddOpenTelemetry()
    .WithTracing(t => t.AddSource(SquadAgentDiagnostics.ActivitySourceName));
SquadAgentDiagnostics.ActivitySourceName is "Microsoft.Agents.AI.Squad".

Span event kinds

The SquadAgentTraceEventKind enum describes the lifecycle events emitted as span events:
KindDescription
SubagentSelectedThe coordinator selected a subagent to dispatch to
SubagentStartedA subagent process started (the task tool spawned it)
SubagentCompletedA subagent finished and returned its result
SubagentFailedA subagent terminated abnormally
AssistantMessageAn assistant turn completed (from coordinator or subagent)
ToolStartA tool started executing
ToolCompleteA tool finished executing
SessionIdleThe session became idle (the run is complete)
Each SquadAgentTraceEvent carries the event kind, a raw event type name, a timestamp, an optional SdkAgentId (non-null for subagent-scoped events), and the subagent’s display name when known. Consumers receive these via the OnSubagentTrace callback in SquadAgentOptions — no transitive dependency on the Copilot SDK required.

Consuming trace events in .NET

builder.Services.AddSquadAgent(o =>
{
    o.SquadFolderPath = @"C:\path\to\team-root";
    o.OnSubagentTrace = (traceEvent) =>
    {
        if (traceEvent.Kind == SquadAgentTraceEventKind.SubagentStarted)
        {
            Console.WriteLine($"Agent started: {traceEvent.SubagentDisplayName}");
        }
    };
});

Cost Tracking

The CostTracker accumulates token usage and estimated cost across all agents in a session:
import { CostTracker } from '@bradygaster/squad-sdk';

const tracker = new CostTracker();

// Wire to EventBus for automatic updates
const unwire = tracker.wireToEventBus(eventBus);

// Get a formatted summary
console.log(tracker.formatSummary());
// e.g.:
// Session cost summary:
//   Keyser (claude-sonnet-4): 1,240 in / 892 out — $0.0024
//   McManus (claude-haiku-4.5): 380 in / 201 out — $0.0003
//   Total: 1,620 in / 1,093 out — $0.0027

// Get structured data
const summary = tracker.getSummary();
for (const [name, entry] of summary.agents) {
  console.log(`${name}: $${entry.estimatedCost.toFixed(4)}`);
}

// Cleanup
unwire();
The cost tracker stores per-agent and per-session breakdowns, so you can see exactly which agents consumed the most tokens and which sessions were most expensive.

TelemetryCollector

For opt-in aggregate telemetry (usage metrics with no PII or code content), the SDK provides TelemetryCollector:
import { TelemetryCollector } from '@bradygaster/squad-sdk';

const collector = new TelemetryCollector({ enabled: false });

// User opts in
collector.setConsent(true);

// Collect events
collector.collectEvent({ name: 'squad.init' });
collector.collectEvent({ name: 'squad.agent.spawn', properties: { role: 'developer' } });

// Flush to endpoint
await collector.flush();
Recognized event names: squad.init, squad.run, squad.agent.spawn, squad.error, squad.upgrade. The collector never transmits when consent is false and caps the queue at 500 events to prevent unbounded memory growth.

Build docs developers (and LLMs) love