What is OpenTelemetry?
OpenTelemetry is an observability framework that provides:- Distributed Tracing: Track requests across system boundaries
- Metrics Collection: Monitor performance and usage patterns
- Spans: Represent individual operations with timing and metadata
- Exporters: Send telemetry to various backends
- Semantic Conventions: Standardized attribute names for AI/LLM operations
- Agent creation and invocation
- Strategy execution
- Node execution
- LLM calls (including token usage)
- Tool calls
Installation
Configuration
Basic Configuration
Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
serviceName | String | "ai.koog" | Name of the service being instrumented |
serviceVersion | String | "0.0.0" | Version of the service |
isVerbose | Boolean | false | Enable verbose logging for debugging |
sdk | OpenTelemetrySdk | Auto-configured | Custom SDK instance |
tracer | Tracer | Auto-created | Custom tracer instance |
Configuration Methods
Exporters
OTLP Exporter (Recommended)
Send telemetry to OpenTelemetry Collector:Logging Exporter
Output traces to console (useful for development):Jaeger Exporter
Send traces directly to Jaeger:Zipkin Exporter
Send traces to Zipkin:Integration with Jaeger
Jaeger is a popular distributed tracing system. Here’s how to set it up:1. Start Jaeger with Docker
docker-compose up -d
2. Configure Agent
3. View Traces
Open Jaeger UI athttp://localhost:16686 to view your traces.
Span Types and Attributes
Koog creates different span types following OpenTelemetry Semantic Conventions for GenAI:Create Agent Span
Long-lived span for the agent’s lifetime: Attributes:gen_ai.operation.name="create_agent"gen_ai.agent.idgen_ai.request.model
Invoke Agent Span
One execution run of an agent: Attributes:gen_ai.operation.name="invoke_agent"gen_ai.agent.idgen_ai.conversation.idgen_ai.system(LLM provider)gen_ai.response.finish_reasons(on error)
Strategy Span
Strategy execution: Attributes:gen_ai.conversation.idkoog.strategy.namekoog.event.id
Node Execute Span
Individual node execution: Attributes:gen_ai.conversation.idkoog.node.id
Inference Span (LLM Call)
Single LLM call: Attributes:gen_ai.operation.name="chat"gen_ai.system(provider: “openai”, “anthropic”, etc.)gen_ai.request.modelgen_ai.request.temperaturegen_ai.request.max_tokensgen_ai.usage.input_tokensgen_ai.usage.output_tokensgen_ai.usage.total_tokensgen_ai.response.finish_reasons
- System, user, and assistant messages
- Tool choice and tool result messages
- Moderation responses
Execute Tool Span
Tool execution: Attributes:gen_ai.tool.namegen_ai.tool.descriptiongen_ai.tool.argumentsgen_ai.tool.call_idgen_ai.tool.outputerror.type(on failure)
Resource Attributes
Default resource attributes automatically added:service.name: Service nameservice.version: Service versionservice.instance.time: Instance creation timestampos.type: Operating system typeos.version: OS versionos.arch: OS architecture
Sampling
Control which spans are collected:Custom SDK
Provide a pre-configured OpenTelemetry SDK:When using
setSdk(), other configuration methods like addSpanExporter() are ignored since the SDK is already configured.Examples
Basic Example
Production Example
Multi-Exporter Example
Token Usage Tracking
OpenTelemetry automatically tracks token usage for LLM calls:- Cost monitoring: Track API usage costs
- Performance analysis: Identify expensive prompts
- Optimization: Find opportunities to reduce token usage
Troubleshooting
No traces appearing
No traces appearing
- Verify the exporter endpoint is accessible
- Check that sampling is not set to
alwaysOff() - Ensure you wait for async export (add delay before exit)
- Enable verbose logging:
setVerbose(true)
Missing spans or incomplete traces
Missing spans or incomplete traces
- Verify agent execution completes successfully
- Check for exceptions in your code
- Ensure proper span processor configuration
- Wait sufficient time for batch processing
Too many spans
Too many spans
- Adjust sampling rate:
Sampler.traceIdRatioBased(0.1) - Use parent-based sampling for consistency
- Filter at the collector level
High memory usage
High memory usage
- Reduce max queue size in BatchSpanProcessor
- Decrease schedule delay for more frequent exports
- Increase sampling ratio to collect fewer traces
Best Practices
Use OTLP exporter
Use OTLP exporter
OTLP is the standard protocol and works with all major backends. Prefer it over backend-specific exporters.
Configure appropriate sampling
Configure appropriate sampling
In production, use ratio-based sampling (e.g., 0.1 = 10%) to balance observability with overhead.
Add meaningful resource attributes
Add meaningful resource attributes
Include environment, region, version to make traces easier to filter and analyze.
Monitor token usage
Monitor token usage
Use
gen_ai.usage.* attributes to track and optimize LLM costs.Wait for export completion
Wait for export completion
Add a delay before application exit to ensure all spans are exported.
Related Features
Tracing
Koog-specific tracing with custom message processors
Event Handlers
Lightweight hooks for custom monitoring logic