Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/abelperezr/nokia-bng-lab/llms.txt

Use this file to discover all available pages before exploring further.

Prometheus Setup

Prometheus is the time-series database that stores and queries metrics collected by gNMIc.

Overview

Prometheus is an open-source monitoring and alerting toolkit designed for reliability and scalability:
  • Pull-based metric collection (scraping)
  • Powerful query language (PromQL)
  • Time-series data storage
  • Built-in alerting capabilities
  • Service discovery support
Lab Configuration:
  • Container: prometheus
  • Management IP: 10.77.1.13
  • Web UI Port: 9090
  • Config File: configs/prometheus/prometheus.yml

Configuration File

The Prometheus configuration is minimal and focused:
global:
  scrape_interval: 5s

scrape_configs:
  - job_name: "gnmic"
    static_configs:
      - targets: ["gnmic:9273"]

Configuration Breakdown

global:
  scrape_interval: 5s
scrape_interval: How often Prometheus polls metric endpoints
  • Default: 5 seconds
  • Matches gNMIc’s sample-interval for consistent data
  • Lower values = more data points but higher storage/CPU usage
  • Higher values = less granular but more efficient

Data Retention

Prometheus stores data with default retention settings:
SettingDefaultDescription
Retention time15 daysHow long to keep data
Retention sizeNo limitMaximum storage size
Storage path/prometheusData directory
To customize retention, modify the container command in lab.yml:
prometheus:
  kind: linux
  image: prom/prometheus
  cmd: >
    --config.file=/etc/prometheus/prometheus.yml
    --storage.tsdb.retention.time=30d
    --storage.tsdb.retention.size=10GB

Access Prometheus Web UI

Prometheus includes a built-in web interface:
http://localhost:9090

Key Web UI Features

Execute PromQL queries and visualize results
  1. Navigate to Graph tab
  2. Enter a query in the expression box
  3. Click Execute
  4. View table or graph visualization
Example queries:
# All CPU metrics
system_cpu

# Interface statistics for a specific device
port_statistics_in_packets{device="bng1"}

# Rate of change over 5 minutes
rate(port_statistics_in_packets[5m])

PromQL Query Language

Prometheus Query Language (PromQL) is used to query and aggregate metrics.

Basic Queries

# All samples of a metric
system_cpu

# Filter by label
system_cpu{device="bng1"}

# Multiple label filters
port_statistics_in_packets{device="bng1",port_id="1/1/c1/1"}

Common Functions

# Packets per second
rate(port_statistics_in_packets[5m])

# Bits per second (assuming byte counter * 8)
rate(port_statistics_in_octets[5m]) * 8
Use for counter metrics that always increase.
# Instantaneous rate using last two samples
irate(port_statistics_in_packets[5m])
More responsive than rate() but can be volatile.
# Total packets across all interfaces
sum(port_statistics_in_packets)

# Total per device
sum by (device) (port_statistics_in_packets)

# Total per device and port
sum by (device, port_id) (port_statistics_in_packets)
# Average CPU across all devices
avg(system_cpu)

# Average per device
avg by (device) (system_cpu)
# Total packets in last 5 minutes
increase(port_statistics_in_packets[5m])
Like rate() but returns total change, not per-second.

Example Queries for BNG Lab

# CPU utilization by device
system_cpu{device=~"bng.*"}

# Memory usage percentage
(system_memory_pools_in_use / system_memory_pools_available) * 100

# Devices with high CPU
system_cpu > 80

Querying from Command Line

You can query Prometheus using its HTTP API:
# Query current value
curl -G http://localhost:9090/api/v1/query \
  --data-urlencode 'query=system_cpu{device="bng1"}'

Monitoring Prometheus

Check Scrape Health

# View Prometheus logs
sudo docker logs prometheus

# Check targets via API
curl http://localhost:9090/api/v1/targets | jq

# Expected output: "health": "up" for gnmic target

Query Statistics

# Prometheus metrics about itself
prometheus_tsdb_head_series          # Number of time series
prometheus_tsdb_head_samples_appended_total  # Samples added
rate(prometheus_tsdb_head_samples_appended_total[5m])  # Sample rate

# Storage metrics
prometheus_tsdb_storage_blocks_bytes  # Storage size
prometheus_tsdb_head_chunks          # Active chunks in memory

Verify Data Collection

# Test gNMIc endpoint from Prometheus container
sudo docker exec prometheus wget -O- http://gnmic:9273/metrics | head

# Count metrics available
sudo docker exec prometheus wget -O- http://gnmic:9273/metrics | grep -c "^[a-z]"

Performance Tuning

Adjust Scrape Interval

global:
  scrape_interval: 10s  # Less frequent scraping
  
scrape_configs:
  - job_name: "gnmic"
    scrape_interval: 5s   # Override for specific job
    static_configs:
      - targets: ["gnmic:9273"]

Scrape Timeout

global:
  scrape_timeout: 10s  # Max time to wait for scrape

scrape_configs:
  - job_name: "gnmic"
    scrape_timeout: 5s   # Job-specific timeout
    static_configs:
      - targets: ["gnmic:9273"]

Relabeling

scrape_configs:
  - job_name: "gnmic"
    static_configs:
      - targets: ["gnmic:9273"]
    
    # Add custom labels
    relabel_configs:
      - source_labels: [__address__]
        target_label: instance
        replacement: 'telemetry-collector'
    
    # Drop high-cardinality metrics
    metric_relabel_configs:
      - source_labels: [__name__]
        regex: 'expensive_metric_.*'
        action: drop

Troubleshooting

Problem: gNMIc target shows as DOWN in PrometheusCheck list:
  1. Verify gNMIc is running:
    sudo docker ps | grep gnmic
    
  2. Test connectivity from Prometheus:
    sudo docker exec prometheus wget -O- http://gnmic:9273/metrics
    
  3. Check Prometheus logs:
    sudo docker logs prometheus | grep gnmic
    
  4. Verify configuration:
    sudo docker exec prometheus cat /etc/prometheus/prometheus.yml
    
Problem: Queries return empty resultsSolutions:
  1. Check if metrics exist:
    curl http://localhost:9273/metrics | grep system_cpu
    
  2. Verify scraping is working (Targets page should show UP)
  3. Check query syntax and label filters:
    # Wrong (no such label)
    system_cpu{host="bng1"}
    
    # Correct
    system_cpu{device="bng1"}
    
  4. Ensure time range includes data points
Problem: Prometheus container using excessive memorySolutions:
  1. Reduce retention:
    --storage.tsdb.retention.time=7d
    --storage.tsdb.retention.size=5GB
    
  2. Increase scrape interval in prometheus.yml:
    global:
      scrape_interval: 15s
    
  3. Drop unused metrics in gNMIc or Prometheus config
  4. Monitor series cardinality:
    prometheus_tsdb_head_series
    
Problem: PromQL queries taking too longSolutions:
  1. Reduce query time range
  2. Use recording rules for expensive queries (requires config reload)
  3. Add more specific label filters
  4. Use irate() instead of rate() for recent data
  5. Limit results with topk() or bottomk()

Integration with Grafana

Prometheus is pre-configured as a Grafana datasource:
apiVersion: 1

datasources:
  - name: Prometheus
    type: prometheus
    url: http://prometheus:9090
    uid: prometheus
    isDefault: true
Access in Grafana:
  1. Navigate to ConfigurationData Sources
  2. Select Prometheus
  3. Click Test to verify connection
Grafana uses the same PromQL query language. Queries in Prometheus Web UI can be copied directly to Grafana panels.

Advanced Configuration

Multiple Scrape Targets

To scrape additional exporters:
scrape_configs:
  - job_name: "gnmic"
    static_configs:
      - targets: ["gnmic:9273"]
  
  - job_name: "node_exporter"
    static_configs:
      - targets: ["node-exporter:9100"]
  
  - job_name: "custom_exporter"
    static_configs:
      - targets:
          - "exporter1:8080"
          - "exporter2:8080"

Service Discovery

For dynamic environments:
scrape_configs:
  - job_name: "docker"
    docker_sd_configs:
      - host: unix:///var/run/docker.sock
    relabel_configs:
      - source_labels: [__meta_docker_container_name]
        target_label: container

Recording Rules

Pre-compute expensive queries:
groups:
  - name: bng_metrics
    interval: 30s
    rules:
      - record: interface_bps_in
        expr: rate(port_statistics_in_octets[5m]) * 8
      
      - record: interface_bps_out
        expr: rate(port_statistics_out_octets[5m]) * 8

Next Steps

Grafana Dashboards

Visualize metrics with pre-built dashboards

Available Metrics

Complete catalog of Nokia SROS metrics

Build docs developers (and LLMs) love