Monitoring

Overview

The Monitoring feature provides Prometheus-compatible metrics endpoints for tracking system performance, request patterns, and operational health. Integrate with your observability stack for real-time monitoring and alerting.

Prometheus Compatible

Standard Prometheus text format metrics

Request Tracking

Total request counters and performance metrics

Metrics Endpoint

Expose metrics in Prometheus text format:

GET /api/monitoring/metrics

Response:

# HELP requests_total Total number of requests
# TYPE requests_total counter
requests_total 1547.0

The metrics endpoint does not require authentication and should be accessible to your monitoring infrastructure.

Available Metrics

Request Counter

REQUESTS = Counter("requests_total", "Total number of requests")

Metric Name: requests_total
Type: Counter
Description: Tracks the total number of requests to the monitoring endpoint Source: ~/workspace/source/app/features/monitoring/presentation/routes.py:7

Prometheus Integration

Scrape Configuration

Add to your prometheus.yml:

scrape_configs:
  - job_name: 'water-quality-api'
    scrape_interval: 15s
    static_configs:
      - targets: ['api.example.com:443']
    metrics_path: '/api/monitoring/metrics'
    scheme: https

Example Queries

Total requests:

requests_total

Request rate (per second):

rate(requests_total[5m])

Request increase over time:

increase(requests_total[1h])

Grafana Dashboard

Create visualizations for your metrics:

{
  "panels": [
    {
      "title": "Total Requests",
      "targets": [
        {
          "expr": "requests_total",
          "legendFormat": "Total Requests"
        }
      ],
      "type": "stat"
    },
    {
      "title": "Request Rate",
      "targets": [
        {
          "expr": "rate(requests_total[5m])",
          "legendFormat": "Requests/sec"
        }
      ],
      "type": "graph"
    }
  ]
}

Health Check

Basic health check endpoint:

GET /api/monitoring/

Response:

{
  "message": "Monitoring Home"
}

This endpoint increments the requests_total counter on each call.

Implementation Details

Prometheus Client

The monitoring feature uses the official Prometheus Python client:

from prometheus_client import Counter, generate_latest
from fastapi import Response

REQUESTS = Counter("requests_total", "Total number of requests")

@monitoring_router.get("/metrics")
async def metrics():
    data = generate_latest()
    return Response(content=data, media_type="text/plain")

Custom Metrics

Extend monitoring by adding custom metrics:

from prometheus_client import Counter, Histogram, Gauge

# Counter: monotonically increasing value
ALERTS_SENT = Counter(
    "alerts_sent_total",
    "Total number of alerts sent",
    ["alert_type"]  # Labels
)

# Histogram: distribution of values
ANALYSIS_DURATION = Histogram(
    "analysis_duration_seconds",
    "Time spent processing analysis",
    ["analysis_type"]
)

# Gauge: value that can go up or down
ACTIVE_METERS = Gauge(
    "active_meters_count",
    "Number of currently active meters"
)

# Usage
ALERTS_SENT.labels(alert_type="dangerous").inc()
with ANALYSIS_DURATION.labels(analysis_type="prediction").time():
    # Process analysis
    pass
ACTIVE_METERS.set(42)

Best Practices

Meaningful Names

Use descriptive metric names following Prometheus naming conventions

Appropriate Types

Choose the right metric type (Counter, Gauge, Histogram, Summary)

Strategic Labels

Add labels for dimensions but avoid high cardinality

Regular Scraping

Configure reasonable scrape intervals (10-60 seconds)

Alerting Rules

Example Prometheus alerting rules:

groups:
  - name: water_quality_api
    interval: 30s
    rules:
      - alert: HighRequestRate
        expr: rate(requests_total[5m]) > 100
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High request rate detected"
          description: "Request rate is {{ $value }} requests/sec"
      
      - alert: NoRecentRequests
        expr: rate(requests_total[10m]) == 0
        for: 10m
        labels:
          severity: critical
        annotations:
          summary: "No requests received"
          description: "API may be down or unreachable"

Expanding Metrics

Recommended additional metrics for production:

Application Metrics

Request latency by endpoint
Error rate and types
Active WebSocket connections
Database query performance

Business Metrics

Alerts sent by type
Analysis created/completed
Active users and workspaces
Meters registered and connected

Infrastructure Metrics

Memory usage
CPU utilization
Database connections
Cache hit rates

Water Quality Metrics

Average sensor values by type
Anomaly detection rates
Data collection frequency
Missing data percentages

Security Considerations

Metrics endpoints should be accessible to monitoring infrastructure but protected from public access. Consider:

Network-level access controls
VPN or private network access
IP allowlisting
Separate authentication for metrics

Example: Complete Monitoring Setup

# extended_monitoring.py
from prometheus_client import Counter, Histogram, Gauge, generate_latest
from fastapi import APIRouter, Response
import time

monitoring_router = APIRouter(prefix="/monitoring", tags=["Monitoring"])

# Define metrics
REQUESTS = Counter("requests_total", "Total requests")
ALERTS = Counter("alerts_sent_total", "Total alerts sent", ["type"])
ANALYSIS = Counter("analysis_created_total", "Total analyses created", ["type"])
ANALYSIS_DURATION = Histogram(
    "analysis_duration_seconds",
    "Analysis processing time",
    ["type"]
)
ACTIVE_METERS = Gauge("active_meters", "Currently connected meters")
ACTIVE_CONNECTIONS = Gauge("websocket_connections", "Active WebSocket connections")

@monitoring_router.get("/")
async def home():
    REQUESTS.inc()
    return {"message": "Monitoring Home", "timestamp": time.time()}

@monitoring_router.get("/metrics")
async def metrics():
    data = generate_latest()
    return Response(content=data, media_type="text/plain")

# Helper functions for other parts of the application
def track_alert(alert_type: str):
    ALERTS.labels(type=alert_type).inc()

def track_analysis(analysis_type: str, duration: float):
    ANALYSIS.labels(type=analysis_type).inc()
    ANALYSIS_DURATION.labels(type=analysis_type).observe(duration)

def update_active_meters(count: int):
    ACTIVE_METERS.set(count)

def update_websocket_connections(count: int):
    ACTIVE_CONNECTIONS.set(count)

Monitoring Dashboard Example

Visualize your metrics with this Grafana dashboard JSON:

{
  "dashboard": {
    "title": "Water Quality API Monitoring",
    "panels": [
      {
        "title": "Request Rate",
        "targets": [{"expr": "rate(requests_total[5m])"}],
        "type": "graph"
      },
      {
        "title": "Active Meters",
        "targets": [{"expr": "active_meters"}],
        "type": "stat"
      },
      {
        "title": "Alerts Sent by Type",
        "targets": [{"expr": "rate(alerts_sent_total[1h])"}],
        "type": "graph"
      },
      {
        "title": "Analysis Duration (p95)",
        "targets": [{
          "expr": "histogram_quantile(0.95, rate(analysis_duration_seconds_bucket[5m]))"
        }],
        "type": "graph"
      }
    ]
  }
}

Get Started

Guides

Features

Overview

Prometheus Compatible

Request Tracking

Metrics Endpoint

Available Metrics

Request Counter

Prometheus Integration

Scrape Configuration

Example Queries

Grafana Dashboard

Health Check

Implementation Details

Prometheus Client

Custom Metrics

Best Practices

Meaningful Names

Appropriate Types

Strategic Labels

Regular Scraping

Alerting Rules

Expanding Metrics

Security Considerations

Example: Complete Monitoring Setup

Monitoring Dashboard Example

Build docs developers (and LLMs) love

Get Started

Guides

Features

​Overview

Prometheus Compatible

Request Tracking

​Metrics Endpoint

​Available Metrics

​Request Counter

​Prometheus Integration

​Scrape Configuration

​Example Queries

​Grafana Dashboard

​Health Check

​Implementation Details

​Prometheus Client

​Custom Metrics

​Best Practices

Meaningful Names

Appropriate Types

Strategic Labels

Regular Scraping

​Alerting Rules

​Expanding Metrics

​Security Considerations

​Example: Complete Monitoring Setup

​Monitoring Dashboard Example

Build docs developers (and LLMs) love

Overview

Metrics Endpoint

Available Metrics

Request Counter

Prometheus Integration

Scrape Configuration

Example Queries

Grafana Dashboard

Health Check

Implementation Details

Prometheus Client

Custom Metrics

Best Practices

Alerting Rules

Expanding Metrics

Security Considerations

Example: Complete Monitoring Setup

Monitoring Dashboard Example