Skip to main content
The Kimbernetes cluster includes a complete observability stack that provides real-time monitoring, logging, and visualization capabilities. The stack is built on industry-standard tools including Grafana, Prometheus, Loki, and Alloy, all managed declaratively through Flux HelmReleases.

Architecture

The observability stack follows a unified telemetry collection pipeline:
┌─────────────────────────────────────────────────────────────┐
│                    Data Sources                              │
│  • Pod Logs        • Node Logs      • Cluster Events         │
│  • Metrics         • Custom Metrics • Service Monitors       │
└─────────────────┬───────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────┐
│                  Grafana Alloy                               │
│  Unified telemetry collector with intelligent routing        │
│  • Log collection from pods and nodes                        │
│  • Metric scraping from exporters                            │
│  • Label extraction and enrichment                           │
└─────────────┬───────────────────────────────────────────────┘

       ┌──────────┴───────────┐
       ▼                      ▼
┌──────────────┐      ┌──────────────┐
│  Prometheus  │      │     Loki     │
│   Metrics    │      │     Logs     │
│   Storage    │      │   Storage    │
└──────┬───────┘      └──────┬───────┘
       │                     │
       └──────────┬──────────┘

         ┌────────────────┐
         │    Grafana     │
         │ Visualization  │
         │  & Dashboards  │
         └────────────────┘

Stack Components

Grafana Stack

Visualization platform with operator-managed instances, datasources, and dashboards

Prometheus

High-performance metrics collection and storage with 2-day retention and 25GiB capacity

Grafana Alloy

Unified telemetry collector for logs, metrics, and events with intelligent routing

Loki

Log aggregation system with 31-day retention and S3-compatible storage

Telemetry Flow

1

Collection

Grafana Alloy runs as a DaemonSet on every node, collecting:
  • Pod logs from all containers with automatic label extraction
  • Node logs from systemd journal (kubelet, containerd)
  • Cluster events from the Kubernetes API
  • Metrics from node-exporter and Kepler
2

Processing

Alloy enriches telemetry data with:
  • Cluster and namespace labels
  • Pod controller and application names
  • Node names and container images
  • Custom labels for filtering
3

Storage

Data is routed to appropriate backends:
  • Logs → Loki with MinIO S3 storage backend
  • Metrics → Prometheus with local persistent volumes
4

Visualization

Grafana provides:
  • Pre-configured datasources for Prometheus and Loki
  • Custom dashboards for cluster health
  • Query interface for LogQL and PromQL

Key Features

High Availability

  • Prometheus: 2 replicas with hard pod anti-affinity
  • Alloy: DaemonSet deployment ensures coverage on all nodes
  • Loki: SingleBinary mode with persistent storage

Data Retention

ComponentRetention PeriodStorage Capacity
Prometheus2 days25 GiB per replica
Loki31 days30 GiB (MinIO)
AlertmanagerN/A50 GiB

Flux Integration

The observability stack includes custom metrics for Flux resources:
metrics:
  - gotk_resource_info (Kustomizations)
  - gotk_resource_info (HelmReleases)
  - gotk_resource_info (GitRepositories)
  - gotk_resource_info (HelmRepositories)
These metrics enable monitoring of GitOps deployments directly in Grafana.

Access Grafana

Grafana is exposed via HTTPRoute at your cluster’s configured domain. Check the HTTPRoute configuration in overlays/kimawesome/infrastructure/observability/grafana-operator/httproute.yaml.
Default credentials are stored in a SealedSecret:
kubectl get secret credentials -n observability -o jsonpath='{.data.GF_SECURITY_ADMIN_USER}' | base64 -d

ServiceMonitor and PodMonitor

Prometheus automatically discovers metrics endpoints using:
  • ServiceMonitors: For services exposing metrics
  • PodMonitors: For pods with direct metrics endpoints
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: metallb-monitor
  namespace: observability
spec:
  selector:
    matchLabels:
      app.kubernetes.io/instance: metallb
  namespaceSelector:
    any: true
  podMetricsEndpoints:
  - port: "monitoring"
    interval: 30s
    path: /metrics

Configuration Files

All observability components are defined in:
  • Base: overlays/base/grafana/ and overlays/base/prometheus/
  • Environment: overlays/kimawesome/infrastructure/observability/
  • Kustomization: overlays/kimawesome/infrastructure/observability/kustomization.yaml:1

Next Steps

Configure Grafana

Set up dashboards and datasources

Add Service Monitors

Expose custom application metrics

Query Logs

Search and analyze application logs

Customize Alloy

Configure telemetry collection

Build docs developers (and LLMs) love