Prometheus is deployed as part of the kube-prometheus-stack, providing comprehensive metrics collection, storage, and alerting for the Kimbernetes cluster. The stack includes Prometheus server, Alertmanager, kube-state-metrics, and node-exporter.
HelmRelease Configuration
overlays/base/prometheus/helmrelease.yaml
apiVersion : helm.toolkit.fluxcd.io/v2
kind : HelmRelease
metadata :
name : prometheus
spec :
chart :
spec :
chart : kube-prometheus-stack
sourceRef :
kind : HelmRepository
name : prometheus
version : "=79.5.0"
interval : 24h
releaseName : prometheus
targetNamespace : observability
install :
crds : Create
upgrade :
crds : CreateReplace
The HelmRepository source:
overlays/base/prometheus/helmrepository.yaml
apiVersion : source.toolkit.fluxcd.io/v1
kind : HelmRepository
metadata :
name : prometheus
spec :
interval : 24h
url : https://prometheus-community.github.io/helm-charts
Prometheus Server Configuration
The Prometheus server is configured for high availability and persistence:
prometheus :
prometheusSpec :
tolerations :
- effect : NoSchedule
operator : Exists
externalLabels :
cluster : ${CLUSTER}
enableRemoteWriteReceiver : false
podAntiAffinity : hard
replicas : 2
retention : 2d
retentionSize : 25GiB
serviceMonitorSelectorNilUsesHelmValues : false
podMonitorSelectorNilUsesHelmValues : false
storageSpec :
volumeClaimTemplate :
spec :
accessModes :
- ReadWriteOnce
resources :
requests :
storage : 50Gi
storageClassName : local-storage
The serviceMonitorSelectorNilUsesHelmValues: false setting allows Prometheus to discover all ServiceMonitors and PodMonitors in the cluster, regardless of labels.
Key Features
High Availability : 2 replicas with hard pod anti-affinity
Retention : 2 days or 25GiB per replica
Storage : 50Gi persistent volume per replica
Cluster Label : External label for multi-cluster setups
Tolerations : Runs on all nodes including control plane
ServiceMonitor Usage
ServiceMonitors define how Prometheus scrapes metrics from Kubernetes services:
apiVersion : monitoring.coreos.com/v1
kind : ServiceMonitor
metadata :
name : my-app-monitor
namespace : default
spec :
selector :
matchLabels :
app : my-app
endpoints :
- port : metrics
interval : 30s
path : /metrics
Basic ServiceMonitor
With Metric Relabeling
Cross-Namespace
apiVersion : monitoring.coreos.com/v1
kind : ServiceMonitor
metadata :
name : example-app
namespace : default
spec :
selector :
matchLabels :
app : example-app
endpoints :
- port : web
interval : 30s
apiVersion : monitoring.coreos.com/v1
kind : ServiceMonitor
metadata :
name : example-app
namespace : default
spec :
selector :
matchLabels :
app : example-app
endpoints :
- port : metrics
interval : 30s
metricRelabelings :
- sourceLabels : [ __name__ ]
regex : 'go_.*'
action : drop
- sourceLabels : [ instance ]
targetLabel : pod
apiVersion : monitoring.coreos.com/v1
kind : ServiceMonitor
metadata :
name : cross-namespace
namespace : observability
spec :
selector :
matchLabels :
monitoring : "true"
namespaceSelector :
any : true
endpoints :
- port : metrics
interval : 1m
PodMonitor Usage
PodMonitors scrape metrics directly from pods, useful for DaemonSets or pods without services:
overlays/kimawesome/infrastructure/observability/monitors-infrastructure/metallb-monitor.yaml
apiVersion : monitoring.coreos.com/v1
kind : PodMonitor
metadata :
name : metallb-monitor
namespace : observability
spec :
selector :
matchLabels :
app.kubernetes.io/instance : metallb
namespaceSelector :
any : true
podMetricsEndpoints :
- port : "monitoring"
interval : 30s
path : /metrics
Example: Kubernetes Gateway Monitor
overlays/kimawesome/infrastructure/observability/monitors-infrastructure/kgateway-monitor.yaml
apiVersion : monitoring.coreos.com/v1
kind : PodMonitor
metadata :
name : kgateway-monitor
namespace : observability
spec :
selector :
matchLabels :
app.kubernetes.io/name : gateway
namespaceSelector :
matchNames :
- gateway-system
podMetricsEndpoints :
- port : metrics
interval : 30s
Flux Metrics Integration
The stack includes custom resource state metrics for Flux resources:
Kustomization Metrics
HelmRelease Metrics
- groupVersionKind :
group : kustomize.toolkit.fluxcd.io
version : v1
kind : Kustomization
metricNamePrefix : gotk
metrics :
- name : "resource_info"
help : "The current state of a Flux Kustomization resource."
each :
type : Info
info :
labelsFromPath :
name : [ metadata , name ]
labelsFromPath :
exported_namespace : [ metadata , namespace ]
ready : [ status , conditions , "[type=Ready]" , status ]
suspended : [ spec , suspend ]
revision : [ status , lastAppliedRevision ]
source_name : [ spec , sourceRef , name ]
Example PromQL Queries
Resource Metrics
Flux Status
Cluster Health
# CPU usage by pod
rate(container_cpu_usage_seconds_total{
namespace="default",
pod!=""
}[5m])
# Memory usage by namespace
sum by (namespace) (
container_memory_working_set_bytes{}
)
# Disk I/O rate
rate(container_fs_reads_bytes_total[5m])
# HelmRelease ready status
gotk_resource_info{
customresource_kind="HelmRelease",
ready="True"
}
# Failed Kustomizations
gotk_resource_info{
customresource_kind="Kustomization",
ready="False"
}
# GitRepository sync lag
time() - gotk_reconcile_condition{
type="Ready",
status="True",
kind="GitRepository"
}
# Node CPU pressure
kube_node_status_condition{
condition="CPUPressure",
status="true"
}
# Pods in CrashLoopBackOff
kube_pod_container_status_waiting_reason{
reason="CrashLoopBackOff"
}
# Persistent Volume usage
kubelet_volume_stats_used_bytes /
kubelet_volume_stats_capacity_bytes * 100
Alertmanager Configuration
Alertmanager is configured for high availability:
overlays/base/prometheus/helmrelease.yaml
alertmanager :
podDisruptionBudget :
enabled : false
maxUnavailable : 1
minAvailable : ""
alertmanagerSpec :
logFormat : json
replicas : 1
storage :
volumeClaimTemplate :
spec :
storageClassName : managed-csi-zrs
accessModes :
- ReadWriteOnce
resources :
requests :
storage : 50Gi
Alertmanager is configured with 1 replica. Multi-replica setups require additional configuration for proper alert deduplication.
Creating Alerts
apiVersion : monitoring.coreos.com/v1
kind : PrometheusRule
metadata :
name : my-alerts
namespace : observability
spec :
groups :
- name : example
interval : 30s
rules :
- alert : HighPodMemory
expr : |
sum by (namespace, pod) (
container_memory_working_set_bytes{}
) > 1e9
for : 5m
labels :
severity : warning
annotations :
summary : "Pod {{ $labels.pod }} memory usage high"
description : "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} is using {{ $value | humanize }}B of memory."
Node Exporter
Node exporter is enabled to collect host-level metrics:
nodeExporter :
enabled : true
Node exporter provides:
CPU, memory, and disk metrics
Network interface statistics
Filesystem usage
Hardware sensor data
Accessing Prometheus UI
Port Forward
kubectl port-forward -n observability svc/prometheus-kube-prometheus-prometheus 9090:9090
Open Browser
Navigate to http://localhost:9090
Query Metrics
Use the query interface to explore metrics and test PromQL expressions
Troubleshooting
ServiceMonitor not discovered
Check Prometheus configuration: # View active ServiceMonitors
kubectl port-forward -n observability svc/prometheus-kube-prometheus-prometheus 9090:9090
# Navigate to Status → Service Discovery
Verify ServiceMonitor exists: kubectl get servicemonitor -A
kubectl describe servicemonitor my-monitor -n default
Check if the service endpoint is reachable: # Get service endpoints
kubectl get endpoints my-app -n default
# Test metrics endpoint
kubectl run -it --rm debug --image=curlimages/curl --restart=Never -- \
curl http://my-app.default.svc:8080/metrics
Check PVC status: kubectl get pvc -n observability | grep prometheus
kubectl describe pvc prometheus-prometheus-kube-prometheus-prometheus-0
View disk usage: kubectl exec -n observability prometheus-prometheus-kube-prometheus-prometheus-0 -- \
df -h /prometheus
Next Steps
Visualize in Grafana Create dashboards for Prometheus metrics
Configure Alloy Send metrics to remote endpoints
Query Logs Correlate metrics with logs
Overview Return to observability architecture