What to monitor
Cluster components
Cluster components
| Component | What to monitor |
|---|---|
API Server (kube-apiserver) | Request rates, error rates, and latency |
Scheduler (kube-scheduler) | Scheduling latency and errors |
Controller Manager (kube-controller-manager) | Status of various controllers |
Nodes
Nodes
- The number of nodes in the cluster
- CPU and memory usage — ensure nodes are not over or underutilized
- Disk usage — monitor disk space and I/O operations
- Network — monitor bandwidth and errors
Pods
Pods
- The number of pods in the cluster
- Resource usage — CPU, memory, and disk usage per pod and container
- Pod status — Running, Pending, Failed, and other states
- Container logs — errors and warnings
- Health checks — results of liveness and readiness probes
- Application metrics — custom metrics such as request rates and error rates
Metrics Server
You can use other metrics solutions such as Prometheus, Datadog, or Dynatrace in place of Metrics Server.
kube-apiserver) for autoscaling purposes.
- Horizontal Pod Autoscaler (HPA) uses this data to automatically adjust the number of pod replicas in a deployment.
How metrics are collected
Each kubelet contains a component called cAdvisor (Container Advisor) that collects resource usage data from running pods and exposes it through the kubelet API. Metrics Server reads from this API to gather cluster-wide resource data.Commands
The
kubectl top commands require Metrics Server to be correctly configured and running in the cluster.