Control flow
Disruption controller
The Disruption Controller automatically discovers disruptable nodes and provisions replacements. It executes one automated method at a time — first Drift, then Consolidation — and uses disruption budgets to control the rate of disruption. The disruption flow:- Identify prioritized candidates for the disruption method.
- Skip nodes with pods that block eviction.
- If no disruptable nodes exist, move to the next disruption method.
- For each disruptable node:
- Check whether disrupting it would violate the NodePool’s disruption budget.
- Run a scheduling simulation to determine if replacement nodes are needed.
- Add the
karpenter.sh/disrupted:NoScheduletaint to prevent new pods from scheduling. - Pre-spin any needed replacement nodes and wait for them to become ready.
- If a replacement node fails to initialize, un-taint the disrupted nodes and restart from step 1.
- Delete the disrupted nodes and wait for the Termination Controller to complete graceful shutdown.
- After termination, restart from step 1.
Termination controller
When a Karpenter node is deleted, the finalizer blocks deletion while the Termination Controller gracefully shuts down the node:- Add the
karpenter.sh/disrupted:NoScheduletaint to prevent new pods from scheduling. - Begin evicting pods using the Kubernetes Eviction API, respecting PDBs. Static pods, pods tolerating the disruption taint, and completed pods are ignored.
- Verify all VolumeAttachment resources for drainable pods are deleted.
- Terminate the NodeClaim in the cloud provider.
- Remove the finalizer to allow the API server to delete the node.
Manual methods
You can manually disrupt nodes usingkubectl:
Deleting a node object without a Karpenter finalizer does not terminate the EC2 instance — the instance keeps running. Karpenter’s finalizer ensures proper cleanup.
Automated graceful methods
These methods can be rate-limited using NodePool disruption budgets.Consolidation
Consolidation reduces cluster cost by removing or replacing nodes when possible. Karpenter performs consolidation in this order:- Empty node consolidation — delete empty nodes in parallel
- Multi-node consolidation — delete two or more nodes in parallel, possibly launching a single cheaper replacement
- Single-node consolidation — delete a single node, possibly launching a cheaper replacement
consolidationPolicy and consolidateAfter:
consolidationPolicy values:
WhenEmptyOrUnderutilized— consider all nodes; consolidate when empty or underutilizedWhenEmpty— only consider nodes with no workload pods
- Run fewer pods
- Are expiring soon
- Have lower-priority pods
For spot nodes, Karpenter only enables deletion consolidation by default. To enable spot-to-spot replacement consolidation, enable the
SpotToSpotConsolidation feature flag.Drift
Drift detects when NodeClaims no longer match their owning NodePool or EC2NodeClass spec, and replaces those nodes. Karpenter annotates NodePools and EC2NodeClasses with a hash of the spec to detect drift. A NodeClaim is drifted when its values no longer match the owning resource. Fields that trigger drift:| Resource | Drifted fields |
|---|---|
| NodePool | spec.template.spec.requirements |
| EC2NodeClass | spec.subnetSelectorTerms, spec.securityGroupSelectorTerms, spec.amiSelectorTerms |
spec.weight, spec.limits, spec.disruption.*) do not trigger drift.
Drift can also occur without CRD changes — for example, when a new AMI is published and amiSelectorTerms discovers a newer AMI ID.
Automated forceful methods
These methods begin draining nodes immediately when triggered. They cannot be rate-limited by disruption budgets and do not wait for pre-spun replacements to be healthy.Expiration
Nodes are forcefully drained once their lifetime exceedsspec.expireAfter on the owning NodeClaim.
expireAfter: Never to disable expiration.
expireAfter is an upper bound on node lifetime, not a guaranteed minimum. Nodes may be disrupted earlier by drift or consolidation if their disruption budgets allow.Interruption
If interruption handling is enabled, Karpenter watches an SQS queue for events that signal upcoming EC2 disruptions:- Spot interruption warnings
- Scheduled maintenance events
- Instance terminating events
- Instance stopping events
--interruption-queue CLI argument to the name of the SQS queue.
Node auto repair
Node auto repair is an Alpha feature (Karpenter v1.1.0+). Enable with the
NodeRepair=true feature flag.| Condition | Status | Toleration duration |
|---|---|---|
Ready | False or Unknown | 30 minutes |
AcceleratedHardwareReady | False | 10 minutes |
StorageReady | False | 30 minutes |
NetworkingReady | False | 30 minutes |
KernelReady | False | 30 minutes |
ContainerRuntimeReady | False | 30 minutes |
NodePool disruption budgets
Budgets control how many nodes Karpenter can voluntarily disrupt at once. They apply to drift, emptiness, and consolidation — not to expiration. If undefined, Karpenter defaults to one budget ofnodes: 10%.
Budget fields
Maximum number of nodes to disrupt. Accepts an integer (
"5") or percentage ("20%").Karpenter calculates allowed disruptions as:
roundup(total * percentage) - total_deleting - total_notreadyFor multiple budgets, Karpenter uses the most restrictive value.Disruption reasons this budget applies to:
Empty, Drifted, or Underutilized. When omitted, the budget applies to all reasons.Cron expression defining when the budget becomes active. Must be set together with
duration. Always in UTC.How long the budget is active after the schedule fires. Accepts minutes and hours (e.g.,
10h5m, 30m). Must be set together with schedule.Disable disruption for a NodePool
Pod-level controls
PodDisruptionBudgets
Pods with blocking PDBs are not evicted and not considered for voluntary disruption. If any PDB on a node is blocking, the entire node cannot be voluntarily disrupted. Complex PDB scenarios:- A pod matching multiple PDBs requires ALL of those PDBs to allow disruption
- All PDBs across all pods on the same node must simultaneously permit eviction
- A single blocking PDB prevents the entire node from being disrupted
do-not-disrupt annotation
Addkarpenter.sh/do-not-disrupt: "true" to a pod to block Karpenter from voluntarily disrupting its node:
The
karpenter.sh/do-not-disrupt annotation does not block forceful methods: expiration, interruption, node repair, and manual deletion.Node-level controls
Addkarpenter.sh/do-not-disrupt: "true" to a node to block Karpenter from voluntarily disrupting it:
TerminationGracePeriod
Sets a maximum draining duration. Once this period elapses, remaining pods are forcibly deleted and the node is terminated — even if they have PDBs or thedo-not-disrupt annotation.
Configure in the NodePool:
expireAfter, this enforces a hard maximum node lifetime:
- Node begins draining at
expireAfter - Node is forcibly terminated
terminationGracePeriodafter draining starts - Maximum lifetime =
expireAfter+terminationGracePeriod