Skip to main content
Karpenter disrupts nodes to reduce costs, keep nodes up to date, and respond to infrastructure events. Disruption is controlled through the Disruption Controller and the Termination Controller.

Control flow

Disruption controller

The Disruption Controller automatically discovers disruptable nodes and provisions replacements. It executes one automated method at a time — first Drift, then Consolidation — and uses disruption budgets to control the rate of disruption. The disruption flow:
  1. Identify prioritized candidates for the disruption method.
  2. For each disruptable node:
    • Check whether disrupting it would violate the NodePool’s disruption budget.
    • Run a scheduling simulation to determine if replacement nodes are needed.
  3. Add the karpenter.sh/disrupted:NoSchedule taint to prevent new pods from scheduling.
  4. Pre-spin any needed replacement nodes and wait for them to become ready.
    • If a replacement node fails to initialize, un-taint the disrupted nodes and restart from step 1.
  5. Delete the disrupted nodes and wait for the Termination Controller to complete graceful shutdown.
  6. After termination, restart from step 1.

Termination controller

When a Karpenter node is deleted, the finalizer blocks deletion while the Termination Controller gracefully shuts down the node:
  1. Add the karpenter.sh/disrupted:NoSchedule taint to prevent new pods from scheduling.
  2. Begin evicting pods using the Kubernetes Eviction API, respecting PDBs. Static pods, pods tolerating the disruption taint, and completed pods are ignored.
  3. Verify all VolumeAttachment resources for drainable pods are deleted.
  4. Terminate the NodeClaim in the cloud provider.
  5. Remove the finalizer to allow the API server to delete the node.

Manual methods

You can manually disrupt nodes using kubectl:
# Delete a specific NodeClaim
kubectl delete nodeclaim $NODECLAIM_NAME

# Delete a specific node
kubectl delete node $NODE_NAME

# Delete all NodeClaims
kubectl delete nodeclaims --all

# Delete all nodes managed by any NodePool
kubectl delete nodes -l karpenter.sh/nodepool

# Delete all NodeClaims owned by a specific NodePool
kubectl delete nodeclaims -l karpenter.sh/nodepool=$NODEPOOL_NAME
Deleting a NodePool cascades deletion to all NodeClaims it owns through Kubernetes owner references.
Deleting a node object without a Karpenter finalizer does not terminate the EC2 instance — the instance keeps running. Karpenter’s finalizer ensures proper cleanup.

Automated graceful methods

These methods can be rate-limited using NodePool disruption budgets.

Consolidation

Consolidation reduces cluster cost by removing or replacing nodes when possible. Karpenter performs consolidation in this order:
  1. Empty node consolidation — delete empty nodes in parallel
  2. Multi-node consolidation — delete two or more nodes in parallel, possibly launching a single cheaper replacement
  3. Single-node consolidation — delete a single node, possibly launching a cheaper replacement
Consolidation is configured by consolidationPolicy and consolidateAfter:
disruption:
  consolidationPolicy: WhenEmptyOrUnderutilized  # or WhenEmpty
  consolidateAfter: 1m
consolidationPolicy values:
  • WhenEmptyOrUnderutilized — consider all nodes; consolidate when empty or underutilized
  • WhenEmpty — only consider nodes with no workload pods
When multiple candidates exist, Karpenter prefers to terminate nodes that:
  • Run fewer pods
  • Are expiring soon
  • Have lower-priority pods
Karpenter emits events on nodes that can’t be consolidated:
Normal   Unconsolidatable   66s   karpenter   pdb default/inflate-pdb prevents pod evictions
Normal   Unconsolidatable   33s   karpenter   can't replace with a lower-priced node
Preferred anti-affinity and topology spread constraints can reduce consolidation effectiveness. Karpenter won’t disrupt nodes in ways that violate these preferences, even if the scheduler could fit the pods elsewhere.
For spot nodes, Karpenter only enables deletion consolidation by default. To enable spot-to-spot replacement consolidation, enable the SpotToSpotConsolidation feature flag.

Drift

Drift detects when NodeClaims no longer match their owning NodePool or EC2NodeClass spec, and replaces those nodes. Karpenter annotates NodePools and EC2NodeClasses with a hash of the spec to detect drift. A NodeClaim is drifted when its values no longer match the owning resource. Fields that trigger drift:
ResourceDrifted fields
NodePoolspec.template.spec.requirements
EC2NodeClassspec.subnetSelectorTerms, spec.securityGroupSelectorTerms, spec.amiSelectorTerms
Behavioral fields (like spec.weight, spec.limits, spec.disruption.*) do not trigger drift. Drift can also occur without CRD changes — for example, when a new AMI is published and amiSelectorTerms discovers a newer AMI ID.

Automated forceful methods

These methods begin draining nodes immediately when triggered. They cannot be rate-limited by disruption budgets and do not wait for pre-spun replacements to be healthy.

Expiration

Nodes are forcefully drained once their lifetime exceeds spec.expireAfter on the owning NodeClaim.
spec:
  template:
    spec:
      expireAfter: 720h  # 30 days (default)
Use expireAfter: Never to disable expiration.
expireAfter is an upper bound on node lifetime, not a guaranteed minimum. Nodes may be disrupted earlier by drift or consolidation if their disruption budgets allow.
Misconfigured PDBs or karpenter.sh/do-not-disrupt pods may block draining indefinitely. If you use expireAfter, also set terminationGracePeriod to ensure nodes eventually terminate.

Interruption

If interruption handling is enabled, Karpenter watches an SQS queue for events that signal upcoming EC2 disruptions:
  • Spot interruption warnings
  • Scheduled maintenance events
  • Instance terminating events
  • Instance stopping events
For spot interruptions, Karpenter starts a replacement node immediately upon receiving the 2-minute warning, then drains the interrupted node in parallel. Enable interruption handling by setting the --interruption-queue CLI argument to the name of the SQS queue.

Node auto repair

Node auto repair is an Alpha feature (Karpenter v1.1.0+). Enable with the NodeRepair=true feature flag.
Karpenter automatically replaces nodes with unhealthy status conditions. When a node has been unhealthy beyond its toleration duration, Karpenter forcefully terminates the node and its NodeClaim, bypassing standard drain and grace period procedures. To prevent cascading failures, Karpenter will not repair if more than 20% of nodes in a NodePool are unhealthy. Monitored conditions:
ConditionStatusToleration duration
ReadyFalse or Unknown30 minutes
AcceleratedHardwareReadyFalse10 minutes
StorageReadyFalse30 minutes
NetworkingReadyFalse30 minutes
KernelReadyFalse30 minutes
ContainerRuntimeReadyFalse30 minutes

NodePool disruption budgets

Budgets control how many nodes Karpenter can voluntarily disrupt at once. They apply to drift, emptiness, and consolidation — not to expiration. If undefined, Karpenter defaults to one budget of nodes: 10%.
disruption:
  consolidationPolicy: WhenEmptyOrUnderutilized
  budgets:
  # Allow 20% of nodes to be disrupted when drifted or empty
  - nodes: "20%"
    reasons:
    - "Empty"
    - "Drifted"
  # Absolute ceiling of 5 concurrent disruptions
  - nodes: "5"
  # Block disruptions for underutilized nodes at midnight daily for 10 minutes
  - nodes: "0"
    schedule: "@daily"
    duration: 10m
    reasons:
    - "Underutilized"

Budget fields

nodes
string
required
Maximum number of nodes to disrupt. Accepts an integer ("5") or percentage ("20%").Karpenter calculates allowed disruptions as: roundup(total * percentage) - total_deleting - total_notreadyFor multiple budgets, Karpenter uses the most restrictive value.
reasons
array
Disruption reasons this budget applies to: Empty, Drifted, or Underutilized. When omitted, the budget applies to all reasons.
schedule
string
Cron expression defining when the budget becomes active. Must be set together with duration. Always in UTC.
# ┌── minute (0-59)
# │ ┌── hour (0-23)
# │ │ ┌── day of month (1-31)
# │ │ │ ┌── month (1-12)
# │ │ │ │ ┌── day of week (0-6)
# * * * * *
duration
string
How long the budget is active after the schedule fires. Accepts minutes and hours (e.g., 10h5m, 30m). Must be set together with schedule.

Disable disruption for a NodePool

disruption:
  budgets:
    - nodes: "0"

Pod-level controls

PodDisruptionBudgets

Pods with blocking PDBs are not evicted and not considered for voluntary disruption. If any PDB on a node is blocking, the entire node cannot be voluntarily disrupted. Complex PDB scenarios:
  • A pod matching multiple PDBs requires ALL of those PDBs to allow disruption
  • All PDBs across all pods on the same node must simultaneously permit eviction
  • A single blocking PDB prevents the entire node from being disrupted

do-not-disrupt annotation

Add karpenter.sh/do-not-disrupt: "true" to a pod to block Karpenter from voluntarily disrupting its node:
apiVersion: apps/v1
kind: Deployment
spec:
  template:
    metadata:
      annotations:
        karpenter.sh/do-not-disrupt: "true"
This is useful for long-running batch jobs or stateful workloads that cannot tolerate interruption.
The karpenter.sh/do-not-disrupt annotation does not block forceful methods: expiration, interruption, node repair, and manual deletion.

Node-level controls

Add karpenter.sh/do-not-disrupt: "true" to a node to block Karpenter from voluntarily disrupting it:
apiVersion: v1
kind: Node
metadata:
  annotations:
    karpenter.sh/do-not-disrupt: "true"

TerminationGracePeriod

Sets a maximum draining duration. Once this period elapses, remaining pods are forcibly deleted and the node is terminated — even if they have PDBs or the do-not-disrupt annotation. Configure in the NodePool:
spec:
  template:
    spec:
      terminationGracePeriod: 48h
Used together with expireAfter, this enforces a hard maximum node lifetime:
  • Node begins draining at expireAfter
  • Node is forcibly terminated terminationGracePeriod after draining starts
  • Maximum lifetime = expireAfter + terminationGracePeriod
Pods are preemptively deleted before terminationGracePeriod elapses to allow their own terminationGracePeriodSeconds to be respected. If a pod’s terminationGracePeriodSeconds exceeds the node’s terminationGracePeriod, the pod is deleted immediately when the node begins to drain.

Build docs developers (and LLMs) love