- Watching for pods that the Kubernetes scheduler has marked as unschedulable
- Evaluating scheduling constraints (resource requests, node selectors, affinities, tolerations, and topology spread constraints) requested by the pods
- Provisioning nodes that meet the requirements of the pods
- Disrupting nodes when they are no longer needed
Why Karpenter
Before Karpenter, Kubernetes users relied on the Cluster Autoscaler (CAS) and manually provisioned node groups to dynamically scale cluster capacity. This approach has fundamental limitations:- Node groups are coarse-grained. You pre-define a fixed set of instance types per node group. If your workloads need a variety of instance types (e.g., GPU for ML jobs, high-memory for databases, Spot for batch), you must create and manage many node groups.
- Scaling is slow. CAS works by scaling existing Auto Scaling Groups (ASGs). It must wait for the ASG to launch an instance, for the instance to bootstrap, and for the node to become ready before pods can be scheduled.
- Over-provisioning is common. Teams often keep buffer capacity to avoid latency, which increases cost.
- Bin-packing is limited. Because CAS selects node groups rather than individual instance types, it cannot always choose the most cost-efficient instance for a given workload.
| Cluster Autoscaler | Karpenter | |
|---|---|---|
| Provisioning model | Scales pre-defined node groups (ASGs) | Launches individual EC2 instances directly |
| Instance selection | Fixed per node group | Dynamically chosen per workload need |
| Speed | Minutes (ASG warm-up + bootstrap) | Seconds (direct EC2 RunInstances API) |
| Bin packing | Node-group-constrained | Workload-aware, fleet-wide |
| Configuration | One node group per instance type set | One NodePool covers many instance types |
| Spot handling | Requires separate Spot node groups | Native capacity-type awareness |
Groupless provisioning
Karpenter operates without node groups. Rather than scaling a pre-configured ASG, Karpenter evaluates pending pod requirements and calls the EC2RunInstances API directly, selecting the best-fitting instance type from the entire EC2 fleet. This means a single NodePool can satisfy vastly different workload shapes.
Just-in-time provisioning
Karpenter provisions nodes in seconds. When the Kubernetes scheduler marks pods as unschedulable, Karpenter immediately evaluates the pod requirements and launches a matching instance. There is no ASG warm-up period and no pre-provisioned buffer required.Automatic consolidation
Karpenter continuously evaluates running nodes. When it detects that nodes are underutilized or empty, it can consolidate workloads onto fewer, better-fitting nodes and terminate the rest — reducing cost without any manual intervention.Core architecture
Karpenter introduces three Kubernetes custom resources:NodePool
ANodePool is a cluster-scoped resource that defines the constraints for the nodes Karpenter is allowed to provision. You configure:
- Requirements — acceptable instance categories, architectures, OS, capacity types (on-demand or Spot), and availability zones
- Limits — maximum total CPU and memory Karpenter can provision under this pool
- Disruption — consolidation policies and node expiry settings
- Node template — labels, taints, and annotations to apply to provisioned nodes
EC2NodeClass
AnEC2NodeClass is an AWS-specific resource that defines the cloud provider configuration used when launching EC2 instances. It covers:
- IAM role — the instance profile role assigned to nodes
- AMI selection — via alias (e.g.,
al2023@latest) or tag-based selectors - Subnet selection — by tags or IDs
- Security group selection — by tags or IDs
- User data — custom bootstrap scripts
- Block device mappings — EBS volume configuration
nodeClassRef.
NodeClaim
ANodeClaim is an internal resource created by Karpenter when it decides to provision a node. It represents a request for a specific node with specific properties and tracks the lifecycle of the corresponding EC2 instance. NodeClaims are managed entirely by Karpenter — you do not create them directly.
AWS prerequisites
To run Karpenter on AWS you need:An existing Amazon EKS cluster
An existing Amazon EKS cluster
Karpenter runs as a controller inside your EKS cluster. The cluster must be running Kubernetes 1.29 or later. See the compatibility matrix for the mapping between Karpenter versions and Kubernetes versions.
IAM permissions for the Karpenter controller
IAM permissions for the Karpenter controller
Karpenter requires an IAM role with permissions to call EC2 APIs (RunInstances, TerminateInstances, DescribeInstances, etc.), SSM for AMI resolution, SQS for interruption handling, and IAM for instance profile management. The recommended approach is to use EKS Pod Identity or IRSA to bind this role to the Karpenter service account.
An IAM role for Karpenter-managed nodes
An IAM role for Karpenter-managed nodes
Nodes launched by Karpenter need an EC2 instance profile that grants them the standard EKS worker-node policies:
AmazonEKSWorkerNodePolicy, AmazonEKS_CNI_Policy, AmazonEC2ContainerRegistryPullOnly, and AmazonSSMManagedInstanceCore.Tagged subnets and security groups
Tagged subnets and security groups
Karpenter uses tag-based discovery to find subnets and security groups. The recommended tag is
karpenter.sh/discovery: <cluster-name> on the subnets and security groups you want Karpenter to use.Helm 3
Helm 3
Karpenter is distributed as an OCI Helm chart at
oci://public.ecr.aws/karpenter/karpenter. Helm 3 is required to install it.EC2 Spot service-linked role (if using Spot instances)
EC2 Spot service-linked role (if using Spot instances)
If your AWS account has not previously used EC2 Spot, you must create the Spot service-linked role before Karpenter can launch Spot instances:
Compatibility
The table below shows the minimum Karpenter version required for each Kubernetes version:| Kubernetes | Minimum Karpenter |
|---|---|
| 1.29 | >= 0.34 |
| 1.30 | >= 0.37 |
| 1.31 | >= 1.0.5 |
| 1.32 | >= 1.2 |
| 1.33 | >= 1.5 |
| 1.34 | >= 1.6 |
| 1.35 | 1.9.x |
Next steps
Installation
Create an EKS cluster, install Karpenter with Helm, and provision your first nodes.
Migrating from Cluster Autoscaler
Step-by-step guide to switching from CAS to Karpenter on an existing cluster.
NodePool concepts
Learn how to configure requirements, limits, disruption policies, and more.
EC2NodeClass concepts
Configure AMI selection, subnets, security groups, and instance settings.