Getting Started with Karpenter

Set up a cluster and add Karpenter

Karpenter automatically provisions new nodes in response to unschedulable pods. Karpenter does this by observing events within the Kubernetes cluster, and then sending commands to the underlying cloud provider.

This guide shows how to get started with Karpenter by creating a Kubernetes cluster and installing Karpenter. To use Karpenter, you must be running a supported Kubernetes cluster on a supported cloud provider.

The guide below explains how to utilize the Karpenter provider for AWS with EKS.

See the AKS Node autoprovisioning article on how to use Karpenter on Azure’s AKS or go to the Karpenter provider for Azure open source repository for self-hosting on Azure and additional information.

Create a cluster and add Karpenter

This guide uses eksctl to create the cluster. It should take less than 1 hour to complete, and cost less than $0.25. Follow the clean-up instructions to reduce any charges.

1. Install utilities

Karpenter is installed in clusters with a Helm chart.

Karpenter requires cloud provider permissions to provision nodes, for AWS IAM Roles for Service Accounts (IRSA) should be used. IRSA permits Karpenter (within the cluster) to make privileged requests to AWS (as the cloud provider) via a ServiceAccount.

Install these tools before proceeding:

AWS CLI
kubectl - the Kubernetes CLI
eksctl (>= v0.202.0) - the CLI for AWS EKS
helm - the package manager for Kubernetes

Configure the AWS CLI with a user that has sufficient privileges to create an EKS cluster. Verify that the CLI can authenticate properly by running aws sts get-caller-identity.

2. Set environment variables

After setting up the tools, set the Karpenter and Kubernetes version:

export KARPENTER_NAMESPACE="kube-system"
export KARPENTER_VERSION="1.14.0"
export K8S_VERSION="1.36"

Then set the following environment variable:

export AWS_PARTITION="aws" # if you are not using standard partitions, you may need to configure to aws-cn / aws-us-gov
export ENABLE_ZONAL_SHIFT="true" # if you are not using Zonal Shift, change this to false
export CLUSTER_NAME="${USER}-karpenter-demo"
export AWS_DEFAULT_REGION="us-west-2"
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
export TEMPOUT="$(mktemp)"
export ALIAS_VERSION="$(aws ssm get-parameter --name "/aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2023/x86_64/standard/recommended/image_id" --query Parameter.Value | xargs aws ec2 describe-images --query 'Images[0].Name' --image-ids | sed -r 's/^.*(v[[:digit:]]+).*$/\1/')"

Warning

If you open a new shell to run steps in this procedure, you need to set some or all of the environment variables again. To remind yourself of these values, type:

echo "${KARPENTER_NAMESPACE}" "${KARPENTER_VERSION}" "${K8S_VERSION}" "${CLUSTER_NAME}" "${AWS_DEFAULT_REGION}" "${AWS_ACCOUNT_ID}" "${TEMPOUT}" "${ALIAS_VERSION}" ${ENABLE_ZONAL_SHIFT}

3. Create a Cluster

Create a basic cluster with eksctl. The following cluster configuration will:

Use CloudFormation to set up the infrastructure needed by the EKS cluster. See CloudFormation for a complete description of what cloudformation.yaml does for Karpenter.
Create a Kubernetes service account and AWS IAM Role, and associate them using IRSA to let Karpenter launch instances.
Add the Karpenter node role to the aws-auth configmap to allow nodes to connect.
Use AWS EKS managed node groups for the kube-system and karpenter namespaces. Uncomment fargateProfiles settings (and comment out managedNodeGroups settings) to use Fargate for both namespaces instead.
Set KARPENTER_IAM_ROLE_ARN variables.
Create a role to allow spot instances.
Run Helm to install Karpenter

curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml  > "${TEMPOUT}" \
&& aws cloudformation deploy \
  --stack-name "Karpenter-${CLUSTER_NAME}" \
  --template-file "${TEMPOUT}" \
  --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides "ClusterName=${CLUSTER_NAME}"

eksctl create cluster -f - <<EOF
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: ${CLUSTER_NAME}
  region: ${AWS_DEFAULT_REGION}
  version: "${K8S_VERSION}"
  tags:
    karpenter.sh/discovery: ${CLUSTER_NAME}

iam:
  withOIDC: true
  podIdentityAssociations:
  - namespace: "${KARPENTER_NAMESPACE}"
    serviceAccountName: karpenter
    roleName: ${CLUSTER_NAME}-karpenter
    permissionPolicyARNs:
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerNodeLifecyclePolicy-${CLUSTER_NAME}
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerIAMIntegrationPolicy-${CLUSTER_NAME}
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerEKSIntegrationPolicy-${CLUSTER_NAME}
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerInterruptionPolicy-${CLUSTER_NAME}
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerResourceDiscoveryPolicy-${CLUSTER_NAME}
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerZonalShiftPolicy-${CLUSTER_NAME}

iamIdentityMappings:
- arn: "arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}"
  username: system:node:{{EC2PrivateDNSName}}
  groups:
  - system:bootstrappers
  - system:nodes
  ## If you intend to run Windows workloads, the kube-proxy group should be specified.
  # For more information, see https://github.com/aws/karpenter/issues/5099.
  # - eks:kube-proxy-windows

managedNodeGroups:
- instanceType: m5.large
  amiFamily: AmazonLinux2023
  name: ${CLUSTER_NAME}-ng
  desiredCapacity: 2
  minSize: 1
  maxSize: 10

zonalShiftConfig:
  enabled: ${ENABLE_ZONAL_SHIFT}

addons:
- name: eks-pod-identity-agent
EOF

export CLUSTER_ENDPOINT="$(aws eks describe-cluster --name "${CLUSTER_NAME}" --query "cluster.endpoint" --output text)"
export KARPENTER_IAM_ROLE_ARN="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter"

echo "${CLUSTER_ENDPOINT} ${KARPENTER_IAM_ROLE_ARN}"

curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml  > $TEMPOUT \
&& aws cloudformation deploy \
  --stack-name "Karpenter-${CLUSTER_NAME}" \
  --template-file "${TEMPOUT}" \
  --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides "ClusterName=${CLUSTER_NAME}"

eksctl create cluster -f - <<EOF
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: ${CLUSTER_NAME}
  region: ${AWS_DEFAULT_REGION}
  version: "${K8S_VERSION}"
  tags:
    karpenter.sh/discovery: ${CLUSTER_NAME}

iam:
  withOIDC: true
  serviceAccounts:
  - metadata:
      name: karpenter
      namespace: "${KARPENTER_NAMESPACE}"
    roleName: ${CLUSTER_NAME}-karpenter
    attachPolicyARNs:
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerNodeLifecyclePolicy-${CLUSTER_NAME}
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerIAMIntegrationPolicy-${CLUSTER_NAME}
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerEKSIntegrationPolicy-${CLUSTER_NAME}
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerInterruptionPolicy-${CLUSTER_NAME}
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerResourceDiscoveryPolicy-${CLUSTER_NAME}
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerZonalShiftPolicy-${CLUSTER_NAME}
    roleOnly: true

iamIdentityMappings:
- arn: "arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}"
  username: system:node:{{EC2PrivateDNSName}}
  groups:
  - system:bootstrappers
  - system:nodes
  ## If you intend to run Windows workloads, the kube-proxy group should be specified.
  # For more information, see https://github.com/aws/karpenter/issues/5099.
  # - eks:kube-proxy-windows

zonalShiftConfig:
  enabled: ${ENABLE_ZONAL_SHIFT}

fargateProfiles:
- name: karpenter
  selectors:
  - namespace: "${KARPENTER_NAMESPACE}"
EOF

export CLUSTER_ENDPOINT="$(aws eks describe-cluster --name ${CLUSTER_NAME} --query "cluster.endpoint" --output text)"
export KARPENTER_IAM_ROLE_ARN="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter"

echo $CLUSTER_ENDPOINT $KARPENTER_IAM_ROLE_ARN

Unless your AWS account has already onboarded to EC2 Spot, you will need to create the service linked role to avoid the ServiceLinkedRoleCreationNotPermitted error.

aws iam create-service-linked-role --aws-service-name spot.amazonaws.com || true
# If the role has already been successfully created, you will see:
# An error occurred (InvalidInput) when calling the CreateServiceLinkedRole operation: Service role name AWSServiceRoleForEC2Spot has been taken in this account, please try a different suffix.

Windows Support Notice

In order to run Windows workloads, Windows support should be enabled in your EKS Cluster. See Enabling Windows support to learn more.

4. Install Karpenter

# Logout of helm registry to perform an unauthenticated pull against the public ECR
helm registry logout public.ecr.aws

helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version "${KARPENTER_VERSION}" --namespace "${KARPENTER_NAMESPACE}" --create-namespace \
  --set "settings.clusterName=${CLUSTER_NAME}" \
  --set "settings.interruptionQueue=${CLUSTER_NAME}" \
  --set "settings.enableZonalShift=${ENABLE_ZONAL_SHIFT}" \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --set controller.resources.limits.cpu=1 \
  --set controller.resources.limits.memory=1Gi \
  --wait

# Logout of helm registry to perform an unauthenticated pull against the public ECR
helm registry logout public.ecr.aws

helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version "${KARPENTER_VERSION}" --namespace "${KARPENTER_NAMESPACE}" --create-namespace \
  --set "settings.clusterName=${CLUSTER_NAME}" \
  --set "settings.interruptionQueue=${CLUSTER_NAME}" \
  --set "settings.enableZonalShift=${ENABLE_ZONAL_SHIFT}" \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --set controller.resources.limits.cpu=1 \
  --set controller.resources.limits.memory=1Gi \
  --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter" \
  --wait

As the OCI Helm chart is signed by Cosign as part of the release process you can verify the chart before installing it by running the following command.

cosign verify public.ecr.aws/karpenter/karpenter:1.14.0 \
  --certificate-oidc-issuer=https://token.actions.githubusercontent.com \
  --certificate-identity-regexp='https://github\.com/aws/karpenter-provider-aws/\.github/workflows/release\.yaml@.+' \
  --certificate-github-workflow-repository=aws/karpenter-provider-aws \
  --certificate-github-workflow-name=Release \
  --certificate-github-workflow-ref=refs/tags/v1.14.0 \
  --annotations version=1.14.0

DNS Policy Notice

Karpenter uses the ClusterFirst pod DNS policy by default. This is the Kubernetes cluster default and this ensures that Karpenter can reach-out to internal Kubernetes services during its lifetime. There may be cases where you do not have the DNS service that you are using on your cluster up-and-running before Karpenter starts up. The most common case of this is you want Karpenter to manage the node capacity where your DNS service pods are running.

If you need Karpenter to manage the DNS service pods’ capacity, this means that DNS won’t be running when Karpenter starts-up. In this case, you will need to set the pod DNS policy to Default with --set dnsPolicy=Default. This will tell Karpenter to use the host’s DNS resolution instead of the internal DNS resolution, ensuring that you don’t have a dependency on the DNS service pods to run. More details on this issue can be found in the following Github issues: #2186 and #4947.

Warning

Karpenter creates a mapping between CloudProvider machines and CustomResources in the cluster for capacity tracking. To ensure this mapping is consistent, Karpenter utilizes the following tag keys:

karpenter.sh/managed-by
karpenter.sh/nodepool
kubernetes.io/cluster/${CLUSTER_NAME}

Because Karpenter takes this dependency, any user that has the ability to Create/Delete these tags on CloudProvider machines will have the ability to orchestrate Karpenter to Create/Delete CloudProvider machines as a side effect. We recommend that you enforce tag-based IAM policies on these tags against any EC2 instance resource (i-*) for any users that might have CreateTags/DeleteTags permissions but should not have RunInstances/TerminateInstances permissions.

5. Create NodePool

A single Karpenter NodePool is capable of handling many different pod shapes. Karpenter makes scheduling and provisioning decisions based on pod attributes such as labels and affinity. In other words, Karpenter eliminates the need to manage many different node groups.

Create a default NodePool using the command below. This NodePool uses securityGroupSelectorTerms and subnetSelectorTerms to discover resources used to launch nodes. We applied the tag karpenter.sh/discovery in the eksctl command above. Depending on how these resources are shared between clusters, you may need to use different tagging schemes.

The consolidationPolicy set to WhenEmptyOrUnderutilized in the disruption block configures Karpenter to reduce cluster cost by removing and replacing nodes. As a result, consolidation will terminate any empty nodes on the cluster. This behavior can be disabled by setting consolidateAfter to Never, telling Karpenter that it should never consolidate nodes. Review the NodePool API docs for more information.

Note: This NodePool will create capacity as long as the sum of all created capacity is less than the specified limit.

cat <<EOF | envsubst | kubectl apply -f -
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64"]
        - key: kubernetes.io/os
          operator: In
          values: ["linux"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ["2"]
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
      expireAfter: 720h # 30 * 24h = 720h
  limits:
    cpu: 1000
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 1m
---
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: default
spec:
  role: "KarpenterNodeRole-${CLUSTER_NAME}" # replace with your cluster name
  amiSelectorTerms:
    - alias: "al2023@${ALIAS_VERSION}"
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER_NAME}" # replace with your cluster name
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER_NAME}" # replace with your cluster name
EOF

Karpenter is now active and ready to begin provisioning nodes.

6. Scale up deployment

This deployment uses the pause image and starts with zero replicas.

cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 0
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      terminationGracePeriodSeconds: 0
      securityContext:
        runAsUser: 1000
        runAsGroup: 3000
        fsGroup: 2000
      containers:
      - name: inflate
        image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
        resources:
          requests:
            cpu: 1
        securityContext:
          allowPrivilegeEscalation: false
EOF

kubectl scale deployment inflate --replicas 5
kubectl logs -f -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter -c controller

7. Scale down deployment

Now, delete the deployment. After a short amount of time, Karpenter should terminate the empty nodes due to consolidation.

kubectl delete deployment inflate
kubectl logs -f -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter -c controller

8. Delete Karpenter nodes manually

If you delete a node with kubectl, Karpenter will gracefully cordon, drain, and shutdown the corresponding instance. Under the hood, Karpenter adds a finalizer to the node object, which blocks deletion until all pods are drained and the instance is terminated. Keep in mind, this only works for nodes provisioned by Karpenter.

kubectl delete node "${NODE_NAME}"

9. Delete the cluster

To avoid additional charges, remove the demo infrastructure from your AWS account.

helm uninstall karpenter --namespace "${KARPENTER_NAMESPACE}"
aws cloudformation delete-stack --stack-name "Karpenter-${CLUSTER_NAME}"
aws ec2 describe-launch-templates --filters "Name=tag:karpenter.k8s.aws/cluster,Values=${CLUSTER_NAME}" |
    jq -r ".LaunchTemplates[].LaunchTemplateName" |
    xargs -I{} aws ec2 delete-launch-template --launch-template-name {}
eksctl delete cluster --name "${CLUSTER_NAME}"

Monitoring with Grafana (optional)

This section describes optional ways to configure Karpenter to enhance its capabilities. In particular, the following commands deploy a Prometheus and Grafana stack that is suitable for this guide but does not include persistent storage or other configurations that would be necessary for monitoring a production deployment of Karpenter. This deployment includes two Karpenter dashboards that are automatically onboarded to Grafana. They provide a variety of visualization examples on Karpenter metrics.

helm repo add grafana-charts https://grafana.github.io/helm-charts
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

kubectl create namespace monitoring

curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/prometheus-values.yaml | envsubst | tee prometheus-values.yaml
helm install --namespace monitoring prometheus prometheus-community/prometheus --values prometheus-values.yaml

curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/grafana-values.yaml | tee grafana-values.yaml
helm install --namespace monitoring grafana grafana-charts/grafana --values grafana-values.yaml

The Grafana instance may be accessed using port forwarding.

kubectl port-forward --namespace monitoring svc/grafana 3000:80

The new stack has only one user, admin, and the password is stored in a secret. The following command will retrieve the password.

kubectl get secret --namespace monitoring grafana -o jsonpath="{.data.admin-password}" | base64 --decode

Zonal Shift Onboarding (optional)

You can optionally enable the Zonal Shift integration with Karpenter which provides additional improvements to your application environment’s fault tolerance and application recovery. For more details on how Zonal Shift works with EKS, see Learn about ARC zonal shift in Amazon EKS.

To do this, the EKS cluster must first be enabled for Zonal Shift. This can be done through the AWS Console, AWS CLI, or eksctl. For step-by-step instructions, see Enable EKS zonal shift.

eksctl utils update-zonal-shift-config --cluster=${CLUSTER_NAME} --enabled

Once the cluster has been enabled for Zonal Shift, you can use Zonal Shift controls to shift traffic and scaling operations. For more details see the guide on using Zonal Shift with EKS.

If you are installing Karpenter for the first time, pass --set "settings.enableZonalShift=true" during the initial Helm install to enable the integration.

Onboarding an existing cluster to Zonal Shift

If you are upgrading an existing Karpenter installation to v1.12.0+ and want to enable Zonal Shift, follow these steps:

Add the required IAM permission. The Karpenter controller role needs the arc-zonal-shift:GetManagedResource permission. If you are using the getting started guide’s cloudformation template, re-apply the template to pick up the new KarpenterControllerZonalShiftPolicy. If you manage IAM manually, add the following statement to your controller role policy:
```
{
  "Sid": "AllowZonalShiftStatusReadOnly",
  "Effect": "Allow",
  "Resource": "*",
  "Action": [
    "arc-zonal-shift:GetManagedResource"
  ],
  "Condition": {
    "StringEquals": {
      "arc-zonal-shift:ResourceIdentifier": "arn:aws:eks:<region>:<account-id>:cluster/<cluster-name>"
    }
  }
}
```
Karpenter also requires permission for eks:DescribeCluster when Zonal Shift is enabled. The standard onboarding’s KarpenterControllerEKSIntegrationPolicy already grants this; if you manage IAM manually, make sure the controller role has eks:DescribeCluster on the cluster resource.

Enable Zonal Shift on the EKS cluster if not already enabled:

eksctl utils update-zonal-shift-config --cluster=${CLUSTER_NAME} --enabled

Enable Zonal Shift in Karpenter by upgrading the Helm release with the settings.enableZonalShift value:
```
helm upgrade karpenter oci://public.ecr.aws/karpenter/karpenter --version "${KARPENTER_VERSION}" --namespace "${KARPENTER_NAMESPACE}" --reuse-values \
  --set "settings.enableZonalShift=true"
```
Alternatively, set the ENABLE_ZONAL_SHIFT=true environment variable on the Karpenter controller deployment.

Next Steps: Validate with a Zonal Shift

After enabling the integration, test that Karpenter correctly avoids launching nodes in a shifted zone by starting a zonal shift on your cluster. You can do this in one of two ways:

Manual zonal shift: Start a zonal shift from the ARC console or CLI to temporarily shift traffic away from one AZ.
FIS experiment: Create an AWS FIS experiment using the aws:arc:start-zonal-autoshift action to simulate a zonal autoshift.

While the shift is active, scale up a workload to trigger new node provisioning and verify that Karpenter only launches nodes in non-shifted zones:

kubectl scale deployment/inflate --replicas=5
kubectl get nodeclaims -o wide

For more details on designing your cluster for AZ resilience, see Learn about ARC zonal shift in Amazon EKS.

Advanced Installation

The section below covers advanced installation techniques for installing Karpenter. This includes things such as running Karpenter on a cluster without public internet access or ensuring that Karpenter avoids getting throttled by other components in your cluster.

Private Clusters

You can optionally install Karpenter on a private cluster using the eksctl installation by setting privateCluster.enabled to true in your ClusterConfig and by setting --set settings.isolatedVPC=true when installing the karpenter Helm chart.

privateCluster:
  enabled: true

Private clusters have no outbound access to the internet. This means that in order for Karpenter to reach out to the services that it needs to access, you need to enable specific VPC private endpoints. Below shows the endpoints that you need to enable to successfully run Karpenter in a private cluster:

com.amazonaws.<region>.ec2
com.amazonaws.<region>.ecr.api
com.amazonaws.<region>.ecr.dkr
com.amazonaws.<region>.s3 – For pulling container images
com.amazonaws.<region>.sts – For IAM roles for service accounts
com.amazonaws.<region>.ssm - For resolving default AMIs
com.amazonaws.<region>.sqs - For accessing SQS if using interruption handling
com.amazonaws.<region>.eks - For Karpenter to discover the cluster endpoint

If you do not currently have these endpoints surfaced in your VPC, you can add the endpoints by running

aws ec2 create-vpc-endpoint --vpc-id ${VPC_ID} --service-name ${SERVICE_NAME} --vpc-endpoint-type Interface --subnet-ids ${SUBNET_IDS} --security-group-ids ${SECURITY_GROUP_IDS}

Note

Karpenter (controller and webhook deployment) container images must be in or copied to Amazon ECR private or to another private registry accessible from inside the VPC. If these are not available from within the VPC, or from networks peered with the VPC, you will get Image pull errors when Kubernetes tries to pull these images from ECR public.

Note

There is currently no VPC private endpoint for the IAM API. As a result, you cannot use the default spec.role field in your EC2NodeClass. Instead, you need to provision and manage an instance profile manually and then specify Karpenter to use this instance profile through the spec.instanceProfile field.

You can provision an instance profile manually and assign a Node role to it by calling the following command

aws iam create-instance-profile --instance-profile-name "KarpenterNodeInstanceProfile-${CLUSTER_NAME}"
aws iam add-role-to-instance-profile --instance-profile-name "KarpenterNodeInstanceProfile-${CLUSTER_NAME}" --role-name "KarpenterNodeRole-${CLUSTER_NAME}"

Note

There is currently no VPC private endpoint for the Price List Query API. As a result, pricing data can go stale over time. By default, Karpenter ships a static price list that is updated when each binary is released.

Failed requests for pricing data will result in the following error messages

ERROR   controller.aws.pricing  updating on-demand pricing, RequestError: send request failed
caused by: Post "https://api.pricing.us-east-1.amazonaws.com/": dial tcp 52.94.231.236:443: i/o timeout; RequestError: send request failed
caused by: Post "https://api.pricing.us-east-1.amazonaws.com/": dial tcp 52.94.231.236:443: i/o timeout, using existing pricing data from 2022-08-17T00:19:52Z  {"commit": "4b5f953"}

Preventing APIServer Request Throttling

Kubernetes uses FlowSchemas and PriorityLevelConfigurations to map calls to the API server into buckets which determine each user agent’s throttling limits.

By default, Karpenter is installed into the kube-system namespace, which leverages the system-leader-election and kube-system-service-accounts FlowSchemas to map calls from the kube-system namespace to the leader-election and workload-high PriorityLevelConfigurations respectively. By putting Karpenter in these PriorityLevelConfigurations, we ensure that Karpenter and other critical cluster components are able to run even if other components on the cluster are throttled in other PriorityLevelConfigurations.

If you install Karpenter in a different namespace than the default kube-system namespace, Karpenter will not be put into these higher-priority FlowSchemas by default. Instead, you will need to create custom FlowSchemas for the namespace and service account where Karpenter is installed to ensure that requests are put into this higher PriorityLevelConfiguration.

cat <<EOF | envsubst | kubectl apply -f -
---
apiVersion: flowcontrol.apiserver.k8s.io/v1
kind: FlowSchema
metadata:
  name: karpenter-leader-election
spec:
  distinguisherMethod:
    type: ByUser
  matchingPrecedence: 200
  priorityLevelConfiguration:
    name: leader-election
  rules:
  - resourceRules:
    - apiGroups:
      - coordination.k8s.io
      namespaces:
      - '*'
      resources:
      - leases
      verbs:
      - get
      - create
      - update
    subjects:
    - kind: ServiceAccount
      serviceAccount:
        name: karpenter
        namespace: "${KARPENTER_NAMESPACE}"

---
apiVersion: flowcontrol.apiserver.k8s.io/v1
kind: FlowSchema
metadata:
  name: karpenter-workload
spec:
  distinguisherMethod:
    type: ByUser
  matchingPrecedence: 1000
  priorityLevelConfiguration:
    name: workload-high
  rules:
  - nonResourceRules:
    - nonResourceURLs:
      - '*'
      verbs:
      - '*'
    resourceRules:
    - apiGroups:
      - '*'
      clusterScope: true
      namespaces:
      - '*'
      resources:
      - '*'
      verbs:
      - '*'
    subjects:
    - kind: ServiceAccount
      serviceAccount:
        name: karpenter
        namespace: "${KARPENTER_NAMESPACE}"
EOF

Last modified July 22, 2026: docs: add conceptual documentation for NodeClaims (#9125) (3deed39)