v1 Migration

Migrating to Karpenter v1.0

This migration guide is designed to help you migrate to Karpenter v1.0.x from v0.33.x through v0.37.x. Use this document as a reference to the changes that were introduced in this release and as a guide to how you need to update the manifests and other Karpenter objects you created in previous Karpenter releases.

Before continuing with this guide, you should know that Karpenter v1.0.x only supports Karpenter v1 and v1beta1 APIs. Earlier Provisioner, AWSNodeTemplate, and Machine APIs are not supported. Do not upgrade to v1.0.x without first upgrading to v0.32.x and then upgrading to v0.33+.

Additionaly, validate that you are running at least Kubernetes 1.25. Use the compatibility matrix to confirm you are on a supported Kubernetes version.

Before You Start

Karpenter v1.0 is a major release and contains a number of breaking changes. The following section will highlight some of the major breaking changes, but you should review the full changelog before proceeding with the upgrade.

Forceful Expiration

Previously, Karpenter would only begin draining an expired node after ensuring graceful disruption was possible and replacement capacity was provisioned. Now, NodeClaim expiration is forceful. Once a NodeClaim expires, Karpenter will immediately begin draining the corresponding Node without first provisioning a replacement Node. This forceful approach means that all expired NodeClaims that were previously blocked from disruption (due to Pod Disruption Budgets (PDBs), do-not-disrupt pods, or other constraints) will be expired. The scheduler will then attempt to provision new Nodes for the disrupted workloads as they are draining. This behavior may lead to an increased number of pods in the “Pending” state while replacement capacity is being provisioned. Before upgrading, users should check for expired NodeClaims and resolve any disruption blockers to ensure a smooth upgrade process.

Deprecated Annotations, Labels, and Tags Removed

The following annotations, labels, and tags have been removed in v1.0.0:

KeyType
karpenter.sh/do-not-consolidateannotation
karpenter.sh/do-not-evictannotation
karpenter.sh/managed-bytag

Both the karpenter.sh/do-not-consolidate and the karpenter.sh/do-not-evict annotations were deprecated in v0.32.0. They have now been dropped in-favor of their replacement, karpenter.sh/do-not-disrupt.

The karpenter.sh/managed-by, which currently stores the cluster name in its value, is replaced by eks:eks-cluster-name, to allow for EKS Pod Identity ABAC policies.

Zap Logging Config Removed

Support for setting the Zap logging config was deprecated in v0.32.0 and has been been removed in v1.0.0. The following environment variables are now available to configure logging:

  • LOG_LEVEL
  • LOG_OUTPUT_PATHS
  • LOG_ERROR_OUTPUT_PATHS.

Refer to Settings for more details.

New MetadataOptions Defaults

The default value for httpPutResponseHopLimit has been reduced from 2 to 1. This prevents pods that are not using hostNetworking from accessing IMDS by default. If you have pods which rely on access to IMDS, and are not using hostNetworking, you will need to either update the pod’s networking config or configure httpPutResponseHopLimit on your EC2NodeClass. This change aligns Karpenter’s defaults with EKS’ Best Practices.

Ubuntu AMIFamily Removed

Support for automatic AMI selection and UserData generation for Ubuntu has been dropped in Karpenter v1.0.0. To continue using Ubuntu AMIs you will need to specify an AMI using amiSelectorTerms.

UserData generation can be achieved using amiFamily: AL2, which has an identical UserData format. However, compatibility is not guaranteed long-term and changes to either AL2 or Ubuntu’s UserData format may introduce incompatibilities. If this occurs, amiFamily: Custom should be used for Ubuntu AMIs and UserData will need to be entirely maintained by the user.

If you are upgrading to v1.0.x and already have v1beta1 Ubuntu EC2NodeClasses, all you need to do is specify amiSelectorTerms and Karpenter will translate your EC2NodeClasses to the v1 equivalent (as shown below). Failure to specify amiSelectorTerms will result in the EC2NodeClass and all referencing NodePools to become NotReady. These NodePools and EC2NodeClasses would then be ignored for provisioning and drift.

# Original v1beta1 EC2NodeClass
version: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
spec:
 amiFamily: Ubuntu
 amiSelectorTerms:
 - id: ami-foo
---
# Conversion Webhook Output
version: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
 annotations:
   compatibility.karpenter.k8s.aws/v1beta1-ubuntu: amiFamily,blockDeviceMappings
spec:
 amiFamily: AL2
 amiSelectorTerms:
 - id: ami-foo
 blockDeviceMappings:
 - deviceName: '/dev/sda1'
   rootVolume: true
   ebs:
     encrypted: true
     volumeType: gp3
     volumeSize: 20Gi

New Registration Taint

EC2NodeClasses using amiFamily: Custom must configure the kubelet to register with the karpenter.sh/unregistered:NoExecute taint. For example, to achieve this with an AL2023 AMI you would use the following UserData:

version: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
spec:
  amiFamily: Custom
  amiSelectorTerms:
    - id: ami-custom-al2023-ami
  userData: |
    apiVersion: node.eks.aws/v1alpha1
    kind: NodeConfig
    spec:
      # ...
      kubelet:
        config:
          # ...
          registerWithTaints:
            - key: karpenter.sh/unregistered
              effect: NoExecute    

If you are using one of Karpenter’s managed AMI families, this will be handled for you by Karpenter’s generated UserData.

Upgrading

Before proceeding with the upgrade, be sure to review the changelog and review the upgrade procedure in its entirety. The procedure can be split into two sections:

  • Steps 1 through 6 will upgrade you to the latest patch release on your current minor version.
  • Steps 7 through 11 will then upgrade you to the latest v1.0 release.

While it is possible to upgrade directly from any patch version on versions v0.33 through v0.37, rollback from v1.0.x is only supported on the latest patch releases. Upgrading directly may leave you unable to rollback. For more information on the rollback procedure, refer to the downgrading section.

Upgrade Procedure

  1. Configure environment variables for the cluster you’re upgrading:

    export AWS_PARTITION="aws" # if you are not using standard partitions, you may need to configure to aws-cn / aws-us-gov
    export CLUSTER_NAME="${USER}-karpenter-demo"
    export AWS_REGION="us-west-2"
    export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
    export KARPENTER_NAMESPACE=kube-system
    export KARPENTER_IAM_ROLE_ARN="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter"
    
  2. Determine your current Karpenter version:

    kubectl get deployment -A -l app.kubernetes.io/name=karpenter -ojsonpath="{.items[0].metadata.labels['app\.kubernetes\.io/version']}{'\n'}"
    

    To upgrade to v1, you must be running a Karpenter version between v0.33 and v0.37. If you are on an older version, you must upgrade before continuing with this guide.

  3. Before upgrading to v1, we’re going to upgrade to a patch release that supports rollback. Set the KARPENTER_VERSION environment variable to the latest patch release for your current minor version. The following releases are the current latest:

    • 0.37.6
    • 0.36.8
    • 0.35.11
    • v0.34.12
    • v0.33.11
    # Note: v0.33.x and v0.34.x include the v prefix, omit it for versions v0.35+
    export KARPENTER_VERSION="0.37.5" # Replace with your minor version
    
  4. Upgrade Karpenter to the latest patch release for your current minor version. Note that webhooks must be enabled.

    # Service account annotation can be dropped when using pod identity
    helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version ${KARPENTER_VERSION} --namespace "${KARPENTER_NAMESPACE}" --create-namespace \
      --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KARPENTER_IAM_ROLE_ARN} \
      --set settings.clusterName=${CLUSTER_NAME} \
      --set settings.interruptionQueue=${CLUSTER_NAME} \
      --set controller.resources.requests.cpu=1 \
      --set controller.resources.requests.memory=1Gi \
      --set controller.resources.limits.cpu=1 \
      --set controller.resources.limits.memory=1Gi \
      --set webhook.enabled=true \
      --set webhook.port=8443 \
      --wait
    
  5. Apply the latest patch version of your current minor version’s Custom Resource Definitions (CRDs). Applying this version of the CRDs will enable the use of both the v1 and v1beta1 APIs on this version via the conversion webhooks. Note that this is only for rollback purposes, and new features available with the v1 APIs will not work on your minor version.

    helm upgrade --install karpenter-crd oci://public.ecr.aws/karpenter/karpenter-crd --version "${KARPENTER_VERSION}" --namespace "${KARPENTER_NAMESPACE}" --create-namespace \
        --set webhook.enabled=true \
        --set webhook.serviceName="karpenter" \
        --set webhook.port=8443
    
  6. Validate that Karpenter is operating as expected on this patch release. If you need to rollback after upgrading to v1, this is the version you will need to rollback to.

  7. We’re now ready to begin the upgrade to v1. Set the KARPENTER_VERSION environment variable to the latest v1.0.x release.

    export KARPENTER_VERSION="1.0.8"
    
  8. Attach the v1 policy to your existing NodeRole. Notable Changes to the IAM Policy include additional tag-scoping for the eks:eks-cluster-name tag for instances and instance profiles. We will remove this additional policy later once the controller has been migrated to v1 and we’ve updated the Karpenter cloudformation stack.

    POLICY_DOCUMENT=$(mktemp)
    curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/13d6fc014ea59019b1c3b1953184efc41809df11/website/content/en/v1.0/upgrading/get-controller-policy.sh | sh | envsubst > ${POLICY_DOCUMENT}
    POLICY_NAME="KarpenterControllerPolicy-${CLUSTER_NAME}-v1"
    ROLE_NAME="${CLUSTER_NAME}-karpenter"
    POLICY_ARN="$(aws iam create-policy --policy-name "${POLICY_NAME}" --policy-document "file://${POLICY_DOCUMENT}" | jq -r .Policy.Arn)"
    aws iam attach-role-policy --role-name "${ROLE_NAME}" --policy-arn "${POLICY_ARN}"
    
  9. Apply the v1 Custom Resource Definitions (CRDs):

    helm upgrade --install karpenter-crd oci://public.ecr.aws/karpenter/karpenter-crd --version "${KARPENTER_VERSION}" --namespace "${KARPENTER_NAMESPACE}" --create-namespace \
        --set webhook.enabled=true \
        --set webhook.serviceName="karpenter" \
        --set webhook.port=8443
    
  10. Upgrade Karpenter to the latest v1.0.x release.

    # Service account annotion can be dropped when using pod identity
    helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version ${KARPENTER_VERSION} --namespace "${KARPENTER_NAMESPACE}" --create-namespace \
      --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KARPENTER_IAM_ROLE_ARN} \
      --set settings.clusterName=${CLUSTER_NAME} \
      --set settings.interruptionQueue=${CLUSTER_NAME} \
      --set controller.resources.requests.cpu=1 \
      --set controller.resources.requests.memory=1Gi \
      --set controller.resources.limits.cpu=1 \
      --set controller.resources.limits.memory=1Gi \
      --wait
    
  11. Upgrade your cloudformation stack and remove the temporary v1 controller policy.

    TEMPOUT=$(mktemp)
    curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml > "${TEMPOUT}"
    aws cloudformation deploy \
      --stack-name "Karpenter-${CLUSTER_NAME}" \
      --template-file "${TEMPOUT}" \
      --capabilities CAPABILITY_NAMED_IAM \
      --parameter-overrides "ClusterName=${CLUSTER_NAME}"
    
    ROLE_NAME="${CLUSTER_NAME}-karpenter"
    POLICY_NAME="KarpenterControllerPolicy-${CLUSTER_NAME}-v1"
    POLICY_ARN=$(aws iam list-policies --query "Policies[?PolicyName=='${POLICY_NAME}'].Arn" --output text)
    aws iam detach-role-policy --role-name "${ROLE_NAME}" --policy-arn "${POLICY_ARN}"
    aws iam delete-policy --policy-arn "${POLICY_ARN}"
    

Downgrading

Once you upgrade to Karpenter v1.0.x, both v1 and v1beta1 resources may be stored in ETCD. Due to this, you may only rollback to a version of Karpenter with the conversion webhooks. The following releases should be used as rollback targets:

  • v0.37.6
  • v0.36.8
  • v0.35.12
  • v0.34.12
  • v0.33.11

Downgrade Procedure

  1. Configure environment variables for the cluster you’re downgrading:

    export AWS_PARTITION="aws" # if you are not using standard partitions, you may need to configure to aws-cn / aws-us-gov
    export CLUSTER_NAME="${USER}-karpenter-demo"
    export AWS_REGION="us-west-2"
    export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
    export KARPENTER_NAMESPACE=kube-system
    export KARPENTER_IAM_ROLE_ARN="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter"
    
  2. Configure your target Karpenter version. You should select one of the following versions:

    • 0.37.5
    • 0.36.7
    • 0.35.10
    • v0.34.11
    • v0.33.10
    # Note: v0.33.x and v0.34.x include the v prefix, omit it for versions v0.35+
    export KARPENTER_VERSION="0.37.5" # Replace with your minor version
    
  3. Attach the v1beta1 policy from your target version to your existing NodeRole.

    POLICY_DOCUMENT=$(mktemp)
    curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/website/docs/v1.0/upgrading/get-controller-policy.sh | sh | envsubst > ${POLICY_DOCUMENT}
    POLICY_NAME="KarpenterControllerPolicy-${CLUSTER_NAME}-${KARPENTER_VERSION}"
    ROLE_NAME="${CLUSTER_NAME}-karpenter"
    POLICY_ARN="$(aws iam create-policy --policy-name "${POLICY_NAME}" --policy-document "file://${POLICY_DOCUMENT}" | jq -r .Policy.Arn)"
    aws iam attach-role-policy --role-name "${ROLE_NAME}" --policy-arn "${POLICY_ARN}"
    
  4. Rollback the Karpenter Controller: Note that webhooks must be enabled to rollback. Without enabling the webhooks, Karpenter will be unable to correctly operate on v1 versions of the resources already stored in ETCD.

    # Service account annotation can be dropped when using pod identity
    helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version ${KARPENTER_VERSION} --namespace "${KARPENTER_NAMESPACE}" --create-namespace \
      --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KARPENTER_IAM_ROLE_ARN} \
      --set settings.clusterName=${CLUSTER_NAME} \
      --set settings.interruptionQueue=${CLUSTER_NAME} \
      --set controller.resources.requests.cpu=1 \
      --set controller.resources.requests.memory=1Gi \
      --set controller.resources.limits.cpu=1 \
      --set controller.resources.limits.memory=1Gi \
      --set webhook.enabled=true \
      --set webhook.port=8443 \
      --wait
    
  5. Rollback the CRDs.

    helm upgrade --install karpenter-crd oci://public.ecr.aws/karpenter/karpenter-crd --version "${KARPENTER_VERSION}" --namespace "${KARPENTER_NAMESPACE}" --create-namespace \
      --set webhook.enabled=true \
      --set webhook.serviceName=karpenter \
      --set webhook.port=8443
    
  6. Rollback your cloudformation stack and remove the temporary v1beta1 controller policy.

    TEMPOUT=$(mktemp)
    VERSION_TAG=$([[ ${KARPENTER_VERSION} == v* ]] && echo "${KARPENTER_VERSION}" || echo "v${KARPENTER_VERSION}")
    curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/${VERSION_TAG}/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml > "${TEMPOUT}"
    aws cloudformation deploy \
      --stack-name "Karpenter-${CLUSTER_NAME}" \
      --template-file "${TEMPOUT}" \
      --capabilities CAPABILITY_NAMED_IAM \
      --parameter-overrides "ClusterName=${CLUSTER_NAME}"
    
    ROLE_NAME="${CLUSTER_NAME}-karpenter"
    POLICY_NAME="KarpenterControllerPolicy-${CLUSTER_NAME}-${KARPENTER_VERSION}"
    POLICY_ARN=$(aws iam list-policies --query "Policies[?PolicyName=='${POLICY_NAME}'].Arn" --output text)
    aws iam detach-role-policy --role-name "${ROLE_NAME}" --policy-arn "${POLICY_ARN}"
    aws iam delete-policy --policy-arn "${POLICY_ARN}"
    

Before Upgrading to v1.1.0

You’ve successfully upgraded to v1.0, but more than likely your manifests are still v1beta1. You can continue to apply these v1beta1 manifests on v1.0, but support will be dropped in v1.1. Before upgrading to v1.1+, you will need to migrate your manifests over to v1.

Manifest Migration

You can manually migrate your manifests by referring to the changelog and the updated API docs (NodePool, EC2NodeClass). Alternatively, you can take advantage of the conversion webhooks. Performing a get using kubectl will return the v1 version of the resource, even if it was applied with a v1beta1 manifest.

For example, applying the following v1beta1 manifest and performing a get will return the v1 equivalent:

cat <<EOF | kubectl apply -f -
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64"]
        - key: kubernetes.io/os
          operator: In
          values: ["linux"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ["2"]
      nodeClassRef:
        apiVersion: karpenter.k8s.aws/v1beta1
        kind: EC2NodeClass
        name: default
  limits:
    cpu: 1000
  disruption:
    consolidationPolicy: WhenUnderutilized
    expireAfter: 720h # 30 * 24h = 720h
EOF
kubectl get nodepools default -o yaml > v1-nodepool.yaml

Kubelet Configuration Migration

One of the changes made to the NodePool and EC2NodeClass schemas for v1 was the migration of the kubelet field from the NodePool to the EC2NodeClass. This change is difficult to properly handle with conversion webhooks due to the many-to-one relation between NodePools and EC2NodeClasses. To facilitate this, Karpenter adds the compatibility.karpenter.sh/v1beta1-kubelet-conversion annotation to converted NodePools. If this annotation is present, it will take precedence over the kubelet field in the EC2NodeClass.

This annotation is only meant to support migration, and support will be dropped in v1.1. Before upgrading to v1.1+, you must migrate your kubelet configuration to your EC2NodeClasses, and remove the compatibility annotation from your NodePools.

If you have multiple NodePools that refer to the same EC2NodeClass, but have varying kubelet configurations, you will need to create a separate EC2NodeClass for unique set of kubelet configurations.

For example, consider the following v1beta1 manifests:

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: nodepool-a
spec:
  template:
    spec:
      kubelet:
        maxPods: 10
      nodeClassRef:
        apiVersion: karpenter.k8s.aws/v1beta1
        kind: EC2NodeClass
        name: nodeclass
---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: nodepool-b
spec:
  template:
    spec:
      kubelet:
        maxPods: 20
      nodeClassRef:
        apiVersion: karpenter.k8s.aws/v1beta1
        kind: EC2NodeClass
        name: nodeclass
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: nodeclass

In this example, we have two NodePools with different kubelet values, but they refer to the same EC2NodeClass. The conversion webhook will annotate the NodePools with the compatibility.karpenter.sh/v1beta1-kubelet-conversion annotation. This is the result of that conversion:

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: nodepool-a
  annotations:
    compatibility.karpenter.sh/v1beta1-kubelet-conversion: "{\"maxPods\": 10}"
spec:
  template:
    spec:
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: nodeclass
---
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: nodepool-b
  annotations:
    compatibility.karpenter.sh/v1beta1-kubelet-conversion: "{\"maxPods\": 20}"
spec:
  template:
    spec:
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: nodeclass
---
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: nodeclass

Before upgrading to v1.1, you must update your NodePools to refer to separate EC2NodeClasses to retain this behavior. Note that this will drift the Nodes associated with these NodePools due to the updated nodeClassRef.

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: nodepool-a
spec:
  template:
    spec:
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: nodeclass-a
---
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: nodepool-b
spec:
  template:
    spec:
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: nodeclass-b
---
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: nodeclass-a
spec:
  kubelet:
    maxPods: 10
---
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: nodeclass-b
spec:
  kubelet:
    maxPods: 20

NodeClassRef Requirements

Starting with Karpenter v1.1.0, nodeClassRef.group and nodeClassRef.kind are strictly required on both NodePools and NodeClaims. Ensure these values are set for all resources before upgrading Karpenter. Failing to do so will result in Karpenter being unable to operate against those resources. For the AWS provider, the group will always be karpenter.k8s.aws and the kind will always be EC2NodeClass.

Stored Version Migration

Once you have upgraded all of your manifests, you need to ensure that all existing resources are stored as v1 in ETCD. Karpenter v1.0.6+ includes a controller to automatically migrate all stored resources to v1. To validate that the migration was successful, you should check the stored versions for Karpenter’s CRDs:

for crd in "nodepools.karpenter.sh" "nodeclaims.karpenter.sh" "ec2nodeclasses.karpenter.k8s.aws"; do
    kubectl get crd ${crd} -ojsonpath="{.status.storedVersions}{'\n'}"
done

For more details on this migration process, refer to the kubernetes docs.

Changelog

  • Features:
    • AMI Selector Terms has a new Alias field which can only be set by itself in EC2NodeClass.Spec.AMISelectorTerms
    • Disruption Budgets by Reason was added to NodePool.Spec.Disruption.Budgets
    • TerminationGracePeriod was added to NodePool.Spec.Template.Spec.
    • LOG_OUTPUT_PATHS and LOG_ERROR_OUTPUT_PATHS environment variables added
  • API Rename: NodePool’s ConsolidationPolicy WhenUnderutilized is now renamed to WhenEmptyOrUnderutilized
  • Behavior Changes:
    • Expiration is now forceful and begins draining as soon as it’s expired. Karpenter does not wait for replacement capacity to be available before draining, but will start provisioning a replacement as soon as the node is expired and begins draining.
    • Karpenter’s generated NodeConfig now takes precedence when generating UserData with the AL2023 amiFamily. If you’re setting any values managed by Karpenter in your AL2023 UserData, configure these through Karpenter natively (e.g. kubelet configuration fields).
    • Karpenter now adds a karpenter.sh/unregistered:NoExecute taint to nodes in injected UserData when using alias in AMISelectorTerms or non-Custom AMIFamily. When using amiFamily: Custom, users will need to add this taint into their UserData, where Karpenter will automatically remove it when provisioning nodes.
    • Discovered standard AL2023 AMIs will no longer be considered compatible with GPU / accelerator workloads. If you’re using an AL2023 EC2NodeClass (without AMISelectorTerms) for these workloads, you will need to select your AMI via AMISelectorTerms (non-alias).
    • Karpenter now waits for underlying instances to be completely terminated before removing the associated nodes. This means it may take longer for nodes to be deleted and for nodeclaims to get cleaned up.
    • NodePools now have status conditions that indicate if they are ready. If not, then they will not be considered during scheduling.
    • NodeClasses now have status conditions that indicate if they are ready. If they are not ready, NodePools that reference them through their nodeClassRef will not be considered during scheduling.
    • Karpenter will no longer set associatePublicIPAddress to false in private subnets by default. Users with IAM policies / SCPs that require this field to be set explicitly should configure this through their EC2NodeClass (ref).
  • API Moves:
    • ExpireAfter has moved from the NodePool.Spec.Disruption block to NodePool.Spec.Template.Spec, and is now a drift-able field.
    • Kubelet was moved to the EC2NodeClass from the NodePool.
  • RBAC changes: added delete pods | added get, patch crds | added update nodes | removed create nodes
  • Breaking API (Manual Migration Needed):
    • Ubuntu is dropped as a first class supported AMI Family
    • karpenter.sh/do-not-consolidate (annotation), karpenter.sh/do-not-evict (annotation), and karpenter.sh/managed-by (tag) are all removed. karpenter.sh/managed-by, which currently stores the cluster name in its value, will be replaced by eks:eks-cluster-name. karpenter.sh/do-not-consolidate and karpenter.sh/do-not-evict are both replaced by karpenter.sh/do-not-disrupt.
    • The taint used to mark nodes for disruption and termination changed from karpenter.sh/disruption=disrupting:NoSchedule to karpenter.sh/disrupted:NoSchedule. It is not recommended to tolerate this taint, however, if you were tolerating it in your applications, you’ll need to adjust your taints to reflect this.
  • Environment Variable Changes:
    • Environment Variable Changes
    • LOGGING_CONFIG, ASSUME_ROLE_ARN, ASSUME_ROLE_DURATION Dropped
    • LEADER_ELECT renamed to DISABLE_LEADER_ELECTION
    • FEATURE_GATES.DRIFT=true was dropped and promoted to Stable, and cannot be disabled.
      • Users currently opting out of drift, disabling the drift feature flag will no longer be able to do so.
  • Defaults changed:
    • API: Karpenter will drop support for IMDS access from containers by default on new EC2NodeClasses by updating the default of httpPutResponseHopLimit from 2 to 1.
    • API: ConsolidateAfter is required. Users couldn’t set this before with ConsolidationPolicy: WhenUnderutilized, where this is now required. Users can set it to 0 to have the same behavior as in v1beta1.
    • API: All NodeClassRef fields are now all required, and apiVersion has been renamed to group
    • API: AMISelectorTerms are required. Setting an Alias cannot be done with any other type of term, and must match the AMI Family that’s set or be Custom.
    • Helm: Deployment spec TopologySpreadConstraint to have required zonal spread over preferred. Users who had one node running their Karpenter deployments need to either:
      • Have two nodes in different zones to ensure both Karpenter replicas schedule
      • Scale down their Karpenter replicas from 2 to 1 in the helm chart
      • Edit and relax the topology spread constraint in their helm chart from DoNotSchedule to ScheduleAnyway
    • Helm/Binary: controller.METRICS_PORT default changed back to 8080

Updated metrics

The following changes have been made to Karpenter’s metrics in v1.0.0.

Renamed Metrics

TypeOriginal NameNew Name
Nodekarpenter_nodes_termination_time_secondskarpenter_nodes_termination_duration_seconds
Nodekarpenter_nodes_terminatedkarpenter_nodes_terminated_total
Nodekarpenter_nodes_leases_deletedkarpenter_nodes_leases_deleted_total
Nodekarpenter_nodes_createdkarpenter_nodes_created_total
Podkarpenter_pods_startup_time_secondskarpenter_pods_startup_duration_seconds
Disruptionkarpenter_disruption_replacement_nodeclaim_failures_totalkarpenter_voluntary_disruption_queue_failures_total
Disruptionkarpenter_disruption_evaluation_duration_secondskarpenter_voluntary_disruption_decision_evaluation_duration_seconds
Disruptionkarpenter_disruption_eligible_nodeskarpenter_voluntary_disruption_eligible_nodes
Disruptionkarpenter_disruption_consolidation_timeouts_totalkarpenter_voluntary_disruption_consolidation_timeouts_total
Disruptionkarpenter_disruption_budgets_allowed_disruptionskarpenter_nodepools_allowed_disruptions
Disruptionkarpenter_disruption_actions_performed_totalkarpenter_voluntary_disruption_decisions_total
Provisionerkarpenter_provisioner_scheduling_simulation_duration_secondskarpenter_scheduler_scheduling_duration_seconds
Provisionerkarpenter_provisioner_scheduling_queue_depthkarpenter_scheduler_queue_depth
Interruptionkarpenter_interruption_received_messageskarpenter_interruption_received_messages_total
Interruptionkarpenter_interruption_deleted_messageskarpenter_interruption_deleted_messages_total
Interruptionkarpenter_interruption_message_latency_time_secondskarpenter_interruption_message_queue_duration_seconds
NodePoolkarpenter_nodepool_usagekarpenter_nodepools_usage
NodePoolkarpenter_nodepool_limitkarpenter_nodepools_limit
NodeClaimkarpenter_nodeclaims_terminatedkarpenter_nodeclaims_terminated_total
NodeClaimkarpenter_nodeclaims_disruptedkarpenter_nodeclaims_disrupted_total
NodeClaimkarpenter_nodeclaims_createdkarpenter_nodeclaims_created_total

Dropped Metrics

TypeName
Disruptionkarpenter_disruption_replacement_nodeclaim_initialized_seconds
Disruptionkarpenter_disruption_queue_depth
Disruptionkarpenter_disruption_pods_disrupted_total
karpenter_consistency_errors
NodeClaimkarpenter_nodeclaims_registered
NodeClaimkarpenter_nodeclaims_launched
NodeClaimkarpenter_nodeclaims_initialized
NodeClaimkarpenter_nodeclaims_drifted
Provisionerkarpenter_provisioner_scheduling_duration_seconds
Interruptionkarpenter_interruption_actions_performed