You are definately finding ways to autoscale up and down your nodes in kubernetes cluster, and figuring out which autoscaler is the best can be hard since there are many options, which one should you go for?
Well, i would advice going for Karpenter instead of the native Cluster Autoscaler, both project are sponsored by the aws team though, but Karpenter is fast when it comes to scaling up and scaling down the nodes.
Prerequisitesโ
- Running EKS Cluster Provisioned with Terraform
- Create Terraform file named
karpenter.tf
Lets jump into it;
Provision the metric server on your EKSโ
Karpenter needs the metric server, because the metric server provides accurate metrics about pods in the nodes, That means you need to also setup resource limits for your deployments for better performance.
Now lets deploy metric server with helm_release provider
resource "helm_release" "metric-server" {
name = "metric-server"
repository = "https://kubernetes-sigs.github.io/metrics-server/"
chart = "metrics-server"
version = "3.10.0"
namespace = "kube-system"
cleanup_on_fail = true
timeout = "1200"
set {
name = "apiService.create"
value = "true"
}
}
Provision Karpenter Policy, IRSA, Instance Profile and Karpenter Helm Releaseโ
At the end of the day Karpenter needs access to create and spin up nodes and ec2 instances in your eks cluster on your behalf, this is where the IAM Role Service Account for EKS comes in.
๐ Step 1: You need to create the kubernetes namespace firstโ
resource "kubernetes_namespace" "karpenter" {
metadata {
name = "karpenter"
}
}
๐ Step 2: Create the karpenter controller policyโ
resource "aws_iam_policy" "karpenter_controller" {
name = "KarpenterController"
path = "/"
description = "Karpenter controller policy for autoscaling"
policy = <<EOF
{
"Statement": [
{
"Action": [
"ec2:CreateLaunchTemplate",
"ec2:CreateFleet",
"ec2:RunInstances",
"ec2:CreateTags",
"ec2:TerminateInstances",
"ec2:DeleteLaunchTemplate",
"ec2:DescribeLaunchTemplates",
"ec2:DescribeInstances",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeImages",
"ec2:DescribeInstanceTypes",
"ec2:DescribeInstanceTypeOfferings",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeSpotPriceHistory",
"iam:PassRole",
"ssm:GetParameter",
"pricing:GetProducts"
],
"Effect": "Allow",
"Resource": "*",
"Sid": "Karpenter"
},
{
"Action": "ec2:TerminateInstances",
"Condition": {
"StringLike": {
"ec2:ResourceTag/Name": "*karpenter*"
}
},
"Effect": "Allow",
"Resource": "*",
"Sid": "ConditionalEC2Termination"
},
{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": "arn:aws:iam::777XXXX:role/KarpenterNodeRole-${module.eks.cluster_name}",
"Sid": "PassNodeIAMRole"
},
{
"Effect": "Allow",
"Action": "eks:DescribeCluster",
"Resource": "arn:aws:eks:US-EAST-2:777XXXX:cluster/${module.eks.cluster_name}",
"Sid": "EKSClusterEndpointLookup"
}
],
"Version": "2012-10-17"
}
EOF
}
Replace 777XXXX
with your IAM User ID and US-EAST-2
with whichever region you are using, while the cluster ${module.eks.cluster_name}
value we passed in, just adds your cluster name from the eks module, you can replace it with your cluster name straight up if you dont use the eks module to provision your eks cluster.
๐ Step 3: Create Karpenter EC2 Instance Profileโ
Now lets create the EC2 Instance profile which we are going to attach in the next step, and we would attach the existing EKS Node IAM Role, using module.eks.eks_managed_node_groups.regular.iam_role_name
resource "aws_iam_instance_profile" "karpenter" {
name = "KarpenterNodeInstanceProfile"
role = module.eks.eks_managed_node_groups.regular.iam_role_name
}
๐ Step 4: Create Karpenter IAM Role Service Account for EKSโ
here you can just use the IRSA modules instead of the other raw way, makes you move faster with lesser line of code.
module "karpenter_irsa_role" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "5.20.0"
role_name = "karpenter_controller"
## i am attaching the policy i created in step 2 here instead of using the attach_karpenter_controller_policy = true argument
role_policy_arns = {
policy = aws_iam_policy.karpenter_controller.arn
}
karpenter_controller_cluster_id = module.eks.cluster_id
karpenter_controller_node_iam_role_arns = [module.eks.eks_managed_node_groups["regular"].iam_role_arn]
oidc_providers = {
main = {
provider_arn = module.eks.oidc_provider_arn
namespace_service_accounts = ["karpenter:karpenter"]
}
}
}
๐ Step 5: Deploy Karpenter on EKS using Helm Releaseโ
Now we can deploy karpenter using helm_release resource
resource "helm_release" "karpenter" {
name = "karpenter"
chart = "karpenter"
repository = "oci://public.ecr.aws/karpenter"
version = "v0.27.5"
namespace = kubernetes_namespace.karpenter.id #refrenced the namespaced we created in step 1
cleanup_on_fail = true
set {
name = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
value = module.karpenter_irsa_role.iam_role_arn #here we refrenced the IRSA ARN created in setp 4
}
set {
name = "replicas"
value = "1"
}
set {
name = "settings.aws.clusterName"
value = module.eks.cluster_name
}
set {
name = "settings.aws.clusterEndpoint"
value = module.eks.cluster_endpoint
}
set {
name = "settings.aws.defaultInstanceProfile"
value = aws_iam_instance_profile.karpenter.name #here we refrenced the Intanceprofile we created in step 4
}
}
Configure Karpenter Node Autoscaling using Provisioner and NodeTemplateโ
Now you are almost done setting up Karpenter on EKS, you just have to configure and deploy Provisioner and AWSNodeTemplate
Provisioner?, The provisioner is the confiiguration you can use to declare the type of nodes you want karpenter to create and the type of pods that can run on those nodes, the time it should take those nodes to be terminated when empty and more, you can read more about provisioner
With the below provisioner, we are simply telling karpenter to spin up an ON DEMAND node in either m5 EC2 Instances or t3 EC2 Instances but also not in the sizes of nano
, micro
, small
or large
, so we wont see t3.large
or m5.large
but we can see t3.medium
, m5.medium
and more.
resource "kubectl_manifest" "karpenter_provisioner" {
yaml_body = <<-YAML
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
ttlSecondsAfterEmpty: 30
limits:
resources:
cpu: 100
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"]
- key: karpenter.k8s.aws/instance-family
operator: In
values: [m5, t3]
- key: karpenter.k8s.aws/instance-size
operator: NotIn
values: [nano, micro, small, large]
providerRef:
name: my-provider
YAML
}
And here is NodeTemplate, NodeTemplate Dictates to the Provisioner which subnets would be attached to the nodes that the provisioner would be create, is it a private subnet or public.
If a private subnet is declared in the subnetSelector
, this means the nodes will be created in the private subnet, likewise the subnet Security Group, headover to the docs on Node Templates to learn more.
resource "kubectl_manifest" "karpenter_node_template" {
yaml_body = <<-YAML
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
name: my-provider
spec:
subnetSelector:
kubernetes.io/cluster/${var.eks-name}: owned
securityGroupSelector:
kubernetes.io/cluster/${var.eks-name}: owned
YAML
}
Now that we are done with the codes, you can run Terraform plan and apply, then create a small deployment and scale it up and down to test your just concluded karpenter installation.
The results from my side is that, it helped us use our nodes to their maxs before provisioning another compared to when we didnt have karpenter, we had alot of resources space wasting.

t3.medium Node created by Karpenter used up well

t3.large Node Created Manually Having Much Resouces Left and After Deployment Settled

t3.large Second Node Created Manually Having Much Resouces Left and After Deployment Settled
I hope you've learned something useful from this blog to take home for your cluster autoscaling and better deployment management using Karpenter.
Till next time โ๏ธ