Karpenter Migrating from Cluster Autoscaler

이언철·2023년 9월 7일

Infra Karpenter SRE autoscaling clsuter autoscaler container orchestration devops eks k8s kubernetes node autoscaler

DevOps

목록 보기

6/16

다음 목적을 위해 Karpenter로 마이그레이션을 진행한다.

O1. WarmPool 운영 시의 문제점과 이슈를 고려해 CAS에서 Karpenter로 전환하여 운영의 오버헤드를 최소화한다.

O2. 노드 오토스케일링 환경을 개선하여 Node가 생성되는 시간을 감소시키고 이로 인한 Traffic 증가로부터의 HPA가 보다 빠르고 수월하게 동작할 수 있는 기대를 가져온다.

O3. 리소스의 Spec에 맞는 적절한 노드를 생성함과 동시에 리소스의 요구사항 변화에 신속하고 자동으로 대응하여 유연한 워크로드 운영을 확보함으로써 애플리케이션의 가용성을 향상시킨다.

O4. 목적별 활용 가능한 노드를 유연하게 조정하며 보다 저렴하고 효과적인 대안의 노드로 지속적인 교체를 통해 워크로드를 보다 효율적인 컴퓨팅 리소스로 통합하여 클러스터 컴퓨팅 비용을 절감한다.

Migrating from Cluster Autoscaler

Scenario

AWS Karpenter Policy & Role check
- KarpenterControllerRole-my-cluster-xxxx
- KarpenterNodeRole-my-cluster-xxxx
  (created develop, staging, production)
AWS Cluster SG & Subnet tagging check
- EKS connected SG
- EKS connected Subnets

Added mapRoles - aws-auth.yaml

- groups:
  - system:bootstrappers
  - system:nodes
  rolearn: arn:aws:iam::ACCOUNT_ID:role/KarpenterNodeRole-my-cluster-xxxx
  username: system:node:{{EC2PrivateDNSName}}

Provisioner NodeGroup values per EKS NodeGroup Check
EKS NodeGroup EBS check
- Provisioner providerRef Select
AWSNodeTemplates SG check (+ subnets)
(optional) system-critical NodeGroup Check
- aws-ebs-csi-controller advanced configuration
- core-dns advanced configuration
karpenter NodeGroup Check
Create Karpenter
- kubectl apply -k {cluster-env}
Patch CAS(Cluster Autoscaler) replicas = 0
NodeGroups desired size down
- 10% - pdb deployments check → temp replicas control
  - only 1 replicas pods temporary update (+1 replica) - scripts & pdb check
- 20~100%
  - only 1 replicas pods temporary update (+0 replica) - scripts
Karpenter Node Scale-In Monitoring

Succeeded!

이언철

DevOps Engineer @Soomgo | Grafana Champion

이전 포스트

Karpenter 도입 체크리스트

다음 포스트

Karpenter Migrating from Cluster Autoscaler

DevOps

다음 목적을 위해 Karpenter로 마이그레이션을 진행한다.

O1. WarmPool 운영 시의 문제점과 이슈를 고려해 CAS에서 Karpenter로 전환하여 운영의 오버헤드를 최소화한다.

O2. 노드 오토스케일링 환경을 개선하여 Node가 생성되는 시간을 감소시키고 이로 인한 Traffic 증가로부터의 HPA가 보다 빠르고 수월하게 동작할 수 있는 기대를 가져온다.

O3. 리소스의 Spec에 맞는 적절한 노드를 생성함과 동시에 리소스의 요구사항 변화에 신속하고 자동으로 대응하여 유연한 워크로드 운영을 확보함으로써 애플리케이션의 가용성을 향상시킨다.

O4. 목적별 활용 가능한 노드를 유연하게 조정하며 보다 저렴하고 효과적인 대안의 노드로 지속적인 교체를 통해 워크로드를 보다 효율적인 컴퓨팅 리소스로 통합하여 클러스터 컴퓨팅 비용을 절감한다.

Migrating from Cluster Autoscaler

Scenario

Karpenter 도입 체크리스트

CoreDNS 문제 해결하기

0개의 댓글