Vertical Pod Autoscaling

cloud2000·2024년 10월 13일

Vertical Pod Autoscaling(VPA)는 컨테이너의 리소스 request 및 limit를 자동으로 설정하는 방법을 제공함. 즉, CPU나 Memory의 부족으로 인한 성능 저하 위험을 최소화하고 리소스 낭비를 줄이는 것임.

아키텍처

How Kubernetes VPA allocates resources

VPA admission hook
Every pod submitted to the cluster goes through this webhook automatically which checks whether a VerticalPodAutoscaler object is referencing this pod or one of its parents (a ReplicaSet, a Deployment, etc.)
VPA recommender
Connects to the metrics-server application in the cluster, fetches historical and current usage data (CPU and memory) for each VPA-enabled pod and generates recommendations for scaling up or down the requests and limits of these pods.
VPA updater
Runs every 1 minute. If a pod is not running in the calculated recommendation range, it evicts the currently running version of this pod, so it can restart and go through the VPA admission webhook which will change the CPU and memory settings for it, before it can start.
VPA 예

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: prometheus-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: StatefulSet
    name: prometheus
  updatePolicy: 
    updateMode: Auto
  evictionRequirements:
    - resources: ["cpu", "memory"]
      changeRequirement: TargetHigherThanRequests
  resourcePolicy:
    containerPolicies:
      - containerName: '*'
        minAllowed:
          cpu: 0m
          memory: 0Mi
        maxAllowed:
          cpu: 1
          memory: 500Mi
        controlledResources: ["cpu", "memory"]
        controlledValues: RequestsAndLimits

updatePolicy.updateMode
- Off: VPA will not automatically change resource requirements. Autoscaler computes the recommendations and stores them in the VPA object’s status field.
- Initial: VPA only assigns resource requests on pod creation and never changes them later.
- Recreate: VPA assigns resource requests on pod creation and updates them on existing pods by evicting them when the requested resources differ significantly from the new recommendation.
- Auto: currently does the same as Recreate. In the future, it may take advantage of restart-free updates once they are available.
시간이 지나면 VPA recommender에 의해 권장되는 resource양을 VPA 객체의 Status에 저장함. VPA는 하한과 상한을 사용하여 포드를 퇴출한다. 현재 리소스 요청이 하한보다 낮거나 상한보다 높고 리소스 요청이 대상 추정치에 비해 10% 변경되면 퇴출이 발생할 수 있음.
초기 request, limit를 동일하게 최소한으로 설정하고 순간적으로 부하를 발생하여 OOM-killed가 되면 VPA recommend가 리소스 사용을 학습하는 데 부족한 데이터로 인해 기대한 대로 resource가 변경하여 재 기동되지 않는 경우가 있다.
Cronjob에서 최초에 resource에 대한 설정을 하지 않고 VPA object를 initial mode로 설정하면 2~3번의 학습뒤에 적정한 resource가 자동으로 설정되어 실행(추방이나 중단이 발생하지 않음)됨으로 활용 가치가 있음.

Status: 
  Conditions:
    Last Transition Time:  2020-12-23T08:03:07Z
    Status:                True
    Type:                  RecommendationProvided
  Recommendation:
      Container Recommendations:
      Container Name:  prometheus
      Lower Bound:
          Cpu:     25m
          Memory:  380220488
      Target:
          Cpu:     410m
          Memory:  380258472
      Uncapped Target:
          Cpu:     410m
          Memory:  380258472
      Upper Bound:
          Cpu:     704m
          Memory:  464927423

memory-aggregation-interval: The length of a single interval, for which the peak memory usage is computed. Memory usage peaks are aggregated in multiples of this interval. In other words there is one memory usage sample per interval (the maximum usage over that interval)
memory-aggregation-interval-count: The number of consecutive memory-aggregation-intervals which make up the MemoryAggregationWindowLength which in turn is the period for memory usage aggregation by VPA.
즉, MemoryAggregationWindowLength = memory-aggregation-interval * memory-aggregation-interval-count
memory-histogram-decay-half-life: The amount of time it takes a historical memory usage sample to lose half of its weight. In other words, a fresh usage sample is twice as 'important' as one with age equal to the half life period.
oom-bump-up-ratio: The memory bump up ratio when OOM occurred, default is 1.2
oom-min-bump-up-bytes: The minimal increase of memory when OOM occurred in bytes, default is 100 1024 1024
참고
- https://povilasv.me/vertical-pod-autoscaling-the-definitive-guide/
- https://www.kubecost.com/kubernetes-autoscaling/kubernetes-vpa/
- https://docs.aws.amazon.com/ko_kr/eks/latest/userguide/vertical-pod-autoscaler.html
- https://medium.com/@muppedaanvesh/a-hands-on-guide-to-kubernetes-horizontal-vertical-pod-autoscalers-%EF%B8%8F-58903382ef71
- 백분위수
  백분위(점수) = 100 - ((현재 나의 위치 / 전체 수) 100 )
  예) 300명 중에 20등을 했을 경우 백분위(점수): 100 - ((20 / 300) 100) = 94

cloud2000

클라우드쟁이

이전 포스트

Git cheatsheet

다음 포스트

Vertical Pod Autoscaling

아키텍처

Git cheatsheet

k8s efficient configurations

0개의 댓글

관련 채용 정보