[AWS EKS] EKS Autoscaling 2 - HPA(Horizontal Pod Autoscaler)

주영·2025년 3월 8일

AWS EKS Workshop Study 3기

목록 보기

13/31

1. HPA 개요

Horizontal Pod Autoscaler (HPA)는 Kubernetes 클러스터 내에서 CPU 및 메모리 사용률을 기반으로 Pod의 개수를 자동으로 조절하는 기능입니다. 사용량이 증가하면 Pod 개수를 늘리고, 사용량이 감소하면 Pod 개수를 줄여 리소스를 효율적으로 관리할 수 있습니다.

HPA는 Metrics API를 통해 CPU 및 메모리 사용률을 측정합니다.
기본적으로 CPU 사용량을 기준으로 동작하며, 메모리 및 커스텀 메트릭 기반의 오토스케일링도 가능합니다.
Pod 개수 조정 시 기본 대기 시간이 적용됩니다.
- 증가 시: 30초 기본 대기
- 감소 시: 5분 기본 대기 (조정 가능)

2. 실습 환경 구성

HPA의 동작을 실습하기 위해 Grafana, Prometheus, kube-ops-view 등의 모니터링 도구를 활용하여 리소스 변화를 확인합니다.

2.1 Grafana 대시보드 설정

HPA 관련 메트릭을 시각적으로 모니터링하기 위해 Grafana 대시보드를 Import합니다.

Grafana HPA 대시보드 22128
Grafana HPA 대시보드 22251

2.2 샘플 애플리케이션 배포

HPA 실습을 위해 CPU 과부하를 발생시키는 PHP 기반 샘플 애플리케이션을 배포합니다.

# Run and expose php-apache server
cat << EOF > php-apache.yaml
apiVersion: apps/v1
kind: Deployment
metadata: 
  name: php-apache
spec: 
  selector: 
    matchLabels: 
      run: php-apache
  template: 
    metadata: 
      labels: 
        run: php-apache
    spec: 
      containers: 
      - name: php-apache
        image: registry.k8s.io/hpa-example
        ports: 
        - containerPort: 80
        resources: 
          limits: 
            cpu: 500m
          requests: 
            cpu: 200m
---
apiVersion: v1
kind: Service
metadata: 
  name: php-apache
  labels: 
    run: php-apache
spec: 
  ports: 
  - port: 80
  selector: 
    run: php-apache
EOF

# 배포
kubectl apply -f php-apache.yaml

# 애플리케이션 정상 배포 확인
kubectl exec -it deploy/php-apache -- cat /var/www/html/index.php
...

# 모니터링 : 터미널2개 사용
watch -d 'kubectl get hpa,pod;echo;kubectl top pod;echo;kubectl top node'
kubectl exec -it deploy/php-apache -- top

# [운영서버 EC2] 파드IP로 직접 접속
PODIP=$(kubectl get pod -l run=php-apache -o jsonpath="{.items[0].status.podIP}")
curl -s $PODIP; echo

2.3 HPA 생성

Pod의 CPU 사용량이 50% 이상일 경우 자동으로 스케일링되도록 HPA를 설정합니다.

# Create the HorizontalPodAutoscaler : requests.cpu=200m - 알고리즘
# Since each pod requests 200 milli-cores by kubectl run, this means an average CPU usage of 100 milli-cores.
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        averageUtilization: 50
        type: Utilization
EOF
혹은 kubectl 명령어로 생성 가능
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10


# HPA 상태 확인
kubectl describe hpa

# 출력 예시
...
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  0% (1m) / 50%
Min replicas:                                          1
Max replicas:                                          10
Deployment pods:                                       1 current / 1 desired
...

# HPA 설정 확인
kubectl get hpa php-apache -o yaml | kubectl neat
spec: 
  minReplicas: 1               # [4] 또는 최소 1개까지 줄어들 수도 있습니다
  maxReplicas: 10              # [3] 포드를 최대 10개까지 늘립니다
  scaleTargetRef: 
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache           # [1] php-apache 의 자원 사용량에서
  metrics: 
  - type: Resource
    resource: 
      name: cpu
      target: 
        type: Utilization
        averageUtilization: 50  # [2] CPU 활용률이 50% 이상인 경우

3. HPA 오토스케일링 테스트

3.1 부하 발생

# Pod IP를 가져온 후, 직접 호출
# 반복 접속 1 (파드1 IP로 접속) >> 증가 확인 후 중지
PODIP=$(kubectl get pod -l run=php-apache -o jsonpath="{.items[0].status.podIP}")
while true;do curl -s $PODIP; sleep 0.5; done

# 서비스 도메인 기반 요청(로드 밸런싱)
# 반복 접속 2 (서비스명 도메인으로 파드들 분산 접속) >> 증가 확인(몇개까지 증가되는가? 그 이유는?) 후 중지
## >> [scale back down] 중지 5분 후 파드 갯수 감소 확인
# Run this in a separate terminal
# so that the load generation continues and you can carry on with the rest of the steps
kubectl run -i --tty load-generator --rm --image=busybox:1.28 --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"

3.2 Pod 개수 변동 확인

# Horizontal Pod Autoscaler Status Conditions
kubectl describe hpa
...
Events:
  Type    Reason             Age    From                       Message
  ----    ------             ----   ----                       -------
  Normal  SuccessfulRescale  13m    horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
  Normal  SuccessfulRescale  11m    horizontal-pod-autoscaler  New size: 3; reason: cpu resource utilization (percentage of request) above target
  Normal  SuccessfulRescale  11m    horizontal-pod-autoscaler  New size: 6; reason: cpu resource utilization (percentage of request) above target
  Normal  SuccessfulRescale  10m    horizontal-pod-autoscaler  New size: 8; reason: cpu resource utilization (percentage of request) above target
  Normal  SuccessfulRescale  5m35s  horizontal-pod-autoscaler  New size: 7; reason: All metrics below target
  Normal  SuccessfulRescale  4m35s  horizontal-pod-autoscaler  New size: 5; reason: All metrics below target
  Normal  SuccessfulRescale  4m5s   horizontal-pod-autoscaler  New size: 2; reason: All metrics below target
  Normal  SuccessfulRescale  3m50s  horizontal-pod-autoscaler  New size: 1; reason: All metrics below target

4. Prometheus 기반 HPA 메트릭 수집

Prometheus에서 HPA 관련 메트릭을 수집하여 모니터링할 수 있습니다.

# Prometheus HPA 관련 메트릭
kube_horizontalpodautoscaler_status_current_replicas
kube_horizontalpodautoscaler_status_desired_replicas
kube_horizontalpodautoscaler_status_target_metric
kube_horizontalpodautoscaler_status_condition

kube_horizontalpodautoscaler_spec_target_metric
kube_horizontalpodautoscaler_spec_min_replicas
kube_horizontalpodautoscaler_spec_max_replicas

# kube-state-metrics 활용
# [운영서버 EC2]
kubectl get pod -n monitoring -l app.kubernetes.io/name=kube-state-metrics -owide
kubectl get pod -n monitoring -l app.kubernetes.io/name=kube-state-metrics -o jsonpath="{.items[*].status.podIP}"
PODIP=$(kubectl get pod -n monitoring -l app.kubernetes.io/name=kube-state-metrics -o jsonpath="{.items[*].status.podIP}")
curl -s http://$PODIP:8080/metrics | grep -i horizontalpodautoscaler | grep HELP
# HELP kube_horizontalpodautoscaler_info Information about this autoscaler.
# HELP kube_horizontalpodautoscaler_metadata_generation [STABLE] The generation observed by the HorizontalPodAutoscaler controller.
# HELP kube_horizontalpodautoscaler_spec_max_replicas [STABLE] Upper limit for the number of pods that can be set by the autoscaler; cannot be smaller than MinReplicas.
# HELP kube_horizontalpodautoscaler_spec_min_replicas [STABLE] Lower limit for the number of pods that can be set by the autoscaler, default 1.
# HELP kube_horizontalpodautoscaler_spec_target_metric The metric specifications used by this autoscaler when calculating the desired replica count.
# HELP kube_horizontalpodautoscaler_status_target_metric The current metric status used by this autoscaler when calculating the desired replica count.
# HELP kube_horizontalpodautoscaler_status_current_replicas [STABLE] Current number of replicas of pods managed by this autoscaler.
# HELP kube_horizontalpodautoscaler_status_desired_replicas [STABLE] Desired number of replicas of pods managed by this autoscaler.
# HELP kube_horizontalpodautoscaler_annotations Kubernetes annotations converted to Prometheus labels.
# HELP kube_horizontalpodautoscaler_labels [STABLE] Kubernetes labels converted to Prometheus labels.
# HELP kube_horizontalpodautoscaler_status_condition [STABLE] The condition of this autoscaler.

curl -s http://$PODIP:8080/metrics | grep -i horizontalpodautoscaler
...