Horizontal Pod Autoscaler (HPA)는 Kubernetes 클러스터 내에서 CPU 및 메모리 사용률을 기반으로 Pod의 개수를 자동으로 조절하는 기능입니다. 사용량이 증가하면 Pod 개수를 늘리고, 사용량이 감소하면 Pod 개수를 줄여 리소스를 효율적으로 관리할 수 있습니다.
HPA의 동작을 실습하기 위해 Grafana, Prometheus, kube-ops-view 등의 모니터링 도구를 활용하여 리소스 변화를 확인합니다.
HPA 관련 메트릭을 시각적으로 모니터링하기 위해 Grafana 대시보드를 Import합니다.
HPA 실습을 위해 CPU 과부하를 발생시키는 PHP 기반 샘플 애플리케이션을 배포합니다.
# Run and expose php-apache server
cat << EOF > php-apache.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: php-apache
spec:
selector:
matchLabels:
run: php-apache
template:
metadata:
labels:
run: php-apache
spec:
containers:
- name: php-apache
image: registry.k8s.io/hpa-example
ports:
- containerPort: 80
resources:
limits:
cpu: 500m
requests:
cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
name: php-apache
labels:
run: php-apache
spec:
ports:
- port: 80
selector:
run: php-apache
EOF
# 배포
kubectl apply -f php-apache.yaml
# 애플리케이션 정상 배포 확인
kubectl exec -it deploy/php-apache -- cat /var/www/html/index.php
...
# 모니터링 : 터미널2개 사용
watch -d 'kubectl get hpa,pod;echo;kubectl top pod;echo;kubectl top node'
kubectl exec -it deploy/php-apache -- top
# [운영서버 EC2] 파드IP로 직접 접속
PODIP=$(kubectl get pod -l run=php-apache -o jsonpath="{.items[0].status.podIP}")
curl -s $PODIP; echo





Pod의 CPU 사용량이 50% 이상일 경우 자동으로 스케일링되도록 HPA를 설정합니다.
# Create the HorizontalPodAutoscaler : requests.cpu=200m - 알고리즘
# Since each pod requests 200 milli-cores by kubectl run, this means an average CPU usage of 100 milli-cores.
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
averageUtilization: 50
type: Utilization
EOF
혹은 kubectl 명령어로 생성 가능
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
# HPA 상태 확인
kubectl describe hpa
# 출력 예시
...
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): 0% (1m) / 50%
Min replicas: 1
Max replicas: 10
Deployment pods: 1 current / 1 desired
...
# HPA 설정 확인
kubectl get hpa php-apache -o yaml | kubectl neat
spec:
minReplicas: 1 # [4] 또는 최소 1개까지 줄어들 수도 있습니다
maxReplicas: 10 # [3] 포드를 최대 10개까지 늘립니다
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache # [1] php-apache 의 자원 사용량에서
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50 # [2] CPU 활용률이 50% 이상인 경우

# Pod IP를 가져온 후, 직접 호출
# 반복 접속 1 (파드1 IP로 접속) >> 증가 확인 후 중지
PODIP=$(kubectl get pod -l run=php-apache -o jsonpath="{.items[0].status.podIP}")
while true;do curl -s $PODIP; sleep 0.5; done
# 서비스 도메인 기반 요청(로드 밸런싱)
# 반복 접속 2 (서비스명 도메인으로 파드들 분산 접속) >> 증가 확인(몇개까지 증가되는가? 그 이유는?) 후 중지
## >> [scale back down] 중지 5분 후 파드 갯수 감소 확인
# Run this in a separate terminal
# so that the load generation continues and you can carry on with the rest of the steps
kubectl run -i --tty load-generator --rm --image=busybox:1.28 --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"
# Horizontal Pod Autoscaler Status Conditions
kubectl describe hpa
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 13m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 11m horizontal-pod-autoscaler New size: 3; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 11m horizontal-pod-autoscaler New size: 6; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 10m horizontal-pod-autoscaler New size: 8; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 5m35s horizontal-pod-autoscaler New size: 7; reason: All metrics below target
Normal SuccessfulRescale 4m35s horizontal-pod-autoscaler New size: 5; reason: All metrics below target
Normal SuccessfulRescale 4m5s horizontal-pod-autoscaler New size: 2; reason: All metrics below target
Normal SuccessfulRescale 3m50s horizontal-pod-autoscaler New size: 1; reason: All metrics below target


Prometheus에서 HPA 관련 메트릭을 수집하여 모니터링할 수 있습니다.
# Prometheus HPA 관련 메트릭
kube_horizontalpodautoscaler_status_current_replicas
kube_horizontalpodautoscaler_status_desired_replicas
kube_horizontalpodautoscaler_status_target_metric
kube_horizontalpodautoscaler_status_condition
kube_horizontalpodautoscaler_spec_target_metric
kube_horizontalpodautoscaler_spec_min_replicas
kube_horizontalpodautoscaler_spec_max_replicas







# kube-state-metrics 활용
# [운영서버 EC2]
kubectl get pod -n monitoring -l app.kubernetes.io/name=kube-state-metrics -owide
kubectl get pod -n monitoring -l app.kubernetes.io/name=kube-state-metrics -o jsonpath="{.items[*].status.podIP}"
PODIP=$(kubectl get pod -n monitoring -l app.kubernetes.io/name=kube-state-metrics -o jsonpath="{.items[*].status.podIP}")
curl -s http://$PODIP:8080/metrics | grep -i horizontalpodautoscaler | grep HELP
# HELP kube_horizontalpodautoscaler_info Information about this autoscaler.
# HELP kube_horizontalpodautoscaler_metadata_generation [STABLE] The generation observed by the HorizontalPodAutoscaler controller.
# HELP kube_horizontalpodautoscaler_spec_max_replicas [STABLE] Upper limit for the number of pods that can be set by the autoscaler; cannot be smaller than MinReplicas.
# HELP kube_horizontalpodautoscaler_spec_min_replicas [STABLE] Lower limit for the number of pods that can be set by the autoscaler, default 1.
# HELP kube_horizontalpodautoscaler_spec_target_metric The metric specifications used by this autoscaler when calculating the desired replica count.
# HELP kube_horizontalpodautoscaler_status_target_metric The current metric status used by this autoscaler when calculating the desired replica count.
# HELP kube_horizontalpodautoscaler_status_current_replicas [STABLE] Current number of replicas of pods managed by this autoscaler.
# HELP kube_horizontalpodautoscaler_status_desired_replicas [STABLE] Desired number of replicas of pods managed by this autoscaler.
# HELP kube_horizontalpodautoscaler_annotations Kubernetes annotations converted to Prometheus labels.
# HELP kube_horizontalpodautoscaler_labels [STABLE] Kubernetes labels converted to Prometheus labels.
# HELP kube_horizontalpodautoscaler_status_condition [STABLE] The condition of this autoscaler.
curl -s http://$PODIP:8080/metrics | grep -i horizontalpodautoscaler
...

kubectl delete deploy,svc,hpa,pod --all