Docker and k8s: Horizontal Pod Autoscaler (HPA)

Peter Jeon·2023년 6월 14일

Docker and k8s

목록 보기
38/41

HPA

In Kubernetes, one of the powerful features that come out of the box is the Horizontal Pod Autoscaler (HPA). It is an API resource that automatically scales the number of pod replicas in a replication controller, deployment, or replica set based on the observed CPU utilization.

Understanding HPA

HPA in Kubernetes automatically scales the number of pods in a replication controller, deployment, replica set, or stateful set based on observed CPU utilization. However, starting from Kubernetes 1.6, it can also scale based on memory and custom metrics provided through the Metrics API.

HPA is implemented as a Kubernetes API resource and a controller. The controller periodically adjusts the number of replicas in a replication controller or deployment to match the observed average CPU utilization to the target specified by the user.

How HPA Works

kubectl autoscale deployment foo --min=2 --max=5 --cpu-percent=80

The above command will ensure that "foo" Deployment has between 2 and 5 Pods, maintaining an average CPU utilization across all Pods of 80%.

HPA operates on the principle of the feedback loop. It is controlled by the Kubernetes master node and operates on the metrics provided by the Metrics Server, such as CPU and memory usage. However, it can also operate on custom metrics if configured accordingly.

HPA with Custom Metrics

As of Kubernetes 1.6, HPA can react to custom metrics. These can be any metrics that your application exposes. Suppose your application increases its outgoing network requests when it is overloaded. In that case, a total count of outgoing requests could be a good candidate for a custom metric.

Here is an example of HPA configuration with custom metrics:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: custom-metric-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: sample-deployment
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metric:
        name: packets-per-second
      target:
        type: AverageValue
        averageValue: 1k

This configuration scales the Deployment named "sample-deployment" ensuring each Pod will handle an average of 1000 packets per second.

Conclusion

HPA is an essential tool for any Kubernetes setup where the workload varies. With options to autoscale based on CPU and memory usage, or even custom metrics, it gives you the flexibility to maintain the availability of your applications without manual intervention.

profile
As a growing developer, I am continually expanding my skillset and knowledge, embracing new challenges and technologies

0개의 댓글