[애플리케이션 스케줄링과 라이프사이클 관리] 수동 스케줄링, 원하는 포드를 원하는 노드에

hi·2023년 8월 8일
0

쿠버네티스

목록 보기
49/60

포드 매뉴얼 스케줄링

  • 특수한 환경의 경우 특정 노드에서 실행되도록 포드를 제한
  • 일반적으로 스케줄러는 합리적인 노드에 자동으로 포드 배치를 수행하므로 이러한 제한은 필요하지 않음
  • 더 많은 제어가 필요할 수 있는 몇 가지 케이스
    • SSD가 있는 노드에서 포드를 실행하기 위한 경우
    • 블록체인이나 딥러닝 시스템을 위해 GPU 서비스가 필요한 경우
    • 서비스의 성능을 극대화하기 위해 하나의 노드에 필요한 포드를 모두 배치해야 하는 경우


nodeName 필드를 사용한 매뉴얼 스케줄링

  • 포드를 강제로 원하는 node에 스케줄링 할 수 있음
  • spec 아래에 nodeName: work1과 같이 노드 이름 설정
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx
  nodeName: kube-01


노드 셀렉터를 사용한 스케줄링

  • 특정 하드웨어를 가진 노드에서 포드를 실행하고자 하는 경우에 사용
  • GPU, SSD 등의 이슈를 가진 사항을 적용
  • 다수의 워크로드에 대해 관리할 때 용이

노드 레이블링

  • 다음 명령어로 gpu=true를 적용
    kubectl label node <node_name> gpu=true
imkunyoung@cloudshell:~ (kubernetes-397511)$ kubectl label node gke-artbridge-default-pool-65403ed8-7zvx gpu=true
node/gke-artbridge-default-pool-65403ed8-7zvx labeled
imkunyoung@cloudshell:~ (kubernetes-397511)$ kubectl describe nodes gke-artbridge-default-pool-65403ed8-7zvx
Name:               gke-artbridge-default-pool-65403ed8-7zvx
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=e2-medium
                    beta.kubernetes.io/os=linux
                    cloud.google.com/gke-boot-disk=pd-balanced
                    cloud.google.com/gke-container-runtime=containerd
                    cloud.google.com/gke-cpu-scaling-level=2
                    cloud.google.com/gke-logging-variant=DEFAULT
                    cloud.google.com/gke-max-pods-per-node=110
                    cloud.google.com/gke-nodepool=default-pool
                    cloud.google.com/gke-os-distribution=cos
                    cloud.google.com/gke-provisioning=standard
                    cloud.google.com/gke-stack-type=IPV4
                    cloud.google.com/machine-family=e2
                    cloud.google.com/private-node=false
                    failure-domain.beta.kubernetes.io/region=us-central1
                    failure-domain.beta.kubernetes.io/zone=us-central1-c
                    gpu=true
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=gke-artbridge-default-pool-65403ed8-7zvx
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=e2-medium
                    topology.gke.io/zone=us-central1-c
                    topology.kubernetes.io/region=us-central1
                    topology.kubernetes.io/zone=us-central1-c
Annotations:        container.googleapis.com/instance_id: 2705762411001197920
                    csi.volume.kubernetes.io/nodeid:
                      {"pd.csi.storage.gke.io":"projects/kubernetes-397511/zones/us-central1-c/instances/gke-artbridge-default-pool-65403ed8-7zvx"}
                    node.alpha.kubernetes.io/ttl: 0
                    node.gke.io/last-applied-node-labels:
                      cloud.google.com/gke-boot-disk=pd-balanced,cloud.google.com/gke-container-runtime=containerd,cloud.google.com/gke-cpu-scaling-level=2,clou...
                    node.gke.io/last-applied-node-taints: 
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 06 Sep 2023 23:35:12 +0000
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  gke-artbridge-default-pool-65403ed8-7zvx
  AcquireTime:     <unset>
  RenewTime:       Sun, 10 Sep 2023 06:36:46 +0000
Conditions:
  Type                          Status  LastHeartbeatTime                 LastTransitionTime                Reason                          Message
  ----                          ------  -----------------                 ------------------                ------                          -------
  ReadonlyFilesystem            False   Sun, 10 Sep 2023 06:33:49 +0000   Wed, 06 Sep 2023 23:35:16 +0000   FilesystemIsNotReadOnly         Filesystem is not read-only
  CorruptDockerOverlay2         False   Sun, 10 Sep 2023 06:33:49 +0000   Wed, 06 Sep 2023 23:35:16 +0000   NoCorruptDockerOverlay2         docker overlay2 is functioning properly
  FrequentUnregisterNetDevice   False   Sun, 10 Sep 2023 06:33:49 +0000   Wed, 06 Sep 2023 23:35:16 +0000   NoFrequentUnregisterNetDevice   node is functioning properly
  FrequentKubeletRestart        False   Sun, 10 Sep 2023 06:33:49 +0000   Wed, 06 Sep 2023 23:35:16 +0000   NoFrequentKubeletRestart        kubelet is functioning properly
  FrequentDockerRestart         False   Sun, 10 Sep 2023 06:33:49 +0000   Wed, 06 Sep 2023 23:35:16 +0000   NoFrequentDockerRestart         docker is functioning properly
  FrequentContainerdRestart     False   Sun, 10 Sep 2023 06:33:49 +0000   Wed, 06 Sep 2023 23:35:16 +0000   NoFrequentContainerdRestart     containerd is functioning properly
  KernelDeadlock                False   Sun, 10 Sep 2023 06:33:49 +0000   Wed, 06 Sep 2023 23:35:16 +0000   KernelHasNoDeadlock             kernel has no deadlock
  NetworkUnavailable            False   Wed, 06 Sep 2023 23:35:13 +0000   Wed, 06 Sep 2023 23:35:12 +0000   RouteCreated                    NodeController create implicit route
  MemoryPressure                False   Sun, 10 Sep 2023 06:32:33 +0000   Wed, 06 Sep 2023 23:30:56 +0000   KubeletHasSufficientMemory      kubelet has sufficient memory available
  DiskPressure                  False   Sun, 10 Sep 2023 06:32:33 +0000   Wed, 06 Sep 2023 23:30:56 +0000   KubeletHasNoDiskPressure        kubelet has no disk pressure
  PIDPressure                   False   Sun, 10 Sep 2023 06:32:33 +0000   Wed, 06 Sep 2023 23:30:56 +0000   KubeletHasSufficientPID         kubelet has sufficient PID available
  Ready                         True    Sun, 10 Sep 2023 06:32:33 +0000   Wed, 06 Sep 2023 23:35:32 +0000   KubeletReady                    kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:  10.128.0.3
  ExternalIP:  35.184.98.58
  Hostname:    gke-artbridge-default-pool-65403ed8-7zvx
Capacity:
  cpu:                2
  ephemeral-storage:  98831908Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             4022952Ki
  pods:               110
Allocatable:
  cpu:                940m
  ephemeral-storage:  47060071478
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             2877096Ki
  pods:               110
System Info:
  Machine ID:                 ba2028bc3a517ab615092f0373101b4b
  System UUID:                ba2028bc-3a51-7ab6-1509-2f0373101b4b
  Boot ID:                    1d09c47b-376f-49a0-b21e-6deee29a5595
  Kernel Version:             5.15.109+
  OS Image:                   Container-Optimized OS from Google
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.7.0
  Kubelet Version:            v1.27.3-gke.100
  Kube-Proxy Version:         v1.27.3-gke.100
PodCIDR:                      10.40.0.0/24
PodCIDRs:                     10.40.0.0/24
ProviderID:                   gce://kubernetes-397511/us-central1-c/gke-artbridge-default-pool-65403ed8-7zvx
Non-terminated Pods:          (12 in total)
  Namespace                   Name                                                   CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                                                   ------------  ----------  ---------------  -------------  ---
  argocd                      argocd-dex-server-656864dd94-ph98s                     0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d8h
  argocd                      argocd-redis-b5d6bf5f5-mbfck                           0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d8h
  argocd                      argocd-server-7f758fccf6-mpds7                         0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d8h
  default                     admin-page                                             0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d4h
  default                     jenkins-deployment-58f88d9746-gmvkc                    0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d9h
  default                     jhipster-prometheus-operator-77c8f847cb-sr7sx          100m (10%)    200m (21%)  50Mi (1%)        100Mi (3%)     45h
  flask0                      flask-6b7fbcfd94-h6xd5                                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d7h
  gmp-system                  collector-5mkmp                                        5m (0%)       0 (0%)      36M (1%)         3032M (102%)   29h
  kube-system                 fluentbit-gke-p5n7f                                    100m (10%)    0 (0%)      200Mi (7%)       500Mi (17%)    3d7h
  kube-system                 gke-metrics-agent-hhbz2                                14m (1%)      0 (0%)      160Mi (5%)       160Mi (5%)     3d7h
  kube-system                 kube-proxy-gke-artbridge-default-pool-65403ed8-7zvx    100m (10%)    0 (0%)      0 (0%)           0 (0%)         3d7h
  kube-system                 pdcsi-node-l9dq5                                       10m (1%)      0 (0%)      20Mi (0%)        100Mi (3%)     29h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests         Limits
  --------           --------         ------
  cpu                329m (35%)       200m (21%)
  memory             486887680 (16%)  3933775360 (133%)
  ephemeral-storage  0 (0%)           0 (0%)
  hugepages-1Gi      0 (0%)           0 (0%)
  hugepages-2Mi      0 (0%)           0 (0%)
Events:              <none>

레이블링을 활용한 스케줄링

  • 아래의 nginx.yaml을 작성하고 포드 생성
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx
  nodeSelector:
  	gpu: "true"

0개의 댓글