포드 매뉴얼 스케줄링
- 특수한 환경의 경우 특정 노드에서 실행되도록 포드를 제한
- 일반적으로 스케줄러는 합리적인 노드에 자동으로 포드 배치를 수행하므로 이러한 제한은 필요하지 않음
- 더 많은 제어가 필요할 수 있는 몇 가지 케이스
- SSD가 있는 노드에서 포드를 실행하기 위한 경우
- 블록체인이나 딥러닝 시스템을 위해 GPU 서비스가 필요한 경우
- 서비스의 성능을 극대화하기 위해 하나의 노드에 필요한 포드를 모두 배치해야 하는 경우
nodeName 필드를 사용한 매뉴얼 스케줄링
- 포드를 강제로 원하는 node에 스케줄링 할 수 있음
- spec 아래에 nodeName: work1과 같이 노드 이름 설정
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx
nodeName: kube-01
노드 셀렉터를 사용한 스케줄링
- 특정 하드웨어를 가진 노드에서 포드를 실행하고자 하는 경우에 사용
- GPU, SSD 등의 이슈를 가진 사항을 적용
- 다수의 워크로드에 대해 관리할 때 용이
노드 레이블링
- 다음 명령어로 gpu=true를 적용
kubectl label node <node_name> gpu=true
imkunyoung@cloudshell:~ (kubernetes-397511)$ kubectl label node gke-artbridge-default-pool-65403ed8-7zvx gpu=true
node/gke-artbridge-default-pool-65403ed8-7zvx labeled
imkunyoung@cloudshell:~ (kubernetes-397511)$ kubectl describe nodes gke-artbridge-default-pool-65403ed8-7zvx
Name: gke-artbridge-default-pool-65403ed8-7zvx
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=e2-medium
beta.kubernetes.io/os=linux
cloud.google.com/gke-boot-disk=pd-balanced
cloud.google.com/gke-container-runtime=containerd
cloud.google.com/gke-cpu-scaling-level=2
cloud.google.com/gke-logging-variant=DEFAULT
cloud.google.com/gke-max-pods-per-node=110
cloud.google.com/gke-nodepool=default-pool
cloud.google.com/gke-os-distribution=cos
cloud.google.com/gke-provisioning=standard
cloud.google.com/gke-stack-type=IPV4
cloud.google.com/machine-family=e2
cloud.google.com/private-node=false
failure-domain.beta.kubernetes.io/region=us-central1
failure-domain.beta.kubernetes.io/zone=us-central1-c
gpu=true
kubernetes.io/arch=amd64
kubernetes.io/hostname=gke-artbridge-default-pool-65403ed8-7zvx
kubernetes.io/os=linux
node.kubernetes.io/instance-type=e2-medium
topology.gke.io/zone=us-central1-c
topology.kubernetes.io/region=us-central1
topology.kubernetes.io/zone=us-central1-c
Annotations: container.googleapis.com/instance_id: 2705762411001197920
csi.volume.kubernetes.io/nodeid:
{"pd.csi.storage.gke.io":"projects/kubernetes-397511/zones/us-central1-c/instances/gke-artbridge-default-pool-65403ed8-7zvx"}
node.alpha.kubernetes.io/ttl: 0
node.gke.io/last-applied-node-labels:
cloud.google.com/gke-boot-disk=pd-balanced,cloud.google.com/gke-container-runtime=containerd,cloud.google.com/gke-cpu-scaling-level=2,clou...
node.gke.io/last-applied-node-taints:
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Wed, 06 Sep 2023 23:35:12 +0000
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: gke-artbridge-default-pool-65403ed8-7zvx
AcquireTime: <unset>
RenewTime: Sun, 10 Sep 2023 06:36:46 +0000
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
ReadonlyFilesystem False Sun, 10 Sep 2023 06:33:49 +0000 Wed, 06 Sep 2023 23:35:16 +0000 FilesystemIsNotReadOnly Filesystem is not read-only
CorruptDockerOverlay2 False Sun, 10 Sep 2023 06:33:49 +0000 Wed, 06 Sep 2023 23:35:16 +0000 NoCorruptDockerOverlay2 docker overlay2 is functioning properly
FrequentUnregisterNetDevice False Sun, 10 Sep 2023 06:33:49 +0000 Wed, 06 Sep 2023 23:35:16 +0000 NoFrequentUnregisterNetDevice node is functioning properly
FrequentKubeletRestart False Sun, 10 Sep 2023 06:33:49 +0000 Wed, 06 Sep 2023 23:35:16 +0000 NoFrequentKubeletRestart kubelet is functioning properly
FrequentDockerRestart False Sun, 10 Sep 2023 06:33:49 +0000 Wed, 06 Sep 2023 23:35:16 +0000 NoFrequentDockerRestart docker is functioning properly
FrequentContainerdRestart False Sun, 10 Sep 2023 06:33:49 +0000 Wed, 06 Sep 2023 23:35:16 +0000 NoFrequentContainerdRestart containerd is functioning properly
KernelDeadlock False Sun, 10 Sep 2023 06:33:49 +0000 Wed, 06 Sep 2023 23:35:16 +0000 KernelHasNoDeadlock kernel has no deadlock
NetworkUnavailable False Wed, 06 Sep 2023 23:35:13 +0000 Wed, 06 Sep 2023 23:35:12 +0000 RouteCreated NodeController create implicit route
MemoryPressure False Sun, 10 Sep 2023 06:32:33 +0000 Wed, 06 Sep 2023 23:30:56 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Sun, 10 Sep 2023 06:32:33 +0000 Wed, 06 Sep 2023 23:30:56 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Sun, 10 Sep 2023 06:32:33 +0000 Wed, 06 Sep 2023 23:30:56 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Sun, 10 Sep 2023 06:32:33 +0000 Wed, 06 Sep 2023 23:35:32 +0000 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 10.128.0.3
ExternalIP: 35.184.98.58
Hostname: gke-artbridge-default-pool-65403ed8-7zvx
Capacity:
cpu: 2
ephemeral-storage: 98831908Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 4022952Ki
pods: 110
Allocatable:
cpu: 940m
ephemeral-storage: 47060071478
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 2877096Ki
pods: 110
System Info:
Machine ID: ba2028bc3a517ab615092f0373101b4b
System UUID: ba2028bc-3a51-7ab6-1509-2f0373101b4b
Boot ID: 1d09c47b-376f-49a0-b21e-6deee29a5595
Kernel Version: 5.15.109+
OS Image: Container-Optimized OS from Google
Operating System: linux
Architecture: amd64
Container Runtime Version: containerd://1.7.0
Kubelet Version: v1.27.3-gke.100
Kube-Proxy Version: v1.27.3-gke.100
PodCIDR: 10.40.0.0/24
PodCIDRs: 10.40.0.0/24
ProviderID: gce://kubernetes-397511/us-central1-c/gke-artbridge-default-pool-65403ed8-7zvx
Non-terminated Pods: (12 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
argocd argocd-dex-server-656864dd94-ph98s 0 (0%) 0 (0%) 0 (0%) 0 (0%) 2d8h
argocd argocd-redis-b5d6bf5f5-mbfck 0 (0%) 0 (0%) 0 (0%) 0 (0%) 2d8h
argocd argocd-server-7f758fccf6-mpds7 0 (0%) 0 (0%) 0 (0%) 0 (0%) 2d8h
default admin-page 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d4h
default jenkins-deployment-58f88d9746-gmvkc 0 (0%) 0 (0%) 0 (0%) 0 (0%) 2d9h
default jhipster-prometheus-operator-77c8f847cb-sr7sx 100m (10%) 200m (21%) 50Mi (1%) 100Mi (3%) 45h
flask0 flask-6b7fbcfd94-h6xd5 0 (0%) 0 (0%) 0 (0%) 0 (0%) 2d7h
gmp-system collector-5mkmp 5m (0%) 0 (0%) 36M (1%) 3032M (102%) 29h
kube-system fluentbit-gke-p5n7f 100m (10%) 0 (0%) 200Mi (7%) 500Mi (17%) 3d7h
kube-system gke-metrics-agent-hhbz2 14m (1%) 0 (0%) 160Mi (5%) 160Mi (5%) 3d7h
kube-system kube-proxy-gke-artbridge-default-pool-65403ed8-7zvx 100m (10%) 0 (0%) 0 (0%) 0 (0%) 3d7h
kube-system pdcsi-node-l9dq5 10m (1%) 0 (0%) 20Mi (0%) 100Mi (3%) 29h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 329m (35%) 200m (21%)
memory 486887680 (16%) 3933775360 (133%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
레이블링을 활용한 스케줄링
- 아래의 nginx.yaml을 작성하고 포드 생성
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx
nodeSelector:
gpu: "true"