환경: Windows 11 + Rancher Desktop + WSL2 + kind v1.35.0 + Kubernetes 1.35
kind 클러스터를 생성한 후, kube-proxy DaemonSet에서 대부분의 Pod가 Error 상태이고 이후 CrashLoopBackOff 상태로 반복 재시작되는 현상이 발생했습니다.
PS C:\projects\my-k8s-project> kind create cluster --config .\k8s\0-config\0-kind-config.yaml --name kind-cluster
Creating cluster "kind-cluster" ...
✓ Ensuring node image (kindest/node:v1.35.0) 🖼
✓ Preparing nodes 📦 📦 📦 📦 📦 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
✓ Joining worker nodes 🚜
Set kubectl context to "kind-kind-cluster"
# 클러스터 생성 직후 Pod 상태 확인
PS C:\projects\my-k8s-project> kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-7d764666f9-n4dcj 1/1 Running 0 63s
coredns-7d764666f9-qm5dp 1/1 Running 0 63s
etcd-kind-cluster-control-plane 1/1 Running 0 70s
kindnet-26h2s 1/1 Running 0 63s
kindnet-6sq64 1/1 Running 0 61s
kindnet-d8fhq 1/1 Running 0 61s
kindnet-dhhpv 1/1 Running 0 61s
kindnet-jw8ph 1/1 Running 0 61s
kindnet-s6p2d 1/1 Running 0 61s
kube-apiserver-kind-cluster-control-plane 1/1 Running 0 70s
kube-controller-manager-kind-cluster-control-plane 1/1 Running 0 70s
kube-proxy-5cs9j 1/1 Running 0 63s
kube-proxy-7mk5j 0/1 Error 3 (46s ago) 61s # ❌
kube-proxy-7trt2 0/1 Error 3 (44s ago) 60s # ❌
kube-proxy-gtzc7 0/1 Error 3 (43s ago) 61s # ❌
kube-proxy-krmp8 0/1 Error 3 (41s ago) 61s # ❌
kube-proxy-nsxgm 0/1 Error 3 (41s ago) 61s # ❌
kube-scheduler-kind-cluster-control-plane 1/1 Running 0 70s
실패하는 kube-proxy의 로그를 확인하였습니다.
PS C:\projects\my-k8s-project> kubectl -n kube-system logs kube-proxy-7mk5j -c kube-proxy --tail=200
E0127 23:34:28.752692 1 run.go:72] "command failed" err="failed complete: too many open files"
"too many open files" 파일 디스크립터 부족 문제로 생각하였습니다.PS C:\projects\my-k8s-project> kubectl -n kube-system describe pod kube-proxy-7mk5j
Name: kube-proxy-7mk5j
Namespace: kube-system
Node: kind-cluster-worker3/172.20.0.4
...
Containers:
kube-proxy:
State: Terminated
Reason: Error
Exit Code: 1
Started: Wed, 28 Jan 2026 08:31:36 +0900
Finished: Wed, 28 Jan 2026 08:31:36 +0900
Last State: Terminated
Reason: Error
Exit Code: 1
Ready: False
Restart Count: 4
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 27s (x5 over 2m4s) kubelet spec.containers{kube-proxy}: Container image already present
Normal Created 27s (x5 over 2m4s) kubelet spec.containers{kube-proxy}: Container created
Normal Started 27s (x5 over 2m4s) kubelet spec.containers{kube-proxy}: Container started
Warning BackOff 26s (x5 over 2m2s) kubelet Back-off restarting failed container kube-proxy
PS C:\projects\my-k8s-project> kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP OS-IMAGE KERNEL-VERSION
kind-cluster-control-plane Ready control-plane 111s v1.35.0 172.20.0.5 Debian GNU/Linux 12 (bookworm) 6.6.87.2-microsoft-standard-WSL2
kind-cluster-worker Ready <none> 100s v1.35.0 172.20.0.7 Debian GNU/Linux 12 (bookworm) 6.6.87.2-microsoft-standard-WSL2
kind-cluster-worker2 Ready <none> 100s v1.35.0 172.20.0.2 Debian GNU/Linux 12 (bookworm) 6.6.87.2-microsoft-standard-WSL2
kind-cluster-worker3 Ready <none> 100s v1.35.0 172.20.0.4 Debian GNU/Linux 12 (bookworm) 6.6.87.2-microsoft-standard-WSL2
kind-cluster-worker4 Ready <none> 100s v1.35.0 172.20.0.3 Debian GNU/Linux 12 (bookworm) 6.6.87.2-microsoft-standard-WSL2
kind-cluster-worker5 Ready <none> 100s v1.35.0 172.20.0.6 Debian GNU/Linux 12 (bookworm) 6.6.87.2-microsoft-standard-WSL2
"too many open files" 에러가 발생했으므로, 처음에는 파일 디스크립터(File Descriptor) 제한 문제로 의심했습니다.
PS C:\projects\my-k8s-project> docker exec -it kind-cluster-worker3 bash
root@kind-cluster-worker3:/# ulimit -n
1048576
root@kind-cluster-worker3:/# cat /proc/1/limits | grep -i "open files"
Max open files 1048576 1048576 files
root@kind-cluster-worker3:/# cat /proc/self/limits | grep -i "open files"
Max open files 1048576 1048576 files
Linux의 inotify 시스템은 파일 시스템 변경을 모니터링하는 매커니즘입니다. Kubernetes는 ConfigMap, Secret 등의 변경을 감지하기 위해 inotify를 사용합니다.
kind 노드에서 inotify 설정을 확인하였습니다.
PS C:\projects\my-k8s-project> docker exec kind-cluster-worker3 sysctl fs.inotify.max_user_instances
fs.inotify.max_user_instances = 128
PS C:\projects\my-k8s-project> docker exec kind-cluster-worker3 sysctl fs.inotify.max_user_watches
fs.inotify.max_user_watches = 524288
문제는 fs.inotify.max_user_instances = 128이 너무 낮아서, 여러 kube-proxy가 동시에 inotify 인스턴스를 생성하려 할 때 리소스가 고갈 된 것이라고 판단했습니다.
참고로 max_user_instances은 기본 설정 상태에서 128로 유지되는 경우가 많으며, 아래와 같이 /proc 경로를 통해 직접 확인할 수 있습니다.
$ cat /proc/sys/fs/inotify/max_user_instances
128
$ cat /proc/sys/fs/inotify/max_user_watches
524288
kind 클러스터처럼 많은 컨테이너가 실행되는 환경에서는 기본값이 부족할 수 있습니다.
WSL 터미널에서
# 현재 세션에 적용
$ sudo sysctl -w fs.inotify.max_user_instances=8192
fs.inotify.max_user_instances = 8192
$ sudo sysctl -w fs.inotify.max_user_watches=524288
fs.inotify.max_user_watches = 524288
# 재부팅 후에도 유지되도록 설정
$ echo "fs.inotify.max_user_instances=8192" | sudo tee -a /etc/sysctl.conf
$ echo "fs.inotify.max_user_watches=524288" | sudo tee -a /etc/sysctl.conf
# WSL 종료
$ exit
PS C:\projects\my-k8s-project> docker exec kind-cluster-control-plane sysctl -w fs.inotify.max_user_instances=8192
fs.inotify.max_user_instances = 8192
PS C:\projects\my-k8s-project> docker exec kind-cluster-worker sysctl -w fs.inotify.max_user_instances=8192
fs.inotify.max_user_instances = 8192
PS C:\projects\my-k8s-project> docker exec kind-cluster-worker2 sysctl -w fs.inotify.max_user_instances=8192
fs.inotify.max_user_instances = 8192
PS C:\projects\my-k8s-project> docker exec kind-cluster-worker3 sysctl -w fs.inotify.max_user_instances=8192
fs.inotify.max_user_instances = 8192
PS C:\projects\my-k8s-project> docker exec kind-cluster-worker4 sysctl -w fs.inotify.max_user_instances=8192
fs.inotify.max_user_instances = 8192
PS C:\projects\my-k8s-project> docker exec kind-cluster-worker5 sysctl -w fs.inotify.max_user_instances=8192
fs.inotify.max_user_instances = 8192
PS C:\projects\my-k8s-project> kubectl delete pod -n kube-system -l k8s-app=kube-proxy
pod "kube-proxy-5cs9j" deleted from kube-system namespace
pod "kube-proxy-7mk5j" deleted from kube-system namespace
pod "kube-proxy-7trt2" deleted from kube-system namespace
pod "kube-proxy-gtzc7" deleted from kube-system namespace
pod "kube-proxy-krmp8" deleted from kube-system namespace
pod "kube-proxy-nsxgm" deleted from kube-system namespace
이제 kube-proxy Pod들이 정상인지 확인을 해보면
PS C:\projects\my-k8s-project> kubectl get pods -n kube-system -l k8s-app=kube-proxy
NAME READY STATUS RESTARTS AGE
kube-proxy-75n2x 1/1 Running 0 12s
kube-proxy-hv7zm 1/1 Running 0 12s
kube-proxy-pltnv 1/1 Running 0 12s
kube-proxy-rzd98 1/1 Running 0 12s
kube-proxy-xtvjq 1/1 Running 0 12s
kube-proxy-xxs8d 1/1 Running 0 12s
모든 kube-proxy Pod가 READY 1/1, STATUS Running 상태이고,
재시작 없이 정상적으로 동작했습니다 ㅎㅎ
too many open files 에러라 해서 nofile 부족일 것이라고 생각했는데, inotify 리소스가 부족해 발생한 케이스였습니다.