오늘 쿠버네티스 명령을 하려고 하니까 아래와 같이 에러가 발생했다.
$ kubectl get node
E0920 22:26:27.495184 25796 memcache.go:265] couldn't get current server API group list: Get "https://192.168.0.213:6443/api?timeout=32s": dial tcp 192.168.0.213:6443: connect: connection refused
The connection to the server 192.168.0.213:6443 was refused - did you specify the right host or port?
kube-apiserver 은 아래와 같이 종료되어있었고
$ sudo journalctl -u kube-apiserver
-- Logs begin at Sun 2024-09-15 20:54:44 KST, end at Fri 2024-09-20 22:29:42 KST. --
-- No entries --
확인해보니 아래의 로그로 짐작해볼 때에 인증서가 만료되어 kube-apiserver
가 정상동작하지 않고 죽었음을 알 수 있었다.
$ sudo crictl ps -a | grep kube-apiserver
a9ab410eded16 19b9246d37c8b About a minute ago Exited kube-apiserver 811 a917030e3d193 kube-apiserver-com
$ sudo crictl logs a9ab410eded16
W0920 13:31:21.680393 1 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {
"Addr": "127.0.0.1:2379",
"ServerName": "127.0.0.1",
"Attributes": null,
"BalancerAttributes": null,
"Type": 0,
"Metadata": null
}. Err: connection error: desc = "transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2024-09-20T13:31:21Z is after 2024-09-20T03:21:59Z"
E0920 13:31:24.410777 1 run.go:74] "command failed" err="context deadline exceeded"
$ openssl x509 -noout -dates -in /etc/kubernetes/pki/apiserver.crt
notBefore=Sep 21 03:16:57 2023 GMT
notAfter=Sep 20 03:21:58 2024 GMT
$ sudo kubeadm certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Sep 20, 2024 03:22 UTC <invalid> ca no
apiserver Sep 20, 2025 13:39 UTC 364d ca no
!MISSING! apiserver-etcd-client
apiserver-kubelet-client Sep 20, 2024 03:21 UTC <invalid> ca no
controller-manager.conf Sep 20, 2024 03:22 UTC <invalid> ca no
etcd-healthcheck-client Sep 20, 2024 03:22 UTC <invalid> etcd-ca no
etcd-peer Sep 20, 2024 03:22 UTC <invalid> etcd-ca no
etcd-server Sep 20, 2024 03:21 UTC <invalid> etcd-ca no
front-proxy-client Sep 20, 2024 03:21 UTC <invalid> front-proxy-ca no
scheduler.conf Sep 20, 2024 03:22 UTC <invalid> ca no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Sep 18, 2033 03:21 UTC 8y no
etcd-ca Sep 18, 2033 03:21 UTC 8y no
front-proxy-ca Sep 18, 2033 03:21 UTC 8y no
# 인증서 생성
$ sudo kubeadm init phase certs all
I0920 22:56:40.672223 30625 version.go:256] remote version is much newer: v1.31.0; falling back to: stable-1.28
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Using existing etcd/peer certificate and key on disk
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Using the existing "sa" key
#백업
sudo mv /etc/kubernetes/pki/apiserver.crt /etc/kubernetes/pki/apiserver.crt.bak
sudo mv /etc/kubernetes/pki/apiserver.key /etc/kubernetes/pki/apiserver.key.bak
sudo mv /etc/kubernetes/pki/apiserver-etcd-client.crt /etc/kubernetes/pki/apiserver-etcd-client.crt.bak
sudo mv /etc/kubernetes/pki/apiserver-etcd-client.key /etc/kubernetes/pki/apiserver-etcd-client.key.bak
# 안되서 잡다하게 했음... 이 명령어가 도움이 되었는 지는 모르겠다.
$ sudo kubeadm init phase kubeconfig all
# config 복사
$ sudo cp /etc/kubernetes/admin.conf /home/[username]/.kube/config
# 정상 동작 확인
$ kubectl get node
# 안될 경우 로그 보는 법
$ sudo crictl ps | grep kube-apiserver
a2650fe07fc7f...
$ sudo crictl logs a2650fe07fc7f -f
# 비정상 동작 에러 발생
E0920 14:47:44.447360 1 authentication.go:70] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2024-09-20T14:47:44Z is after 2024-09-20T03:22:01Z, verifying certificate SN=5751791429713888989, SKID=, AKID=95:58:3C:70:E1:8B:20:61:2B:A6:80:4E:93:52:2D:F6:D0:38:74:02 failed: x509: certificate has expired or is not yet valid: current time 2024-09-20T14:47:44Z is after 2024-09-20T03:22:01Z]"
E0920 14:47:44.448531 1 authentication.go:70] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2024-09-20T14:47:44Z is after 2024-09-20T03:22:01Z, verifying certificate SN=5751791429713888989, SKID=, AKID=95:58:3C:70:E1:8B:20:61:2B:A6:80:4E:93:52:2D:F6:D0:38:74:02 failed: x509: certificate has expired or is not yet valid: current time 2024-09-20T14:47:44Z is after 2024-09-20T03:22:01Z]"
# 이 이후로 갑자기 정상으로 바뀜
I0920 14:48:03.755701 1 handler.go:232] Adding GroupVersion metrics.k8s.io v1beta1 to ResourceManager
I0920 14:48:03.760819 1 handler.go:232] Adding GroupVersion projectcalico.org v3 to ResourceManager
인증서를 갱신해줬음에도 불구하고 api-server
가 계속 예전 인증서를 바라보는 것으로 보이는 에러가 발생했었다. 재시작을 해줘도 인식을 못했었는데 뜬금없이 갑자기 정상 동작하였다. 정확한 동작 흐름은 좀 봐야할 듯 싶다.