key:value
형태의 분산 데이터 스토리지./var/lib/etcd
에 보관(메모리와 별개로 보관).etcd 관리 명령 : etcdctl
k8s 마스터 노드에 접근합니다.
% ssh k8s-master
ETCD 버전과 etcdctl 툴이 설치되었는지 확인합니다.
% etcd --version
etcd Version: 3.5.2
Git SHA: 99018a77b
Go Version: go1.16.3
Go OS/Arch: linux/amd64
% etcdctl version
etcdctl version: 3.5.2
API version: 3.5
ETCD 파일들을 확인해봅시다.
% cd /var/lib/etcd
% tree
.
└── member
├── snap
│ ├── 0000000000000013-00000000000caa85.snap
│ ├── 0000000000000013-00000000000cd196.snap
│ ├── 0000000000000013-00000000000cf8a7.snap
│ ├── 0000000000000013-00000000000d1fb8.snap
│ ├── 0000000000000013-00000000000d46c9.snap
│ └── db
└── wal
├── 0000000000000005-000000000007743f.wal
├── 0000000000000006-000000000008e55a.wal
├── 0000000000000007-00000000000a5eaa.wal
├── 0000000000000008-00000000000bc9c6.wal
├── 0000000000000009-00000000000d4758.wal
└── 0.tmp
3 directories, 12 files
ETCD는 하나의 파드로서 동작합니다. 아래를 보겠습니다.
네임스페이스
kube-system
: 쿠버네티스 운영과 관련된 API가 실행되는 공간. 쿠버네티스를 실제 동작시키는데 필요한 pod, volume, service 등등이 들어있습니다.
% exit -> 콘솔로 돌아옵니다.
% kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-78fcd69978-5fjhs 1/1 Running 16 (101d ago) 162d
coredns-78fcd69978-d6xk5 1/1 Running 16 (101d ago) 162d
etcd-k8s-master 1/1 Running 17 (98d ago) 162d
kube-apiserver-k8s-master 1/1 Running 17 (98d ago) 162d
kube-controller-manager-k8s-master 1/1 Running 24 (4h45m ago) 162d
kube-flannel-ds-9jr8f 1/1 Running 15 162d
kube-flannel-ds-dvg5d 1/1 Running 12 (102d ago) 162d
kube-flannel-ds-gtlgg 1/1 Running 14 (101d ago) 162d
kube-proxy-q28vc 1/1 Running 14 (101d ago) 162d
kube-proxy-tfhrt 1/1 Running 16 (98d ago) 162d
kube-proxy-vd6zl 1/1 Running 12 (102d ago) 162d
kube-scheduler-k8s-master 1/1 Running 24 (4h45m ago) 162d
kube-system
이라는 네임스페이스에서 동작하는 파드의 목록을 보니 etcd-k8s-master
가 있는 것을 볼 수 있습니다.
/etc/kubernetes/manifests
경로에는 etcd.yaml
이 있습니다.
# ssh k8s-master
# cat /etc/kubernetes/manifests/etcd.yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubeadm.kubernetes.io/etcd.advertise-client-urls: https://10.0.2.10:2379
creationTimestamp: null
labels:
component: etcd
tier: control-plane
name: etcd
namespace: kube-system
spec:
containers:
- command:
- etcd
- --advertise-client-urls=https://10.0.2.10:2379
- --cert-file=/etc/kubernetes/pki/etcd/server.crt
- --client-cert-auth=true
- --data-dir=/var/lib/etcd
- --initial-advertise-peer-urls=https://10.0.2.10:2380
- --initial-cluster=k8s-master=https://10.0.2.10:2380
- --key-file=/etc/kubernetes/pki/etcd/server.key
- --listen-client-urls=https://127.0.0.1:2379,https://10.0.2.10:2379
- --listen-metrics-urls=http://127.0.0.1:2381
- --listen-peer-urls=https://10.0.2.10:2380
- --name=k8s-master
- --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
- --peer-client-cert-auth=true
- --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
- --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --snapshot-count=10000
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
image: k8s.gcr.io/etcd:3.5.0-0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /health
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: etcd
resources:
requests:
cpu: 100m
memory: 100Mi
startupProbe:
failureThreshold: 24
httpGet:
host: 127.0.0.1
path: /health
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /var/lib/etcd
name: etcd-data
- mountPath: /etc/kubernetes/pki/etcd
name: etcd-certs
hostNetwork: true
priorityClassName: system-node-critical
securityContext:
seccompProfile:
type: RuntimeDefault
volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
- hostPath:
path: /var/lib/etcd
type: DirectoryOrCreate
name: etcd-data
status: {}
/var/lib/etcd
를 스냅샷 파일
로 복제해서 보존하는 것을 의미합니다.
% etcdctl snapshot save <스냅샷 파일 경로>
ETCD를 호스팅 할 마스터 노드에 SSH 접속
% ssh k8s-master
etcdctl
이 설치되어 있는지 확인
% etcdctl version
etcdctl version: 3.5.2
API version: 3.5
/var/lib/etcd
에 대해 스냅샷 파일(/tmp/etcd-backup
)을 생성합니다.
% ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=<trusted-ca-file> --cert=<cert-file> --key=<key-file> \
snapshot save <backup-file-location>
cacert
, cert
, key
관련 정보를 조회하는 방법은 아래와 같습니다.
속성이 정확하게 trusted-ca-file
인 값을 취합니다.
% ps -ef | grep kube | grep trusted-ca-file
root 2392 2349 3 Jul05 ? 00:40:21 etcd --advertise-client-urls=https://10.0.2.10:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/etcd --initial-advertise-peer-urls=https://10.0.2.10:2380 --initial-cluster=k8s-master=https://10.0.2.10:2380 --key-file=/etc/kubernetes/pki/etcd/server.key --listen-client-urls=https://127.0.0.1:2379,https://10.0.2.10:2379 --listen-metrics-urls=http://127.0.0.1:2381 --listen-peer-urls=https://10.0.2.10:2380 --name=k8s-master --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt --peer-client-cert-auth=true --peer-key-file=/etc/kubernetes/pki/etcd/peer.key --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --snapshot-count=10000 --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
속성이 정확하게 cert-file
인 값을 취합니다.
% ps -ef | grep kube | grep cert-file
root 2359 2264 11 Jul05 ? 02:10:53 kube-apiserver --advertise-address=10.0.2.10 --allow-privileged=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key --etcd-servers=https://127.0.0.1:2379 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6443 --service-account-issuer=https://kubernetes.default.svc.cluster.local --service-account-key-file=/etc/kubernetes/pki/sa.pub --service-account-signing-key-file=/etc/kubernetes/pki/sa.key --service-cluster-ip-range=10.96.0.0/12 --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
root 2392 2349 3 Jul05 ? 00:40:26 etcd --advertise-client-urls=https://10.0.2.10:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/etcd --initial-advertise-peer-urls=https://10.0.2.10:2380 --initial-cluster=k8s-master=https://10.0.2.10:2380 --key-file=/etc/kubernetes/pki/etcd/server.key --listen-client-urls=https://127.0.0.1:2379,https://10.0.2.10:2379 --listen-metrics-urls=http://127.0.0.1:2381 --listen-peer-urls=https://10.0.2.10:2380 --name=k8s-master --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt --peer-client-cert-auth=true --peer-key-file=/etc/kubernetes/pki/etcd/peer.key --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --snapshot-count=10000 --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
root 86597 86541 7 00:14 ? 00:24:18 kube-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf --bind-address=127.0.0.1 --client-ca-file=/etc/kubernetes/pki/ca.crt --cluster-cidr=10.244.0.0/16 --cluster-name=kubernetes --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --controllers=*,bootstrapsigner,tokencleaner --kubeconfig=/etc/kubernetes/controller-manager.conf --leader-elect=true --port=0 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --root-ca-file=/etc/kubernetes/pki/ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key --service-cluster-ip-range=10.96.0.0/12 --use-service-account-credentials=true
속성이 정확하게 key-file
인 값을 취합니다.
% ps -ef | grep kube | grep key-file
root 2359 2264 11 Jul05 ? 02:11:16 kube-apiserver --advertise-address=10.0.2.10 --allow-privileged=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key --etcd-servers=https://127.0.0.1:2379 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6443 --service-account-issuer=https://kubernetes.default.svc.cluster.local --service-account-key-file=/etc/kubernetes/pki/sa.pub --service-account-signing-key-file=/etc/kubernetes/pki/sa.key --service-cluster-ip-range=10.96.0.0/12 --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
root 2392 2349 3 Jul05 ? 00:40:34 etcd --advertise-client-urls=https://10.0.2.10:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/etcd --initial-advertise-peer-urls=https://10.0.2.10:2380 --initial-cluster=k8s-master=https://10.0.2.10:2380 --key-file=/etc/kubernetes/pki/etcd/server.key --listen-client-urls=https://127.0.0.1:2379,https://10.0.2.10:2379 --listen-metrics-urls=http://127.0.0.1:2381 --listen-peer-urls=https://10.0.2.10:2380 --name=k8s-master --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt --peer-client-cert-auth=true --peer-key-file=/etc/kubernetes/pki/etcd/peer.key --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --snapshot-count=10000 --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
root 86597 86541 7 00:14 ? 00:24:28 kube-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf --bind-address=127.0.0.1 --client-ca-file=/etc/kubernetes/pki/ca.crt --cluster-cidr=10.244.0.0/16 --cluster-name=kubernetes --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --controllers=*,bootstrapsigner,tokencleaner --kubeconfig=/etc/kubernetes/controller-manager.conf --leader-elect=true --port=0 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --root-ca-file=/etc/kubernetes/pki/ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key --service-cluster-ip-range=10.96.0.0/12 --use-service-account-credentials=true
따라서 완성된 명령어는 아래와 같습니다.
% sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /tmp/etcd-backup
명령을 실행하면 아래와 같이 스냅샷이 생성
됩니다.
{"level":"info","ts":1657053500.6789536,"caller":"snapshot/v3_snapshot.go:68","msg":"created temporary db file","path":"/tmp/etcd-backup.part"}
{"level":"info","ts":1657053500.7074778,"logger":"client","caller":"v3/maintenance.go:211","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":1657053500.707609,"caller":"snapshot/v3_snapshot.go:76","msg":"fetching snapshot","endpoint":"https://127.0.0.1:2379"}
{"level":"info","ts":1657053501.0209572,"logger":"client","caller":"v3/maintenance.go:219","msg":"completed snapshot read; closing"}
{"level":"info","ts":1657053501.0416455,"caller":"snapshot/v3_snapshot.go:91","msg":"fetched snapshot","endpoint":"https://127.0.0.1:2379","size":"4.9 MB","took":"now"}
{"level":"info","ts":1657053501.0417738,"caller":"snapshot/v3_snapshot.go:100","msg":"saved","path":"/tmp/etcd-backup"}
Snapshot saved at /tmp/etcd-backup
/tmp
폴더의 파일 목록을 보면 etcd-backup
이 생성되어 있는 것을 확인할 수 있습니다.
% ls -l /tmp
total 4752
-rw-------. 1 root root 4861984 Jul 6 05:38 etcd-backup
스냅샷을 떴던 시점으로 다시 되돌리겠다는 것을 의미합니다.
/var/lib/etcd-new
에 스냅샷 파일을 복구/var/lib/etcd-new
)로 수정.먼저 제대로 복구되었는지 알아보기 위해 현재 실행중인 pod를 임의로 지워봅시다.
% kubectl get pods
NAME READY STATUS RESTARTS AGE
eshop-cart-app 1/1 Running 4 (101d ago) 104d
front-end-8dc556958-f98xs 1/1 Running 0 18h
front-end-8dc556958-fvlpx 1/1 Terminating 2 (102d ago) 124d
front-end-8dc556958-vcr4s 1/1 Terminating 3 (102d ago) 124d
front-end-8dc556958-vtqf8 1/1 Running 0 18h
nginx-79488c9578-5xzjn 1/1 Running 0 18h
nginx-79488c9578-qwnfk 1/1 Running 2 (101d ago) 102d
nginx-79488c9578-xpsvp 1/1 Terminating 1 (102d ago) 102d
% kubectl delete deployments.apps nginx
deployment.apps "nginx" deleted
% kubectl get pods
NAME READY STATUS RESTARTS AGE
eshop-cart-app 1/1 Running 4 (101d ago) 104d
front-end-8dc556958-f98xs 1/1 Running 0 18h
front-end-8dc556958-fvlpx 1/1 Terminating 2 (102d ago) 124d
front-end-8dc556958-vcr4s 1/1 Terminating 3 (102d ago) 124d
front-end-8dc556958-vtqf8 1/1 Running 0 18h
nginx-79488c9578-5xzjn 1/1 Terminating 0 18h
nginx-79488c9578-qwnfk 1/1 Terminating 2 (101d ago) 102d
nginx-79488c9578-xpsvp 1/1 Terminating 1 (102d ago) 102d
/tmp/etcd-backup
스냅샷 파일을 이용해 /var/lib/etcd-new
로 복구합니다.
(새로운 디렉토리를 미리 생성할 필요는 없습니다)
% sudo ETCDCTL_API=3 etcdctl \
--data-dir <data-dir-location> \
snapshot restore <snapshot-file-location>
완성된 명령어는 아래와 같습니다.
% sudo ETCDCTL_API=3 etcdctl \
--data-dir /var/lib/etcd-new \
snapshot restore /tmp/etcd-backup
명령어를 실행하면 아래와 같은 출력이 나타나면서 복구가 완료됩니다.
Deprecated: Use `etcdutl snapshot restore` instead.
2022-07-06T05:53:57+09:00 info snapshot/v3_snapshot.go:251 restoring snapshot {"path": "/tmp/etcd-backup", "wal-dir": "/var/lib/etcd-new/member/wal", "data-dir": "/var/lib/etcd-new", "snap-dir": "/var/lib/etcd-new/member/snap", "stack": "go.etcd.io/etcd/etcdutl/v3/snapshot.(*v3Manager).Restore\n\t/tmp/etcd-release-3.5.2/etcd/release/etcd/etcdutl/snapshot/v3_snapshot.go:257\ngo.etcd.io/etcd/etcdutl/v3/etcdutl.SnapshotRestoreCommandFunc\n\t/tmp/etcd-release-3.5.2/etcd/release/etcd/etcdutl/etcdutl/snapshot_command.go:147\ngo.etcd.io/etcd/etcdctl/v3/ctlv3/command.snapshotRestoreCommandFunc\n\t/tmp/etcd-release-3.5.2/etcd/release/etcd/etcdctl/ctlv3/command/snapshot_command.go:128\ngithub.com/spf13/cobra.(*Command).execute\n\t/usr/local/google/home/siarkowicz/.gvm/pkgsets/go1.16.3/global/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/usr/local/google/home/siarkowicz/.gvm/pkgsets/go1.16.3/global/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:960\ngithub.com/spf13/cobra.(*Command).Execute\n\t/usr/local/google/home/siarkowicz/.gvm/pkgsets/go1.16.3/global/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:897\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.Start\n\t/tmp/etcd-release-3.5.2/etcd/release/etcd/etcdctl/ctlv3/ctl.go:107\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.MustStart\n\t/tmp/etcd-release-3.5.2/etcd/release/etcd/etcdctl/ctlv3/ctl.go:111\nmain.main\n\t/tmp/etcd-release-3.5.2/etcd/release/etcd/etcdctl/main.go:59\nruntime.main\n\t/usr/local/google/home/siarkowicz/.gvm/gos/go1.16.3/src/runtime/proc.go:225"}
2022-07-06T05:53:57+09:00 info membership/store.go:141 Trimming membership information from the backend...
2022-07-06T05:53:57+09:00 info membership/cluster.go:421 added member {"cluster-id": "cdf818194e3a8c32", "local-member-id": "0", "added-peer-id": "8e9e05c52164694d", "added-peer-peer-urls": ["http://localhost:2380"]}
2022-07-06T05:53:57+09:00 info snapshot/v3_snapshot.go:272 restored snapshot {"path": "/tmp/etcd-backup", "wal-dir": "/var/lib/etcd-new/member/wal", "data-dir": "/var/lib/etcd-new", "snap-dir": "/var/lib/etcd-new/member/snap"}
/var/lib/etcd-new
를 트리 형태로 살펴보면 아래와 같이 복구된 것을 확인할 수 있습니다.
% sudo tree /var/lib/etcd-new
/var/lib/etcd-new
└── member
├── snap
│ ├── 0000000000000001-0000000000000001.snap
│ └── db
└── wal
└── 0000000000000000-0000000000000000.wal
3 directories, 3 files
복원된 etcd 위치로 etcd.yaml
을 수정합니다.
(static pod는 yaml 파일이 수정되면 자동으로 pod가 재시작됩니다)
% sudo vim /etc/kubernetes/manifests/etcd.yaml
...
- hostPath:
path: /var/lib/etcd-new
type: DirectoryOrCreate
name: etcd-data
...
이전 상태로 복원되었는지 확인합니다.
% kubectl get pods
NAME READY STATUS RESTARTS AGE
eshop-cart-app 1/1 Running 4 (101d ago) 104d
front-end-8dc556958-f98xs 1/1 Running 0 18h
front-end-8dc556958-fvlpx 1/1 Terminating 2 (102d ago) 124d
front-end-8dc556958-vcr4s 1/1 Terminating 3 (102d ago) 124d
front-end-8dc556958-vtqf8 1/1 Running 0 18h
nginx-79488c9578-5xzjn 1/1 Running 0 18h
nginx-79488c9578-qwnfk 1/1 Running 2 (101d ago) 102d
nginx-79488c9578-xpsvp 1/1 Terminating 1 (102d ago) 102d
nginx
가 Terminating
-> Running
으로 변경된 것을 확인할 수 있습니다.
정상적으로 복구되었습니다.
https://127.0.0.1:2379
에서 실행 중인 etcd의 snapshot을 생성하고 snapshot을 /data/etcd-snapshot.db
에 저장합니다. 그런 다음 /data/etcd-snapshot-previous.db
에 있는 기존의 이전 스냅샷을 복원합니다. etcdctl
을 이용해 서버에 연결하기 위해 다음 TLS 인증서/키가 제공됩니다./etc/kubernetes/pki/etcd/ca.crt
/etc/kubernetes/pki/
/etc/kubernetes/pki/etcd/server.key
kubernetes Cluster - etcd backup & restore를 참고해서 백업과 복구에 대한 명령어를 쓰면 됩니다.