etcd는 kuberntes의 cluster 정보를 key-value 형식으로 저장하는 DB로 kubernetes cluster 운영에 있어 매우 중요.
etcdctl을 통해 etcd를 다양하게 활용해보기 위해 정리
etcdctl 사용 전 세팅
etcd버전, endpoint, 인증 정보를 포함한 etcdctl command를 alias에 등록해야 합니다.
alias etcdctl='ETCDCTL_API=3 etcdctl \ --endpoints=https://{ 해당노드IP }:2379 \ --cacert=/etc/kubernetes/pki/etcd/ca.crt \ --cert=/etc/kubernetes/pki/etcd/server.crt \ --key=/etc/kubernetes/pki/etcd/server.key'
$ etcdctl version
## 출력 예시
etcdctl version: 3.4.13
API version: 3.4
$ etcdctl member list -w=table
-w=table 옵션을 주면 가독성 좋게 출력됩니다.
## 출력 예시
+------------------+---------+-------------+------------------------------+------------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+-------------+------------------------------+------------------------------+------------+
| 85792d9beda616ff | started | k8s-master2 | https://192.168.178.166:2380 | https://192.168.178.166:2379 | false |
| 90046ab4d4049c30 | started | k8s-master3 | https://192.168.178.167:2380 | https://192.168.178.167:2379 | false |
| e5a6eae61bf0e6a6 | started | k8s-master1 | https://192.168.178.165:2380 | https://192.168.178.165:2379 | false |
+------------------+---------+-------------+------------------------------+------------------------------+------------+
$ etcdctl endpoint health --cluster -w=table
## 출력 예시
+------------------------------+--------+------------+-------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+------------------------------+--------+------------+-------+
| https://192.168.178.165:2379 | true | 8.721097ms | |
| https://192.168.178.166:2379 | true | 9.55216ms | |
| https://192.168.178.167:2379 | true | 9.634818ms | |
+------------------------------+--------+------------+-------+
$ etcdctl endpoint status --cluster -w=table
## 출력 예시
+------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://192.168.178.166:2379 | 85792d9beda616ff | 3.4.3 | 9.7 MB | false | false | 21 | 325815 | 325815 | |
| https://192.168.178.167:2379 | 90046ab4d4049c30 | 3.4.3 | 9.7 MB | false | false | 21 | 325815 | 325815 | |
| https://192.168.178.165:2379 | e5a6eae61bf0e6a6 | 3.4.3 | 9.7 MB | true | false | 21 | 325815 | 325815 | |
+------------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
$ etcdctl snapshot save { 백업할 경로 }/etcd-`date +%Y%m%d_%H%M%S`
etcd-'date~' 은 생성될 snapshot 파일명으로 꼭 저 값이 아니라 원하는 값으로 변경해서 사용해도 괜찮습니다.
## 출력 예시
{"level":"info","ts":1688107398.147482,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"/root/etcd/etcd-20230630_154318.part"}
{"level":"info","ts":"2023-06-30T15:43:18.154+0900","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":1688107398.154079,"caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"https://192.168.178.165:2379"}
{"level":"info","ts":"2023-06-30T15:43:18.221+0900","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"}
{"level":"info","ts":1688107398.3090427,"caller":"snapshot/v3_snapshot.go:142","msg":"fetched snapshot","endpoint":"https://192.168.178.165:2379","size":"9.7 MB","took":0.16142193}
{"level":"info","ts":1688107398.3091292,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"/root/etcd/etcd-20230630_154318"}
Snapshot saved at /root/etcd/etcd-20230630_154318
etcdctl snapshot restore { etcd snapshot 파일} \
--name { command 수행하는 node hostname } \
--data-dir /var/lib/etcd/recover \
--initial-cluster { master1 hostname }=https://{ master1 IP }:2380,{ master2 hostname }=https://{ master2 IP }:2380,{ master3 hostname }=https://{ master3 IP }:2380 \
--initial-advertise-peer-urls https://{ command 수행하는 노드 IP }:2380
마스터 다중화 구조일 경우 각 마스터에서 동일한 snapshot 파일로 수행해야합니다.
## 실행 예시
$ etcdctl snapshot restore etcd-20220908_094848 --name master1 --data-dir /var/lib/etcd/recover --initial-cluster master1=https://172.21.4.2:2380,master2=https://172.21.4.3:2380,master3=https://172.21.4.13:2380 --initial-advertise-peer-urls https://172.21.4.2:2380
{"level":"info","ts":1663568652.8124788,"caller":"snapshot/v3_snapshot.go:296","msg":"restoring snapshot","path":"etcd-20220908_094848","wal-dir":"/var/lib/etcd/recover/member/wal","data-dir":"/var/lib/etcd/recover","snap-dir":"/var/lib/etcd/recover/member/snap"}
{"level":"info","ts":1663568653.2258673,"caller":"mvcc/kvstore.go:380","msg":"restored last compact revision","meta-bucket-name":"meta","meta-bucket-name-key":"finishedCompactRev","restored-compact-revision":383414000}
{"level":"info","ts":1663568653.2696786,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"50cb115086dee509","local-member-id":"0","added-peer-id":"55c32c756b1d29c5","added-peer-peer-urls":["https://172.21.4.3:2380"]}
{"level":"info","ts":1663568653.2697916,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"50cb115086dee509","local-member-id":"0","added-peer-id":"80d5ede2757aacc6","added-peer-peer-urls":["https://172.21.4.2:2380"]}
{"level":"info","ts":1663568653.2698317,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"50cb115086dee509","local-member-id":"0","added-peer-id":"ac792b69d6b05f9b","added-peer-peer-urls":["https://172.21.4.13:2380"]}
{"level":"info","ts":1663568653.3989177,"caller":"snapshot/v3_snapshot.go:309","msg":"restored snapshot","path":"etcd-20220908_094848","wal-dir":"/var/lib/etcd/recover/member/wal","data-dir":"/var/lib/etcd/recover","snap-dir":"/var/lib/etcd/recover/member/snap"}
--data-dir 에서 설정한 /var/lib/etcd/recover에 복구 파일이 생성됩니다.
이후 /var/lib/etcd/member 디렉토리를 지우고 recover 디렉토리에 들어가 member 디렉토리를 /var/lib/etcd로 옮겨줍니다.
마지막으로 crictl 을 통해 etcd 컨테이너를 재기동하면 됩니다.
모든 마스터에서 restore가 끝나면 member 조회 및 상태 조회를 통해 정상적으로 복구 됨을 확인할 수 있습니다.
etcd 백업/복구에 대한 자세한 내용은 별도의 포스트에서 실습을 통해 확인할 수 있습니다.