가시다님의 Production Kubernetes Online Study 2기 멤버가 되어
스터디에 참여하고 있다.
주 1회, 총 4주간 (5주로 변경됨) 진행되며
매주 과제를 완료하여야 탈락되지 않고 참여가능 하다. (이점이 좋은듯)
이번 스터디의 목표는 kops 를 활용하여 aws 인프라위에 k8s 클러스터를 구성하는 것인데 이렇게만 하면 재미가 없듯, 가시다님께서 많은걸 준비하셨다. (대단하세요 :)
앞으로 어떤 재미난 일이 있을지 상세하게 실습해보며 정리해보도록 하자.
kops는 AWS 계정이 필요하며 IAM 키가 필요하다.
그리고 kubectl도 사전에 필요하다.
참고) https://kubernetes.io/docs/setup/production-environment/tools/kops/
# yaml 파일 다운로드
curl -O https://s3.ap-northeast-2.amazonaws.com/cloudformation.cloudneta.net/K8S/kops-new-ec2.yaml
아 내 VM 환경에는 우선 cloudformation client 를 설치해야함.
참고 :
https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
https://docs.aws.amazon.com/cloudformation-cli/latest/userguide/what-is-cloudformation-cli.html
https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/install-sam-cli.html
# python 3.6 or above 환경이 필요하므로 가상환경이용.
(ansible_venv) [root@san-1 ~]# python --version
Python 3.6.8
pip install cloudformation-cli cloudformation-cli-java-plugin cloudformation-cli-go-plugin cloudformation-cli-python-plugin cloudformation-cli-typescript-plugin
cli 로 배포
aws cloudformation deploy --template-file ./kops-new-ec2.yaml --stack-name mykops --parameter-overrides KeyName=spark SgIngressSshCidr=$(curl -s ipinfo.io/ip)/32 --region ap-northeast-2
aws cli 사용을 하기 위해서 자격증명이 되어있어야함.
IAM 콘솔에서 access key 생성후
생성된 Access Key와 Secret Access Key를 이용하여
aws configure 해준다.(ansible_venv) [root@san-1 pkos]# aws configure AWS Access Key ID [None]: AK--7W AWS Secret Access Key [None]: F4--------d Default region name [None]: ap-northeast-2 Default output format [None]: json
스택 배포 완료후 확인.
(ansible_venv) [root@san-1 pkos]# aws cloudformation describe-stacks --stack-name mykops --query 'Stacks[*].Outputs[*].OutputValue' --output text
13.124.5.243
# 접속
(ansible_venv) [root@san-1 pkos]# ssh -i spark.pem ec2-user@$(aws cloudformation describe-stacks --stack-name mykops --query 'Stacks[*].Outputs[0].OutputValue' --output text)
__| __|_ )
_| ( / Amazon Linux 2 AMI
___|\___|___|
https://aws.amazon.com/amazon-linux-2/
[root@kops-ec2 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP group default qlen 1000
link/ether 02:01:f3:ec:0a:c4 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.10/24 brd 10.0.0.255 scope global dynamic eth0
valid_lft 2972sec preferred_lft 2972sec
inet6 fe80::1:f3ff:feec:ac4/64 scope link
valid_lft forever preferred_lft forever
[root@kops-ec2 ~]#
sudo tail -f /var/log/cloud-init-output.log
sudo su -
# 기본 툴 및 SSH 키 설치 등 확인
kubectl version --client=true -o yaml | yh
... gitVersion: v1.26.2 ...
# kops version
[root@kops-ec2 ~]# kops version
Client version: 1.25.4 (git-v1.25.4)
[root@kops-ec2 ~]# aws --version
aws-cli/2.11.1 Python/3.11.2 Linux/4.14.305-227.531.amzn2.x86_64 exe/x86_64.amzn.2 prompt/off
ls /root/.ssh/id_rsa*
# 자격 구성 설정 없이 확인
aws ec2 describe-instances
# IAM User 자격 구성 : 실습 편리를 위해 administrator 권한을 가진 IAM User 의 자격 증명 입력
[root@kops-ec2 ~]# aws configure
AWS Access Key ID [None]: AK--7W
AWS Secret Access Key [None]: F4--------d
Default region name [None]: ap-northeast-2
Default output format [None]: json
# 자격 구성 적용 확인 : 노드 IP 확인
[root@kops-ec2 ~]# aws ec2 describe-instances | grep IpAddress
# aws cli 페이지 출력 옵션
export AWS_PAGER=""
# 리소스를 배치할 리전이름을 변수 지정
REGION=ap-northeast-2 # 서울 리전 사용
# k8s 설정 파일이 저장될 버킷 생성
## aws s3 mb s3://버킷<유일한 이름> --region <S3 배포될 AWS 리전>
aws s3 mb s3://pkos2 --region $REGION
aws s3 ls
[root@kops-ec2 ~]# aws s3 mb s3://pkos2 --region $REGION
make_bucket: pkos2
[root@kops-ec2 ~]# aws s3 ls
2023-03-10 14:27:51 pkos2
# 배포 시 참고할 정보를 환경 변수에 저장
## export NAME=<자신의 퍼블릭 호스팅 메인 주소>
## export KOPS_STATE_STORE=s3://(위에서 생성한 자신의 버킷 이름)
export KOPS_CLUSTER_NAME=<자신의 퍼블릭 호스팅 메인 주소>
export KOPS_STATE_STORE=<s3://(위에서 생성한 자신의 버킷 이름)>
export AWS_PAGER=""
export REGION=ap-northeast-2
## 예시)
export AWS_PAGER=""
export REGION=ap-northeast-2
export KOPS_CLUSTER_NAME=sparkandassociates.net
export KOPS_STATE_STORE=s3://pkos2
echo 'export AWS_PAGER=""' >>~/.bashrc
echo 'export REGION=ap-northeast-2' >>~/.bashrc
echo 'export KOPS_CLUSTER_NAME=sparkandassociates.net' >>~/.bashrc
echo 'export KOPS_STATE_STORE=s3://pkos2' >>~/.bashrc
# 옵션 [터미널1] EC2 생성 모니터링
while true; do aws ec2 describe-instances --query "Reservations[*].Instances[*].{PublicIPAdd:PublicIpAddress,InstanceName:Tags[?Key=='Name']|[0].Value,Status:State.Name}" --filters Name=instance-state-name,Values=running --output text ; echo "------------------------------" ; sleep 1; done
# kops 설정 파일 생성(s3) 및 k8s 클러스터 배포 : 6분 정도 소요
## CNI는 aws vpc cni 사용, 마스터 노드 1대(t3.medium), 워커 노드 2대(t3.medium), 파드 사용 네트워크 대역 지정(172.30.0.0/16)
## --container-runtime containerd --kubernetes-version 1.24.0 ~ 1.25.6
kops create cluster --zones="$REGION"a,"$REGION"c --networking amazonvpc --cloud aws \
--master-size t3.medium --node-size t3.medium --node-count=2 --network-cidr 172.30.0.0/16 \
--ssh-public-key ~/.ssh/id_rsa.pub --name=$KOPS_CLUSTER_NAME --kubernetes-version "1.24.10" --dry-run -o yaml > mykops.yaml
kops create cluster --zones="$REGION"a,"$REGION"c --networking amazonvpc --cloud aws \
--master-size t3.medium --node-size t3.medium --node-count=2 --network-cidr 172.30.0.0/16 \
--ssh-public-key ~/.ssh/id_rsa.pub --name=$KOPS_CLUSTER_NAME --kubernetes-version "1.24.10" -y
# validate
kops validate cluster --wait 10m
[root@kops-ec2 ~]# kops validate cluster --wait 10m
Validating cluster sparkandassociates.net
INSTANCE GROUPS
NAME ROLE MACHINETYPE MIN MAX SUBNETS
master-ap-northeast-2a Master t3.medium 1 1 ap-northeast-2a
nodes-ap-northeast-2a Node t3.medium 1 1 ap-northeast-2a
nodes-ap-northeast-2c Node t3.medium 1 1 ap-northeast-2c
NODE STATUS
NAME ROLE READY
i-014efbf2d513ddb46 node True
i-05bf2ec120f5a7a9e node True
i-0d547c750bb8c97c1 master True
Your cluster sparkandassociates.net is ready
[root@kops-ec2 ~]#
# 자신의 도메인 변수 지정 : 소유하고 있는 자신의 도메인을 입력하시면 됩니다
MyDomain=<자신의 도메인>
MyDomain=sparkandassociates.net
# 자신의 Route 53 도메인 ID 조회 및 변수 지정
aws route53 list-hosted-zones-by-name --dns-name "${MyDomain}." | jq
aws route53 list-hosted-zones-by-name --dns-name "${MyDomain}." --query "HostedZones[0].Name"
aws route53 list-hosted-zones-by-name --dns-name "${MyDomain}." --query "HostedZones[0].Id" --output text
MyDnzHostedZoneId=`aws route53 list-hosted-zones-by-name --dns-name "${MyDomain}." --query "HostedZones[0].Id" --output text`
echo $MyDnzHostedZoneId
# A 레코드 타입 조회
aws route53 list-resource-record-sets --hosted-zone-id "${MyDnzHostedZoneId}" --query "ResourceRecordSets[?Type == 'A']" | jq
aws route53 list-resource-record-sets --hosted-zone-id "${MyDnzHostedZoneId}" --query "ResourceRecordSets[?Type == 'A'].Name" | jq
aws route53 list-resource-record-sets --hosted-zone-id "${MyDnzHostedZoneId}" --query "ResourceRecordSets[?Type == 'A'].Name" --output text
# A 레코드 값 반복 조회
while true; do aws route53 list-resource-record-sets --hosted-zone-id "${MyDnzHostedZoneId}" --query "ResourceRecordSets[?Type == 'A']" | jq ; date ; echo ; sleep 1; done
route53 콘솔에서 확인됨.
# 노드 IP 확인
aws ec2 describe-instances --query "Reservations[*].Instances[*].{PublicIPAdd:PublicIpAddress,PrivateIPAdd:PrivateIpAddress,InstanceName:Tags[?Key=='Name']|[0].Value,Status:State.Name}" --filters Name=instance-state-name,Values=running --output table
---------------------------------------------------------------------------------------------------------
| DescribeInstances |
+--------------------------------------------------------+----------------+-----------------+-----------+
| InstanceName | PrivateIPAdd | PublicIPAdd | Status |
+--------------------------------------------------------+----------------+-----------------+-----------+
| nodes-ap-northeast-2c.sparkandassociates.net | 172.30.79.8 | 15.164.251.18 | running |
| kops-ec2 | 10.0.0.10 | 13.124.5.243 | running |
| master-ap-northeast-2a.masters.sparkandassociates.net | 172.30.43.70 | 54.180.156.238 | running |
| nodes-ap-northeast-2a.sparkandassociates.net | 172.30.33.224 | 43.200.253.62 | running |
+--------------------------------------------------------+----------------+-----------------+-----------+
# api.sparkandassociates.net A레코드가 가지고 있는 IP 값이
# master노드 즉 k8s api 노드의 IP에 맵핑됨 54.180.156.238
# 파드 IP 확인
kubectl get pod -n kube-system -o=custom-columns=NAME:.metadata.name,IP:.status.podIP,STATUS:.status.phase
[root@kops-ec2 ~]# kubectl get pod -n kube-system -o=custom-columns=NAME:.metadata.name,IP:.status.podIP,STATUS:.status.phase
NAME IP STATUS
aws-cloud-controller-manager-h6lth 172.30.43.70 Running
aws-node-7k8zb 172.30.33.224 Running
aws-node-hfrd5 172.30.43.70 Running
aws-node-jqxsf 172.30.79.8 Running
coredns-6897c49dc4-rp5d9 172.30.75.143 Running
coredns-6897c49dc4-zs7jt 172.30.60.153 Running
coredns-autoscaler-5685d4f67b-pc257 172.30.95.96 Running
dns-controller-b9c5c9476-7fnr9 172.30.43.70 Running
ebs-csi-controller-54fb95868b-cck46 172.30.46.73 Running
ebs-csi-node-4q7x2 172.30.92.39 Running
ebs-csi-node-tf499 172.30.37.137 Running
ebs-csi-node-z8fmh 172.30.44.191 Running
etcd-manager-events-i-0d547c750bb8c97c1 172.30.43.70 Running
etcd-manager-main-i-0d547c750bb8c97c1 172.30.43.70 Running
kops-controller-j5xl7 172.30.43.70 Running
kube-apiserver-i-0d547c750bb8c97c1 172.30.43.70 Running
kube-controller-manager-i-0d547c750bb8c97c1 172.30.43.70 Running
kube-proxy-i-014efbf2d513ddb46 172.30.33.224 Running
kube-proxy-i-05bf2ec120f5a7a9e 172.30.79.8 Running
kube-proxy-i-0d547c750bb8c97c1 172.30.43.70 Running
kube-scheduler-i-0d547c750bb8c97c1 172.30.43.70 Running
# kops 클러스터 정보 확인
kops get cluster
[root@kops-ec2 ~]# kops get cluster
NAME CLOUD ZONES
sparkandassociates.net aws ap-northeast-2a,ap-northeast-2c
kops get cluster -o yaml
kops get cluster -o yaml | yh
...
# 인스턴스그룹 정보 확인
kops get ig
[root@kops-ec2 ~]# kops get ig
NAME ROLE MACHINETYPE MIN MAX ZONES
master-ap-northeast-2a Master t3.medium 1 1 ap-northeast-2a
nodes-ap-northeast-2a Node t3.medium 1 1 ap-northeast-2a
nodes-ap-northeast-2c Node t3.medium 1 1 ap-northeast-2c
kops get ig -o yaml
kops get ig -o yaml | yh
...
# 인스턴스 정보 확인
kops get instances
kops get instances -o yaml | yh
...
# 자동 완성 및 alias 축약 설정
source <(kubectl completion bash)
echo 'source <(kubectl completion bash)' >> ~/.bashrc
echo 'alias k=kubectl' >> ~/.bashrc
echo 'complete -F __start_kubectl k' >> ~/.bashrc
exit
exit
# 클러스터 정보 확인
k cluster-info
k cluster-info dump
# 노드 정보 확인
k get nodes -v6
[root@kops-ec2 ~]# k get nodes -v6
I0310 14:57:05.685877 5468 loader.go:373] Config loaded from file: /root/.kube/config
I0310 14:57:05.706084 5468 round_trippers.go:553] GET https://api.sparkandassociates.net/api/v1/nodes?limit=500 200 OK in 11 milliseconds
NAME STATUS ROLES AGE VERSION
i-014efbf2d513ddb46 Ready node 19m v1.24.10
i-05bf2ec120f5a7a9e Ready node 20m v1.24.10
i-0d547c750bb8c97c1 Ready control-plane 21m v1.24.10
# CRI 컨테이너 런타임이 무엇인가요? => containerd
[root@kops-ec2 ~]# k get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
i-014efbf2d513ddb46 Ready node 24m v1.24.10 172.30.33.224 43.200.253.62 Ubuntu 20.04.5 LTS 5.15.0-1031-aws containerd://1.6.18
i-05bf2ec120f5a7a9e Ready node 24m v1.24.10 172.30.79.8 15.164.251.18 Ubuntu 20.04.5 LTS 5.15.0-1031-aws containerd://1.6.18
i-0d547c750bb8c97c1 Ready control-plane 25m v1.24.10 172.30.43.70 54.180.156.238 Ubuntu 20.04.5 LTS 5.15.0-1031-aws containerd://1.6.18
[root@kops-ec2 ~]#
# 배포 완료 후 정보 확인
tree -L 1 ~/.kube
cat .kube/config
cat .kube/config | yh
# 네트워크/스토리지 상세 내용은 2주차에 진행
# volume(sc)
kubectl get sc
kubectl get sc kops-csi-1-21 -o jsonpath={.parameters} ;echo
{"encrypted":"true","type":"gp3"}
kubectl get sc kops-ssd-1-17 -o jsonpath={.parameters} ;echo
{"encrypted":"true","type":"gp2"}
# [master node] aws vpc cni log
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME ls /var/log/aws-routed-eni
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME cat /var/log/aws-routed-eni/plugin.log | jq
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME cat /var/log/aws-routed-eni/ipamd.log | jq
# pod ip 확인
kubectl get pod -n kube-system
kubectl get pod -n kube-system -owide
# [master node] iptables rules
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME sudo iptables -t nat -S
# [master node] 컨테이너 정보 확인
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME ps axf |grep /usr/bin/containerd
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME ps afxuwww
## [master node] tree 툴 설치
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME sudo apt install -y tree jq
# [master node] 볼륨/마운트 확인 : nvme1n1 과 nvme2n1 은 etcd-events, etcd-main 으로 사용
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME lsblk
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME df -hT --type=ext4
Filesystem Type Size Used Avail Use% Mounted on
/dev/root ext4 62G 5.5G 57G 9% /
/dev/nvme1n1 ext4 20G 167M 20G 1% /mnt/master-vol-03072035bffdd8252
/dev/nvme2n1 ext4 20G 170M 20G 1% /mnt/master-vol-0db4765644c7d6ecb
## [master node] nvme1n1,nvme1n2 디렉터리 확인
[root@kops-ec2 ~]# ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME tree /mnt/master-vol-0677e154ec157c895
/mnt/master-vol-0677e154ec157c895
├── data [error opening dir]
├── lost+found [error opening dir]
├── pki
│ └── aVtGb7iDprc8tloOBs_H7w
│ ├── clients
│ │ ├── ca.crt
│ │ ├── server.crt
│ │ └── server.key
│ └── peers
│ ├── ca.crt
│ ├── me.crt
│ └── me.key
└── state
6 directories, 7 files
[root@kops-ec2 ~]# ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME tree /mnt/master-vol-0409200c053e944be
/mnt/master-vol-0409200c053e944be
├── data [error opening dir]
├── lost+found [error opening dir]
├── pki
│ └── q6M2HIDRjxaUXhZIjaDiqw
│ ├── clients
│ │ ├── ca.crt
│ │ ├── server.crt
│ │ └── server.key
│ └── peers
│ ├── ca.crt
│ ├── me.crt
│ └── me.key
└── state
6 directories, 7 files
# [master node] kubelet 상태 확인
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME systemctl status kubelet
master node에 SSH 접속 후 확인
# [master node] SSH 접속
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME
# [master node] EC2 메타데이터 확인
TOKEN=`curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"`
echo $TOKEN
curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/
curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/latest/
curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/latest/meta-data/iam/security-credentials/
curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v
ubuntu@i-0d547c750bb8c97c1:~$ curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/latest/meta-data/iam/security-credentials/masters.sparkandassociates.net| jq
{
"Code": "Success",
"LastUpdated": "2023-03-10T06:15:19Z",
"Type": "AWS-HMAC",
"AccessKeyId": "ASIA3NGL2HIFJRHH",
"SecretAccessKey": "kxkMtLFvjhY+3/KgBYxSLg8VAKgGlV3",
"Token": "IQoJb3JpZ2luX2VjELb//////////wEaDmFwLW5vcnRoZWFzdC0yIkYwRAIgaubwNUXQQ3DdSUbQqqxSEFCqSfTYWh212fr9na+QqEACIDkMJ/E70H5EnLIXFcAM+UCffhUHzMlsJx5O+ZiZs/CtKssFCG8QBBoMNzg0MjQ2MTY0Njk1IgyXiuhVIUJp2qzCVdoqqAVNQaXC1nn4kgFaRryjQxbkXclhFkvv8bCbRtAZKe8iY9Dme+OyBj91i+ez1NrGNT6bVJ65tVMIDAjs/zZbUU0d2c7l8yUDGPOkOT9fYPnyF1EQldrsFGFwfTsp+eDwd6l3hGRQSD764INMPvLS1WSUtiyJx4UXfScrJcNTjEANQlFvwrqf6nQtboPkCVGAzOwMk6rzxAadjSybgGvn434nktgmGoSoo9cQCTu872/1y2y6Jt2Jx5CYhup5+0Z/sj7Labq0us4XvKJ/IoTp8/78LgbemNCLXhL2oIBG9kwnDoGUObV0KDMsScnmkGx6lGSMPo/pH6RPfH3+Z/XHK1CMT7TMQnVNm6sIcGb9DJqmsb/uDOZf9QfIDvyojWLgDbttysLmEorkJv5AreydbH1uLukfEcdxGIDJ8Fh1sMJRX6aDwMgUrSU5MSLEDtrqVhDLeVR5kp/VLnybi0znZg1doLvXHRypKfcqJ4VZZ9R1CtW9TXSmlJ4AsQL2samA94HjS5SW8R7rZUHbm8gE/C5tOHRZsRAye1Qhl2TLO91c/KTIFWf+pQmX+bUWnQCOy2lcrU1gdKSxzqfGYBhJ2XBnL+KhkhCVK6w3CNtuq6CYivycfTSmqtO+naRf/WMER0J+qdukHpWW0ITvf0MBTziT2Y/dnlRgw1YE8ajTIE3ICYJPz0B27OZroYbzli6DKB38aLXk5gW0yOn1n3DH+Rs+q1LElFPCLhG2nMTsDcaEnvCNPeHEWN8cNH1nj+jCTZAMnKmYgBbuK8ywTiSCZIu8IdoFsNwn5c78w4GwYxfGySWk5hDmQwMlTwKyrXQhuQzCPlqugBjqyAV7tVXfDBzsM5xN6/MQCe+NIjQX3kZsodB5N+R1oF8x1MiXjezJtoVkDwN4lIhqTMGv7IS9naD3184Gez6sjNNqqk4DgJiu9keYEs6b+c0xRMsu7vHQdPD30kmeFg9jzn+W6u7rkCDvF09X73V5Q0A4UaDAT9VX+xA18IfL+piaWpsnO9RlGhgOLKrad1hyCEo6bbYTFZzBOpt2cV1dSbB/7ZyGOAXIkmRTfPI8jeN9c8gQ=",
"Expiration": "2023-03-10T12:37:27Z"
}
ubuntu@i-0d547c750bb8c97c1:~$
워커 노드에 SSH 접속 후 확인 : 워커 노드의 public ip 로 SSH 접속
# 워커 노드 Public IP 확인
aws ec2 describe-instances --query "Reservations[*].Instances[*].{PublicIPAdd:PublicIpAddress,InstanceName:Tags[?Key=='Name']|[0].Value}" --filters Name=instance-state-name,Values=running --output table
[root@kops-ec2 ~]# aws ec2 describe-instances --query "Reservations[*].Instances[*].{PublicIPAdd:PublicIpAddress,InstanceName:Tags[?Key=='Name']|[0].Value}" --filters Name=instance-state-name,Values=running --output table
-----------------------------------------------------------------------------
| DescribeInstances |
+---------------------------------------------------------+-----------------+
| InstanceName | PublicIPAdd |
+---------------------------------------------------------+-----------------+
| nodes-ap-northeast-2c.sparkandassociates.net | 15.164.251.18 |
| kops-ec2 | 13.124.5.243 |
| master-ap-northeast-2a.masters.sparkandassociates.net | 54.180.156.238 |
| nodes-ap-northeast-2a.sparkandassociates.net | 43.200.253.62 |
+---------------------------------------------------------+-----------------+
[root@kops-ec2 ~]#
# 워커 노드 Public IP 변수 지정
W1PIP=43.200.253.62
W2PIP=15.164.251.18
# 워커 노드 SSH 접속
ssh -i ~/.ssh/id_rsa ubuntu@$W1PIP
exit
ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP
exit
# 워커 노드 스토리지 확인
ssh -i ~/.ssh/id_rsa ubuntu@$W1PIP lsblk
ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP lsblk
ssh -i ~/.ssh/id_rsa ubuntu@$W1PIP df -hT -t ext4
ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP df -hT -t ext4
# 워커 노드 SSH 접속 후 node EC2 메타데이터 확인 : IAM Role - 링크
ssh -i ~/.ssh/id_rsa ubuntu@$W1PIP
ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP
TOKEN=`curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"`
echo $TOKEN
curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/latest/meta-data/iam/security-credentials/
curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/latest/meta-data/iam/security-credentials/nodes.sparkandassociates.net
ubuntu@i-014efbf2d513ddb46:~$ curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/latest/meta-data/iam/security-credentials/nodes.sparkandassociates.net
{
"Code" : "Success",
"LastUpdated" : "2023-03-10T06:10:31Z",
"Type" : "AWS-HMAC",
"AccessKeyId" : "ASIA3NG2PY4PIYD",
"SecretAccessKey" : "ayl+IVkBokMOljz5mc/Tu0BRl8f8DwiW6",
"Token" : "IQoJb3JpZ2luX2VjELb//////////wEaDmFwLW5vcnRoZWFzdC0yIkYwRAIgAwaSeuaHb3eZ/t36zlzUK+6Ry3kRNSysDIJW1qSdaVwCIFMUVOBAFiwArt95lZSXociouVxvmUggIP2pNXADxrF4KswFCG8QBBoMNzg0MjQ2MTY0Njk1Igxzi3uq+jhZ2jJgoukqqQU8EX/9nTjNQNDrzvF0W+X+/VCH+OxawhWWKqKlsb3KKlk1d9CyVp1+rNJ1vhL19FV9EFnGPdfMJ95kc4RXuAzu99vwhetK4s52SBpTSC60btNSouCoJ1Oyzf6uhYuRHQx8YykM9URvesQupakVo6KHfrcd6SV9khXPcbA9OmoDsrb4u7pT8e2bte3LPMqogt9Cc/OAGm6vk6V7bNIgJlNHs7+ypgcW+BEVeNHc0al0mAjzE/Ctuo8eqTX8YgS/FC2F3hZ14Hr+jNsGf3tUtx06sq9pk74HpGHbiqmIggtbOzmvPWM2tNGwIHlIbLpEGRHu4ipAgXOLDpQbdOqs+IdxrXoQ6F60fAWpp+65jVjDmVbn8jbBpfPTPJT5U0BeGPp8vk2CMUKxHZQOBU+GBdMdHw0J892J1eZUcOU5XAwDF3Z4ijgzOpYtM1AcO2WZ/V5ywkWPckHM3huy7BoXiPrs26YqStwFmbq9c+66OARjMGU3X8XvrmbMbK8vV3ZSAcY7DdUQ3biRAQg8OVggfTPsA+UV6BVt/3kRUrinre5H09WS38Zn/Kq7jDM1EzsGd46p12byRZfh9Se2UM6f7bF9SuyNPdC1V+JXTrcxHRpA7cJbSEsJ99tgpqwo5vCWcDIAZZVZMcOJN7itknP+Jas8IIrp+ksSO5GiVljy6W+FwcYfaXc4RM3vvtPwYSyL6VtY/ul0oVjQhA6vivw6pWjmua53sxgpeQimFChrl9DFrxn/BKMuDzyoqjTpoXYJcOZftGWZqgEsZ36orsFTeGyWcEWLU/h7EML9YOC/VPuT+DgA3YD+lL5oiwmCfM4w7ZOroAY6sgFob5AqzlL5OuwLUavN0TW5EjZaG7p3PIDVtTaRmBkJOU2ayYMtZTvuVPDYGbDTad4aObuCtQ8kNh21M05Yyi93arWrxYwbhs+NW/L8Cj31Y/rsNaZf+SSd/HugzJJY4bKoKV4hh8AUlV2uQdckE7vLAfAqeuqPms+OaT8dqAUmvy4oFGOUZ7Uv+gjbt6Z2AmXkg5Wb/pbNYm0HtDHG6AOyqt33+ZxEbwLr1LqbNIrrhhFI",
"Expiration" : "2023-03-10T12:36:16Z"
# 수퍼마리오 디플로이먼트 배포
curl -s -O https://raw.githubusercontent.com/gasida/PKOS/main/1/mario.yaml
k apply -f mario.yaml
cat mario.yaml | yh
# 배포 확인 : CLB 배포 확인 >> 5분 이상 소요
k get deploy,svc,ep mario
watch kubectl get svc mario
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
mario LoadBalancer 100.69.37.166 <pending> 80:31785/TCP 111s
# classic LB가 생성되면 아래와 같이 external-ip 에 해당 LB 주소가 보여짐.
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
mario LoadBalancer 100.69.37.166 af928db1a812d464895ff5b54fc8861b-208157803.ap-northeast-2.elb.amazonaws.com 80:31785/TCP 4m36s
pending 상태인데, 이유는 AWS LB가 만들어 지는데 약 5분 소요됨.
# 마리오 게임 접속 : CLB 주소로 웹 접속
k get svc mario -o jsonpath={.status.loadBalancer.ingress[0].hostname} | awk '{ print "Maria URL = http://"$1 }'
[root@kops-ec2 ~]# k get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
mario 0/1 1 0 4s
[root@kops-ec2 ~]# k get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 100.64.0.1 <none> 443/TCP 52m
mario LoadBalancer 100.69.37.166 <pending> 80:31785/TCP 8s
[root@kops-ec2 ~]# k get ep
NAME ENDPOINTS AGE
kubernetes 172.30.43.70:443 53m
mario 172.30.38.69:8080 15s
K8S 서비스/인그레스 생성 시 도메인을 설정하면, AWS(Route 53), Azure(DNS), GCP(Cloud DNS) 에 A 레코드(TXT 레코드)로 자동 생성/삭제
(https://edgehog.blog/a-self-hosted-external-dns-resolver-for-kubernetes-111a27d6fc2c)
# https://github.com/kubernetes-sigs/external-dns
# 모니터링
watch -d kubectl get pod -A
# 정책 생성 -> 마스터/워커노드에 정책 연결
curl -s -O https://s3.ap-northeast-2.amazonaws.com/cloudformation.cloudneta.net/AKOS/externaldns/externaldns-aws-r53-policy.json
aws iam create-policy --policy-name AllowExternalDNSUpdates --policy-document file://externaldns-aws-r53-policy.json
[root@kops-ec2 ~]# aws iam create-policy --policy-name AllowExternalDNSUpdates --policy-document file://externaldns-aws-r53-policy.json
{
"Policy": {
"PolicyName": "AllowExternalDNSUpdates",
"PolicyId": "ANPA3NL4NUOGGZDI",
"Arn": "arn:aws:iam::7842464695:policy/AllowExternalDNSUpdates",
"Path": "/",
"DefaultVersionId": "v1",
"AttachmentCount": 0,
"PermissionsBoundaryUsageCount": 0,
"IsAttachable": true,
"CreateDate": "2023-03-10T06:40:32+00:00",
"UpdateDate": "2023-03-10T06:40:32+00:00"
}
}
# ACCOUNT_ID 변수 지정
export ACCOUNT_ID=$(aws sts get-caller-identity --query 'Account' --output text)
# EC2 instance profiles 에 IAM Policy 추가(attach)
aws iam attach-role-policy --policy-arn arn:aws:iam::$ACCOUNT_ID:policy/AllowExternalDNSUpdates --role-name masters.$KOPS_CLUSTER_NAME
aws iam attach-role-policy --policy-arn arn:aws:iam::$ACCOUNT_ID:policy/AllowExternalDNSUpdates --role-name nodes.$KOPS_CLUSTER_NAME
[root@kops-ec2 ~]# aws iam attach-role-policy --policy-arn arn:aws:iam::$ACCOUNT_ID:policy/AllowExternalDNSUpdates --role-name masters.$KOPS_CLUSTER_NAME
[root@kops-ec2 ~]# aws iam attach-role-policy --policy-arn arn:aws:iam::$ACCOUNT_ID:policy/AllowExternalDNSUpdates --role-name nodes.$KOPS_CLUSTER_NAME
# 설치
kops edit cluster
--------------------------
spec:
externalDns:
provider: external-dns
--------------------------
5 apiVersion: kops.k8s.io/v1alpha2
6 kind: Cluster
7 metadata:
8 creationTimestamp: "2023-03-10T05:32:31Z"
9 name: sparkandassociates.net
10 spec:
11 externalDns:
12 provider: external-dns
13 api:
14 dns: {}
15 authorization:
16 rbac: {}
# 업데이트 적용
kops update cluster --yes && echo && sleep 3 && kops rolling-update cluster
[root@kops-ec2 ~]# kops update cluster --yes && echo && sleep 3 && kops rolling-update cluster
*********************************************************************************
A new kubernetes version is available: 1.24.11
Upgrading is recommended (try kops upgrade cluster)
More information: https://github.com/kubernetes/kops/blob/master/permalinks/upgrade_k8s.md#1.24.11
*********************************************************************************
W0310 15:46:28.800558 5322 builder.go:230] failed to digest image "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.11.4"
W0310 15:46:29.264401 5322 builder.go:230] failed to digest image "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni-init:v1.11.4"
I0310 15:46:33.555789 5322 executor.go:111] Tasks: 0 done / 103 total; 48 can run
I0310 15:46:34.423479 5322 executor.go:111] Tasks: 48 done / 103 total; 21 can run
I0310 15:46:35.329983 5322 executor.go:111] Tasks: 69 done / 103 total; 28 can run
I0310 15:46:35.616911 5322 executor.go:111] Tasks: 97 done / 103 total; 3 can run
I0310 15:46:35.719104 5322 executor.go:111] Tasks: 100 done / 103 total; 3 can run
I0310 15:46:35.785512 5322 executor.go:111] Tasks: 103 done / 103 total; 0 can run
I0310 15:46:36.438277 5322 dns.go:238] Pre-creating DNS records
I0310 15:46:37.036912 5322 update_cluster.go:326] Exporting kubeconfig for cluster
kOps has set your kubectl context to sparkandassociates.net
W0310 15:46:37.052521 5322 update_cluster.go:350] Exported kubeconfig with no user authentication; use --admin, --user or --auth-plugin flags with `kops export kubeconfig`
Cluster changes have been applied to the cloud.
Changes may require instances to restart: kops rolling-update cluster
NAME STATUS NEEDUPDATE READY MIN TARGET MAX NODES
master-ap-northeast-2a Ready 0 1 1 1 1 1
nodes-ap-northeast-2a Ready 0 1 1 1 1 1
nodes-ap-northeast-2c Ready 0 1 1 1 1 1
# 업데이트 후 external dns pod 생성됨
[root@kops-ec2 ~]# k get pod -n kube-system -l k8s-app=external-dns
NAME READY STATUS RESTARTS AGE
external-dns-7657648665-bw2sg 1/1 Running 0 3m21s
[root@kops-ec2 ~]#
# CLB에 ExternanDNS 로 도메인 연결
k annotate service mario "external-dns.alpha.kubernetes.io/hostname=mario.$KOPS_CLUSTER_NAME"
# annotation 확인
[root@kops-ec2 ~]# k describe svc mario | grep -i anno
Annotations: external-dns.alpha.kubernetes.io/hostname: mario.sparkandassociates.net
# 확인
dig +short mario.$KOPS_CLUSTER_NAME
kubectl logs -n kube-system -l k8s-app=external-dns
route53 등록된 Arecord 확인
https://www.whatsmydns.net/#A/{도메인}
이렇게 하면 글로벌 네임서버에 배포 현황을 실시간으로 확인가능하다.
https://www.whatsmydns.net/#A/mario.sparkandassociates.net
# 웹 접속 주소 확인 및 접속
echo -e "Maria Game URL = http://mario.$KOPS_CLUSTER_NAME"
# 도메인 체크
echo -e "My Domain Checker = https://www.whatsmydns.net/#A/mario.$KOPS_CLUSTER_NAME"
접속 확인.
# 터미널1
watch kubectl get pod -owide
# 터미널2
kubectl get pod -w
# 디플로이먼트 배포
curl -s -O https://raw.githubusercontent.com/junghoon2/kube-books/main/ch05/busybox-deploy.yml
cat busybox-deploy.yml | sed -e 's/replicas: 10/replicas: 6/g' | kubectl apply -f -
# 워커 노드 Public IP 확인
aws ec2 describe-instances --query "Reservations[*].Instances[*].{PublicIPAdd:PublicIpAddress,InstanceName:Tags[?Key=='Name']|[0].Value}" --filters Name=instance-state-name,Values=running --output table
# 워커 노드 Public IP 변수 지정
W1PIP=<워커 노드 1 Public IP>
W2PIP=<워커 노드 2 Public IP>
# 워커 노드 스토리지 확인
ssh -i ~/.ssh/id_rsa ubuntu@$W1PIP df -hT -t ext4
ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP df -hT -t ext4
# 노드2에 디스크에 큰 파일 생성
ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP fallocate -l 110g 110g-file
# 노드2에 디스크에 용량 확인 >> 92% 넘김!
[root@kops-ec2 ~]# ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP df -hT -t ext4
Filesystem Type Size Used Avail Use% Mounted on
/dev/root ext4 124G 114G 11G 92% /
# 파드 상태 확인 >> 노드2에서 쫓겨남 확인!
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox-6b5c698b45-5mkp6 1/1 Running 0 44s 172.30.41.142 i-014efbf2d513ddb46 <none> <none>
busybox-6b5c698b45-6kd24 1/1 Running 0 113s 172.30.34.14 i-014efbf2d513ddb46 <none> <none>
busybox-6b5c698b45-8dpdf 1/1 Running 0 113s 172.30.49.31 i-014efbf2d513ddb46 <none> <none>
busybox-6b5c698b45-d5kml 1/1 Running 0 113s 172.30.80.30 i-05bf2ec120f5a7a9e <none> <none>
busybox-6b5c698b45-gpxpl 1/1 Running 0 13s 172.30.36.132 i-014efbf2d513ddb46 <none> <none>
busybox-6b5c698b45-k68gd 0/1 Error 0 113s 172.30.76.99 i-05bf2ec120f5a7a9e <none> <none>
busybox-6b5c698b45-mcvx2 1/1 Running 0 113s 172.30.41.191 i-014efbf2d513ddb46 <none> <none>
busybox-6b5c698b45-thnnk 0/1 Error 0 113s 172.30.92.217 i-05bf2ec120f5a7a9e <none> <none>
mario-687bcfc9cc-cv4l4 1/1 Running 0 39m 172.30.38.69 i-014efbf2d513ddb46 <none> <none>
# 노드의 이벤트이므로 클러스터 이벤트 확인
k get events
4m48s Warning Failed pod/nginx-19 Failed to pull image "nginx:1.19.19": rpc error: code = NotFound desc = failed to pull and unpack image "docker.io/library/nginx:1.19.19": failed to resolve reference "docker.io/library/nginx:1.19.19": docker.io/library/nginx:1.19.19: not found
4m48s Warning Failed pod/nginx-19 Error: ErrImagePull
5m2s Normal BackOff pod/nginx-19 Back-off pulling image "nginx:1.19.19"
5m2s Warning Failed pod/nginx-19 Error: ImagePullBackOff
4m39s Normal Pulling pod/nginx-19 Pulling image "nginx:1.19"
4m33s Normal Pulled pod/nginx-19 Successfully pulled image "nginx:1.19" in 6.166666328s
4m33s Normal Created pod/nginx-19 Created container nginx-pod
4m33s Normal Started pod/nginx-19 Started container nginx-pod
3m53s Normal Killing pod/nginx-19 Stopping container nginx-pod
# disk 부족 이벤트 확인
k describe nodes
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Fri, 10 Mar 2023 16:05:48 +0900 Fri, 10 Mar 2023 14:36:46 +0900 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Fri, 10 Mar 2023 16:05:48 +0900 Fri, 10 Mar 2023 14:36:46 +0900 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Fri, 10 Mar 2023 16:05:48 +0900 Fri, 10 Mar 2023 14:36:46 +0900 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Fri, 10 Mar 2023 16:05:48 +0900 Fri, 10 Mar 2023 14:37:37 +0900 KubeletReady kubelet is posting ready status. AppArmor enabled
[root@kops-ec2 ~]# k get events| grep FreeDiskSpaceFailed
4m34s Warning FreeDiskSpaceFailed node/i-05bf2ec120f5a7a9e failed to garbage collect required amount of images. Wanted to free 15788516966 bytes, but freed 0 bytes
# 삭제
[root@kops-ec2 ~]# k delete deploy busybox
deployment.apps "busybox" deleted
[root@kops-ec2 ~]# ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP rm -rf 110g-file
[root@kops-ec2 ~]#
https://learnk8s.io/troubleshooting-deployments
우선 1주차를 막 참여하였지만,
업무나 개인스터디에서 얻기 힘든 지식 및 노하우를 그냥 전수 받은 느낌(?)
왜 스터디를 하는지 깨닫게 되었고, 집단지성의 위대함(?)도 다시한번 경험하게 되었다.
업무바쁜 핑계말고 밤에 좀더 달려봐야겠다..