Kops on AWS

Spark·2023년 3월 10일
0

PKOS

목록 보기
1/6
post-thumbnail

PKOS 2기

가시다님의 Production Kubernetes Online Study 2기 멤버가 되어
스터디에 참여하고 있다.
주 1회, 총 4주간 (5주로 변경됨) 진행되며
매주 과제를 완료하여야 탈락되지 않고 참여가능 하다. (이점이 좋은듯)
이번 스터디의 목표는 kops 를 활용하여 aws 인프라위에 k8s 클러스터를 구성하는 것인데 이렇게만 하면 재미가 없듯, 가시다님께서 많은걸 준비하셨다. (대단하세요 :)
앞으로 어떤 재미난 일이 있을지 상세하게 실습해보며 정리해보도록 하자.

Kops on AWS

kops는 AWS 계정이 필요하며 IAM 키가 필요하다.
그리고 kubectl도 사전에 필요하다.
참고) https://kubernetes.io/docs/setup/production-environment/tools/kops/

  • k8s 를 배포할 수 있는 kops 가 설치된 ec2cloudformation 에 의해서 생성됨
  • kopsk8s 클러스터를 생성 : k8s 설정 파일을 s3 에 저장
  • 버전 : k8s v1.24.10, OS Ubuntu 20.04 LTS
  • kops-ec2 역할 : kOps 배포 수행, kubectl 명령 실행 등
  • 마스터 노드와 워커 노드는 EC2 Auto Scaling Group(=ASG) 설정으로 구성됨
  • 도메인은 퍼블릭 도메인을 사용함

기본인프라 배포

# yaml 파일 다운로드
curl -O https://s3.ap-northeast-2.amazonaws.com/cloudformation.cloudneta.net/K8S/kops-new-ec2.yaml

아 내 VM 환경에는 우선 cloudformation client 를 설치해야함.

참고 :
https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
https://docs.aws.amazon.com/cloudformation-cli/latest/userguide/what-is-cloudformation-cli.html
https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/install-sam-cli.html

# python 3.6 or above 환경이 필요하므로 가상환경이용.
(ansible_venv) [root@san-1 ~]# python --version
Python 3.6.8

pip install cloudformation-cli cloudformation-cli-java-plugin cloudformation-cli-go-plugin cloudformation-cli-python-plugin cloudformation-cli-typescript-plugin

cli 로 배포

aws cloudformation deploy --template-file ./kops-new-ec2.yaml --stack-name mykops --parameter-overrides KeyName=spark SgIngressSshCidr=$(curl -s ipinfo.io/ip)/32 --region ap-northeast-2

aws cli 사용을 하기 위해서 자격증명이 되어있어야함.
IAM 콘솔에서 access key 생성후
생성된 Access Key와 Secret Access Key를 이용하여
aws configure 해준다.

(ansible_venv) [root@san-1 pkos]# aws configure
AWS Access Key ID [None]: AK--7W
AWS Secret Access Key [None]: F4--------d
Default region name [None]: ap-northeast-2
Default output format [None]: json

스택 배포 완료후 확인.

(ansible_venv) [root@san-1 pkos]# aws cloudformation describe-stacks --stack-name mykops --query 'Stacks[*].Outputs[*].OutputValue' --output text
13.124.5.243

# 접속
(ansible_venv) [root@san-1 pkos]# ssh -i spark.pem ec2-user@$(aws cloudformation describe-stacks --stack-name mykops --query 'Stacks[*].Outputs[0].OutputValue' --output text)

       __|  __|_  )
       _|  (     /   Amazon Linux 2 AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-2/
[root@kops-ec2 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 02:01:f3:ec:0a:c4 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.10/24 brd 10.0.0.255 scope global dynamic eth0
       valid_lft 2972sec preferred_lft 2972sec
    inet6 fe80::1:f3ff:feec:ac4/64 scope link
       valid_lft forever preferred_lft forever
[root@kops-ec2 ~]#

kOps 클러스터 배포 및 확인 : kops ec2에 접속 후 실행

sudo tail -f /var/log/cloud-init-output.log
sudo su -

# 기본 툴 및 SSH 키 설치 등 확인
kubectl version --client=true -o yaml | yh
... gitVersion: v1.26.2 ...

# kops version
[root@kops-ec2 ~]# kops version
Client version: 1.25.4 (git-v1.25.4)

[root@kops-ec2 ~]# aws --version
aws-cli/2.11.1 Python/3.11.2 Linux/4.14.305-227.531.amzn2.x86_64 exe/x86_64.amzn.2 prompt/off

ls /root/.ssh/id_rsa*

# 자격 구성 설정 없이 확인
aws ec2 describe-instances

# IAM User 자격 구성 : 실습 편리를 위해 administrator 권한을 가진 IAM User 의 자격 증명 입력
[root@kops-ec2 ~]# aws configure
AWS Access Key ID [None]: AK--7W
AWS Secret Access Key [None]: F4--------d
Default region name [None]: ap-northeast-2
Default output format [None]: json

# 자격 구성 적용 확인 : 노드 IP 확인
[root@kops-ec2 ~]# aws ec2 describe-instances | grep IpAddress

# aws cli 페이지 출력 옵션
export AWS_PAGER=""

# 리소스를 배치할 리전이름을 변수 지정
REGION=ap-northeast-2  # 서울 리전 사용

# k8s 설정 파일이 저장될 버킷 생성
## aws s3 mb s3://버킷<유일한 이름> --region <S3 배포될 AWS 리전>
aws s3 mb s3://pkos2 --region $REGION
aws s3 ls
[root@kops-ec2 ~]# aws s3 mb s3://pkos2 --region $REGION
make_bucket: pkos2
[root@kops-ec2 ~]# aws s3 ls
2023-03-10 14:27:51 pkos2

# 배포 시 참고할 정보를 환경 변수에 저장
## export NAME=<자신의 퍼블릭 호스팅 메인 주소>
## export KOPS_STATE_STORE=s3://(위에서 생성한 자신의 버킷 이름)
export KOPS_CLUSTER_NAME=<자신의 퍼블릭 호스팅 메인 주소>
export KOPS_STATE_STORE=<s3://(위에서 생성한 자신의 버킷 이름)>
export AWS_PAGER=""
export REGION=ap-northeast-2

## 예시)
export AWS_PAGER=""
export REGION=ap-northeast-2
export KOPS_CLUSTER_NAME=sparkandassociates.net
export KOPS_STATE_STORE=s3://pkos2
echo 'export AWS_PAGER=""' >>~/.bashrc
echo 'export REGION=ap-northeast-2' >>~/.bashrc
echo 'export KOPS_CLUSTER_NAME=sparkandassociates.net' >>~/.bashrc
echo 'export KOPS_STATE_STORE=s3://pkos2' >>~/.bashrc

# 옵션 [터미널1] EC2 생성 모니터링
while true; do aws ec2 describe-instances --query "Reservations[*].Instances[*].{PublicIPAdd:PublicIpAddress,InstanceName:Tags[?Key=='Name']|[0].Value,Status:State.Name}" --filters Name=instance-state-name,Values=running --output text ; echo "------------------------------" ; sleep 1; done

# kops 설정 파일 생성(s3) 및 k8s 클러스터 배포 : 6분 정도 소요
## CNI는 aws vpc cni 사용, 마스터 노드 1대(t3.medium), 워커 노드 2대(t3.medium), 파드 사용 네트워크 대역 지정(172.30.0.0/16)
## --container-runtime containerd --kubernetes-version 1.24.0 ~ 1.25.6
kops create cluster --zones="$REGION"a,"$REGION"c --networking amazonvpc --cloud aws \
--master-size t3.medium --node-size t3.medium --node-count=2 --network-cidr 172.30.0.0/16 \
--ssh-public-key ~/.ssh/id_rsa.pub --name=$KOPS_CLUSTER_NAME --kubernetes-version "1.24.10" --dry-run -o yaml > mykops.yaml

kops create cluster --zones="$REGION"a,"$REGION"c --networking amazonvpc --cloud aws \
--master-size t3.medium --node-size t3.medium --node-count=2 --network-cidr 172.30.0.0/16 \
--ssh-public-key ~/.ssh/id_rsa.pub --name=$KOPS_CLUSTER_NAME --kubernetes-version "1.24.10" -y

# validate
kops validate cluster --wait 10m

[root@kops-ec2 ~]# kops validate cluster --wait 10m
Validating cluster sparkandassociates.net

INSTANCE GROUPS
NAME                    ROLE    MACHINETYPE     MIN     MAX     SUBNETS
master-ap-northeast-2a  Master  t3.medium       1       1       ap-northeast-2a
nodes-ap-northeast-2a   Node    t3.medium       1       1       ap-northeast-2a
nodes-ap-northeast-2c   Node    t3.medium       1       1       ap-northeast-2c

NODE STATUS
NAME                    ROLE    READY
i-014efbf2d513ddb46     node    True
i-05bf2ec120f5a7a9e     node    True
i-0d547c750bb8c97c1     master  True

Your cluster sparkandassociates.net is ready
[root@kops-ec2 ~]#

AWS Route53 도메인 정보 확인

# 자신의 도메인 변수 지정 : 소유하고 있는 자신의 도메인을 입력하시면 됩니다
MyDomain=<자신의 도메인>
MyDomain=sparkandassociates.net

# 자신의 Route 53 도메인 ID 조회 및 변수 지정
aws route53 list-hosted-zones-by-name --dns-name "${MyDomain}." | jq
aws route53 list-hosted-zones-by-name --dns-name "${MyDomain}." --query "HostedZones[0].Name"
aws route53 list-hosted-zones-by-name --dns-name "${MyDomain}." --query "HostedZones[0].Id" --output text
MyDnzHostedZoneId=`aws route53 list-hosted-zones-by-name --dns-name "${MyDomain}." --query "HostedZones[0].Id" --output text`
echo $MyDnzHostedZoneId

# A 레코드 타입 조회
aws route53 list-resource-record-sets --hosted-zone-id "${MyDnzHostedZoneId}" --query "ResourceRecordSets[?Type == 'A']" | jq
aws route53 list-resource-record-sets --hosted-zone-id "${MyDnzHostedZoneId}" --query "ResourceRecordSets[?Type == 'A'].Name" | jq
aws route53 list-resource-record-sets --hosted-zone-id "${MyDnzHostedZoneId}" --query "ResourceRecordSets[?Type == 'A'].Name" --output text

# A 레코드 값 반복 조회
while true; do aws route53 list-resource-record-sets --hosted-zone-id "${MyDnzHostedZoneId}" --query "ResourceRecordSets[?Type == 'A']" | jq ; date ; echo ; sleep 1; done

route53 콘솔에서 확인됨.

kOps 설치 확인 : 설치 완료 후 진행, 편리성(자동 완성, 입력)

# 노드 IP 확인
aws ec2 describe-instances --query "Reservations[*].Instances[*].{PublicIPAdd:PublicIpAddress,PrivateIPAdd:PrivateIpAddress,InstanceName:Tags[?Key=='Name']|[0].Value,Status:State.Name}" --filters Name=instance-state-name,Values=running --output table

---------------------------------------------------------------------------------------------------------
|                                           DescribeInstances                                           |
+--------------------------------------------------------+----------------+-----------------+-----------+
|                      InstanceName                      | PrivateIPAdd   |   PublicIPAdd   |  Status   |
+--------------------------------------------------------+----------------+-----------------+-----------+
|  nodes-ap-northeast-2c.sparkandassociates.net          |  172.30.79.8   |  15.164.251.18  |  running  |
|  kops-ec2                                              |  10.0.0.10     |  13.124.5.243   |  running  |
|  master-ap-northeast-2a.masters.sparkandassociates.net |  172.30.43.70  |  54.180.156.238 |  running  |
|  nodes-ap-northeast-2a.sparkandassociates.net          |  172.30.33.224 |  43.200.253.62  |  running  |
+--------------------------------------------------------+----------------+-----------------+-----------+

# api.sparkandassociates.net A레코드가 가지고 있는 IP 값이 
# master노드 즉 k8s api 노드의 IP에 맵핑됨 54.180.156.238

# 파드 IP 확인
kubectl get pod -n kube-system -o=custom-columns=NAME:.metadata.name,IP:.status.podIP,STATUS:.status.phase

[root@kops-ec2 ~]# kubectl get pod -n kube-system -o=custom-columns=NAME:.metadata.name,IP:.status.podIP,STATUS:.status.phase
NAME                                          IP              STATUS
aws-cloud-controller-manager-h6lth            172.30.43.70    Running
aws-node-7k8zb                                172.30.33.224   Running
aws-node-hfrd5                                172.30.43.70    Running
aws-node-jqxsf                                172.30.79.8     Running
coredns-6897c49dc4-rp5d9                      172.30.75.143   Running
coredns-6897c49dc4-zs7jt                      172.30.60.153   Running
coredns-autoscaler-5685d4f67b-pc257           172.30.95.96    Running
dns-controller-b9c5c9476-7fnr9                172.30.43.70    Running
ebs-csi-controller-54fb95868b-cck46           172.30.46.73    Running
ebs-csi-node-4q7x2                            172.30.92.39    Running
ebs-csi-node-tf499                            172.30.37.137   Running
ebs-csi-node-z8fmh                            172.30.44.191   Running
etcd-manager-events-i-0d547c750bb8c97c1       172.30.43.70    Running
etcd-manager-main-i-0d547c750bb8c97c1         172.30.43.70    Running
kops-controller-j5xl7                         172.30.43.70    Running
kube-apiserver-i-0d547c750bb8c97c1            172.30.43.70    Running
kube-controller-manager-i-0d547c750bb8c97c1   172.30.43.70    Running
kube-proxy-i-014efbf2d513ddb46                172.30.33.224   Running
kube-proxy-i-05bf2ec120f5a7a9e                172.30.79.8     Running
kube-proxy-i-0d547c750bb8c97c1                172.30.43.70    Running
kube-scheduler-i-0d547c750bb8c97c1            172.30.43.70    Running

# kops 클러스터 정보 확인
kops get cluster
[root@kops-ec2 ~]# kops get cluster
NAME                    CLOUD   ZONES
sparkandassociates.net  aws     ap-northeast-2a,ap-northeast-2c

kops get cluster -o yaml
kops get cluster -o yaml | yh
...

# 인스턴스그룹 정보 확인
kops get ig

[root@kops-ec2 ~]# kops get ig
NAME                    ROLE    MACHINETYPE     MIN     MAX     ZONES
master-ap-northeast-2a  Master  t3.medium       1       1       ap-northeast-2a
nodes-ap-northeast-2a   Node    t3.medium       1       1       ap-northeast-2a
nodes-ap-northeast-2c   Node    t3.medium       1       1       ap-northeast-2c

kops get ig -o yaml
kops get ig -o yaml | yh
...

# 인스턴스 정보 확인
kops get instances
kops get instances -o yaml | yh
...

# 자동 완성 및 alias 축약 설정
source <(kubectl completion bash)
echo 'source <(kubectl completion bash)' >> ~/.bashrc
echo 'alias k=kubectl' >> ~/.bashrc
echo 'complete -F __start_kubectl k' >> ~/.bashrc
exit
exit

# 클러스터 정보 확인
k cluster-info
k cluster-info dump

# 노드 정보 확인
k get nodes -v6
[root@kops-ec2 ~]# k get nodes -v6
I0310 14:57:05.685877    5468 loader.go:373] Config loaded from file:  /root/.kube/config
I0310 14:57:05.706084    5468 round_trippers.go:553] GET https://api.sparkandassociates.net/api/v1/nodes?limit=500 200 OK in 11 milliseconds
NAME                  STATUS   ROLES           AGE   VERSION
i-014efbf2d513ddb46   Ready    node            19m   v1.24.10
i-05bf2ec120f5a7a9e   Ready    node            20m   v1.24.10
i-0d547c750bb8c97c1   Ready    control-plane   21m   v1.24.10

# CRI 컨테이너 런타임이 무엇인가요? => containerd
[root@kops-ec2 ~]# k get node -o wide
NAME                  STATUS   ROLES           AGE   VERSION    INTERNAL-IP     EXTERNAL-IP      OS-IMAGE             KERNEL-VERSION    CONTAINER-RUNTIME
i-014efbf2d513ddb46   Ready    node            24m   v1.24.10   172.30.33.224   43.200.253.62    Ubuntu 20.04.5 LTS   5.15.0-1031-aws   containerd://1.6.18
i-05bf2ec120f5a7a9e   Ready    node            24m   v1.24.10   172.30.79.8     15.164.251.18    Ubuntu 20.04.5 LTS   5.15.0-1031-aws   containerd://1.6.18
i-0d547c750bb8c97c1   Ready    control-plane   25m   v1.24.10   172.30.43.70    54.180.156.238   Ubuntu 20.04.5 LTS   5.15.0-1031-aws   containerd://1.6.18
[root@kops-ec2 ~]#


# 배포 완료 후 정보 확인
tree -L 1 ~/.kube
cat .kube/config
cat .kube/config | yh

# 네트워크/스토리지 상세 내용은 2주차에 진행
# volume(sc)
kubectl get sc

kubectl get sc kops-csi-1-21 -o jsonpath={.parameters} ;echo
{"encrypted":"true","type":"gp3"}

kubectl get sc kops-ssd-1-17 -o jsonpath={.parameters} ;echo
{"encrypted":"true","type":"gp2"}

# [master node] aws vpc cni log
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME ls /var/log/aws-routed-eni
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME cat /var/log/aws-routed-eni/plugin.log | jq
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME cat /var/log/aws-routed-eni/ipamd.log | jq

# pod ip 확인
kubectl get pod -n kube-system
kubectl get pod -n kube-system -owide

# [master node] iptables rules
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME sudo iptables -t nat -S

# [master node] 컨테이너 정보 확인
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME ps axf |grep /usr/bin/containerd
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME ps afxuwww

## [master node] tree 툴 설치
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME sudo apt install -y tree jq

# [master node] 볼륨/마운트 확인 : nvme1n1 과 nvme2n1 은 etcd-events, etcd-main 으로 사용
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME lsblk
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME df -hT --type=ext4
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/root      ext4   62G  5.5G   57G   9% /
/dev/nvme1n1   ext4   20G  167M   20G   1% /mnt/master-vol-03072035bffdd8252
/dev/nvme2n1   ext4   20G  170M   20G   1% /mnt/master-vol-0db4765644c7d6ecb

## [master node] nvme1n1,nvme1n2 디렉터리 확인
[root@kops-ec2 ~]# ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME tree /mnt/master-vol-0677e154ec157c895
/mnt/master-vol-0677e154ec157c895
├── data [error opening dir]
├── lost+found [error opening dir]
├── pki
│   └── aVtGb7iDprc8tloOBs_H7w
│       ├── clients
│       │   ├── ca.crt
│       │   ├── server.crt
│       │   └── server.key
│       └── peers
│           ├── ca.crt
│           ├── me.crt
│           └── me.key
└── state

6 directories, 7 files
[root@kops-ec2 ~]# ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME tree /mnt/master-vol-0409200c053e944be
/mnt/master-vol-0409200c053e944be
├── data [error opening dir]
├── lost+found [error opening dir]
├── pki
│   └── q6M2HIDRjxaUXhZIjaDiqw
│       ├── clients
│       │   ├── ca.crt
│       │   ├── server.crt
│       │   └── server.key
│       └── peers
│           ├── ca.crt
│           ├── me.crt
│           └── me.key
└── state

6 directories, 7 files

# [master node] kubelet 상태 확인
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME systemctl status kubelet

master node에 SSH 접속 후 확인

# [master node] SSH 접속
ssh -i ~/.ssh/id_rsa ubuntu@api.$KOPS_CLUSTER_NAME

# [master node] EC2 메타데이터 확인
TOKEN=`curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"`
echo $TOKEN

curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/
curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/latest/
curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/latest/meta-data/iam/security-credentials/
curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v 

ubuntu@i-0d547c750bb8c97c1:~$ curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/latest/meta-data/iam/security-credentials/masters.sparkandassociates.net| jq
{
  "Code": "Success",
  "LastUpdated": "2023-03-10T06:15:19Z",
  "Type": "AWS-HMAC",
  "AccessKeyId": "ASIA3NGL2HIFJRHH",
  "SecretAccessKey": "kxkMtLFvjhY+3/KgBYxSLg8VAKgGlV3",
  "Token": "IQoJb3JpZ2luX2VjELb//////////wEaDmFwLW5vcnRoZWFzdC0yIkYwRAIgaubwNUXQQ3DdSUbQqqxSEFCqSfTYWh212fr9na+QqEACIDkMJ/E70H5EnLIXFcAM+UCffhUHzMlsJx5O+ZiZs/CtKssFCG8QBBoMNzg0MjQ2MTY0Njk1IgyXiuhVIUJp2qzCVdoqqAVNQaXC1nn4kgFaRryjQxbkXclhFkvv8bCbRtAZKe8iY9Dme+OyBj91i+ez1NrGNT6bVJ65tVMIDAjs/zZbUU0d2c7l8yUDGPOkOT9fYPnyF1EQldrsFGFwfTsp+eDwd6l3hGRQSD764INMPvLS1WSUtiyJx4UXfScrJcNTjEANQlFvwrqf6nQtboPkCVGAzOwMk6rzxAadjSybgGvn434nktgmGoSoo9cQCTu872/1y2y6Jt2Jx5CYhup5+0Z/sj7Labq0us4XvKJ/IoTp8/78LgbemNCLXhL2oIBG9kwnDoGUObV0KDMsScnmkGx6lGSMPo/pH6RPfH3+Z/XHK1CMT7TMQnVNm6sIcGb9DJqmsb/uDOZf9QfIDvyojWLgDbttysLmEorkJv5AreydbH1uLukfEcdxGIDJ8Fh1sMJRX6aDwMgUrSU5MSLEDtrqVhDLeVR5kp/VLnybi0znZg1doLvXHRypKfcqJ4VZZ9R1CtW9TXSmlJ4AsQL2samA94HjS5SW8R7rZUHbm8gE/C5tOHRZsRAye1Qhl2TLO91c/KTIFWf+pQmX+bUWnQCOy2lcrU1gdKSxzqfGYBhJ2XBnL+KhkhCVK6w3CNtuq6CYivycfTSmqtO+naRf/WMER0J+qdukHpWW0ITvf0MBTziT2Y/dnlRgw1YE8ajTIE3ICYJPz0B27OZroYbzli6DKB38aLXk5gW0yOn1n3DH+Rs+q1LElFPCLhG2nMTsDcaEnvCNPeHEWN8cNH1nj+jCTZAMnKmYgBbuK8ywTiSCZIu8IdoFsNwn5c78w4GwYxfGySWk5hDmQwMlTwKyrXQhuQzCPlqugBjqyAV7tVXfDBzsM5xN6/MQCe+NIjQX3kZsodB5N+R1oF8x1MiXjezJtoVkDwN4lIhqTMGv7IS9naD3184Gez6sjNNqqk4DgJiu9keYEs6b+c0xRMsu7vHQdPD30kmeFg9jzn+W6u7rkCDvF09X73V5Q0A4UaDAT9VX+xA18IfL+piaWpsnO9RlGhgOLKrad1hyCEo6bbYTFZzBOpt2cV1dSbB/7ZyGOAXIkmRTfPI8jeN9c8gQ=",
  "Expiration": "2023-03-10T12:37:27Z"
}
ubuntu@i-0d547c750bb8c97c1:~$

워커 노드에 SSH 접속 후 확인 : 워커 노드의 public ip 로 SSH 접속

# 워커 노드 Public IP 확인
aws ec2 describe-instances --query "Reservations[*].Instances[*].{PublicIPAdd:PublicIpAddress,InstanceName:Tags[?Key=='Name']|[0].Value}" --filters Name=instance-state-name,Values=running --output table

[root@kops-ec2 ~]# aws ec2 describe-instances --query "Reservations[*].Instances[*].{PublicIPAdd:PublicIpAddress,InstanceName:Tags[?Key=='Name']|[0].Value}" --filters Name=instance-state-name,Values=running --output table
-----------------------------------------------------------------------------
|                             DescribeInstances                             |
+---------------------------------------------------------+-----------------+
|                      InstanceName                       |   PublicIPAdd   |
+---------------------------------------------------------+-----------------+
|  nodes-ap-northeast-2c.sparkandassociates.net           |  15.164.251.18  |
|  kops-ec2                                               |  13.124.5.243   |
|  master-ap-northeast-2a.masters.sparkandassociates.net  |  54.180.156.238 |
|  nodes-ap-northeast-2a.sparkandassociates.net           |  43.200.253.62  |
+---------------------------------------------------------+-----------------+
[root@kops-ec2 ~]#

# 워커 노드 Public IP 변수 지정
W1PIP=43.200.253.62
W2PIP=15.164.251.18

# 워커 노드 SSH 접속
ssh -i ~/.ssh/id_rsa ubuntu@$W1PIP
exit
ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP
exit

# 워커 노드 스토리지 확인
ssh -i ~/.ssh/id_rsa ubuntu@$W1PIP lsblk
ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP lsblk
ssh -i ~/.ssh/id_rsa ubuntu@$W1PIP df -hT -t ext4
ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP df -hT -t ext4

# 워커 노드 SSH 접속 후 node EC2 메타데이터 확인 : IAM Role - 링크
ssh -i ~/.ssh/id_rsa ubuntu@$W1PIP
ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP

TOKEN=`curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"`
echo $TOKEN
curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/latest/meta-data/iam/security-credentials/
curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/latest/meta-data/iam/security-credentials/nodes.sparkandassociates.net

ubuntu@i-014efbf2d513ddb46:~$ curl -s -H "X-aws-ec2-metadata-token: $TOKEN" –v http://169.254.169.254/latest/meta-data/iam/security-credentials/nodes.sparkandassociates.net
{
  "Code" : "Success",
  "LastUpdated" : "2023-03-10T06:10:31Z",
  "Type" : "AWS-HMAC",
  "AccessKeyId" : "ASIA3NG2PY4PIYD",
  "SecretAccessKey" : "ayl+IVkBokMOljz5mc/Tu0BRl8f8DwiW6",
  "Token" : "IQoJb3JpZ2luX2VjELb//////////wEaDmFwLW5vcnRoZWFzdC0yIkYwRAIgAwaSeuaHb3eZ/t36zlzUK+6Ry3kRNSysDIJW1qSdaVwCIFMUVOBAFiwArt95lZSXociouVxvmUggIP2pNXADxrF4KswFCG8QBBoMNzg0MjQ2MTY0Njk1Igxzi3uq+jhZ2jJgoukqqQU8EX/9nTjNQNDrzvF0W+X+/VCH+OxawhWWKqKlsb3KKlk1d9CyVp1+rNJ1vhL19FV9EFnGPdfMJ95kc4RXuAzu99vwhetK4s52SBpTSC60btNSouCoJ1Oyzf6uhYuRHQx8YykM9URvesQupakVo6KHfrcd6SV9khXPcbA9OmoDsrb4u7pT8e2bte3LPMqogt9Cc/OAGm6vk6V7bNIgJlNHs7+ypgcW+BEVeNHc0al0mAjzE/Ctuo8eqTX8YgS/FC2F3hZ14Hr+jNsGf3tUtx06sq9pk74HpGHbiqmIggtbOzmvPWM2tNGwIHlIbLpEGRHu4ipAgXOLDpQbdOqs+IdxrXoQ6F60fAWpp+65jVjDmVbn8jbBpfPTPJT5U0BeGPp8vk2CMUKxHZQOBU+GBdMdHw0J892J1eZUcOU5XAwDF3Z4ijgzOpYtM1AcO2WZ/V5ywkWPckHM3huy7BoXiPrs26YqStwFmbq9c+66OARjMGU3X8XvrmbMbK8vV3ZSAcY7DdUQ3biRAQg8OVggfTPsA+UV6BVt/3kRUrinre5H09WS38Zn/Kq7jDM1EzsGd46p12byRZfh9Se2UM6f7bF9SuyNPdC1V+JXTrcxHRpA7cJbSEsJ99tgpqwo5vCWcDIAZZVZMcOJN7itknP+Jas8IIrp+ksSO5GiVljy6W+FwcYfaXc4RM3vvtPwYSyL6VtY/ul0oVjQhA6vivw6pWjmua53sxgpeQimFChrl9DFrxn/BKMuDzyoqjTpoXYJcOZftGWZqgEsZ36orsFTeGyWcEWLU/h7EML9YOC/VPuT+DgA3YD+lL5oiwmCfM4w7ZOroAY6sgFob5AqzlL5OuwLUavN0TW5EjZaG7p3PIDVtTaRmBkJOU2ayYMtZTvuVPDYGbDTad4aObuCtQ8kNh21M05Yyi93arWrxYwbhs+NW/L8Cj31Y/rsNaZf+SSd/HugzJJY4bKoKV4hh8AUlV2uQdckE7vLAfAqeuqPms+OaT8dqAUmvy4oFGOUZ7Uv+gjbt6Z2AmXkg5Wb/pbNYm0HtDHG6AOyqt33+ZxEbwLr1LqbNIrrhhFI",
  "Expiration" : "2023-03-10T12:36:16Z"

서비스 배포 테스트 (마리오게임 디플로이)

# 수퍼마리오 디플로이먼트 배포
curl -s -O https://raw.githubusercontent.com/gasida/PKOS/main/1/mario.yaml
k apply -f mario.yaml
cat mario.yaml | yh

# 배포 확인 : CLB 배포 확인 >> 5분 이상 소요
k get deploy,svc,ep mario
watch kubectl get svc mario
NAME    TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
mario   LoadBalancer   100.69.37.166   <pending>     80:31785/TCP   111s

# classic LB가 생성되면 아래와 같이 external-ip 에 해당 LB 주소가 보여짐.
NAME    TYPE           CLUSTER-IP      EXTERNAL-IP                                                                   PORT(S)        AGE
mario   LoadBalancer   100.69.37.166   af928db1a812d464895ff5b54fc8861b-208157803.ap-northeast-2.elb.amazonaws.com   80:31785/TCP   4m36s

pending 상태인데, 이유는 AWS LB가 만들어 지는데 약 5분 소요됨.

# 마리오 게임 접속 : CLB 주소로 웹 접속
k get svc mario -o jsonpath={.status.loadBalancer.ingress[0].hostname} | awk '{ print "Maria URL = http://"$1 }'



[root@kops-ec2 ~]# k get deploy
NAME    READY   UP-TO-DATE   AVAILABLE   AGE
mario   0/1     1            0           4s
[root@kops-ec2 ~]# k get svc
NAME         TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
kubernetes   ClusterIP      100.64.0.1      <none>        443/TCP        52m
mario        LoadBalancer   100.69.37.166   <pending>     80:31785/TCP   8s
[root@kops-ec2 ~]# k get ep
NAME         ENDPOINTS           AGE
kubernetes   172.30.43.70:443    53m
mario        172.30.38.69:8080   15s

ExternalDNS

K8S 서비스/인그레스 생성 시 도메인을 설정하면, AWS(Route 53), Azure(DNS), GCP(Cloud DNS) 에 A 레코드(TXT 레코드)로 자동 생성/삭제


(https://edgehog.blog/a-self-hosted-external-dns-resolver-for-kubernetes-111a27d6fc2c)

ExterrnalDNS Addon 설치

# https://github.com/kubernetes-sigs/external-dns
# 모니터링
watch -d kubectl get pod -A

# 정책 생성 -> 마스터/워커노드에 정책 연결
curl -s -O https://s3.ap-northeast-2.amazonaws.com/cloudformation.cloudneta.net/AKOS/externaldns/externaldns-aws-r53-policy.json
aws iam create-policy --policy-name AllowExternalDNSUpdates --policy-document file://externaldns-aws-r53-policy.json

[root@kops-ec2 ~]# aws iam create-policy --policy-name AllowExternalDNSUpdates --policy-document file://externaldns-aws-r53-policy.json
{
    "Policy": {
        "PolicyName": "AllowExternalDNSUpdates",
        "PolicyId": "ANPA3NL4NUOGGZDI",
        "Arn": "arn:aws:iam::7842464695:policy/AllowExternalDNSUpdates",
        "Path": "/",
        "DefaultVersionId": "v1",
        "AttachmentCount": 0,
        "PermissionsBoundaryUsageCount": 0,
        "IsAttachable": true,
        "CreateDate": "2023-03-10T06:40:32+00:00",
        "UpdateDate": "2023-03-10T06:40:32+00:00"
    }
}


# ACCOUNT_ID 변수 지정
export ACCOUNT_ID=$(aws sts get-caller-identity --query 'Account' --output text)

# EC2 instance profiles 에 IAM Policy 추가(attach)
aws iam attach-role-policy --policy-arn arn:aws:iam::$ACCOUNT_ID:policy/AllowExternalDNSUpdates --role-name masters.$KOPS_CLUSTER_NAME
aws iam attach-role-policy --policy-arn arn:aws:iam::$ACCOUNT_ID:policy/AllowExternalDNSUpdates --role-name nodes.$KOPS_CLUSTER_NAME

[root@kops-ec2 ~]# aws iam attach-role-policy --policy-arn arn:aws:iam::$ACCOUNT_ID:policy/AllowExternalDNSUpdates --role-name masters.$KOPS_CLUSTER_NAME
[root@kops-ec2 ~]# aws iam attach-role-policy --policy-arn arn:aws:iam::$ACCOUNT_ID:policy/AllowExternalDNSUpdates --role-name nodes.$KOPS_CLUSTER_NAME


# 설치
kops edit cluster
--------------------------
spec:
  externalDns:
    provider: external-dns
--------------------------

  5 apiVersion: kops.k8s.io/v1alpha2
  6 kind: Cluster
  7 metadata:
  8   creationTimestamp: "2023-03-10T05:32:31Z"
  9   name: sparkandassociates.net
 10 spec:
 11   externalDns:
 12     provider: external-dns
 13   api:
 14     dns: {}
 15   authorization:
 16     rbac: {}
 

# 업데이트 적용
kops update cluster --yes && echo && sleep 3 && kops rolling-update cluster

[root@kops-ec2 ~]# kops update cluster --yes && echo && sleep 3 && kops rolling-update cluster

*********************************************************************************

A new kubernetes version is available: 1.24.11
Upgrading is recommended (try kops upgrade cluster)

More information: https://github.com/kubernetes/kops/blob/master/permalinks/upgrade_k8s.md#1.24.11

*********************************************************************************

W0310 15:46:28.800558    5322 builder.go:230] failed to digest image "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.11.4"
W0310 15:46:29.264401    5322 builder.go:230] failed to digest image "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni-init:v1.11.4"
I0310 15:46:33.555789    5322 executor.go:111] Tasks: 0 done / 103 total; 48 can run
I0310 15:46:34.423479    5322 executor.go:111] Tasks: 48 done / 103 total; 21 can run
I0310 15:46:35.329983    5322 executor.go:111] Tasks: 69 done / 103 total; 28 can run
I0310 15:46:35.616911    5322 executor.go:111] Tasks: 97 done / 103 total; 3 can run
I0310 15:46:35.719104    5322 executor.go:111] Tasks: 100 done / 103 total; 3 can run
I0310 15:46:35.785512    5322 executor.go:111] Tasks: 103 done / 103 total; 0 can run
I0310 15:46:36.438277    5322 dns.go:238] Pre-creating DNS records
I0310 15:46:37.036912    5322 update_cluster.go:326] Exporting kubeconfig for cluster
kOps has set your kubectl context to sparkandassociates.net
W0310 15:46:37.052521    5322 update_cluster.go:350] Exported kubeconfig with no user authentication; use --admin, --user or --auth-plugin flags with `kops export kubeconfig`

Cluster changes have been applied to the cloud.


Changes may require instances to restart: kops rolling-update cluster


NAME                    STATUS  NEEDUPDATE      READY   MIN     TARGET  MAX     NODES
master-ap-northeast-2a  Ready   0               1       1       1       1       1
nodes-ap-northeast-2a   Ready   0               1       1       1       1       1
nodes-ap-northeast-2c   Ready   0               1       1       1       1       1

# 업데이트 후 external dns pod 생성됨
[root@kops-ec2 ~]# k get pod -n kube-system -l k8s-app=external-dns
NAME                            READY   STATUS    RESTARTS   AGE
external-dns-7657648665-bw2sg   1/1     Running   0          3m21s
[root@kops-ec2 ~]#

mario 서비스에 도메인 연결 실습

# CLB에 ExternanDNS 로 도메인 연결
k annotate service mario "external-dns.alpha.kubernetes.io/hostname=mario.$KOPS_CLUSTER_NAME"

# annotation 확인
[root@kops-ec2 ~]# k describe svc mario | grep -i anno
Annotations:              external-dns.alpha.kubernetes.io/hostname: mario.sparkandassociates.net
      
# 확인
dig +short mario.$KOPS_CLUSTER_NAME
kubectl logs -n kube-system -l k8s-app=external-dns

route53 등록된 Arecord 확인

https://www.whatsmydns.net/#A/{도메인}
이렇게 하면 글로벌 네임서버에 배포 현황을 실시간으로 확인가능하다.
https://www.whatsmydns.net/#A/mario.sparkandassociates.net

# 웹 접속 주소 확인 및 접속
echo -e "Maria Game URL = http://mario.$KOPS_CLUSTER_NAME"

# 도메인 체크
echo -e "My Domain Checker = https://www.whatsmydns.net/#A/mario.$KOPS_CLUSTER_NAME"

접속 확인.

트러블 슈팅

# 터미널1
watch kubectl get pod -owide

# 터미널2
kubectl get pod -w

# 디플로이먼트 배포
curl -s -O https://raw.githubusercontent.com/junghoon2/kube-books/main/ch05/busybox-deploy.yml
cat busybox-deploy.yml | sed -e 's/replicas: 10/replicas: 6/g' | kubectl apply -f -

# 워커 노드 Public IP 확인
aws ec2 describe-instances --query "Reservations[*].Instances[*].{PublicIPAdd:PublicIpAddress,InstanceName:Tags[?Key=='Name']|[0].Value}" --filters Name=instance-state-name,Values=running --output table
# 워커 노드 Public IP 변수 지정
W1PIP=<워커 노드 1 Public IP>
W2PIP=<워커 노드 2 Public IP>

# 워커 노드 스토리지 확인
ssh -i ~/.ssh/id_rsa ubuntu@$W1PIP df -hT -t ext4
ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP df -hT -t ext4

# 노드2에 디스크에 큰 파일 생성
ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP fallocate -l 110g 110g-file

# 노드2에 디스크에 용량 확인 >> 92% 넘김! 
[root@kops-ec2 ~]# ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP df -hT -t ext4
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/root      ext4  124G  114G   11G  92% /

# 파드 상태 확인 >> 노드2에서 쫓겨남 확인!
NAME                       READY   STATUS    RESTARTS   AGE    IP              NODE                  NOMINATED NODE   READINESS GATES
busybox-6b5c698b45-5mkp6   1/1     Running   0          44s    172.30.41.142   i-014efbf2d513ddb46   <none>           <none>
busybox-6b5c698b45-6kd24   1/1     Running   0          113s   172.30.34.14    i-014efbf2d513ddb46   <none>           <none>
busybox-6b5c698b45-8dpdf   1/1     Running   0          113s   172.30.49.31    i-014efbf2d513ddb46   <none>           <none>
busybox-6b5c698b45-d5kml   1/1     Running   0          113s   172.30.80.30    i-05bf2ec120f5a7a9e   <none>           <none>
busybox-6b5c698b45-gpxpl   1/1     Running   0          13s    172.30.36.132   i-014efbf2d513ddb46   <none>           <none>
busybox-6b5c698b45-k68gd   0/1     Error     0          113s   172.30.76.99    i-05bf2ec120f5a7a9e   <none>           <none>
busybox-6b5c698b45-mcvx2   1/1     Running   0          113s   172.30.41.191   i-014efbf2d513ddb46   <none>           <none>
busybox-6b5c698b45-thnnk   0/1     Error     0          113s   172.30.92.217   i-05bf2ec120f5a7a9e   <none>           <none>
mario-687bcfc9cc-cv4l4     1/1     Running   0          39m    172.30.38.69    i-014efbf2d513ddb46   <none>           <none>

# 노드의 이벤트이므로 클러스터 이벤트 확인
k get events
4m48s       Warning   Failed                   pod/nginx-19                    Failed to pull image "nginx:1.19.19": rpc error: code = NotFound desc = failed to pull and unpack image "docker.io/library/nginx:1.19.19": failed to resolve reference "docker.io/library/nginx:1.19.19": docker.io/library/nginx:1.19.19: not found
4m48s       Warning   Failed                   pod/nginx-19                    Error: ErrImagePull
5m2s        Normal    BackOff                  pod/nginx-19                    Back-off pulling image "nginx:1.19.19"
5m2s        Warning   Failed                   pod/nginx-19                    Error: ImagePullBackOff
4m39s       Normal    Pulling                  pod/nginx-19                    Pulling image "nginx:1.19"
4m33s       Normal    Pulled                   pod/nginx-19                    Successfully pulled image "nginx:1.19" in 6.166666328s
4m33s       Normal    Created                  pod/nginx-19                    Created container nginx-pod
4m33s       Normal    Started                  pod/nginx-19                    Started container nginx-pod
3m53s       Normal    Killing                  pod/nginx-19                    Stopping container nginx-pod

# disk 부족 이벤트 확인
k describe nodes

Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Fri, 10 Mar 2023 16:05:48 +0900   Fri, 10 Mar 2023 14:36:46 +0900   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Fri, 10 Mar 2023 16:05:48 +0900   Fri, 10 Mar 2023 14:36:46 +0900   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Fri, 10 Mar 2023 16:05:48 +0900   Fri, 10 Mar 2023 14:36:46 +0900   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Fri, 10 Mar 2023 16:05:48 +0900   Fri, 10 Mar 2023 14:37:37 +0900   KubeletReady                 kubelet is posting ready status. AppArmor enabled

[root@kops-ec2 ~]# k get events| grep FreeDiskSpaceFailed
4m34s       Warning   FreeDiskSpaceFailed      node/i-05bf2ec120f5a7a9e        failed to garbage collect required amount of images. Wanted to free 15788516966 bytes, but freed 0 bytes

# 삭제
[root@kops-ec2 ~]# k delete deploy busybox
deployment.apps "busybox" deleted
[root@kops-ec2 ~]# ssh -i ~/.ssh/id_rsa ubuntu@$W2PIP rm -rf 110g-file
[root@kops-ec2 ~]#

조치 프로세스 가이드

https://learnk8s.io/troubleshooting-deployments

마치며

우선 1주차를 막 참여하였지만,
업무나 개인스터디에서 얻기 힘든 지식 및 노하우를 그냥 전수 받은 느낌(?)
왜 스터디를 하는지 깨닫게 되었고, 집단지성의 위대함(?)도 다시한번 경험하게 되었다.
업무바쁜 핑계말고 밤에 좀더 달려봐야겠다..

profile
Hello world

0개의 댓글