kubernetes shell-operator

cloud2000·2023년 9월 16일

1. 개요


BM이나 VM에 kubernetes 관리 tool로 사용되는 kubesphere(https://kubesphere.io)를 설치 tool로 사용되는 ks-installer(https://github.com/kubesphere/ks-installer) 의 기능을 확인하던 중에, shell-operator를 사용하여 kubesphere workload를 operator로 처리하는 부분이 있어서 분석해 본다.

shell-operator(https://github.com/flant/shell-operator)는 "A tool for running event-driven scripts in a Kubernetes cluster"로, 특정 k8s event에 따라 설정된 shell script를 실행하는 k8s operator 임.

2. Hooks


Hook은 kubernetes event를 받았을 경우 실행되는 bash script 또는 실행파일이다.
shell-operator가 최초 실행되면 아래 항목을 실행한다.

  • hooks 디렉토리에서 hook파일을 찾는다. hooks디렉토리는 --hooks-dir 옵션이나 SHELL_OPERATOR_HOOKS_DIR 환경변수에 지정할 수 있다. (기본값은 /hooks 임)
  • hooks 디렉토리에서 실행권한이 있는 파일은 hook파일로 간주한다. 발견된 hook 파일은 디렉토리, 파일명을 알파멧 순으로 정렬한다.
  • 해당 실행파일을 --config 플래그를 설정하여 실행한다. 이를 통해서 이벤트에 바인딩되는 항목에 대한 YAML이나 JSON파일을 구한다.
  • hook 구성이 성공하면 "main"이라는 작업큐에 onStartup hook로 채워진다.
  • 그 후 hooks 구성에 해당하는 kubernetes object에 대한 정보를 포함하는 Synchronization 바인딩 컨텍스트와 함께 "main"큐에 kubernetes hook로 채워진다. 이를 통해 hook 구성에 맞는 기 존재하는 k8s 객체를 수신한다.
  • Synchronization binding context에 대한 hook 실행후 , shell-operator는 본격적으로 구성된 바인딩 정보로 kubernetes 이벤트를 모니터링한다.
    • 각 monitor는 스냅샷을 저장한다.

아래는 k8s pod가 생성되는 event를 수신하여 해당 pod명을 출력하는 간단한 hook 파일이다.

/hooks/pods-hook.sh 파일 생성

#!/usr/bin/env bash

if [[ $1 == "--config" ]] ; then   # shell-operator가 --config flag를 포함하여 실행하면 원하는 이벤트 바인딩과 관련한 YAML파일 반환
  cat <<EOF
configVersion: v1
kubernetes:
- apiVersion: v1
  kind: Pod
  executeHookOnEvent:
  - Added
EOF
else                              # 바인딩 컨텍스트 정보와 함께 이벤트가 수신되면 원하는 로직 구현함.
  type=$(jq -r '.[0].type' $BINDING_CONTEXT_PATH)
  if [[ $type == "Event" ]] ; then
    podName=$(jq -r '.[0].object.metadata.name' $BINDING_CONTEXT_PATH)
    echo "Pod '${podName}' added"
  fi
fi
  • Dockerfile 생성
FROM ghcr.io/flant/shell-operator:latest
ADD hooks /hooks
  • image build & push
$ docker build -t cloud2000/shell-operator:monitor-pods .
$ docker push cloud2000/shell-operator:monitor-pods

shell-operator-test.yaml 파일의 내용

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: monitor-pods-acc
  namespace: example-monitor-pods

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: monitor-pods
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "watch", "list"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: monitor-pods
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: monitor-pods
subjects:
  - kind: ServiceAccount
    name: monitor-pods-acc
    namespace: example-monitor-pods
apiVersion: v1
kind: Pod
metadata:
  name: shell-operator
  namespace: example-monitor-pods
spec:
  containers:
  - name: shell-operator
    image: cloud2000/shell-operator:monitor-pods
    imagePullPolicy: Always
  serviceAccountName: monitor-pods-acc
  • shell-operator-test pod 생성 및 테스트로 busybox deployment 생성
$ kubectl -n example-monitor-pods apply -f shell-operator-test.yaml

$ kubectl create deploy busygox --image=busybox
  • shell-operator log 내용
// 최초 실행될때의 초기화 로그
{"level":"info","msg":"shell-operator main-a16ab23b-2023.08.28_12:20:59","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Debug endpoint listen on /var/run/shell-operator/debug.socket","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Listen on 0.0.0.0:9115","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric histogram shell_operator_kubernetes_client_request_latency_seconds","operator.component":"metricsStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric counter shell_operator_kubernetes_client_request_result_total","operator.component":"metricStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Kubernetes client is configured successfully with 'out-of-cluster' config","operator.component":"KubernetesAPIClient","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Kubernetes client is configured successfully with 'out-of-cluster' config","operator.component":"KubernetesAPIClient","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric counter shell_operator_live_ticks","operator.component":"metricStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric histogram shell_operator_tasks_queue_action_duration_seconds","operator.component":"metricsStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric gauge shell_operator_tasks_queue_length","operator.component":"metricStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric gauge shell_operator_kube_snapshot_objects","operator.component":"metricStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric histogram shell_operator_kube_jq_filter_duration_seconds","operator.component":"metricsStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric histogram shell_operator_kube_event_duration_seconds","operator.component":"metricsStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric counter shell_operator_kubernetes_client_watch_errors_total","operator.component":"metricStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric gauge shell_operator_hook_enable_kubernetes_bindings_seconds","operator.component":"metricStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric counter shell_operator_hook_enable_kubernetes_bindings_errors_total","operator.component":"metricStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric gauge shell_operator_hook_enable_kubernetes_bindings_success","operator.component":"metricStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric histogram shell_operator_hook_run_seconds","operator.component":"metricsStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric histogram shell_operator_hook_run_user_cpu_seconds","operator.component":"metricsStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric histogram shell_operator_hook_run_sys_cpu_seconds","operator.component":"metricsStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric gauge shell_operator_hook_run_max_rss_bytes","operator.component":"metricStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric counter shell_operator_hook_run_errors_total","operator.component":"metricStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric counter shell_operator_hook_run_allowed_errors_total","operator.component":"metricStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric counter shell_operator_hook_run_success_total","operator.component":"metricStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric counter shell_operator_task_wait_in_queue_seconds_total","operator.component":"metricStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Initialize hooks manager. Search for and load all hooks.","time":"2023-09-17T02:40:58Z"}
{"hook":"pods-hook.sh","level":"info","msg":"Load config from '/hooks/pods-hook.sh'","phase":"config","time":"2023-09-17T02:40:58Z"}
{"hook":"pods-hook.sh","level":"info","msg":"Loaded config: Watch k8s kinds: 'Pod'","phase":"config","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"start shell-operator","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"queue task EnableKubernetesBindings:::pods-hook.sh:EnableKubernetesBindings for hook pods-hook.sh","operator.component":"initMainQueue","time":"2023-09-17T02:40:58Z"}
{"binding":"","hook":"pods-hook.sh","level":"info","msg":"Enable kubernetes binding for hook","queue":"main","task":"EnableKubernetesBindings","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric histogram shell_operator_kubernetes_client_rate_limiter_latency_seconds","operator.component":"metricsStorage","time":"2023-09-17T02:40:58Z"}
{"binding":"","hook":"pods-hook.sh","level":"info","msg":"Kubernetes bindings for hook are enabled successfully, 1 tasks generated","queue":"main","task":"EnableKubernetesBindings","time":"2023-09-17T02:40:58Z"}
{"binding":"kubernetes","event":"kubernetes","hook":"pods-hook.sh","level":"info","msg":"Execute hook","queue":"main","task":"HookRun","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric histogram shell_operator_hook_run_sys_seconds","operator.component":"metricsStorage","time":"2023-09-17T02:40:58Z"}
{"level":"info","msg":"Create metric histogram shell_operator_hook_run_user_seconds","operator.component":"metricsStorage","time":"2023-09-17T02:40:58Z"}
{"binding":"kubernetes","event":"kubernetes","hook":"pods-hook.sh","level":"info","msg":"Hook executed successfully","queue":"main","task":"HookRun","time":"2023-09-17T02:40:58Z"}
{"binding":"kubernetes","event":"kubernetes","hook":"pods-hook.sh","level":"info","msg":"Unlock kubernetes.Event tasks","queue":"main","task":"HookRun","time":"2023-09-17T02:40:58Z"}

// busybox pod 생성될때의 로그
$ kubectl -n example-monitor-pods logs -f shell-operator
{"binding":"kubernetes","event.id":"c08bf394-8a59-4f21-bb42-8826b637bee3","level":"info","msg":"queue task HookRun:main:kubernetes:pods-hook.sh:kubernetes","queue":"main","time":"2023-09-17T02:47:26Z"}
{"binding":"kubernetes","event":"kubernetes","hook":"pods-hook.sh","level":"info","msg":"Execute hook","queue":"main","task":"HookRun","time":"2023-09-17T02:47:27Z"}
{"binding":"kubernetes","event":"kubernetes","hook":"pods-hook.sh","level":"info","msg":"Pod 'busygox-7c9f7ddff9-xzhhw' added","output":"stdout","queue":"main","task":"HookRun","time":"2023-09-17T02:47:27Z"}
{"binding":"kubernetes","event":"kubernetes","hook":"pods-hook.sh","level":"info","msg":"Hook executed successfully","queue":"main","task":"HookRun","time":"2023-09-17T02:47:27Z"}

3. kubersphere installer


그럼 ks-installer pod이 디렉토리 구성은 어떻게 되어 있을까 궁금해서 ks-installer의 Dockerfile을 살짝 변경하여 빌드 및 실행한 후 bash 쉘로 pod에 접속하여 디렉토리 구성을 살펴봤다.

/hooks
└── kubesphere
    ├── installRunner.py
    └── schedule.sh
    
/kubesphere
├── config
├── installer
│   └── roles
│       ├── check-result
│       ├── common
│       ├── download
│       ├── edgeruntime
│       ├── gatekeeper
│       ├── ks-auditing
│       ├── ks-core
│       ├── ks-devops
│       ├── ks-events
│       ├── ks-istio
│       ├── ks-logging
│       ├── ks-migration
│       ├── ks-monitor
│       ├── ks-multicluster
│       ├── ks-network
│       ├── kubesphere-defaults
│       ├── metrics-server
│       ├── openpitrix
├── playbooks
│   ├── alerting.yaml
│   ├── auditing.yaml
│   ├── common.yaml
│   ├── devops.yaml
│   ├── edgeruntime.yaml
│   ├── events.yaml
│   ├── gatekeeper.yaml
│   ├── gitlab.yaml
│   ├── harbor.yaml
│   ├── ks-config.yaml
│   ├── ks-core.yaml
│   ├── ks-migration.yaml
│   ├── logging.yaml
│   ├── metering.yaml
│   ├── metrics_server.yaml
│   ├── monitoring.yaml
│   ├── multicluster.yaml
│   ├── network.yaml
│   ├── notification.yaml
│   ├── openpitrix.yaml
│   ├── preinstall.yaml
│   ├── result-info.yaml
│   ├── servicemesh.yaml
│   └── telemetry.yaml
└── results
    └── env
        ├── cmdline
        └── extravars    

아쉽게도 ks-installer는 shell-operator v1.0.0-beta.5 를 base image로 사용중이다. 현재(23.9월 기준)가장 최신버전이 v1.3.1임으로 version 최신화가 필요할 듯 하다.

최초 기동시에 log가 아래처럼 일반 text로 보이는데 shell-operator v1.0.0-beta.5 이후로는 json format으로 변경되었다.

root@utcl:/etc/kubernetes# kubectl -n kubesphere-system logs -f ks-installer-556bd6cd47-c2v95
2023-09-11T09:54:22Z INFO     : shell-operator latest
2023-09-11T09:54:22Z INFO     : Use temporary dir: /tmp/shell-operator
2023-09-11T09:54:22Z INFO     : Initialize hooks manager ...
2023-09-11T09:54:22Z INFO     : Search and load hooks ...
2023-09-11T09:54:22Z INFO     : Load hook config from '/hooks/kubesphere/installRunner.py'
2023-09-11T09:54:22Z INFO     : HTTP SERVER Listening on 0.0.0.0:9115
2023-09-11T09:54:22Z INFO     : Load hook config from '/hooks/kubesphere/schedule.sh'
2023-09-11T09:54:22Z INFO     : Initializing schedule manager ...
2023-09-11T09:54:22Z INFO     : KUBE Init Kubernetes client
2023-09-11T09:54:22Z INFO     : KUBE-INIT Kubernetes client is configured successfully
2023-09-11T09:54:22Z INFO     : MAIN: run main loop
2023-09-11T09:54:22Z INFO     : MAIN: add onStartup tasks
2023-09-11T09:54:22Z INFO     : Running schedule manager ...
2023-09-11T09:54:22Z INFO     : QUEUE add all HookRun@OnStartup
2023-09-11T09:54:22Z INFO     : MSTOR Create new metric shell_operator_live_ticks
2023-09-11T09:54:22Z INFO     : MSTOR Create new metric shell_operator_tasks_queue_length
2023-09-11T09:54:22Z INFO     : GVR for kind 'ClusterConfiguration' is installer.kubesphere.io/v1alpha1, Resource=clusterconfigurations
2023-09-11T09:54:22Z INFO     : EVENT Kube event '3d1e7ecd-543b-4e23-8367-77dcfa4e99bf'
2023-09-11T09:54:22Z INFO     : QUEUE add TASK_HOOK_RUN@KUBE_EVENTS kubesphere/installRunner.py
2023-09-11T09:54:25Z INFO     : TASK_RUN HookRun@KUBE_EVENTS kubesphere/installRunner.py
2023-09-11T09:54:25Z INFO     : Running hook 'kubesphere/installRunner.py' binding 'KUBE_EVENTS' ...
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that
the implicit localhost does not match 'all'

PLAY [localhost] ***************************************************************

TASK [download : Generating images list] ***************************************
skipping: [localhost]

초기 실행되면 /hooks/kubesphere/installRunner.py의 main() 함수가 실행된다.
최초에는 shell-operator가 installRunner.py --config 옵션으로 kubesphere installer의 hook정보를 설정한 후 synchronize를 위해 옵션 없이 호출하게 되면 preInstallTasks()에서 ansible-playbook을 실행하게 된다.

ks hook 내용. ClusterConfiguration CR의 add, update시 .spec항목에 대한 사항만 처리한다.

{
	"onKubernetesEvent": [{
		"name": "Monitor clusterconfiguration",
		"kind": "ClusterConfiguration",
		"event": [ "add", "update" ],
		"objectName": "ks-installer",
		"namespaceSelector": {
			"matchNames": ["kubesphere-system"]
		},
		"jqFilter": ".spec",
		"allowFailure": false
	}]
}

이후 hook의 내용이 update되는 경우에도 동일하게 installRunner.py를 호출하여 진행된다.

def main():
    global privateDataDir, playbookBasePath, configFile, statusFile
	
    // shell-operator가 --config 옵션으로 호출하면 hook 정보를 반환함.
    if len(sys.argv) > 1 and sys.argv[1] == "--config":
        print(ks_hook)
        return

    if len(sys.argv) > 1 and sys.argv[1] == "--debug":
        privateDataDir = os.path.abspath('./results')
        playbookBasePath = os.path.abspath('./playbooks')
        configFile = os.path.abspath('./results/ks-config.json')
        statusFile = os.path.abspath('./results/ks-status.json')
        config.load_kube_config()
    else:
        config.load_incluster_config()

    if not os.path.exists(privateDataDir):
        os.makedirs(privateDataDir)

    api = client.CustomObjectsApi()
    // 이 부분은 kubesphere clusterconfiguration이 migration을 위한 것임.
    generate_new_cluster_configuration(api)
    
    // configuration을 생성한다.
    generateConfig(api)
    
    # execute preInstall tasks
    # 이제 ansible script로 옵션에 따라 kubesphere workload를 설치한다.
    preInstallTasks()
    resultState = getResultInfo()
    resultInfo(resultState, api)


if __name__ == '__main__':
    main()

4. Issues

  • kubesphere core component에 대한 HA 판단을 위한 ansible script에서 master node의 개수를 파악하기 위한 부분이 "master" -> "control-plane"으로 변경되어야 한다.
  • ks-installer/roles/ks-core/ks-core/tasks/main.yaml 내용중 해당 부분 발췌

- name: KubeSphere | Getting Kubernetes master num
  shell: >
    {{ bin_dir }}/kubectl get node | awk '{if(NR>1){print $3}}' | grep master |wc -l
  register: masters
  failed_when: false

- name: KubeSphere | Setting master num
  set_fact:
    master_num: "{{ masters.stdout }}"
  failed_when: false

- name: KubeSphere | Override master num
  set_fact:
    master_num: "3"
  failed_when: false
  when:
    - master_num is defined and (master_num == "1" or master_num == "0")
    - enableHA is defined and enableHA

- name: KubeSphere | Setting enableHA
  set_fact:
    enableHA: >-
      {% if masters is defined and masters.stdout is defined and masters.stdout != "0" and masters.stdout != "1" %}true{% else %}false{% endif %}
  when:
    - enableHA is not defined
profile
클라우드쟁이

0개의 댓글