모니터링용 batch 개발 - 2

zzery·2022년 3월 20일

기타

목록 보기
2/7

사전 정의

사용 방법

  • 모니터링 대상 정보를 config.yml에 작성
  • 서버 실행 후 모니터링

서버 정보

  • healthcheck: /healthz
  • listen port: 38080

알람 조건

  • 연결이 안될 때 (=3번 시도하고 실패일 때)
  • 응답 코드가 원하는 상태 코드가 아닐 때
  • 상태 코드 정의 안한 경우, 응답 코드가 200이 아닐 때

사용 예시

config.yml 예시

urls:
  # test server 1
  - url: http://localhost:5051
    # name: "check server1" # 이름 정의 안하면 url = name
    status_code: 200 
    slack_token: "TOKEN/TOKEN/TOKEN" # for eg; TOKEN/TOKEN/TOKEN
    scheduler: "@every 2s"

  # test server 2
  - name: "check server2"
    url: http://localhost:5052
    status_code: 200
    slack_token: "TOKEN/TOKEN/TOKEN" # for eg; TOKEN/TOKEN/TOKEN
    scheduler: "@every 2s"

  # test server 3 (nginx)
  - name: "nginx server"
    url: http://localhost:8080/no
    # status_code: 200 # 상태 코드 정의 안하면 기본 200
    slack_token: "TOKEN/TOKEN/TOKEN" # for eg; TOKEN/TOKEN/TOKEN
    scheduler: "@every 20m"

모니터링 예시

❯ go run main.go
16:45:51 :: DEBUG  default server name defined -- http://localhost:5051
16:45:51 :: DEBUG  Initializing APIs
16:45:51 :: DEBUG  HTTP start :: listening port: 38080

 ┌───────────────────────────────────────────────────┐
 │                   Fiber v2.29.0                   │
 │              http://127.0.0.1:38080               │
 │      (bound on host 0.0.0.0 and port 38080)       │
 │                                                   │
 │ Handlers ............. 3  Processes ........... 1 │
 │ Prefork ....... Disabled  PID ............. 34086 │
 └───────────────────────────────────────────────────┘

16:45:53 :: DEBUG  [200] -- check server2
16:45:53 :: DEBUG  [200] -- http://localhost:5051
16:45:55 :: DEBUG  [200] -- check server2
16:45:55 :: DEBUG  [200] -- http://localhost:5051
# ...

# 2번 서버 down
16:46:01 :: DEBUG  [200] -- http://localhost:5051
16:46:01 :: ERROR  1th Retry to check server2 -- 100ms
16:46:01 :: ERROR  2th Retry to check server2 -- 115.651925ms
16:46:01 :: ERROR  3th Retry to check server2 -- 129.090855ms
16:46:01 :: ERROR  Failed to Connect -- check server2 (http://localhost:5052)
16:46:01 :: NOTICE  Sending Message to Slack: {"text":"💥 [Connection Failed] check server2 -- http://localhost:5052"}

# 1번 서버 down
16:46:13 :: DEBUG  [200] -- check server2
16:46:13 :: ERROR  1th Retry to http://localhost:5051 -- 100ms
16:46:13 :: ERROR  2th Retry to http://localhost:5051 -- 138.065718ms
16:46:13 :: ERROR  3th Retry to http://localhost:5051 -- 195.417452ms
16:46:13 :: ERROR  Failed to Connect -- http://localhost:5051 (http://localhost:5051)
16:46:13 :: NOTICE  Sending Message to Slack: {"text":"💥 [Connection Failed] http://localhost:5051 -- http://localhost:5051"}
16:46:15 :: DEBUG  [200] -- check server2
16:46:15 :: ERROR  1th Retry to http://localhost:5051 -- 100ms
16:46:15 :: ERROR  2th Retry to http://localhost:5051 -- 128.303415ms
16:46:15 :: DEBUG  [200] -- http://localhost:5051 # 실패 조건 전에 연결 성공한 경우
16:46:17 :: DEBUG  [200] -- http://localhost:5051
16:46:17 :: DEBUG  [200] -- check server2

# 응답 코드에 문제가 있을 때
16:57:25 :: ERROR  [404] -- nginx server (http://localhost:8080/no)
16:57:25 :: NOTICE  Sending Message to Slack: {"text":"💔 [404] nginx server -- http://localhost:8080/no"}

모니터링 서버의 헬스체크

curl 0.0.0.0:38080/healthz
OK%

# 로그
17:00:32 :: DEBUG  [200] -- http://localhost:5051
17:00:32 :: DEBUG  [200] -- check server2
17:00:33 :: DEBUG  healthcheck -- 127.0.0.1 # << 헬스체크 확인

k8s에 띄웠을 때

08:16:41 :: DEBUG  default server name defined -- http://[IP]:5051
08:16:41 :: DEBUG  Initializing APIs
08:16:41 :: DEBUG  HTTP start :: listening port: 38080

 ┌───────────────────────────────────────────────────┐ 
 │                   Fiber v2.29.0                   │ 
 │              http://127.0.0.1:38080               │ 
 │      (bound on host 0.0.0.0 and port 38080)       │ 
 │                                                   │ 
 │ Handlers ............. 3  Processes ........... 1 │ 
 │ Prefork ....... Disabled  PID ................. 1 │ 
 └───────────────────────────────────────────────────┘ 

08:16:51 :: DEBUG  [200] -- http://[IP]:5051
08:16:51 :: DEBUG  [200] -- check server2
08:17:00 :: INFO  healthcheck -- 10.1.0.1
08:17:00 :: INFO  healthcheck -- 10.1.0.1
08:17:01 :: DEBUG  [200] -- http://[IP]:5051
08:17:01 :: DEBUG  [200] -- check server2
08:17:10 :: INFO  healthcheck -- 10.1.0.1
08:17:11 :: ERROR  [404] -- nginx server (http://[IP]:8080/no)
08:17:11 :: NOTICE  Sending Message to Slack: {"text":"💔 [404] nginx server -- http://[IP]:8080/no"}

# ...

08:17:51 :: DEBUG  [200] -- http://[IP]:5051
08:17:53 :: ERROR  1th Retry to check server2 -- 100ms
08:17:55 :: ERROR  2th Retry to check server2 -- 129.310185ms
08:17:57 :: ERROR  3th Retry to check server2 -- 303.725402ms
08:17:57 :: ERROR  Failed to Connect -- check server2 (http://[IP]:5052)
08:17:57 :: NOTICE  Sending Message to Slack: {"text":"💥 [Connection Failed] check server2 -- http://[IP]:5052"}
08:18:00 :: INFO  healthcheck -- 10.1.0.1
08:18:01 :: DEBUG  [200] -- http://[IP]:5051
08:18:03 :: ERROR  1th Retry to check server2 -- 100ms
08:18:05 :: ERROR  2th Retry to check server2 -- 186.533501ms
08:18:07 :: ERROR  3th Retry to check server2 -- 309.015749ms
08:18:07 :: ERROR  Failed to Connect -- check server2 (http://[IP]:5052)
08:18:07 :: NOTICE  Sending Message to Slack: {"text":"💥 [Connection Failed] check server2 -- http://[IP]:5052"}
08:18:10 :: INFO  healthcheck -- 10.1.0.1

앞으로 해야될 것

  • configMap 분리
profile
이 블로그의 모든 글은 수제로 짜여져 있습니다...

0개의 댓글