사전 정의
사용 방법
- 모니터링 대상 정보를
config.yml에 작성
- 서버 실행 후 모니터링
서버 정보
- healthcheck:
/healthz
- listen port:
38080
알람 조건
- 연결이 안될 때 (=3번 시도하고 실패일 때)
- 응답 코드가 원하는 상태 코드가 아닐 때
- 상태 코드 정의 안한 경우, 응답 코드가 200이 아닐 때
사용 예시
config.yml 예시
urls:
- url: http://localhost:5051
status_code: 200
slack_token: "TOKEN/TOKEN/TOKEN"
scheduler: "@every 2s"
- name: "check server2"
url: http://localhost:5052
status_code: 200
slack_token: "TOKEN/TOKEN/TOKEN"
scheduler: "@every 2s"
- name: "nginx server"
url: http://localhost:8080/no
slack_token: "TOKEN/TOKEN/TOKEN"
scheduler: "@every 20m"
모니터링 예시
❯ go run main.go
16:45:51 :: DEBUG default server name defined -- http://localhost:5051
16:45:51 :: DEBUG Initializing APIs
16:45:51 :: DEBUG HTTP start :: listening port: 38080
┌───────────────────────────────────────────────────┐
│ Fiber v2.29.0 │
│ http://127.0.0.1:38080 │
│ (bound on host 0.0.0.0 and port 38080) │
│ │
│ Handlers ............. 3 Processes ........... 1 │
│ Prefork ....... Disabled PID ............. 34086 │
└───────────────────────────────────────────────────┘
16:45:53 :: DEBUG [200] -- check server2
16:45:53 :: DEBUG [200] -- http://localhost:5051
16:45:55 :: DEBUG [200] -- check server2
16:45:55 :: DEBUG [200] -- http://localhost:5051
16:46:01 :: DEBUG [200] -- http://localhost:5051
16:46:01 :: ERROR 1th Retry to check server2 -- 100ms
16:46:01 :: ERROR 2th Retry to check server2 -- 115.651925ms
16:46:01 :: ERROR 3th Retry to check server2 -- 129.090855ms
16:46:01 :: ERROR Failed to Connect -- check server2 (http://localhost:5052)
16:46:01 :: NOTICE Sending Message to Slack: {"text":"💥 [Connection Failed] check server2 -- http://localhost:5052"}
16:46:13 :: DEBUG [200] -- check server2
16:46:13 :: ERROR 1th Retry to http://localhost:5051 -- 100ms
16:46:13 :: ERROR 2th Retry to http://localhost:5051 -- 138.065718ms
16:46:13 :: ERROR 3th Retry to http://localhost:5051 -- 195.417452ms
16:46:13 :: ERROR Failed to Connect -- http://localhost:5051 (http://localhost:5051)
16:46:13 :: NOTICE Sending Message to Slack: {"text":"💥 [Connection Failed] http://localhost:5051 -- http://localhost:5051"}
16:46:15 :: DEBUG [200] -- check server2
16:46:15 :: ERROR 1th Retry to http://localhost:5051 -- 100ms
16:46:15 :: ERROR 2th Retry to http://localhost:5051 -- 128.303415ms
16:46:15 :: DEBUG [200] -- http://localhost:5051
16:46:17 :: DEBUG [200] -- http://localhost:5051
16:46:17 :: DEBUG [200] -- check server2
16:57:25 :: ERROR [404] -- nginx server (http://localhost:8080/no)
16:57:25 :: NOTICE Sending Message to Slack: {"text":"💔 [404] nginx server -- http://localhost:8080/no"}
모니터링 서버의 헬스체크
❯ curl 0.0.0.0:38080/healthz
OK%
17:00:32 :: DEBUG [200] -- http://localhost:5051
17:00:32 :: DEBUG [200] -- check server2
17:00:33 :: DEBUG healthcheck -- 127.0.0.1
k8s에 띄웠을 때
08:16:41 :: DEBUG default server name defined -- http://[IP]:5051
08:16:41 :: DEBUG Initializing APIs
08:16:41 :: DEBUG HTTP start :: listening port: 38080
┌───────────────────────────────────────────────────┐
│ Fiber v2.29.0 │
│ http://127.0.0.1:38080 │
│ (bound on host 0.0.0.0 and port 38080) │
│ │
│ Handlers ............. 3 Processes ........... 1 │
│ Prefork ....... Disabled PID ................. 1 │
└───────────────────────────────────────────────────┘
08:16:51 :: DEBUG [200] -- http://[IP]:5051
08:16:51 :: DEBUG [200] -- check server2
08:17:00 :: INFO healthcheck -- 10.1.0.1
08:17:00 :: INFO healthcheck -- 10.1.0.1
08:17:01 :: DEBUG [200] -- http://[IP]:5051
08:17:01 :: DEBUG [200] -- check server2
08:17:10 :: INFO healthcheck -- 10.1.0.1
08:17:11 :: ERROR [404] -- nginx server (http://[IP]:8080/no)
08:17:11 :: NOTICE Sending Message to Slack: {"text":"💔 [404] nginx server -- http://[IP]:8080/no"}
08:17:51 :: DEBUG [200] -- http://[IP]:5051
08:17:53 :: ERROR 1th Retry to check server2 -- 100ms
08:17:55 :: ERROR 2th Retry to check server2 -- 129.310185ms
08:17:57 :: ERROR 3th Retry to check server2 -- 303.725402ms
08:17:57 :: ERROR Failed to Connect -- check server2 (http://[IP]:5052)
08:17:57 :: NOTICE Sending Message to Slack: {"text":"💥 [Connection Failed] check server2 -- http://[IP]:5052"}
08:18:00 :: INFO healthcheck -- 10.1.0.1
08:18:01 :: DEBUG [200] -- http://[IP]:5051
08:18:03 :: ERROR 1th Retry to check server2 -- 100ms
08:18:05 :: ERROR 2th Retry to check server2 -- 186.533501ms
08:18:07 :: ERROR 3th Retry to check server2 -- 309.015749ms
08:18:07 :: ERROR Failed to Connect -- check server2 (http://[IP]:5052)
08:18:07 :: NOTICE Sending Message to Slack: {"text":"💥 [Connection Failed] check server2 -- http://[IP]:5052"}
08:18:10 :: INFO healthcheck -- 10.1.0.1
앞으로 해야될 것