[MySQL] Percona for MySQL(InnoDB Cluster)을 Orchestrator에 등록하기

Dong yeong Kim·2023년 11월 20일

group replication mysql orchestrator

DBMS

목록 보기

14/15

OS : Rocky Linux 8.8 64bit

MySQL : Percona for MySQL 8.0.34

InnoDB Cluster : 1(Primary), 2(Secondary) (Single Primary)

node1(DB, Router) : 10.64.70.21
node2(DB) : 10.64.70.22
node3(DB) : 10.64.70.23

안녕하세요, 이번 포스팅은 ReplicaSet을 Orchestrator에 등록하기에 이어, Percona for MySQL로 구성한 InnoDB Cluster를 Orchestrator에 등록하겠습니다.

Orchestrator 2.3 버전 이후로, Group Replication 또한 Orchestrator에 등록이 가능해졌습니다. 다만, Failover 등 여러가지 기능은 지원되지 않으며 그저 모니터링의 수준으로만 등록이 가능합니다. 자세한 사항은 아래 Orchestrator Github를 참고해주세요.
https://github.com/openark/orchestrator/blob/master/docs/faq.md#does-orchestrator-support-mysql-group-replication

정리하자면 아래와 같습니다.

Orchestrator는 하나의 Group Replication의 노드들을 하나의 클러스터 그룹으로 인식한다.

클러스터 그룹 내 노드들은 기존의 Replication 처럼 재배치(Role Change 등)를 할 수 없다.

실패한(status) 노드들은 다른 구성원(단일 노드)로 배치된다.

사실 이러한 기능들만 있다면 굳이 Orchestrator에 InnoDB Cluster를 구축할 필요가 있을지만은, 하나의 Cluster가 아닌 여러개의 Cluster를 통합적으로 모니터링 함에 있어서는 중요한 지표가 될 수 있습니다.

Orchestrator 등록하기

지난 포스팅에 Cluster 상태를 보겠습니다.

 MySQL  localhost:33060+ ssl  JS > cl.status()
{
    "clusterName": "testCluster",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "10.64.70.21:33061",
        "ssl": "REQUIRED",
        "status": "OK",
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
        "topology": {
            "10.64.70.21:33061": {
                "address": "10.64.70.21:33061",
                "memberRole": "PRIMARY",
                "mode": "R/W",
                "readReplicas": {},
                "replicationLag": "applier_queue_applied",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.34"
            },
            "10.64.70.22:33061": {
                "address": "10.64.70.22:33061",
                "memberRole": "SECONDARY",
                "mode": "R/O",
                "readReplicas": {},
                "replicationLag": "applier_queue_applied",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.34"
            },
            "10.64.70.23:33061": {
                "address": "10.64.70.23:33061",
                "memberRole": "SECONDARY",
                "mode": "R/O",
                "readReplicas": {},
                "replicationLag": "applier_queue_applied",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.34"
            }
        },
        "topologyMode": "Single-Primary"
    },
    "groupInformationSourceMember": "10.64.70.21:33061"
}

모든 노드가 정상적이며, Orchestrator 실행 후 Discover 해보겠습니다.

Orchestrator 실행

[root@vm2-1 orchestrator]# nohup ./orchestrator http > orc.log &
[1] 47265
[root@vm2-1 orchestrator]# nohup: ignoring input and redirecting stderr to stdout

[root@vm2-1 orchestrator]# ps -ef | grep orc
root       47265   46977  0 08:30 pts/1    00:00:00 ./orchestrator http
root       47272   46977  0 08:31 pts/1    00:00:00 grep --color=auto orc

[root@vm2-1 orchestrator]# tail -50 orc.log
2023-11-20 08:30:59 DEBUG Connected to orchestrator backend: orchestrator:?@tcp(10.64.70.11:3306)/orchestrator?timeout=1s&readTimeout=30s&rejectReadOnly=false&interpolateParams=true
2023-11-20 08:30:59 DEBUG Orchestrator pool SetMaxOpenConns: 128
2023-11-20 08:30:59 DEBUG Initializing orchestrator
2023-11-20 08:30:59 INFO Connecting to backend 10.64.70.11:3306: maxConnections: 128, maxIdleConns: 32
2023-11-20 08:30:59 INFO Starting Discovery
2023-11-20 08:30:59 INFO Registering endpoints
2023-11-20 08:30:59 INFO Starting HTTP listener on :3000
2023-11-20 08:30:59 INFO continuous discovery: setting up
2023-11-20 08:30:59 INFO continuous discovery: starting
2023-11-20 08:30:59 DEBUG Queue.startMonitoring(DEFAULT)
2023-11-20 08:31:01 DEBUG Waiting for 15 seconds to pass before running failure detection/recovery

Orchestrator 등록

[root@vm2-1 orchestrator]# ./orchestrator -c discover -i 10.64.70.21:33061
2023-11-20 08:31:42 DEBUG Hostname unresolved yet: 10.64.70.21
2023-11-20 08:31:42 DEBUG Cache hostname resolve 10.64.70.21 as 10.64.70.21
2023-11-20 08:31:42 DEBUG Connected to orchestrator backend: orchestrator:?@tcp(10.64.70.11:3306)/orchestrator?timeout=1s&readTimeout=30s&rejectReadOnly=false&interpolateParams=true
2023-11-20 08:31:42 DEBUG Orchestrator pool SetMaxOpenConns: 128
2023-11-20 08:31:42 DEBUG Initializing orchestrator
2023-11-20 08:31:42 INFO Connecting to backend 10.64.70.11:3306: maxConnections: 128, maxIdleConns: 32

이제 Orchestrator 웹 서버로 들어가 확인합니다.

이상하게 등록했던 Primary 노드만 관제되고, 나머지 Secondary 노드는 관제되지 않습니다.

Orchestrator Debugging

사실... 여기서 대단한 삽질과 아주 많은 시간을 쏟아 부었는데요.
한가지 희망이 존재했습니다. 바로 Percona에서 따로 Forked한 Percona Orchestrator가 존재했습니다.
근데, 사실 위 예시가 Percona Orchestrator 입니다.

먼저 기본적인 Oracle MySQL로 구성된 InnoDB Cluster를 등록하면 아래와 같이 등록됩니다.

기존 Replication과의 차이점은, 각 노드마다 상단바에 왕관 모양이 Primary라면 진하게, Secondary라면 옅게 표시가 됩니다.

저는 Percona for MySQL InnoDB Cluster를 등록하기 위해 아래와 같은 관점으로 접근했습니다.

1. Percona for MySQL이 Forked 되면서 바뀐 사항이 있는지?

Group Replication은 신기능이 아닌 5.7 버전부터 나온 사항이고, 릴리즈 노트를 봐도 Group Replication 관련 차이점은 발견되지 않음

2. Orchestrator가 기존 Replication은 어떻게 감지하는지?

General Log에서 아래와 같은 구문이 반복적으로 수행됨

2023-11-20T08:40:54.289304-05:00	  302 Query	select
      		substring_index(host, ':', 1) as slave_hostname
      	from
      		information_schema.processlist
      	where
          command IN ('Binlog Dump', 'Binlog Dump GTID')

3. Oracle MySQL로 구성한 InnoDB Cluster에서는 Orchestrator가 어떻게 감지하는지?

아래와 같은 SQL 구문이 반복적으로 수행됨

2023-11-20T08:52:35.35717-05:00	  302 Query	
SELECT
MEMBER_ID,
MEMBER_HOST,
MEMBER_PORT,
MEMBER_STATE,
MEMBER_ROLE,
@@global.group_replication_group_name,
@@global.group_replication_single_primary_mode
FROM
performance_schema.replication_group_members
WHERE
MEMBER_STATE != 'OFFLINE'

4. Percona for MySQL로 구성한 InnoDB Cluster는 위와 같은 SQL이 General Log에 감지되는지?

감지되지 않음

5. Percona for MySQL로 구성한 InnoDB Cluster에서 위와 같은 SQL문이 정상적인 데이터를 반환하는지?

 MySQL  localhost:33060+ ssl  SQL > SELECT
                                 -> MEMBER_ID,
                                 -> MEMBER_HOST,
                                 -> MEMBER_PORT,
                                 -> MEMBER_STATE,
                                 -> MEMBER_ROLE,
                                 -> @@global.group_replication_group_name,
                                 -> @@global.group_replication_single_primary_mode
                                 -> FROM
                                 -> performance_schema.replication_group_members
                                 -> WHERE
                                 -> MEMBER_STATE != 'OFFLINE';
+--------------------------------------+-------------+-------------+--------------+-------------+---------------------------------------+------------------------------------------------+
| MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | @@global.group_replication_group_name | @@global.group_replication_single_primary_mode |
+--------------------------------------+-------------+-------------+--------------+-------------+---------------------------------------+------------------------------------------------+
| 17294ad0-76ff-11ee-9b9e-0800275ac39e | 10.64.70.22 |       33061 | ONLINE       | SECONDARY   | c89845f2-87a6-11ee-99fb-08002734b9ef  |                                              1 |
| b902f580-812c-11ee-a0f5-0800275193ca | 10.64.70.23 |       33061 | ONLINE       | SECONDARY   | c89845f2-87a6-11ee-99fb-08002734b9ef  |                                              1 |
| dc36f96f-763d-11ee-ab3d-08002734b9ef | 10.64.70.21 |       33061 | ONLINE       | PRIMARY     | c89845f2-87a6-11ee-99fb-08002734b9ef  |                                              1 |
+--------------------------------------+-------------+-------------+--------------+-------------+---------------------------------------+------------------------------------------------+

정상적으로 반환됨

정리하면 아래와 같습니다.

Oracle MySQL이든, Percona for MySQL이든 같은 SQL문으로 탐지된다.
Replication과 Group Replication은 다른 SQL문으로 탐지한다.
Group Repliction은 상호간의 구현 방식이 동일하다.
Percona for MySQL로 Orchestrator 등록 시 해당 SQL문이 Query 되지 않는다.

Orchestrator Debugging(2)

저는 마지막 지푸라기를 잡는 심정으로 오픈소스의 강점을 살려 Orchestrator의 소스코드를 열었습니다.

Java에 친숙하던 저는 Go 언어를 처음으로 보는 계기가 되었습니다.

소스코드를 유심히 보던 중, Group Replication 관련 메서드를 찾았습니다.

(go/inst/instance_dao.go)
// PopulateGroupReplicationInformation obtains information about Group Replication  for this host as well as other hosts
// who are members of the same group (if any).
func PopulateGroupReplicationInformation(instance *Instance, db *sql.DB) error {
	// We exclude below hosts with state OFFLINE because they have joined no group yet, so there is no point in getting
	// any group replication information from them
	q := `
	SELECT
		MEMBER_ID,
		MEMBER_HOST,
		MEMBER_PORT,
		MEMBER_STATE,
		MEMBER_ROLE,
		@@global.group_replication_group_name,
		@@global.group_replication_single_primary_mode
	FROM
		performance_schema.replication_group_members
	WHERE
		MEMBER_STATE != 'OFFLINE'
	`
	rows, err := db.Query(q)
	if err != nil {
		_, grNotSupported := GroupReplicationNotSupportedErrors[err.(*mysql.MySQLError).Number]
		if grNotSupported {
			return nil // If GR is not supported by the instance, just exit
		} else {
			// If we got here, the query failed but not because the server does not support group replication. Let's
			// log the error
			return log.Error("There was an error trying to check group replication information for instance "+
				"%+v: %+v", instance.Key, err)
		}
	}

위와 같은 함수가 존재하며, 이제 이 함수를 호출하는 부분을 찾아야합니다.

해당 부분만 유일하게 위 함수를 호출하며, 코드는 아래와 같습니다.

	if instance.IsOracleMySQL() && !instance.IsSmallerMajorVersionByString("8.0") {
		err := PopulateGroupReplicationInformation(instance, db)
		if err != nil {
			goto Cleanup
		}
	}

여기서 한가지 의문점이 들었습니다.
if 문의 조건에서, instance가 IsOracleMySQL에 True이며, IsSmallerMajorVersionByString에서 8.0보다 작지 않으면 해당 함수를 호출시키는 조건입니다.

가장 의심이 가는 코드는 예상 하시다시피 아래와 같습니다.

    if instance.IsOracleMySQL() ...

이것을 좀 더 탐색하기 위해 IsOracleMySQL의 선언부를 찾았습니다.

// IsOracleMySQL checks whether this is an Oracle MySQL distribution
func (this *Instance) IsOracleMySQL() bool {
	if this.IsMariaDB() {
		return false
	}
	if this.IsPercona() {
		return false
	}
	if this.isMaxScale() {
		return false
	}
	if this.IsBinlogServer() {
		return false
	}
	return true
}

...



...


type Instance struct {
	Key                          InstanceKey
	InstanceAlias                string
	Uptime                       uint
	ServerID                     uint
	ServerUUID                   string
	Version                      string
	VersionComment               string
...


// IsPercona checks whether this is any version of Percona Server
func (this *Instance) IsPercona() bool {
	return strings.Contains(this.VersionComment, "Percona")
}


...

err = db.QueryRow("select @@global.hostname, ifnull(@@global.report_host, ''), @@global.server_id, @@global.version, @@global.version_comment, @@global.read_only, @@global.binlog_format, @@global.log_bin, @@global.log_slave_updates").Scan(
			&mysqlHostname, &mysqlReportHost, &instance.ServerID, &instance.Version, &instance.VersionComment, &instance.ReadOnly, &instance.Binlog_format, &instance.LogBinEnabled, &instance.LogReplicationUpdatesEnabled)
            
...

정리하면 아래와 같습니다.

가장 아래 쿼리문의 @@gloval.version으로 해당 versionComment 변수를 초기화한다.
바로 위 함수에서, versionComment에 "Percona"가 포함된다면, IsPercona()에서 True를 반환한다.
Instnace를 여러 정보로 초기화 하는데, 여기 VersionComment가 포함된다.
위 IsOracleMySQL() 함수에서 해당 DBMS가 어떤 DBMS인지 판단한다.

코드는 아주 쉽게 바꿀 수 있었습니다.

	if (instance.IsOracleMySQL() || instance.IsPercona()) && !instance.IsSmallerMajorVersionByString("8.0") {
		err := PopulateGroupReplicationInformation(instance, db)
		if err != nil {
			goto Cleanup
		}
	}

위와 같이 IsOracleMySQL() 이거나, IsPercona()임을 추가로 조건을 넣습니다.

그 후, 빌드된 Orchestrator로 교체합니다.

scp -r orchestrator 10.64.70.21:~/

...

README.md                                     100% 1061    93.2KB/s   00:00
fqdn_test_win.go                              100%  121    10.9KB/s   00:00
fqdn_posix.go                                 100%   89     6.1KB/s   00:00
errors.go                                     100%  942    71.3KB/s   00:00
modules.txt                                   100% 4840   182.5KB/s   00:00

[root@vm2-1 orchestrator]# ll
합계 124
-rw-r--r--.  1 root root 10330 10월 26 04:54 LICENSE
-rw-r--r--.  1 root root  4161 10월 26 04:54 README.md
-rw-r--r--.  1 root root     6 10월 26 04:54 RELEASE_VERSION
-rw-r--r--.  1 root root  1259 10월 26 04:54 Vagrantfile
drwxr-xr-x.  3 root root    17 10월 30 03:10 build
-rwxr-xr-x.  1 root root 12191 10월 26 04:54 build.sh
-rwxr-xr-x.  1 root root  1778 10월 26 04:54 bump_release_version_and_tag
drwxr-xr-x.  2 root root  4096 10월 26 04:54 conf
drwxr-xr-x.  3 root root  4096 10월 26 04:54 docker
drwxr-xr-x.  3 root root  4096 10월 26 04:54 docs
drwxr-xr-x.  4 root root    35 10월 26 04:54 etc
drwxr-xr-x. 21 root root  4096 10월 26 04:54 go
-rw-r--r--.  1 root root  2737 10월 26 04:54 go.mod
-rw-r--r--.  1 root root 21315 10월 26 04:54 go.sum
-rwxr-xr-x.  1 root root   468 10월 30 00:26 keepalived_control.sh
drwxr-xr-x.  7 root root    82 10월 26 04:54 resources
drwxr-xr-x.  2 root root    55 10월 26 04:54 run
drwxr-xr-x.  2 root root  4096 10월 26 04:54 script
-rw-r--r--.  1 root root 22074 10월 26 04:54 sur.txt
drwxr-xr-x.  4 root root    39 10월 26 04:54 tests
drwxr-xr-x.  2 root root  4096 10월 26 04:54 vagrant
drwxr-xr-x.  5 root root    77 10월 26 04:54 vendor
[root@vm2-1 orchestrator]# sh build.sh
[DEBUG] Building via go version go1.19.10 linux/amd64
build/bin/orchestrator
[DEBUG] binary copied to /tmp/orchestrator-release/orchestratorN2fgFg/orchestrator-cli/usr/bin
[DEBUG] binary copied to /tmp/orchestrator-release/orchestratorN2fgFg/orchestrator/usr/local/orchestrator
[DEBUG] orchestrator-client copied to orchestrator-client/
[DEBUG] Release version is 3.2.6 (c0a82c7244d367c3b2392dddfb9b37ef5d9c7eaa)
[DEBUG] Creating Linux Tar package
[DEBUG] Creating Distro full packages
Created package {:path=>"orchestrator-3.2.6-1.x86_64.rpm"}
epoch in Version is set {:epoch=>"1", :level=>:warn}
Created package {:path=>"orchestrator_3.2.6_amd64.deb"}
[DEBUG] Creating Distro cli packages
Created package {:path=>"orchestrator-cli-3.2.6-1.x86_64.rpm"}
epoch in Version is set {:epoch=>"1", :level=>:warn}
Created package {:path=>"orchestrator-cli_3.2.6_amd64.deb"}
[DEBUG] Creating Distro orchestrator-client packages
Created package {:path=>"orchestrator-client-3.2.6-1.x86_64.rpm"}
epoch in Version is set {:epoch=>"1", :level=>:warn}
Created package {:path=>"orchestrator-client_3.2.6_amd64.deb"}
[DEBUG] packeges:
[DEBUG] - orchestrator-3.2.6-1.x86_64.rpm
[DEBUG] - orchestrator-cli-3.2.6-1.x86_64.rpm
[DEBUG] - orchestrator-cli_3.2.6_amd64.deb
[DEBUG] - orchestrator-client-3.2.6-1.x86_64.rpm
[DEBUG] - orchestrator-client_3.2.6_amd64.deb
[DEBUG] - orchestrator_3.2.6_amd64.deb
[DEBUG] Done. Find releases in /tmp/orchestrator-release
[DEBUG] orchestrator build done; exit status is 0

/tmp/orchestrator-release에 빌드가 되었습니다.

[root@vm2-1 orchestrator-release]# ll
합계 57424
lrwxrwxrwx. 1 root root       44 11월 20 09:14 build -> /tmp/orchestrator-release/orchestratorN2fgFg
-rw-r--r--. 1 root root 11930760 11월 20 09:14 orchestrator-3.2.6-1.x86_64.rpm
-rw-r--r--. 1 root root 11961065 11월 20 09:14 orchestrator-3.2.6-linux-amd64.tar.gz
-rw-r--r--. 1 root root 11437145 11월 20 09:14 orchestrator-cli-3.2.6-1.x86_64.rpm
-rw-r--r--. 1 root root 11469934 11월 20 09:14 orchestrator-cli_3.2.6_amd64.deb
-rw-r--r--. 1 root root    16049 11월 20 09:14 orchestrator-client-3.2.6-1.x86_64.rpm
-rw-r--r--. 1 root root    10712 11월 20 09:14 orchestrator-client_3.2.6_amd64.deb
drwx------. 6 root root       88 11월 20 09:14 orchestratorN2fgFg
-rw-r--r--. 1 root root 11963846 11월 20 09:14 orchestrator_3.2.6_amd64.deb

여러가지 OS 버전에 맞는 패키지 파일들로 자동으로 빌드합니다.
우리는, 패키지 파일이 아닌 orchestrator 실행 파일만 필요합니다.

[root@vm2-1 orchestrator-release]# cd orchestrator5ZSrx2/
[root@vm2-1 orchestrator5ZSrx2]# ll
합계 0
drwxr-xr-x. 4 root root 28 11월 20 09:24 orchestrator
drwxr-xr-x. 3 root root 17 11월 20 09:24 orchestrator-cli
drwxr-xr-x. 3 root root 17 11월 20 09:24 orchestrator-client
drwxr-xr-x. 2 root root  6 11월 20 09:24 tmp
[root@vm2-1 orchestrator5ZSrx2]# cd orchestrator/usr/local/orchestrator/
[root@vm2-1 orchestrator]# ll
합계 20300
-rwxr-xr-x. 1 root root 20768848 11월 20 09:24 orchestrator
-rw-r--r--. 1 root root     5100 11월 20 09:23 orchestrator-sample-sqlite.conf.json
-rw-r--r--. 1 root root     5513 11월 20 09:23 orchestrator-sample.conf.json
drwxr-xr-x. 7 root root       82 11월 20 09:22 resources

orchestrator 실행 파일이 보입니다. 이것을 기존 Orchestrator 디렉터리에 복사 후, 재기동하여 다시 Discover 합니다.

[root@vm2-1 orchestrator]# mv orchestrator /usr/local/orchestrator/
mv: overwrite '/usr/local/orchestrator/orchestrator'? y
[root@vm2-1 orchestrator]# ll !$
ll /usr/local/orchestrator/
합계 42364
-rwxr-xr-x. 1 root root      599 10월 30 02:19 a.sh
-rwxr-xr-x. 1 root root      706 10월 30 00:48 keepalived_control.sh
-rw-r--r--. 1 root root    10665 11월 20 08:30 log.out
-rw-r--r--. 1 root root    45221 11월 20 09:15 orc.log
-rwxr-xr-x. 1 root root 20775952 11월 20 09:14 orchestrator
-rw-rw-r--. 1 root root     5100  6월  4  2020 orchestrator-sample-sqlite.conf.json
-rw-rw-r--. 1 root root     5513  5월 24  2021 orchestrator-sample.conf.json
-rw-r--r--. 1 root root     5547 11월  7 03:21 orchestrator.conf.json
-rw-r--r--. 1 root root     5547 11월  7 03:05 orchestrator.conf.json_bak
-rwxr-xr-x. 1 root root 20770816 10월 26 04:55 orchestrator_bak
drwxr-xr-x. 7 root root       82 10월 26 01:44 resources
-rw-r--r--. 1 root root      826 11월  9 02:37 switched.log
-rw-r--r--. 1 root root  1724364 11월  9 20:51 test.out

[root@vm2-1 orchestrator]# nohup ./orchestrator http > orc.log &
[1] 48032
[root@vm2-1 orchestrator]# nohup: ignoring input and redirecting stderr to stdout

[root@vm2-1 orchestrator]# ./orchestrator -c discover -i 10.64.70.21:33061
2023-11-20 09:26:28 DEBUG Hostname unresolved yet: 10.64.70.21
2023-11-20 09:26:28 DEBUG Cache hostname resolve 10.64.70.21 as 10.64.70.21
2023-11-20 09:26:28 DEBUG Connected to orchestrator backend: orchestrator:?@tcp(10.64.70.11:3306)/orchestrator?timeout=1s&readTimeout=30s&rejectReadOnly=false&interpolateParams=true
2023-11-20 09:26:28 DEBUG Orchestrator pool SetMaxOpenConns: 128
2023-11-20 09:26:28 DEBUG Initializing orchestrator
2023-11-20 09:26:28 INFO Connecting to backend 10.64.70.11:3306: maxConnections: 128, maxIdleConns: 32
2023-11-20 09:26:28 DEBUG Hostname unresolved yet: 10.64.70.22
2023-11-20 09:26:28 DEBUG Cache hostname resolve 10.64.70.22 as 10.64.70.22
2023-11-20 09:26:28 DEBUG Hostname unresolved yet: 10.64.70.23
2023-11-20 09:26:28 DEBUG Cache hostname resolve 10.64.70.23 as 10.64.70.23
10.64.70.21:33061