레드햇에서 인수를 하여 제공하는 HA 솔루션이다. pacemaker도 있긴한데... 도입의 히스토리는 모르겠지만 음.. 아래와 같이 사용하였다.
- 고가용성(High Availability)은 시스템이나 서비스가 연속적으로 가동되어야 하는 중요한 역할을 할 때 중단 없이 작동
- 스틸아이(Steeleye)는 소프트웨어 회사로, 고가용성 및 재해 복구 솔루션을 중점으로 제공
[root@test #1]# lcdstatus -q LOCAL TAG ID STATE PRIO PRIMARY test#1.co.kr Groupware_test Groupware_test ISP 1 test#1.test.co.kr test#1.co.kr ip-xx.xxx.49.11 IP-xx.xxx.49.11 ISP 1 test#1.test.co.kr test#1.co.kr /attach /attach ISP 1 test#1.test.co.kr test#1.co.kr lvmlv20044 /dev/vg01/lvol03 ISP 1 test#1.test.co.kr test#1.co.kr lvmvg17138 lvmvg-vg01 ISP 1 test#1.test.co.kr test#1.co.kr dmmp17176 360060e80122d3a0050402d3a00000061 ISP 1 test#1.test.co.kr test#1.co.kr dmmp17282 360060e80122d3a0050402d3a00000060 ISP 1 test#1.test.co.kr test#1.co.kr dmmp17379 360060e80122d3a0050402d3a00000062 ISP 1 test#1.test.co.kr test#1.co.kr /mdata /mdata ISP 1 test#1.test.co.kr test#1.co.kr lvmlv23329 /dev/vg01/lvol02 ISP 1 test#1.test.co.kr test#1.co.kr lvmvg17138 lvmvg-vg01 ISP 1 test#1.test.co.kr test#1.co.kr dmmp17176 360060e80122d3a0050402d3a00000061 ISP 1 test#1.test.co.kr test#1.co.kr dmmp17282 360060e80122d3a0050402d3a00000060 ISP 1 test#1.test.co.kr test#1.co.kr dmmp17379 360060e80122d3a0050402d3a00000062 ISP 1 test#1.test.co.kr test#1.co.kr /mindex /mindex ISP 1 test#1.test.co.kr test#1.co.kr lvmlv17135 /dev/vg01/lvol04 ISP 1 test#1.test.co.kr test#1.co.kr lvmvg17138 lvmvg-vg01 ISP 1 test#1.test.co.kr test#1.co.kr dmmp17176 360060e80122d3a0050402d3a00000061 ISP 1 test#1.test.co.kr test#1.co.kr dmmp17282 360060e80122d3a0050402d3a00000060 ISP 1 test#1.test.co.kr test#1.co.kr dmmp17379 360060e80122d3a0050402d3a00000062 ISP 1 test#1.test.co.kr test#1.co.kr /app /app ISP 1 test#1.test.co.kr test#1.co.kr lvmlv25749 /dev/vg01/lvol01 ISP 1 test#1.test.co.kr test#1.co.kr lvmvg17138 lvmvg-vg01 ISP 1 test#1.test.co.kr test#1.co.kr dmmp17176 360060e80122d3a0050402d3a00000061 ISP 1 test#1.test.co.kr test#1.co.kr dmmp17282 360060e80122d3a0050402d3a00000060 ISP 1 test#1.test.co.kr test#1.co.kr dmmp17379 360060e80122d3a0050402d3a00000062 ISP 1 test#1.test.co.kr MACHINE NETWORK ADDRESSES/DEVICE STATE PRIO test#2.co.kr TCP 192.168.0.15/192.168.0.16 ALIVE 1 test#2.co.kr TCP 192.168.1.15/192.168.1.16 ALIVE 2 - -q 옵션을 빼면 자세히 출력 [root@test #2~]# lcdstatus -q LOCAL TAG ID STATE PRIO PRIMARY test#2.co.kr Groupware_test Groupware_test OSU 10 test#1.test.co.kr test#2.co.kr ip-xx.xxx.49.11 IP-xx.xxx.49.11 OSU 10 test#1.test.co.kr test#2.co.kr /app /app OSU 10 test#1.test.co.kr test#2.co.kr lvmlv23408 /dev/vg01/lvol01 OSU 10 test#1.test.co.kr test#2.co.kr lvmvg22781 lvmvg-vg01 OSU 10 test#1.test.co.kr test#2.co.kr dmmp17176 360060e80122d3a0050402d3a00000061 OSU 10 test#1.test.co.kr test#2.co.kr dmmp17282 360060e80122d3a0050402d3a00000060 OSU 10 test#1.test.co.kr test#2.co.kr dmmp17379 360060e80122d3a0050402d3a00000062 OSU 10 test#1.test.co.kr test#2.co.kr /attach /attach OSU 10 test#1.test.co.kr test#2.co.kr lvmlv23148 /dev/vg01/lvol03 OSU 10 test#1.test.co.kr test#2.co.kr lvmvg22781 lvmvg-vg01 OSU 10 test#1.test.co.kr test#2.co.kr dmmp17176 360060e80122d3a0050402d3a00000061 OSU 10 test#1.test.co.kr test#2.co.kr dmmp17282 360060e80122d3a0050402d3a00000060 OSU 10 test#1.test.co.kr test#2.co.kr dmmp17379 360060e80122d3a0050402d3a00000062 OSU 10 test#1.test.co.kr test#2.co.kr /mdata /mdata OSU 10 test#1.test.co.kr test#2.co.kr lvmlv23273 /dev/vg01/lvol02 OSU 10 test#1.test.co.kr test#2.co.kr lvmvg22781 lvmvg-vg01 OSU 10 test#1.test.co.kr test#2.co.kr dmmp17176 360060e80122d3a0050402d3a00000061 OSU 10 test#1.test.co.kr test#2.co.kr dmmp17282 360060e80122d3a0050402d3a00000060 OSU 10 test#1.test.co.kr test#2.co.kr dmmp17379 360060e80122d3a0050402d3a00000062 OSU 10 test#1.test.co.kr test#2.co.kr /mindex /mindex OSU 10 test#1.test.co.kr test#2.co.kr lvmlv22746 /dev/vg01/lvol04 OSU 10 test#1.test.co.kr test#2.co.kr lvmvg22781 lvmvg-vg01 OSU 10 test#1.test.co.kr test#2.co.kr dmmp17176 360060e80122d3a0050402d3a00000061 OSU 10 test#1.test.co.kr test#2.co.kr dmmp17282 360060e80122d3a0050402d3a00000060 OSU 10 test#1.test.co.kr test#2.co.kr dmmp17379 360060e80122d3a0050402d3a00000062 OSU 10 test#1.test.co.kr MACHINE NETWORK ADDRESSES/DEVICE STATE PRIO test#1.co.kr TCP 192.168.0.16/192.168.0.15 ALIVE 1 test#1.co.kr TCP 192.168.1.16/192.168.1.15 ALIVE 2 •ISP: In-Service locally and Protected. •ISU: In-Service, Unprotected. •OSF: Out-of-Service, Failed. •OSU: Out-of-Service, Unimpaired. [root@test#1 test]# lktest F S UID PID PPID C CLS PRI NI SZ STIME TIME CMD 4 S root 4665 4521 0 TS 39 -20 6740 Oct13 00:00:20 lcm 4 S root 4673 4517 0 TS 39 -20 30650 Oct13 00:00:11 ttymonlcm 4 S root 4678 4518 0 TS 29 -10 9963 Oct13 00:00:11 lcd 또는 [root@test#1 test]# /etc/init.d/lifekeeper status LifeKeeper is running
살리고 싶은 서버에서
perform_action -t tag -a action ex) perform_action -t test -a restore 시작 perform_action -t test -a remove 정지 tag: 각 리소스의 식별자 action: restore 또는 remove [root@test test]# date;perform_action -t test -a restore;date Fri Oct 14 10:56:56 KST 2016 BEGIN restore of "dmmp17176" Locking device "/dev/mapper/testvol03". path sdc successfully registered for /dev/mapper/testvol03. path sdf successfully registered for /dev/mapper/testvol03. reserve /dev/mapper/testvol03. Device "/dev/mapper/testvol03" successfully locked. END successful restore of "dmmp17176" BEGIN restore of "dmmp17282" Locking device "/dev/mapper/testvol01". path sde successfully registered for /dev/mapper/testvol01. path sdb successfully registered for /dev/mapper/testvol01. reserve /dev/mapper/testvol01. Device "/dev/mapper/testvol01" successfully locked. END successful restore of "dmmp17282" BEGIN restore of "dmmp17379" Locking device "/dev/mapper/gwvol02". path sdg successfully registered for /dev/mapper/gwvol02. path sdd successfully registered for /dev/mapper/gwvol02. reserve /dev/mapper/gwvol02. Device "/dev/mapper/gwvol02" successfully locked. END successful restore of "dmmp17379" BEGIN restore of "lvmvg17138" on server "test.co.kr" END successful restore of "lvmvg17138" on server "test.co.kr" BEGIN restore of "lvmlv20044" on server "test.co.kr" END successful restore of "lvmlv20044" on server "test.co.kr" BEGIN restore of /atest "fsck"ing file system /atest fsck.ext4 -y /dev/mapper/vg01-lvol03 e2fsck 1.41.12 (17-May-2010) /dev/mapper/vg01-lvol03: clean, 69/6553600 files, 459441/26214400 blocks mounting file system /atest mount -text4 -orw /dev/mapper/vg01-lvol03 /atest File system /atest has been successfully mounted. END successful restore of /atest BEGIN restore of "lvmlv23329" on server "test.co.kr" END successful restore of "lvmlv23329" on server "test.co.kr" BEGIN restore of /mtest "fsck"ing file system /mtest fsck.ext4 -y /dev/mapper/vg01-lvol02 e2fsck 1.41.12 (17-May-2010) /dev/mapper/vg01-lvol02: clean, 19/6553600 files, 459358/26214400 blocks mounting file system /mtest mount -text4 -orw /dev/mapper/vg01-lvol02 /mtest File system /mtest has been successfully mounted. END successful restore of /mtest BEGIN restore of "lvmlv17135" on server "test.co.kr" END successful restore of "lvmlv17135" on server "test.co.kr" BEGIN restore of /mindex "fsck"ing file system /mindex fsck.ext4 -y /dev/mapper/vg01-lvol04 e2fsck 1.41.12 (17-May-2010) /dev/mapper/vg01-lvol04: clean, 275/2621440 files, 212147/10482688 blocks mounting file system /mindex mount -text4 -orw /dev/mapper/vg01-lvol04 /mindex File system /mindex has been successfully mounted. END successful restore of /mindex BEGIN restore of "lvmlv25749" on server "test.co.kr" END successful restore of "lvmlv25749" on server "test.co.kr" BEGIN restore of /app "fsck"ing file system /app fsck.ext4 -y /dev/mapper/vg01-lvol01 e2fsck 1.41.12 (17-May-2010) /dev/mapper/vg01-lvol01: clean, 50043/6553600 files, 1285782/26214400 blocks (check in 2 mounts) mounting file system /app mount -text4 -orw /dev/mapper/vg01-lvol01 /app File system /app has been successfully mounted. END successful restore of /app BEGIN restore of "ip-00.00.49.11" END successful restore of "ip-00.00.49.11" BEGIN restore of "test" BEGIN restore of test on server test.co.kr sf_ladmd start......................... cyrend start......................... memcached start......................... spamrd start......................... tmtad start......................... tremoted start......................... tmss-routed start......................... t4imapd start......................... tpopd start......................... dbproxy start......................... tc start......................... activemq start......................... webadmin start......................... webmail start......................... searcher start......................... notifier start......................... END successful restore of test on server test.co.kr END successful restore of "test" Fri Oct 14 10:59:15 KST 2016
[root@test /]# lkstart Starting LifeKeeper... [ OK ] [root@test /]# Message from syslogd@test at Oct 14 11:25:47 ... lcdinit[19030]: EMERG:lcd.lcdchkseml:::011138:The LifeKeeper product on this system is using an evaluation license key which will expire at midnight on 12/16/16. To continue functioning beyond that time, a permanent license key must be obtained. 넘기는 작업만 중지는 것인지, lifekeep만 중지 하는 것인지 확인 필요!! [root@test ~]# lkstop -f ok: down: /opt/LifeKeeper/etc/service/lkguiserver: 0s ok: down: /opt/LifeKeeper/etc/service/steeleye-lighttpd: 1s ok: down: /opt/LifeKeeper/etc/service/lkvmhad: 0s ok: down: /opt/LifeKeeper/etc/service/lkscsid: 0s ok: down: /opt/LifeKeeper/etc/service/lkcheck: 1s ok: down: /opt/LifeKeeper/etc/service/lcd: 0s ok: down: /opt/LifeKeeper/etc/service/ttymonlcm: 1s ok: down: /opt/LifeKeeper/etc/service/lcm: 0s LifeKeeper stopped [ OK ] [root@test ~]# lcdstatus -q LifeKeeper does not appear to be running. [root@test test]# lkstop ok: down: /opt/LifeKeeper/etc/service/lkguiserver: 0s BEGIN remove of "test" BEGIN remove of test on server test.co.kr sf_ladmd stop......................... spamrd stop........................... tmtad stop............................ tremoted stop......................... tmss-routed stop...................... t4imapd stop.......................... tpopd stop............................ dbproxy stop.......................... memcached stop........................ searcher stop......................... Sending stop command to Solr running on port 8983 ... waiting 5 seconds to allow Jetty process 27031 to stop gracefully. /root/test notifier stop......................... Shutting down openfire: /root/test webmail stop.......................... Using CATALINA_BASE: /opt/web/webmail Using CATALINA_HOME: /opt/web/webmail Using CATALINA_TMPDIR: /opt/web/webmail/temp Using JRE_HOME: /opt/3rd/java/jre Using CLASSPATH: /opt/web/webmail/bin/bootstrap.jar:/opt/web/webmail/bin/tomcat-juli.jar Stopping ............ Killing 26908 ... Tomcat is being shutdowned. testadmin stop......................... Using CATALINA_BASE: /opt/web/testadmin Using CATALINA_HOME: /opt/web/testadmin Using CATALINA_TMPDIR: /opt/web/testadmin/temp Using JRE_HOME: /opt/3rd/java/jre Using CLASSPATH: /opt/web/testadmin/bin/bootstrap.jar:/opt/web/testadmin/bin/tomcat-juli.jar Stopping ......... Killing 25893 ... Tomcat is being shutdowned. cyrend stop........................... INFO: Loading '/app/mq//bin/env' INFO: Using java '/opt/3rd/java/bin/java' INFO: Waiting at least 30 seconds for regular process termination of pid '25918' : Java Runtime: Oracle Corporation 1.7.0_71 /app/3rd/jdk1.7.0_71/jre Heap sizes: current=63488k free=62465k max=932352k JVM args: -Xms64M -Xmx1G -Djava.util.logging.config.file=logging.properties -Djava.security.auth.login.config=/app/mq//conf/login.config -Dmq.classpath=/app/mq//conf:/app/mq//../lib/: -Dmq.home=/app/mq/ -Dmq.base=/app/mq/ -Dmq.conf=/app/mq//conf -Dmq.data=/app/mq//data Extensions classpath: [/app/mq/lib,/app/mq/lib/camel,/app/mq/lib/optional,/app/mq/lib/web,/app/mq/lib/extra] mq_HOME: /app/mq mq_BASE: /app/mq mq_CONF: /app/mq/conf mq_DATA: /app/mq/data Connecting to pid: 25918 Stopping broker: go .. TERMINATED mq stop......................... test stop....................... 2016-10-14 11:23:26,792 INFO - test 3.7.8, as of 20140409-005518 (Revision 24766 by jenkins-slave@sfo-c54-jenkins-slave-002.eur.ad.sag from 3.7.8) 2016-10-14 11:23:27,095 INFO - Successfully loaded base configuration from file at '/app/test/tc-config.xml'. WARN: The log directory, '/app/log/test/server-logs', is already in use by another test process. Logging will proceed to the console only. 2016-10-14 11:23:27,127 INFO - There is only one test server instance in this configuration file (DO Cache Server); stopping it. 2016-10-14 11:23:27,128 INFO - Host: test.co.kr, port: 9520 /root/test db stop............................... END successful remove of test on server test.co.kr END successful remove of "test" BEGIN remove of "ip-00.00.00..11" END successful remove of "ip-00.00.00.11" BEGIN remove of /a_test unmounting file system /a_test file system /a_test successfully unmounted END successful remove of /a_test BEGIN remove of "lvmlv20044" on server "test.co.kr" END successful remove of "lvmlv20044" on server "test.co.kr" BEGIN remove of /mtest unmounting file system /mtest file system /mtest successfully unmounted END successful remove of /mtest BEGIN remove of "lvmlv23329" on server "test.co.kr" END successful remove of "lvmlv23329" on server "test.co.kr" BEGIN remove of /mindex unmounting file system /mindex file system /mindex successfully unmounted END successful remove of /mindex BEGIN remove of "lvmlv17135" on server "test.co.kr" END successful remove of "lvmlv17135" on server "test.co.kr" BEGIN remove of /app unmounting file system /app file system /app successfully unmounted END successful remove of /app BEGIN remove of "lvmlv25749" on server "test.co.kr" END successful remove of "lvmlv25749" on server "test.co.kr" BEGIN remove of "lvmvg17138" on server "test.co.kr" 0 logical volume(s) in volume group "vg01" now active Volume group "vg01" successfully exported END successful remove of "lvmvg17138" on server "test.co.kr" BEGIN remove of "dmmp17176" Flushing buffers on /dev/mapper/testvol03. ioctl.pl -f /dev/mapper/testvol03 Unlocking device "/dev/mapper/testvol03". Device "/dev/mapper/testvol03" successfully unlocked. END successful remove of "dmmp17176" BEGIN remove of "dmmp17282" Flushing buffers on /dev/mapper/testvol01. ioctl.pl -f /dev/mapper/testvol01 Unlocking device "/dev/mapper/testvol01". Device "/dev/mapper/testvol01" successfully unlocked. END successful remove of "dmmp17282" BEGIN remove of "dmmp17379" Flushing buffers on /dev/mapper/testvol02. ioctl.pl -f /dev/mapper/testvol02 Unlocking device "/dev/mapper/testvol02". Device "/dev/mapper/testvol02" successfully unlocked. END successful remove of "dmmp17379" ok: down: /opt/LifeKeeper/etc/service/steeleye-lighttpd: 0s ok: down: /opt/LifeKeeper/etc/service/lkvmhad: 1s ok: down: /opt/LifeKeeper/etc/service/lkscsid: 0s ok: down: /opt/LifeKeeper/etc/service/lkcheck: 1s ok: down: /opt/LifeKeeper/etc/service/lcd: 0s ok: down: /opt/LifeKeeper/etc/service/ttymonlcm: 1s ok: down: /opt/LifeKeeper/etc/service/lcm: 0s LifeKeeper stopped [ OK ] [root@test test]# lcdstatus -q LifeKeeper does not appear to be running.
[root@test~]# lklicmgr License File: 20161012.lic Product Type Expiry Other LifeKeeper for Linux Eval 16 Dec 2016 (28 days) LifeKeeper for Linux Eval 16 Dec 2016 (28 days) Locale: Japan Apache Recovery Kit Eval 16 Dec 2016 (28 days) DB2 Recovery Kit Eval 16 Dec 2016 (28 days) Device Mapper Multipath (1/2) Eval 16 Dec 2016 (28 days) Device Mapper Multipath (2/2) Eval 16 Dec 2016 (28 days) HDLM Recovery Kit (1/2) Eval 16 Dec 2016 (28 days) HDLM Recovery Kit (2/2) Eval 16 Dec 2016 (28 days) Informix Recovery Kit Eval 16 Dec 2016 (28 days) LVM Recovery Kit (1/2) Eval 16 Dec 2016 (28 days) LVM Recovery Kit (2/2) Eval 16 Dec 2016 (28 days) MD Recovery Kit (1/2) Eval 16 Dec 2016 (28 days) MD Recovery Kit (2/2) Eval 16 Dec 2016 (28 days) MQ Series Recovery Kit Eval 16 Dec 2016 (28 days) MySQL Recovery Kit Eval 16 Dec 2016 (28 days) NAS Recovery Kit Eval 16 Dec 2016 (28 days) NFS Recovery Kit (1/2) Eval 16 Dec 2016 (28 days) NFS Recovery Kit (2/2) Eval 16 Dec 2016 (28 days) Oracle Recovery Kit Eval 16 Dec 2016 (28 days) Oracle Listener Recovery Kit Eval 16 Dec 2016 (28 days) Postfix Recovery Kit Eval 16 Dec 2016 (28 days) Postgres Recovery Kit Eval 16 Dec 2016 (28 days) PowerPath Recovery Kit (1/2) Eval 16 Dec 2016 (28 days) PowerPath Recovery Kit (2/2) Eval 16 Dec 2016 (28 days) DataKeeper for Linux ARK Eval 16 Dec 2016 (28 days) Samba Recovery Kit Eval 16 Dec 2016 (28 days) SAP Recovery Kit Eval 16 Dec 2016 (28 days) SAP DB Recovery Kit Eval 16 Dec 2016 (28 days) SDD Recovery Kit (1/2) Eval 16 Dec 2016 (28 days) SDD Recovery Kit (2/2) Eval 16 Dec 2016 (28 days) DataKeeper for Linux ARK Eval 16 Dec 2016 (28 days) Multi-Site Cluster Eval 16 Dec 2016 (28 days) Sendmail Recovery Kit Eval 16 Dec 2016 (28 days) SPS Recovery Kit (1/2) Eval 16 Dec 2016 (28 days) SPS Recovery Kit (2/2) Eval 16 Dec 2016 (28 days) Sybase Recovery Kit Eval 16 Dec 2016 (28 days) XP Cluster Extension ARK Eval 16 Dec 2016 (28 days)
[root@test/]# lksupport
Collecting info under /tmp/lksupport/test.co.kr
saving LifeKeeper status
saving LifeKeeper install log
saving steeleye-lighttpd information
saving LifeKeeper defaults file
saving md and DataKeeper for Linux data
saving LifeKeeper device_info files
saving ls -la /dev/disk/by-id
saving LifeKeeper SCSI kit files
saving LifeKeeper configuration information
saving host information
saving LifeKeeper licensing data
saving network data
saving installed package data
saving process information
saving LVM data (this may take minutes to complete)
saving /proc data
saving module configuration data
saving lsmod data
saving file system information
saving boot loader data
saving system timestamps
saving Device Mapper information
saving Device Mapper - multipath information
saving SELinux information
saving system information
collecting top information (10s)...
saving LifeKeeper event/policy configuration
saving LifeKeeper runit related data
saving core file information
saving locale related settings
saving up to 200000 lines of /var/log/messages
saving up to 200000 lines of /var/log/lifekeeper.log
saving logging configuration
Creating support file /tmp/lksupport/test.co.kr.lksupport.1610141140.tar.gz
/tmp/lksupport/hostname.lksupport. 날짜.tar.gz << 파일 생성
Active/ Standby 서버동일하게설정. 1) 추가된disk multipath 등록 - # fdisk -l : 추가된disk 확인 - # service multipathd stop - # scsi_id -g -u /dev/new_disk - # vi /etc/multipath.conf wwid new_disk_wwid - # service multipathd start 2) fdisk 파티션 작업 - # fdisk /dev/new_disk_multipath > Command (m for help): n <엔터> > Command actioneextendedpprimary partition (1-4)p <엔터> > Partition number (1-4): 1 <엔터> > First cylinder (1-11915, default 1): < 엔터> > Using default value 1Last cylinder, +cylinders or +size{K,M,G} (1-11915, default 11915):<엔터> > Using default value 11915Command (m for help): t <엔터> > Selected partition 1Hex code (type L to list codes): 8e <엔터> > Changed system type of partition 1 to 8e (LinuxLVM)Command (m for help): w <엔터> > The partition table has been altered!Calling ioctl() to re-read partition table.Syncing disks - # partprobe /dev/new_disk_multipath 3) LVM 증설 작업 - # fdisk /dev/new_disk_multipath - # vgextend vg명 /dev/new_disk_multipath*1 - # vgs - # lvextend -l 100 vg명/lv명 - # lvs - # resize2fs /dev/vg명/lv명 standby 서버에서 pvscan vgscan lvscan