Criar QuorumDisk
RHCS(Red Hat Cluster Suite) quorum disk The last post "RHCS I/O fencing" is about dealing with split-brain situation, in which cluster members lost heartbeat communication and each believe it is legitimate to write data to the shared storage. Methods to deal with split-brain situation: 1. Redundant heartbeat path network port communication plus serial port communication 2. I/O fencing Remaining nodes separate failed node from its storage either by shutdown/reboot power port or storage port 3. Quorum disk Quorum disk is a kind of I/O fencing, but the reboot action is executed by failed node's own quorum daemon. It also has additional feature: contributing vote to cluster. if you want the last standing node to keep the multiple-nodes cluster running, quorum disk appears to be the only solution. RHCS (Red Hat Cluster Suite) Quorum disk facts - A shared block device (SCSI/iSCSI/FC..), Device size requirement is approximately 10MiB - Supports maximum 16 nodes, nodes id must be sequentially ordered - Quorum disk can contribute votes. In multiple nodes cluster, together with quorum vote, the last standing node can still keep the cluster running - single node votes+1 <=Quorum's disk vote < nodes total votes - The failure of the shared quorum disk won’t result in cluster failure, as long as Quorum's disk vote < nodes total votes - each node write its own health information in its own region, the health is determined by external checking program such as "ping" Setup Quorum disk
- initialise quorum disk once in any node
mkqdisk -c /dev/sdx -l myqdisk Add quorum disk to cluster Use luci or system-config-cluster to add quorum disk, following is the result xml file <clusternodes> <clusternode name="station1.example.com" nodeid="1" votes="2"> <fence/> </clusternode> <clusternode name="station2.example.com" nodeid="2" votes="2"> <fence/> </clusternode> <clusternode name="station3.example.com" nodeid="3" votes="2"> <fence/> </clusternode> </clusternodes>
- expected votes =9=(nodes total votes + quorum disk votes) = (2+2+2+3)
<cman expected_votes="9"/>
- Health check result is writen to quorum disk every 2 secs
- if health check fails over 5 tko, 10 (2*5) secs, the node is rebooted by quorum daemon
- Each heuristic check is run very 2 secs and earn 1 score,if shell script return is 0
<quorumd interval="2" label="myqdisk" min_score="2" tko="5" votes="3"> <heuristic interval="2" program="ping -c1 -t1 192.168.1.60" score="1"/> <heuristic interval="2" program="ping -c1 -t1 192.168.1.254" score="1"/> </quorumd> Start quorum disk daemon The daemon is also one of daemons automatically started by cman service qdiskd start Check quorum disk information $ mkqdisk -L -d mkqdisk v0.6.0 /dev/disk/by-id/scsi-1IET_00010002: /dev/disk/by-uuid/55fbf858-df75-493b-a764-5640be5a9b46: /dev/sdc: Magic: eb7a62c2 Label: myqdisk Created: Sat May 7 05:56:35 2011 Host: station2.example.com Kernel Sector Size: 512 Recorded Sector Size: 512 Status block for node 1 Last updated by node 1 Last updated on Sat May 7 15:09:37 2011 State: Master Flags: 0000 Score: 0/0 Average Cycle speed: 0.001500 seconds Last Cycle speed: 0.000000 seconds Incarnation: 4dc4d1764dc4d176 Status block for node 2 Last updated by node 2 Last updated on Sun May 8 01:09:38 2011 State: Running Flags: 0000 Score: 0/0 Average Cycle speed: 0.001000 seconds Last Cycle speed: 0.000000 seconds Incarnation: 4dc55e164dc55e16 Status block for node 3 Last updated by node 3 Last updated on Sat May 7 15:09:38 2011 State: Running Flags: 0000 Score: 0/0 Average Cycle speed: 0.001500 seconds Last Cycle speed: 0.000000 seconds Incarnation: 4dc4d2f04dc4d2f0 The cluster is still running with last node standing Please note Total votes=quorum votes=5=2+3, if quorum disk vote is less than (node votes+1), the cluster wouldn’t have survived $cman_tool status .. Nodes: 1 Expected votes: 9 Quorum device votes: 3 Total votes: 5 Quorum: 5 .. POSTED BY HONGLUS AT 5:12 PM