<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 12pt;
font-family:Calibri
}
--></style></head>
<body class='hmmessage'><div dir='ltr'>Hi all,<br>I deployed a 2 nodes (physical) RHCS Pacemaker cluster on CentOS 6.5 x86_64 (fully up-to-date) with:<br><br>cman-3.0.12.1-59.el6_5.2.x86_64<br>pacemaker-1.1.10-14.el6_5.3.x86_64<br>pcs-0.9.90-2.el6.centos.3.noarch<br>qemu-kvm-0.12.1.2-2.415.el6_5.10.x86_64<br>qemu-kvm-tools-0.12.1.2-2.415.el6_5.10.x86_64<br>drbd-utils-8.9.0-1.el6.x86_64<br>drbd-udev-8.9.0-1.el6.x86_64<br>drbd-rgmanager-8.9.0-1.el6.x86_64<br>drbd-bash-completion-8.9.0-1.el6.x86_64<br>drbd-pacemaker-8.9.0-1.el6.x86_64<br>drbd-8.9.0-1.el6.x86_64<br>drbd-km-2.6.32_431.20.3.el6.x86_64-8.4.5-1.x86_64<br>kernel-2.6.32-431.20.3.el6.x86_64<br><br>The aim is to run KVM virtual machines backed by DRBD (8.4.5) in an active/passive mode (no dual primary and so no live migration).<br>Just to err on the side of consistency against HA (and to pave the way for a possible dual-primary live-migration-capable setup), I configured DRBD for resource-and-stonith with rhcs_fence (that's why I installed drbd-rgmanager) as fence-peer handler and stonith devices configured in Pacemaker (pcmk-redirect in cluster.conf).<br><br>The setup "almost" works (all seems ok with: "pcs status", "crm_mon -Arf1", "corosync-cfgtool -s", "corosync-objctl | grep member") , but every time it needs a resource promotion (to Master, i.e. becoming primary) it either fails or fences the other node (the one supposed to become Slave i.e. secondary) and only then succeeds.<br>It happens, for example both on initial resource definition (when attempting first start) and on node entering standby (when trying to automatically move the resources by stopping then starting them).<br><br>I collected a full "pcs cluster report" and I can provide a CIB dump, but I will initially paste here an excerpt from my configuration just in case it happens to be a simple configuration error that someone can spot on the fly ;> (hoping...)<br><br>Keep in mind that the setup has separated redundant network connections for LAN (1 Gib/s LACP to switches), Corosync (1 Gib/s roundrobin back-to-back) and DRBD (10 Gib/s roundrobin back-to-back) and that FQDNs are correctly resolved through /etc/hosts<br><br>DRBD:<br><br>/etc/drbd.d/global_common.conf:<br><br>------------------------------------------------------------------------------------------------------<br><br>global {<br> usage-count no;<br>}<br><br>common {<br> protocol C;<br> disk {<br> on-io-error detach;<br> fencing resource-and-stonith;<br> disk-barrier no;<br> disk-flushes no;<br> al-extents 3389;<br> c-plan-ahead 200;<br> c-fill-target 15M;<br> c-max-rate 100M;<br> c-min-rate 10M;<br> }<br> net {<br> after-sb-0pri discard-zero-changes;<br> after-sb-1pri discard-secondary;<br> after-sb-2pri disconnect;<br> csums-alg sha1;<br> data-integrity-alg sha1;<br> max-buffers 8000;<br> max-epoch-size 8000;<br> unplug-watermark 16;<br> sndbuf-size 0;<br> verify-alg sha1;<br> }<br> startup {<br> wfc-timeout 300;<br> outdated-wfc-timeout 80;<br> degr-wfc-timeout 120;<br> }<br> handlers {<br> fence-peer "/usr/lib/drbd/rhcs_fence";<br> }<br>}<br><br>------------------------------------------------------------------------------------------------------<br><br>Sample DRBD resource (there are others, similar)<br>/etc/drbd.d/dc_vm.res:<br><br>------------------------------------------------------------------------------------------------------<br><br>resource dc_vm {<br>device /dev/drbd1;<br>disk /dev/VolGroup00/dc_vm;<br>meta-disk internal;<br>on cluster1.verolengo.privatelan {<br>address ipv4 172.16.200.1:7790;<br>}<br>on cluster2.verolengo.privatelan {<br>address ipv4 172.16.200.2:7790;<br>}<br>}<br><br>------------------------------------------------------------------------------------------------------<br><br>RHCS:<br><br>/etc/cluster/cluster.conf<br><br>------------------------------------------------------------------------------------------------------<br><br><?xml version="1.0"?><br><cluster name="vclu" config_version="14"><br> <cman two_node="1" expected_votes="1" keyfile="/etc/corosync/authkey" transport="udpu" port="5405"/><br> <totem consensus="60000" join="6000" token="100000" token_retransmits_before_loss_const="20" rrp_mode="passive" secauth="on"/><br> <clusternodes><br> <clusternode name="cluster1.verolengo.privatelan" votes="1" nodeid="1"><br> <altname name="clusterlan1.verolengo.privatelan" port="6405"/><br> <fence><br> <method name="pcmk-redirect"><br> <device name="pcmk" port="cluster1.verolengo.privatelan"/><br> </method><br> </fence><br> </clusternode><br> <clusternode name="cluster2.verolengo.privatelan" votes="1" nodeid="2"><br> <altname name="clusterlan2.verolengo.privatelan" port="6405"/><br> <fence><br> <method name="pcmk-redirect"><br> <device name="pcmk" port="cluster2.verolengo.privatelan"/><br> </method><br> </fence><br> </clusternode><br> </clusternodes><br> <fencedevices><br> <fencedevice name="pcmk" agent="fence_pcmk"/><br> </fencedevices><br> <fence_daemon clean_start="0" post_fail_delay="30" post_join_delay="30"/><br> <logging debug="on"/><br> <rm disabled="1"><br> <failoverdomains/><br> <resources/><br> </rm><br></cluster><br><br>------------------------------------------------------------------------------------------------------<br><br>Pacemaker:<br><br>PROPERTIES:<br><br>pcs property set default-resource-stickiness=100<br>pcs property set no-quorum-policy=ignore<br><br>STONITH:<br><br>pcs stonith create ilocluster1 fence_ilo2 action="off" delay="10" \<br> ipaddr="ilocluster1.verolengo.privatelan" login="cluster2" passwd="test" power_wait="4" \<br> pcmk_host_check="static-list" pcmk_host_list="cluster1.verolengo.privatelan" op monitor interval=60s<br>pcs stonith create ilocluster2 fence_ilo2 action="off" \<br> ipaddr="ilocluster2.verolengo.privatelan" login="cluster1" passwd="test" power_wait="4" \<br> pcmk_host_check="static-list" pcmk_host_list="cluster2.verolengo.privatelan" op monitor interval=60s<br>pcs stonith create pdu1 fence_apc action="off" \<br> ipaddr="pdu1.verolengo.privatelan" login="cluster" passwd="test" \<br> pcmk_host_map="cluster1.verolengo.privatelan:3,cluster1.verolengo.privatelan:4,cluster2.verolengo.privatelan:6,cluster2.verolengo.privatelan:7" \<br> pcmk_host_check="static-list" pcmk_host_list="cluster1.verolengo.privatelan,cluster2.verolengo.privatelan" op monitor interval=60s<br><br>pcs stonith level add 1 cluster1.verolengo.privatelan ilocluster1<br>pcs stonith level add 2 cluster1.verolengo.privatelan pdu1<br>pcs stonith level add 1 cluster2.verolengo.privatelan ilocluster2<br>pcs stonith level add 2 cluster2.verolengo.privatelan pdu1<br><br>pcs property set stonith-enabled=true<br>pcs property set stonith-action=off<br><br>SAMPLE RESOURCE:<br><br>pcs cluster cib dc_cfg<br>pcs -f dc_cfg resource create DCVMDisk ocf:linbit:drbd \<br> drbd_resource=dc_vm op monitor interval="31s" role="Master" \<br> op monitor interval="29s" role="Slave" \<br> op start interval="0" timeout="120s" \<br> op stop interval="0" timeout="180s"<br>pcs -f dc_cfg resource master DCVMDiskClone DCVMDisk \<br> master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 \<br> notify=true target-role=Started is-managed=true<br>pcs -f dc_cfg resource create DCVM ocf:heartbeat:VirtualDomain \<br> config=/etc/libvirt/qemu/dc.xml migration_transport=tcp migration_network_suffix=-10g \<br> hypervisor=qemu:///system meta allow-migrate=false target-role=Started is-managed=true \<br> op start interval="0" timeout="120s" \<br> op stop interval="0" timeout="120s" \<br> op monitor interval="60s" timeout="120s"<br>pcs -f dc_cfg constraint colocation add DCVM DCVMDiskClone INFINITY with-rsc-role=Master<br>pcs -f dc_cfg constraint order promote DCVMDiskClone then start DCVM<br>pcs -f dc_cfg constraint location DCVM prefers cluster2.verolengo.privatelan=50<br>pcs cluster cib-push firewall_cfg<br><br>Since I know that pcs still has some rough edges, I installed crmsh too, but never actually used it.<br><br>Many thanks in advance for your attention.<br><br>Kind regards,<br>Giuseppe Ragusa<br><br>                                            </div></body>
</html>