[DRBD-user] Promote secondary to primary fails in some situations

Nik Martin nik.martin at nfinausa.com
Mon Aug 6 20:39:26 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello,

I am building a true no SPOF network/storage cluster that consists of:

2 Storage Servers, named SAN-n1 and SAN-n2, with the following config:

3 ethernet interfaces, Management, Storage, and Xover

The DRBD resource looks like:

resource rsdb1
{
         device  /dev/drbd0;
         disk    /dev/sdb1;
         meta-disk internal;

     on san01-n1 {
         address 10.0.0.1:7789; # Use 10Gb Xover
         }

     on san01-n2 {
         address 10.0.0.2:7789; # Use 10Gb Xover
         }
}

As configured, drbd connects to its peer over a 10G crossover.

Each SAN node connects on eth3 up to it's own respective switch, whicjh 
are then stacked with dual 10G stack cables.

All clients of this SAN also conenct to both switches with bonded 
ethernet interfaces.  ANY component can fail, and the storage unit will 
stay online.

First, my CIB:
node san01-n1
node san01-n2

primitive drbd_disk ocf:linbit:drbd \
	params drbd_resource="rsdb1" \
	op monitor interval="9s" role="Master" \
	op monitor interval="11s" role="Slave"

primitive ip_mgmt ocf:heartbeat:IPaddr2 \
	params ip="172.16.5.10" cidr_netmask="24" \
	op monitor interval="10s"

primitive ip_storage ocf:heartbeat:IPaddr2 \
	params ip="172.16.10.10" cidr_netmask="24" \
	op monitor interval="10s"

primitive lvm_nfs ocf:heartbeat:LVM \
	params volgrpname="vg_vmstore" \
	op monitor interval="10s" timeout="30s" depth="0" \
	op start interval="0" timeout="30s" \
	op stop interval="0" timeout="30s"

primitive res_iSCSILogicalUnit_1 ocf:heartbeat:iSCSILogicalUnit \
	params target_iqn="iqn.2012-01.com.nfinausa:san01" lun="1" 
path="/dev/vg_vmstore/lv_vmstore" \
	operations $id="res_iSCSILogicalUnit_1-operations" \
	op start interval="0" timeout="10" \
	op stop interval="0" timeout="10" \
	op monitor interval="10" timeout="10" start-delay="0"

primitive res_iSCSITarget_p_iscsitarget ocf:heartbeat:iSCSITarget \
	params implementation="tgt" iqn="iqn.2012-01.com.nfinausa:san01" tid="1" \
	operations $id="res_iSCSITarget_p_iscsitarget-operations" \
	op start interval="0" timeout="10" \
	op stop interval="0" timeout="10" \
	op monitor interval="10" timeout="10" start-delay="0"

group rg_vmstore lvm_nfs ip_storage ip_mgmt

res_iSCSITarget_p_iscsitarget res_iSCSILogicalUnit_1 \
	meta target-role="started"

ms ms_drbd_disk drbd_disk \
	meta master-max="1" master-node-max="1" clone-max="2" 
clone-node-max="1" notify="true"

location cli-standby-rg_vmstore rg_vmstore \
	rule $id="cli-standby-rule-rg_vmstore" -inf: #uname eq san01-n2

colocation colo_drbd_with_lvm inf: rg_vmstore ms_drbd_disk:Master

order o_drbd_bef_nfs inf: ms_drbd_disk:promote rg_vmstore:start

property $id="cib-bootstrap-options" \
dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \
	cluster-infrastructure="openais" \
	expected-quorum-votes="2" \
	stonith-enabled="false" \
	no-quorum-policy="ignore" \
	last-lrm-refresh="1343691857"



DRBD is controlled by pacemaker

Also, in global_common config for DRBD, I have the fence-peer handlers 
configured:
fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";

and:

disk {
		fencing resource-only;
         	resync-rate 2000M;
		on-io-error detach;
         	c-max-rate 2000M;
         	#we use raid Battery backed cache + ssd cache so dont cache
         	disk-flushes no;
        		md-flushes no;

}

The failure mode which is not being handled properly by this 
configuration is the failure of the storage network interface on the 
Primary (relative to DRBD) node. DRBD communicates over the xover 
connection, so it remains in state Connected:Primary/Secondary if the 
network interface on the primary server dies.  What I see on the 
secondary unit is an error promoting drbd to primary, statting that 
there can only be pone primary.  So my question is how do I demote the 
primary to secondary when this failure mode occurs on either SAN Node? I 
don't see any demotion logic in the DRBD resource agent.

-- 
Regards,

Nik



More information about the drbd-user mailing list