[DRBD-user] borked split-brain recovery

Matt Davidson mdavidson at allureglobal.com
Thu Oct 4 15:19:44 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Should I `cat /dev/zero > /dev/sdb1` before re-creating the metadata?

[root at openfiler1 ~]# drbdadm -- --force create-md vg0_drbd
pvs stderr:        /dev/sdb1: Added to device cache
pvs stderr:        /dev/sdb1: Skipping (regex)
pvs stderr:  Failed to read physical volume "/dev/sdb1"
pvs stderr:      Unlocking /var/lock/lvm/P_global

md_offset 362175492096
al_offset 362175459328
bm_offset 362164404224

Found LVM2 physical volume signature
   353676176 kB left usable by current configuration
Could not determine the size of the actually used data area.

Device size would be truncated, which
would corrupt data and result in
'access beyond end of device' errors.
If you want me to do this, you need to zero out the first part
of the device (destroy the content).
You should be very sure that you mean it.
Operation refused.

Command 'drbdmeta 1 v08 /dev/sdb1 internal create-md --force' terminated with exit code 40
drbdadm create-md vg0_drbd: exited with code 40
[root at openfiler1 ~]#

-----Original Message-----
From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Lars Ellenberg
Sent: Thursday, October 04, 2012 9:05 AM
To: drbd-user at lists.linbit.com
Subject: Re: [DRBD-user] borked split-brain recovery

On Wed, Oct 03, 2012 at 01:08:40PM -0700, mdavidson at allureglobal.co wrote:
> 
> sorry, i meant to say, when telling openfiler1 to connect, openfiler2 
> is designated as sync target

I have no idea how you managed to get yourself into that situation, but I sugget to re-create the drbd meta data on the "bad" node, and have it sync up from there.

on openfiler1,
	drbdadm down vg0_drbd 
	drbdadm -- --force create-md vg0_drbd
	drbdadm up vg0_drbd
on openfiler2
	drbdadm adjust vg0_drbd


Then have someone help you figure out what went wrong, and how to avoid that in the future...

	Lars


> mdavidson at allureglobal.co wrote:
> > 
> > in the middle of trying to manually recover from a split-brain, it 
> > seems i've created a little bit of a mess.  I'm using two openfiler 
> > machines with drbd as HA iscsi storage for a xenserver cluster as 
> > described here
> > http://www.howtoforge.com/installing-and-configuring-openfiler-with-
> > drbd-and-heartbeat-p2
> > http://www.howtoforge.com/installing-and-configuring-openfiler-with-
> > drbd-and-heartbeat-p2 .  I've managed to get the cluster_metadata 
> > resource syncing properly, but the actual data resource is being 
> > fussy.  Openfiler2 is currently primary and seems to be working fine 
> > as all my vm's are currently online.  I'd like to keep openfiler2 as 
> > the primary, but when i tell openfiler1 to connect the system 
> > designates openfiler1 as the sync target.  I'm rather new to drbd so 
> > if there's any other info i need to post please let me know
> > 
> > Openfiler1 status:
> > [root at openfiler1 log]# service drbd status drbd driver loaded OK; 
> > device status:
> > version: 8.3.7 (api:88/proto:86-91)
> > GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by 
> > phil at fat-tyre,
> > 2010-01-13 17:17:27
> > m:res               cs            ro                 ds                    
> > p  mounted  fstype
> > 0:cluster_metadata  Connected     Secondary/Primary  UpToDate/UpToDate     
> > C
> > 1:vg0_drbd          WFConnection  Secondary/Unknown  Inconsistent/DUnknown 
> > C
> > [root at openfiler1 log]#
> > 
> > 
> > Openfiler2 status:
> > [root at openfiler2 ha.d]# service drbd status drbd driver loaded OK; 
> > device status:
> > version: 8.3.7 (api:88/proto:86-91)
> > GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by 
> > phil at fat-tyre,
> > 2010-01-13 17:17:27
> > m:res               cs          ro                 ds                    
> > p      mounted            fstype
> > 0:cluster_metadata  Connected   Primary/Secondary  UpToDate/UpToDate     
> > C      /cluster_metadata  ext3
> > 1:vg0_drbd          StandAlone  Primary/Unknown    UpToDate/Inconsistent 
> > r----
> > [root at openfiler2 ha.d]#
> > 
> > 
> > dmesg output from openfiler2:
> > [1287966.539911] block drbd1: Starting receiver thread (from 
> > drbd1_worker
> > [3145])
> > [1287966.540030] block drbd1: receiver (re)started [1287966.540047] 
> > block drbd1: conn( Unconnected -> WFConnection ) [1287966.639236] 
> > block drbd1: Handshake successful: Agreed network protocol version 
> > 91 [1287966.639246] block drbd1: conn( WFConnection -> 
> > WFReportParams ) [1287966.639282] block drbd1: Starting asender 
> > thread (from drbd1_receiver
> > [12115])
> > [1287966.639419] block drbd1: data-integrity-alg: <not-used> 
> > [1287966.639526] block drbd1: drbd_sync_handshake:
> > [1287966.639532] block drbd1: self
> > 89867987176E42C7:0000000000000000:C1B7F3C81019781C:2516E370EEC0B159
> > bits:29285 flags:0
> > [1287966.639538] block drbd1: peer
> > 80413839405F0B3A:89867987176E42C6:C1B7F3C81019781C:2516E370EEC0B159 
> > bits:0
> > flags:0
> > [1287966.639542] block drbd1: uuid_compare()=-1 by rule 50 
> > [1287966.639546] block drbd1: I shall become SyncTarget, but I am primary!
> > [1287966.639777] block drbd1: conn( WFReportParams -> Disconnecting 
> > ) [1287966.639785] block drbd1: error receiving ReportState, l: 4!
> > [1287966.640033] block drbd1: asender terminated [1287966.640042] 
> > block drbd1: Terminating asender thread [1287966.640209] block 
> > drbd1: Connection closed [1287966.640217] block drbd1: conn( 
> > Disconnecting -> StandAlone ) [1287966.640234] block drbd1: receiver 
> > terminated [1287966.640239] block drbd1: Terminating receiver thread
> > 
> > 
> 
> --
> View this message in context: 
> http://old.nabble.com/borked-split-brain-recovery-tp34510920p34510927.
> html Sent from the DRBD - User mailing list archive at Nabble.com.
> 
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD(r) and LINBIT(r) are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed
_______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user



More information about the drbd-user mailing list