[DRBD-user] borked split-brain recovery

Lars Ellenberg lars.ellenberg at linbit.com
Thu Oct 4 15:05:29 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Wed, Oct 03, 2012 at 01:08:40PM -0700, mdavidson at allureglobal.co wrote:
> 
> sorry, i meant to say, when telling openfiler1 to connect, openfiler2 is
> designated as sync target

I have no idea how you managed to get yourself into that situation,
but I sugget to re-create the drbd meta data on the "bad" node,
and have it sync up from there.

on openfiler1,
	drbdadm down vg0_drbd 
	drbdadm -- --force create-md vg0_drbd
	drbdadm up vg0_drbd
on openfiler2
	drbdadm adjust vg0_drbd


Then have someone help you figure out what went wrong,
and how to avoid that in the future...

	Lars


> mdavidson at allureglobal.co wrote:
> > 
> > in the middle of trying to manually recover from a split-brain, it seems
> > i've created a little bit of a mess.  I'm using two openfiler machines
> > with drbd as HA iscsi storage for a xenserver cluster as described here 
> > http://www.howtoforge.com/installing-and-configuring-openfiler-with-drbd-and-heartbeat-p2
> > http://www.howtoforge.com/installing-and-configuring-openfiler-with-drbd-and-heartbeat-p2
> > .  I've managed to get the cluster_metadata resource syncing properly, but
> > the actual data resource is being fussy.  Openfiler2 is currently primary
> > and seems to be working fine as all my vm's are currently online.  I'd
> > like to keep openfiler2 as the primary, but when i tell openfiler1 to
> > connect the system designates openfiler1 as the sync target.  I'm rather
> > new to drbd so if there's any other info i need to post please let me know
> > 
> > Openfiler1 status:
> > [root at openfiler1 log]# service drbd status
> > drbd driver loaded OK; device status:
> > version: 8.3.7 (api:88/proto:86-91)
> > GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by phil at fat-tyre,
> > 2010-01-13 17:17:27
> > m:res               cs            ro                 ds                    
> > p  mounted  fstype
> > 0:cluster_metadata  Connected     Secondary/Primary  UpToDate/UpToDate     
> > C
> > 1:vg0_drbd          WFConnection  Secondary/Unknown  Inconsistent/DUnknown 
> > C
> > [root at openfiler1 log]# 
> > 
> > 
> > Openfiler2 status:
> > [root at openfiler2 ha.d]# service drbd status
> > drbd driver loaded OK; device status:
> > version: 8.3.7 (api:88/proto:86-91)
> > GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by phil at fat-tyre,
> > 2010-01-13 17:17:27
> > m:res               cs          ro                 ds                    
> > p      mounted            fstype
> > 0:cluster_metadata  Connected   Primary/Secondary  UpToDate/UpToDate     
> > C      /cluster_metadata  ext3
> > 1:vg0_drbd          StandAlone  Primary/Unknown    UpToDate/Inconsistent 
> > r----
> > [root at openfiler2 ha.d]# 
> > 
> > 
> > dmesg output from openfiler2:
> > [1287966.539911] block drbd1: Starting receiver thread (from drbd1_worker
> > [3145])
> > [1287966.540030] block drbd1: receiver (re)started
> > [1287966.540047] block drbd1: conn( Unconnected -> WFConnection ) 
> > [1287966.639236] block drbd1: Handshake successful: Agreed network
> > protocol version 91
> > [1287966.639246] block drbd1: conn( WFConnection -> WFReportParams ) 
> > [1287966.639282] block drbd1: Starting asender thread (from drbd1_receiver
> > [12115])
> > [1287966.639419] block drbd1: data-integrity-alg: <not-used>
> > [1287966.639526] block drbd1: drbd_sync_handshake:
> > [1287966.639532] block drbd1: self
> > 89867987176E42C7:0000000000000000:C1B7F3C81019781C:2516E370EEC0B159
> > bits:29285 flags:0
> > [1287966.639538] block drbd1: peer
> > 80413839405F0B3A:89867987176E42C6:C1B7F3C81019781C:2516E370EEC0B159 bits:0
> > flags:0
> > [1287966.639542] block drbd1: uuid_compare()=-1 by rule 50
> > [1287966.639546] block drbd1: I shall become SyncTarget, but I am primary!
> > [1287966.639777] block drbd1: conn( WFReportParams -> Disconnecting ) 
> > [1287966.639785] block drbd1: error receiving ReportState, l: 4!
> > [1287966.640033] block drbd1: asender terminated
> > [1287966.640042] block drbd1: Terminating asender thread
> > [1287966.640209] block drbd1: Connection closed
> > [1287966.640217] block drbd1: conn( Disconnecting -> StandAlone ) 
> > [1287966.640234] block drbd1: receiver terminated
> > [1287966.640239] block drbd1: Terminating receiver thread
> > 
> > 
> 
> -- 
> View this message in context: http://old.nabble.com/borked-split-brain-recovery-tp34510920p34510927.html
> Sent from the DRBD - User mailing list archive at Nabble.com.
> 
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list