Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Well, my boss just tried `drbdadm connect all` on openfiler2 and the nodes are now syncing with 2 as source and 1 as target, so all is well in the world again. Thanks guys! -----Original Message----- From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Lars Ellenberg Sent: Thursday, October 04, 2012 9:05 AM To: drbd-user at lists.linbit.com Subject: Re: [DRBD-user] borked split-brain recovery On Wed, Oct 03, 2012 at 01:08:40PM -0700, mdavidson at allureglobal.co wrote: > > sorry, i meant to say, when telling openfiler1 to connect, openfiler2 > is designated as sync target I have no idea how you managed to get yourself into that situation, but I sugget to re-create the drbd meta data on the "bad" node, and have it sync up from there. on openfiler1, drbdadm down vg0_drbd drbdadm -- --force create-md vg0_drbd drbdadm up vg0_drbd on openfiler2 drbdadm adjust vg0_drbd Then have someone help you figure out what went wrong, and how to avoid that in the future... Lars > mdavidson at allureglobal.co wrote: > > > > in the middle of trying to manually recover from a split-brain, it > > seems i've created a little bit of a mess. I'm using two openfiler > > machines with drbd as HA iscsi storage for a xenserver cluster as > > described here > > http://www.howtoforge.com/installing-and-configuring-openfiler-with- > > drbd-and-heartbeat-p2 > > http://www.howtoforge.com/installing-and-configuring-openfiler-with- > > drbd-and-heartbeat-p2 . I've managed to get the cluster_metadata > > resource syncing properly, but the actual data resource is being > > fussy. Openfiler2 is currently primary and seems to be working fine > > as all my vm's are currently online. I'd like to keep openfiler2 as > > the primary, but when i tell openfiler1 to connect the system > > designates openfiler1 as the sync target. I'm rather new to drbd so > > if there's any other info i need to post please let me know > > > > Openfiler1 status: > > [root at openfiler1 log]# service drbd status drbd driver loaded OK; > > device status: > > version: 8.3.7 (api:88/proto:86-91) > > GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by > > phil at fat-tyre, > > 2010-01-13 17:17:27 > > m:res cs ro ds > > p mounted fstype > > 0:cluster_metadata Connected Secondary/Primary UpToDate/UpToDate > > C > > 1:vg0_drbd WFConnection Secondary/Unknown Inconsistent/DUnknown > > C > > [root at openfiler1 log]# > > > > > > Openfiler2 status: > > [root at openfiler2 ha.d]# service drbd status drbd driver loaded OK; > > device status: > > version: 8.3.7 (api:88/proto:86-91) > > GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by > > phil at fat-tyre, > > 2010-01-13 17:17:27 > > m:res cs ro ds > > p mounted fstype > > 0:cluster_metadata Connected Primary/Secondary UpToDate/UpToDate > > C /cluster_metadata ext3 > > 1:vg0_drbd StandAlone Primary/Unknown UpToDate/Inconsistent > > r---- > > [root at openfiler2 ha.d]# > > > > > > dmesg output from openfiler2: > > [1287966.539911] block drbd1: Starting receiver thread (from > > drbd1_worker > > [3145]) > > [1287966.540030] block drbd1: receiver (re)started [1287966.540047] > > block drbd1: conn( Unconnected -> WFConnection ) [1287966.639236] > > block drbd1: Handshake successful: Agreed network protocol version > > 91 [1287966.639246] block drbd1: conn( WFConnection -> > > WFReportParams ) [1287966.639282] block drbd1: Starting asender > > thread (from drbd1_receiver > > [12115]) > > [1287966.639419] block drbd1: data-integrity-alg: <not-used> > > [1287966.639526] block drbd1: drbd_sync_handshake: > > [1287966.639532] block drbd1: self > > 89867987176E42C7:0000000000000000:C1B7F3C81019781C:2516E370EEC0B159 > > bits:29285 flags:0 > > [1287966.639538] block drbd1: peer > > 80413839405F0B3A:89867987176E42C6:C1B7F3C81019781C:2516E370EEC0B159 > > bits:0 > > flags:0 > > [1287966.639542] block drbd1: uuid_compare()=-1 by rule 50 > > [1287966.639546] block drbd1: I shall become SyncTarget, but I am primary! > > [1287966.639777] block drbd1: conn( WFReportParams -> Disconnecting > > ) [1287966.639785] block drbd1: error receiving ReportState, l: 4! > > [1287966.640033] block drbd1: asender terminated [1287966.640042] > > block drbd1: Terminating asender thread [1287966.640209] block > > drbd1: Connection closed [1287966.640217] block drbd1: conn( > > Disconnecting -> StandAlone ) [1287966.640234] block drbd1: receiver > > terminated [1287966.640239] block drbd1: Terminating receiver thread > > > > > > -- > View this message in context: > http://old.nabble.com/borked-split-brain-recovery-tp34510920p34510927. > html Sent from the DRBD - User mailing list archive at Nabble.com. > > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD(r) and LINBIT(r) are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed _______________________________________________ drbd-user mailing list drbd-user at lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user