direct links of old sources may well be a few messages off.
Should I `cat /dev/zero > /dev/sdb1` before re-creating the metadata? [root at openfiler1 ~]# drbdadm -- --force create-md vg0_drbd pvs stderr: /dev/sdb1: Added to device cache pvs stderr: /dev/sdb1: Skipping (regex) pvs stderr: Failed to read physical volume "/dev/sdb1" pvs stderr: Unlocking /var/lock/lvm/P_global md_offset 362175492096 al_offset 362175459328 bm_offset 362164404224 Found LVM2 physical volume signature 353676176 kB left usable by current configuration Could not determine the size of the actually used data area. Device size would be truncated, which would corrupt data and result in 'access beyond end of device' errors. If you want me to do this, you need to zero out the first part of the device (destroy the content). You should be very sure that you mean it. Operation refused. Command 'drbdmeta 1 v08 /dev/sdb1 internal create-md --force' terminated with exit code 40 drbdadm create-md vg0_drbd: exited with code 40 [root at openfiler1 ~]# -----Original Message----- From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Lars Ellenberg Sent: Thursday, October 04, 2012 9:05 AM To: drbd-user at lists.linbit.com Subject: Re: [DRBD-user] borked split-brain recovery On Wed, Oct 03, 2012 at 01:08:40PM -0700, mdavidson at allureglobal.co wrote: > > sorry, i meant to say, when telling openfiler1 to connect, openfiler2 > is designated as sync target I have no idea how you managed to get yourself into that situation, but I sugget to re-create the drbd meta data on the "bad" node, and have it sync up from there. on openfiler1, drbdadm down vg0_drbd drbdadm -- --force create-md vg0_drbd drbdadm up vg0_drbd on openfiler2 drbdadm adjust vg0_drbd Then have someone help you figure out what went wrong, and how to avoid that in the future... Lars > mdavidson at allureglobal.co wrote: > > > > in the middle of trying to manually recover from a split-brain, it > > seems i've created a little bit of a mess. I'm using two openfiler > > machines with drbd as HA iscsi storage for a xenserver cluster as > > described here > > http://www.howtoforge.com/installing-and-configuring-openfiler-with- > > drbd-and-heartbeat-p2 > > http://www.howtoforge.com/installing-and-configuring-openfiler-with- > > drbd-and-heartbeat-p2 . I've managed to get the cluster_metadata > > resource syncing properly, but the actual data resource is being > > fussy. Openfiler2 is currently primary and seems to be working fine > > as all my vm's are currently online. I'd like to keep openfiler2 as > > the primary, but when i tell openfiler1 to connect the system > > designates openfiler1 as the sync target. I'm rather new to drbd so > > if there's any other info i need to post please let me know > > > > Openfiler1 status: > > [root at openfiler1 log]# service drbd status drbd driver loaded OK; > > device status: > > version: 8.3.7 (api:88/proto:86-91) > > GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by > > phil at fat-tyre, > > 2010-01-13 17:17:27 > > m:res cs ro ds > > p mounted fstype > > 0:cluster_metadata Connected Secondary/Primary UpToDate/UpToDate > > C > > 1:vg0_drbd WFConnection Secondary/Unknown Inconsistent/DUnknown > > C > > [root at openfiler1 log]# > > > > > > Openfiler2 status: > > [root at openfiler2 ha.d]# service drbd status drbd driver loaded OK; > > device status: > > version: 8.3.7 (api:88/proto:86-91) > > GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by > > phil at fat-tyre, > > 2010-01-13 17:17:27 > > m:res cs ro ds > > p mounted fstype > > 0:cluster_metadata Connected Primary/Secondary UpToDate/UpToDate > > C /cluster_metadata ext3 > > 1:vg0_drbd StandAlone Primary/Unknown UpToDate/Inconsistent > > r---- > > [root at openfiler2 ha.d]# > > > > > > dmesg output from openfiler2: > > [1287966.539911] block drbd1: Starting receiver thread (from > > drbd1_worker > > ) > > [1287966.540030] block drbd1: receiver (re)started [1287966.540047] > > block drbd1: conn( Unconnected -> WFConnection ) [1287966.639236] > > block drbd1: Handshake successful: Agreed network protocol version > > 91 [1287966.639246] block drbd1: conn( WFConnection -> > > WFReportParams ) [1287966.639282] block drbd1: Starting asender > > thread (from drbd1_receiver > > ) > > [1287966.639419] block drbd1: data-integrity-alg: <not-used> > > [1287966.639526] block drbd1: drbd_sync_handshake: > > [1287966.639532] block drbd1: self > > 89867987176E42C7:0000000000000000:C1B7F3C81019781C:2516E370EEC0B159 > > bits:29285 flags:0 > > [1287966.639538] block drbd1: peer > > 80413839405F0B3A:89867987176E42C6:C1B7F3C81019781C:2516E370EEC0B159 > > bits:0 > > flags:0 > > [1287966.639542] block drbd1: uuid_compare()=-1 by rule 50 > > [1287966.639546] block drbd1: I shall become SyncTarget, but I am primary! > > [1287966.639777] block drbd1: conn( WFReportParams -> Disconnecting > > ) [1287966.639785] block drbd1: error receiving ReportState, l: 4! > > [1287966.640033] block drbd1: asender terminated [1287966.640042] > > block drbd1: Terminating asender thread [1287966.640209] block > > drbd1: Connection closed [1287966.640217] block drbd1: conn( > > Disconnecting -> StandAlone ) [1287966.640234] block drbd1: receiver > > terminated [1287966.640239] block drbd1: Terminating receiver thread > > > > > > -- > View this message in context: > http://old.nabble.com/borked-split-brain-recovery-tp34510920p34510927. > html Sent from the DRBD - User mailing list archive at Nabble.com. > > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD(r) and LINBIT(r) are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed _______________________________________________ drbd-user mailing list drbd-user at lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user