[Drbd-dev] DRBD8: Split-brain if primary and syncTarget
Philipp Reisner
philipp.reisner at linbit.com
Mon Mar 12 15:52:16 CET 2007
Am Montag, 12. März 2007 15:28 schrieb Philipp Reisner:
> Am Donnerstag, 8. März 2007 23:21 schrieb Montrose, Ernest:
> > Hi all,
> >
> > We are seeing an issue with split brain if one node is syncing as
> > syncTarget while being Primary.
> > two node A and B.
> > * make B primary and the syncTarget
> > * Start a sync.
> > * ifdown eth1 to break communication
> > * ifup eth1.
> > * then on the node in standalone "drbdadm connect"
> > We get a split-brain.
> >
> > I think the problem is that if we are primary and we lose contact from
> > the other side we generate a new current UUID which causes a Split-Brain
> > next time we connect.
> > This only happens if we are the sync target and we are primary. Perhaps
> > we should not generate a UUID if we were syncing when the disconnect
> > happen. Below is a possible patch for this in after_state_ch():
>
> Hi Ernest,
>
> I think the current behaviour is correct.
>
> * When a node is SyncTarget it actually exposes the data of the sync
> source node to its applications. (And the applications can potentially
> see the data when the SyncTarget node is primary.)
>
> * When you disconnect such a node, it has to fall back to its local
> data set. == suddenly the applications see a different data set,
> and of course the apps might continue to modify this data set...
>
> * Wen you reconnect this, you have a split brain situation. But you
> might let the automatic-split-brain resolving handler solve the
> situation. Use some after-sb-?pri settings, and an rr-conflict of
> "violently" E.g.:
>
> after-sb-0pri discard-least-changes
> after-sb-1pri violently-as0p
> after-sb-2pri violently-as0p
> rr-conflict violently
>
> Then the resync should continue. Since the "violently" allows DRBD
> to change the data set again, that is seen on the Primary node.
Hmmm. I just had a look at the code in drbd_sync_handshake(), and came
to the conclusion that the handling of the inconsistent disk state was
a bit obscure.
With the attached patch the after-sb-?pri settings are of no impact
this such an situation any longer. Only the "rr-conflict" setting
should influence the outcome...
If it works for you with that patch, I will commit it...
-phil
--
: Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Vivenotgasse 48, 1120 Vienna, Austria http://www.linbit.com :
-------------- next part --------------
A non-text attachment was scrubbed...
Name: look_at_inconsistent_first.diff
Type: text/x-diff
Size: 1599 bytes
Desc: not available
Url : http://lists.linbit.com/pipermail/drbd-dev/attachments/20070312/d19be3ca/look_at_inconsistent_first.bin
More information about the drbd-dev
mailing list