[DRBD-user] split-brain after trying to verify?

Michael Tokarev mjt at tls.msk.ru
Fri Oct 23 14:14:17 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Lars Ellenberg wrote:
> On Fri, Oct 23, 2009 at 03:21:49PM +0400, Michael Tokarev wrote:
>> Hello.
[]
> You adjusted network parameters (verify-alg), which we still cannot do
> while keeping the connection.
> So your first "adjust" to add the verify-alg had to
> disconnect ; then reconnect with new parameters.
> 
> Don't do that while both are Primary.

Aha.  Makes sense.  However it looks quite fragile
this way.  May it refuse or at least warn about such
situations?

>> I also don't have an idea what to do next, ie,
>> how to resolve the "conflict".  Restarting stuff
>> does not help.
> 
> There is a section about recovering from split brain
> in the DRBD User's Guide

I've read and tried it yesterday.  But for some reason
it looked to me as if there's only one choice for
after-sb-2pri, which is disconnect.  This part:

  call-pri-lost-after-sb: Apply the recovery policies as
   specified in after-sb-0pri. If a split brain victim can
   be selected after applying these policies, invoke the
   pri-lost-after-sb handler on the victim node. This handler
   must be configured in the handlers section and is
   expected to forcibly remove the node from the cluster.
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Now I see where it goes.

I've added:

     after-sb-0pri discard-zero-changes;
     after-sb-1pri discard-secondary;
     after-sb-2pri call-pri-lost-after-sb;

and it immediately worked.  So it appears that the data
was indeed exactly the same but changes were somewhere
else.

I wonder why the discard-zero-changes is not the default...

Thanks!

/mjt



More information about the drbd-user mailing list