[DRBD-user] drbd split brain recovery - workaround?

Thu Sep 29 14:58:34 CEST 2005

Hi All,

This topic has been covered many times, on a good few mailing lists that
I've already found.
To quickly recap the issue in question (that I'm also suffering with):

2 hosts - Tara, and Inertia, using Linux HA in an active/passive
configuration.
Inertia is the primary, aka node 1.

I pull the plug on the single network cable connecting Inertia to the
switch.
Drbd notices the dropped link. Tara is now Primary/Unknown. Inertia is
Secondary/Unknown.
The serial connection is still up between the nodes, HB negotiates with
ping nodes, and fails over to Inertia.

The HA failover scripts change Tara to Secondary/Unknown, and Inertia to
Primary/Unknown.

Great. All is working as designed so far, and service continues....good
cluster. *pet pet*

Now, I plug the cable back into tara, drbd notices, and prints:
Sep 30 00:08:06 tara kernel: drbd0: Handshake successful: DRBD Network
Protocol version 74
Sep 30 00:08:06 tara kernel: drbd0: Connection established.
Sep 30 00:08:06 tara kernel: drbd0: I am(P):
1:00000002:00000001:00000060:0000001c:10
Sep 30 00:08:06 tara kernel: drbd0: Peer(S):
1:00000002:00000001:00000061:0000001b:00
Sep 30 00:08:06 tara kernel: drbd0: Current Primary shall become sync
TARGET! Aborting to prevent data corruption.

So, once inertia has network access restored, it is unable to resync.
Now, I have already researched this and the drbd developers have
explained that this is not drbd's fault. The reason is that BOTH sides
have changed.
I assume the simple act of umounting the disk (as part of the failover)
on inertia is enough to count as a write, incrementing drbd's counters
on inertia.

There have been various suggestions along the lines of stonithing
inertia, and that if inertia were to be restarted manually, the problem
will go away. Neither of these sit well with me.
I also understand from the road map that drbd 0.8 will have some options
to deal with exactly this situation.

With some exploring I found that if the Primary drbd host runs a simple
'drbd connect r0', the two will resync successfully.  However if the
secondary runs the same command this it won't work.

So I wrote this script (be kind, this is my first ever bash script...)

#!/bin/bash
drbdadm=/sbin/drbdadm

if grep -q Unknown /proc/drbd
then echo "we have a broken drbd connection"
$drbdadm connect r0
fi

exit

This was then added to cron on both machines, and set to run every 10
minutes, offset by 5 per server. This allows the system to work with
either host running as the active node

For me... this has fixed my split brain problem. This still isn't
sophisticated enough to allow for a HA auto-fallback, but at least I
have data synced on both disks increasing my redundancy until a) I
switch back manually, or b) another failure takes out the failed over
node, in which case this will have saved my bacon.

Now, I might be misunderstanding exactly what drbd connect does. At face
value this appears to simply be initiating a connection with it's peer,
and it seems to me that this is something that drbd should be able to
take care of itself internally.
I have my connect-int set to 10 seconds, but from what I'm seeing here,
they try *ONCE* and give up. A simple retry (in the "other" direction)
should allow for a successful resync?
As an intermediate step before 0.8, is this something that could be
implemented in 0.7x? I know this doesn't cover all the situations of
splitbrain, but I'm sure this would help others with this situation?

Or am I missing something fundemental?

Thanks,
Jonathan Wheeler