[DRBD-user] strange split-brain problem

Felix Frank ff at mpexnet.de
Tue Dec 7 14:29:39 CET 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On 12/07/2010 02:09 PM, Klaus Darilion wrote:
> Am 07.12.2010 12:08, schrieb Klaus Darilion:
>>> You will need to resolve that, refer to
>>> http://www.drbd.org/users-guide/s-resolve-split-brain.html
>> I did that now and try to reproduce the problem.
> It happened again. Here is what I did:
> 1. on node B:
>   drbdadm secondary resource
>   drbdadm -- --discard-my-data connect resource
> 2. on node A
>   drbdadm connect resource
> Then everything was fine again. node A was primary, node B was secondary.
> Then I shut down node B.
> Then I rebooted node A.
> Then, one resource on node A came up as primary without the problems.
> The second resource on node A (the one which I had to resolve from split
> brain) again did not come up:
> block drbd5: State change failed: Refusing to be Primary without at
> least one UpToDate disk
> The I tried to force it to primary:
> # drbdadm -- --overwrite-data-of-peer primary cc-manager-templates-ha
> 5: State change failed: (-2) Refusing to be Primary without at least one
> UpToDate disk
> Any ideas what is happening here?

My money is on the activity log again. Here is what's happening:

Node A is primary and marks some extents as hot. Then you reboot it.
It comes back up and finds hot extents, so it flips the corresponding
bits in its QuickSync Bitmap and wants to sync those from the secondary.
Thus, your device is inconsistent, but that is no mistake on your part.

The reasoning is that Node A won't quite trust the data on its physical
disk, so it relies on whatever Node B knows to be the last state of the
most recently written extents. Note that Node A can and will become
Primary even before the sync-back is complete.

Bring back Node B, then Node A should be able to sync itself back to sanity.


More information about the drbd-user mailing list