[DRBD-user] strange split-brain problem

Lars Ellenberg lars.ellenberg at linbit.com
Tue Dec 7 20:43:12 CET 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tue, Dec 07, 2010 at 08:36:01PM +0100, Klaus Darilion wrote:
> Hi Lars!

> >>Then some more reboots on node A and suddenly:
> >>
> >>block drbd5: State change failed: Refusing to be Primary without at
> >>least one UpToDate disk
> >>block drbd5:   state = { cs:WFConnection ro:Secondary/Unknown
> >>ds:Diskless/DUnknown r--- }
> >      ^^^^^^^^
> >
> >You failed to attach, you have not yet connected,
> >so DRBD refuses to become Primary: which data should it be Primary with?
> 
> but how can it be secondary without and disk?

Oh the wonders of DRBD ;-)
Well, you told it to.
It's completely legal to tell a DRBD to connect to its peer
without having a local disk attached.  It's unusual, though.

> 
> >>Then the status on node A was:
> >>
> >>cc-manager-templates-ha  Connected Primary/Secondary
> >>Diskless/UpToDate A r----
> >
> >It was able to establish the connection,
> >and was going Primary with the data of the peer.
> 
> Is this a feature? How can it know that the peers data is up2date
> when it can not attach to the local disk?

You told it to.  DRBD typically does what it is told,
unless it happens know better for sure
(and even then you can force it, usually).

If you tell it to connect without first attaching a local disk,
and you don't have resource level fencing mechanisms in place
so the remote end assumes itself to be uptodate,
that's your problem.

> >>When I tried to manually attach the device I got error messages:
> >>"Split-Brain detected, dropping connection".
> >
> >Hm.  Ugly.
> >It should refuse the attach instead.
> >Did it just get the error message wrong,
> >or did it actually disconnect there?
> >What DRBD version would that be?
> 
> Ubuntu 10.04:
> # /etc/init.d/drbd status
> drbd driver loaded OK; device status:
> version: 8.3.7 (api:88/proto:86-91)
> GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by
> root at cc1-sbg, 2010-10-14 15:13:20

> >And, BTW, no.
> >Your /etc/hosts file has zero to do with how DRBD behaves.
> 
> At least I can reproduce the bad behavior when adding the bug to
> /etc/hosts. I think it has something todo how I address the disk.
> The one volume which is working fine is configured with:
>   disk /dev/mapper/cc1--vienna-manager--disk--drbd
> 
> The other volume which causes the problems is configured with
>   disk /dev/cc1-vienna/cc-manager-templates-drbd
> which is a symlink to
>   /dev/mapper/cc1--vienna-cc--manager--templates--drbd
> 
> So, I have no idea why, but it seems that if /etc/hosts is broken
> then the symlinks are no available when DRBD starts. When after
> booting up is stop/start the DRBD service, then DRBD attaches to the
> disks fine. Strange.

At best, changing stuff in /etc/hosts changes some timing during
your boot process. Which means it is still broken since racy.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list