[DRBD-user] SOLVED: What factors decide: "split-brain detected"?

Rustedt, Florian Florian.Rustedt at smartnet.de
Tue Mar 10 15:41:16 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Thanks for your detailled answer, now i am understanding better ;)

I have solved this now, after recompiling all relevant kernels, i never got this failure again, so it could be seen as "closed".

Kind regards, Florian 

> -----Ursprüngliche Nachricht-----
> Von: drbd-user-bounces at lists.linbit.com 
> [mailto:drbd-user-bounces at lists.linbit.com] Im Auftrag von 
> Lars Ellenberg
> Gesendet: Freitag, 6. März 2009 21:39
> An: drbd-user at lists.linbit.com
> Betreff: Re: [DRBD-user] What factors decide: "split-brain detected"?
> 
> On Wed, Mar 04, 2009 at 09:23:50AM +0100, Rustedt, Florian wrote:
> > Hello list,
> > 
> > What exact is the reason for drbd(8.3.0) to detect 
> split-brain( on dual-primary)?
> > 
> > Parallel write-access?
> 
> no.
> that would log "conflicting write detected" or some such.
> 
> > Too short delay between two write-accesses on both sides, 
> although they are sequential?
> 
> no.
> 
> you are looking in a wrong direction.
> 
> 
> split-brain is a situation when nodes can not communicate.
> it can only be detected once they do communicate again.
> 
> simplifying some special cases,
> whenever DRBD is Primary without being able to communicate 
> with its peer, it generates a "uuid" (large "random" number) 
> to tag its "data generation". it keeps some history of former 
> such uuids.
> 
> during DRBD network handshake, the peers compare their set of 
> uuids (current, bitmap, history...).
> if one is a strict ancestor of the other (the "current" uuid 
> of one node is the "bitmap"-uuid of the other, that decides 
> the syncdirection, as it is clear which one has the "better", 
> more recent, data.
> 
> if both nodes share some (all) former uuids, but both have a 
> new, different, "current" uuid, well, that is when 
> "split-brain" is detected: now they can determin that they 
> used to have the same dataset, but then lost communication, 
> and both proceeded to modify the data, independently.
> 
> there is much more detail about that uuid scheme and 
> algorithm in some of the papers/publications at drbd.org.
> 
> 
> your other posts indicate that you simply try to do xen 
> migration using DRBD as the xen image backing store.
> 
> and you seem to think that the migration causes the 
> split-brain, or the split-brain detection.  that is not so. 
> you are looking at the wrong end of the problem.
> 
> 
> whenever you see "split-brain detected", then you should go 
> back, and find when, where, and why, the "split-brain" was _caused_. 
> becaust there and then is the problem you should solve.
> 
> when and why does DRBD lose the connection?
> while being primary on both nodes?
> or is it made primary without being connected?
> 
> 
> --
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
> 
> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
> __
> please don't Cc me, but send to list   --   I'm subscribed
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
> 
> 
**********************************************************************************************
IMPORTANT: The contents of this email and any attachments are confidential. They are intended for the 
named recipient(s) only.
If you have received this email in error, please notify the system manager or the sender immediately and do 
not disclose the contents to anyone or make copies thereof.
*** eSafe scanned this email for viruses, vandals, and malicious content. ***
**********************************************************************************************




More information about the drbd-user mailing list