Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Wed, Mar 04, 2009 at 09:23:50AM +0100, Rustedt, Florian wrote: > Hello list, > > What exact is the reason for drbd(8.3.0) to detect split-brain( on dual-primary)? > > Parallel write-access? no. that would log "conflicting write detected" or some such. > Too short delay between two write-accesses on both sides, although they are sequential? no. you are looking in a wrong direction. split-brain is a situation when nodes can not communicate. it can only be detected once they do communicate again. simplifying some special cases, whenever DRBD is Primary without being able to communicate with its peer, it generates a "uuid" (large "random" number) to tag its "data generation". it keeps some history of former such uuids. during DRBD network handshake, the peers compare their set of uuids (current, bitmap, history...). if one is a strict ancestor of the other (the "current" uuid of one node is the "bitmap"-uuid of the other, that decides the syncdirection, as it is clear which one has the "better", more recent, data. if both nodes share some (all) former uuids, but both have a new, different, "current" uuid, well, that is when "split-brain" is detected: now they can determin that they used to have the same dataset, but then lost communication, and both proceeded to modify the data, independently. there is much more detail about that uuid scheme and algorithm in some of the papers/publications at drbd.org. your other posts indicate that you simply try to do xen migration using DRBD as the xen image backing store. and you seem to think that the migration causes the split-brain, or the split-brain detection. that is not so. you are looking at the wrong end of the problem. whenever you see "split-brain detected", then you should go back, and find when, where, and why, the "split-brain" was _caused_. becaust there and then is the problem you should solve. when and why does DRBD lose the connection? while being primary on both nodes? or is it made primary without being connected? -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed