Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I am replying to myself. Re-reading drbd documentation I finally found the write quorum explanation. And I discovered that why, using dual primary, I always get a split brain after a disconnection. Now I do not understand two things: - why single primary mode (master slave) does not need a write quorum; - how dual primary works. I think about this (mode C): Good communication: server A receive an order to write a disk block; server A writes it on disk; A send it to B; B writes and ack; A receive ack and tell upper layer that write is good. Bad communication: server A receive an order to write a disk block; server A writes it on disk; A send it to B; COMMUNICATION FAILURE B does not receive anything nor it can reply; A timeouts and sends upper layer a write error. I suppose now that A and B are blocked (they cannot complete writes) and I (or the cluster manager) can decide to shutdown one server. My question is: I still does not understand what drbd really do in this situation, is like above or is different? The other question is why single primary mode (master slave) does not need a write quorum? Thanks again for help! 2009/2/10 Mario Giammarco <mgiammarco at gmail.com> > Hello, > I am trying to build an iscsi san using drbd in a dual primary > configuration. > > I have read drbd documentation and I have not fully understood how it > handles the split brain. > > My hardware is setup as this: two identical server with raid6. Each one has > 4 ethernet cards, configured as two trunks. > Each trunk is connected to an hardware switch. The two switches are > "intelligent", so they have an ip each. > > My idea (correct me if I am wrong) is this: when one primary finds that it > cannot talk to other primary it tries to ping switches. > > If it cannot ping switches it means that all its ethernet cards or all > switches are broken so it shutdown itself. > If it can ping switches it means that other primary is broken so it tries > to stonith it. > > After reconnection it is clear that the primary that cannot ping the > switches must resync with other one. > > Can you say me if I can personalize the behaviour of drbd to follow these > rules? (dopd? peer-outdater?) > Can you say me if my rules are enough? > Can you say me if drbd already implements a better strategy and so my rules > are stupid? > > Thank you in advance for any reply! > > Mario > > >