Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
--- On Mon, 3/9/09, GAUTIER Hervé <herve.gautier at thalesgroup.com> wrote: Thanks for the reply. :) > Martin Fick a écrit : > > > > If node B goes down while node A is still primary, > > should it not be possible to keep track of the fact > > that node B is now outdated on node A? This way, > > if node A goes down while node B is still down, > > when node A comes back up it should know that it > > can safely proceed to primary without waiting for > > node B to return. > > > > How do you know that, while node A was down, node B > haven't been up and down several times ??? You don't! :( But, if it has, you already have a split brain situation and you are not likely making things worse (depending on your split brain resolution scenario). At least with my proposal in mind, the cluster manager can potentially be configured to never bring up node B without some form of manual override causing the split brain in the first place. The idea, is that by keeping track of the down status of nodes on peers, you can automate one extra scenario in your cluster making the manual intervention steps fewer and therefor hopefully also fewer opportunities for split brain (and down time). This leaves only the following scenarios where a node comes up and it either must wait for its peer to return to continue normal operation: 1) Node A & B go down exactly at the same time 2) Node B goes down, Node A goes down, Node B returns (or vice versa) The only time you would want B to become primary here is if node A is going to be permanently down and you are forced to discard its more recent updates Whereas currently, any time a node comes up without its peer, split brain is a risk if it does not establish a connection to its peer before going primary. Since the objective of drbd is, I assume, HA (not data protection like raid since it does not verify reads), it seems strange to make your system have two dependencies on cold starts (when both nodes go down.) In this sense, a drbd cluster is typically currently configured to be less HA reliable (on boots) than a single node without drbd since you can never safely automate the starting of the cluster with a single node! However, if you keep track of your peer's failure, this restriction is potentially removed. If node B suffers an HD failure and you are replacing its drive, do you want your cluster to require manual boot intervention if node A happens to go down for a minute? Seems unnecessary to me, node A should be smart enough to be able to reboot and continue on its own? > > If the cluster was degraded when node A went down, > > it should be able to continue to operate degraded > > safely when node A comes backup right? Is there > > anything wrong with this logic? Are there > > currently any mechanisms to do this? Would you > > consider implementing this in drbd? > > > > I think it is a cluster matter, not DRBD one. Well, it certainly can be handled on the cluster level (and I plan on doing so), but why would drbd not want to store extra important information if possible? Even if drbd does not use this info, why not store the fact that you are positive that your peer is outdated (or that you are in split brain)?! -Martin