Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> What is the best way to manage a split-brain on a Master<>Master setup on > DRBD 8.0 ? Personally, I'd say "Manually" or "With extreme prejudice". Anything else is likely to cuase difficulty somewhere - and that's a generic thing about split brain, not drbd specific. > DRBD will see by itself see if a node is up for what I unserstood, DRBD will see if the node is up, and available via a network connection. There's no magic "It's up, even though I can't talk to it over the defined network" option. Support for a secondary network link would be nice, but it's probably not worth the extra effort. > will it be automaticly sync it to 1-1 for both ways ? If they are both still in "primary" mode, then this isn't going to work. There's no way to do a 2 way resync without understanding of the higher level data - and even then you're likely to get conflicts. Take the simple case of a single EXT3 partition that got mounted on both as a result of a split brain - you could use something like rsync to resync the filesystems, but that wouldn't necessarily be the right thing to do anyway: 1) Edit file A on node 1 (A->A1), edit file B on node 2 (B->B2) 2) Now wait a while, and edit B on node1 (B->B1), and A on node 2 (A-A2). 3) remerge, and rsync. 4) You are left with A2 & B1, which means that you've not got the correct data from either mirror, and probably nothing that makes any sense (think about them as config files, you've got half the config from each node). > Some People claim that when node-2 came back online it needs to resync all > the data from node-01. Is this true, or is it smart enought to only sync > the new files ? Files are at a different level to the drbd device. Without teaching the sync utility about all the available file systems (ext2, ext3, reisferfs, jfs, xfs, gfs, ocfs, the list goes on) this couldn't happen. And even then (see example above) is probably not what you want. Y ou could possibly get away with re-syncing any blocks that have changed on either node to the secondary you pick (e.g. add together the change list for both nodes, and push that) but I'm not sure that I'd want to do it that way, even if in theory it would work... I'm too scared that my data would be corrupted (although if that is what this does, I'm happy to trust these guys - it's my code I don't trust). > I think DRBD 8.0 has almost everything in this case you need, the only > think is a split-brain that you have to manange well. With split brain and drbd, one of the two nodes is about to be told that it's wrong. That it's data is wrong, and that all that it thinks it knows about the drbd device is wrong, and this tends to get ugly. The only safe way (without lots of hooks into lots of applications) to do this is to kill anything that's talking to the device, refresh the device from the copy you want to use (or just make it available during the resync), and then let things access it again. You can't pause and re-allow, since that way the app could easily have cached data - it has to be a kill. Personally my belief is that the best option is to reboot the secondary node - that way you guarantee that everything is reset to a known good state - but I certainly accept that this is a little heavy for some uses. I think it's configurable within the drbd config file - mine is set to disconnect instead, and wait for me to deal with it - as I said, I think manual intervention is the way to go at this point. To put it another way, the only way to really deal with split brain is to not let it develop in the first place - and this is something that's been causing grief for clusters for a long time. In a 2 node environment you basically have three options (that I can think of) for how to avoid it: 1) Have a 3rd "thing" that you just use as a votekeeper. Advantage: This means that you've got 3 votes available, and therefore you can never have a 50/50 split. Disadvantage: You have extra complexity, and dependance on a 3rd device 2) Weight one of the nodes as "more important". Advantage: Very simple to do, very easy to configure, "just works". Disadvantage: The other node cannot operate if the more important one is not available, without manual intervention 3) STONITH (Shoot The Other Node In The Head) Advantage: It means that one node will be down if you ever end up in a split brain situation. Disadvantage: It kills one of the machines (fsck, etc) - and normally needs human intervention to bring it back. You can also, if you are unlucky, end up with both nodes dead (Have had this happen with Sun Cluster 2.1). Which is great for data consistency, but is a bit silly. With any even number of "votes", you've got the possibility of a 50/50 split. With 4 or more you've got other options that would work reasonably well - two nodes is often treated as a "special case". Graham