Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello I'm setting up a two-node active/active cluster with DRBD and OCFS2. When the nodes lose communication with one another, a split-brain happens, and both machines and up in the stand-alone/active state. One possible solution to minimize the amount of time where writes can be done to both nodes without synchronization would be to remount the filesystems as read-only with the split-brain hook. This, however, won't avoid inconsistencies with writes happening before the hook has a chance to run. Is there a better way to avoid inconsistencies in a situation like this? I was thinking that a new protocol that would try to sync writes with the peer before doing a local write could solve this, refusing to do the local write (and possibly setting the device as read-only -- can drbd do that?) if the remote write isn't successful. Such a protocol would surely result in a performance penalty, but maybe it could be worth it if that level of safety could be achieved. Does this idea make sense? Is there a different solution for this problem that doesn't require something as radical as writing a new protocol? Thanks in advance, Andre