Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I have a shared/parallel filesystem on top of drbd dual primary/protocol C (using 8.3.11 right now). My question is about recovering after a network outage where I have a 'resource-and-stonith' fence handler which panics both systems as soon as possible. Even with Protocol-C, can the bitmaps still have dirty bits set? (ie, different writes on each local device which haven't returned/acknowledged to the shared filesystem because they haven't yet been written remotely?) Maybe a more concrete example will make my question clearer: - node A & B (2 node cluster) are operating nominally in primary/primary mode (shared filesystem provides locking and prevents simultaneous write access to the same blocks on the shared disk). - node A: write to drbd device, block 234567, written locally, but remote copy does not complete due to network failure - node B: write to drbd device, block 876543, written locally, but remote copy does not complete due to network failure - Both writes do not complete and do not return successfully to the filesystem (protocolC). - Fencing handler is invoked, where I can suspend-io and/or panic both nodes (since neither one is reliable at this point). If there is a chance of having unreplicated/unacknowledged writes on two different disks (those writes can't conflict, because the shared filesystem wont write to the same blocks on both nodes simultaneously), is there a resync option that will effectively 'revert' any unreplicated/unacknowledged writes? I am considering writing a test for this and would like to know a bit more about what to expect before I do so. Thanks, Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120405/891bf9ab/attachment.htm>