I have a shared/parallel filesystem on top of drbd dual primary/protocol C (using 8.3.11 right now).<div><br></div><div>My question is about recovering after a network outage where I have a 'resource-and-stonith' fence handler which panics both systems as soon as possible.</div>
<div><br></div><div>Even with Protocol-C, can the bitmaps still have dirty bits set? (ie, different writes on each local device which haven't returned/acknowledged to the shared filesystem because they haven't yet been written remotely?)</div>
<div><br></div><div>Maybe a more concrete example will make my question clearer:</div><div>- node A & B (2 node cluster) are operating nominally in primary/primary mode (shared filesystem provides locking and prevents simultaneous write access to the same blocks on the shared disk).</div>
<div>- node A: write to drbd device, block 234567, written locally, but remote copy does not complete due to network failure</div><div>- node B: write to drbd device, block 876543, written locally, but remote copy does not complete due to network failure</div>
<div>- Both writes do not complete and do not return successfully to the filesystem (protocolC).</div><div>- Fencing handler is invoked, where I can suspend-io and/or panic both nodes (since neither one is reliable at this point). </div>
<div><br></div><div>If there is a chance of having unreplicated/unacknowledged writes on two different disks (those writes can't conflict, because the shared filesystem wont write to the same blocks on both nodes simultaneously), is there a resync option that will effectively 'revert' any unreplicated/unacknowledged writes?</div>
<div><br></div><div>I am considering writing a test for this and would like to know a bit more about what to expect before I do so.</div><div><br></div><div>Thanks,</div><div>Brian</div>