[DRBD-user] Crashing (or power failure) using a disk controller with no battery on its cache

tracymtaylor tracymtaylor at comcast.net
Mon Sep 14 18:17:09 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Our 3ware controller is configured to respect FUA and barriers, but has no
battery.  We use protocol C. If we crash during a mirrored write, we don’t
know if the primary and secondary bits have both hit the platter, and we
depend on DRBDs resync mechanism will get them in sync when we come up
again.

After a crash our DBMS guys are worried about the window between the time we
start to use the primary and the time the resync finishes.  Assuming the
same node is primary before and after the crash, is the primary always used
as the source of the resync?   Is there ever a time when the secondary is
used as a source? We don’t want to have a race where we come up, look at the
primary to determine a TXN committed before resync has completed, and then
as resync progresses, have that primary block be overwritten by an older
block from the secondary saying the TXN is not committed.

I guess if we crash again durring resync and this time the primary is unable
to come up again, we still may have the situation that we told a customer we
committed this TXN, but now the only copy of the data is the secondary and
its got a different opionion about this TXNs commit.

-- 
View this message in context: http://www.nabble.com/Crashing-%28or-power-failure%29-using-a-disk-controller-with-no-battery-on-its-cache-tp25438818p25438818.html
Sent from the DRBD - User mailing list archive at Nabble.com.




More information about the drbd-user mailing list