Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tue, Jul 07, 2009 at 09:08:43AM -0700, Mike Sweetser - Adhost wrote: > Hello: > > We have two DRBD machines running RHEL 5.3 with DRBD 8.3.0. Recently, > we had an outage that took the primary server in the cluster down, > leaving it to failover using DRBD and Heartbeat. This was done with > no issues. > Assuming all this was done right, we ran into other issues - some > people have complained that their files have "reverted" to a previous > state. how long ago is "previous"? several days? a few hours? if only a few seconds, people forgot to fsync, and the block device has never seen that particular write. or there are volatile caches involved, and you pretended to DRBD they had been non-volatile. > We don't show any errors occuring in the synchronization of > the files, and never saw any "oos" in the DRBD status. There is only one way to "jump back in time" with DRBD: You swichover to a node that has been disconnected for some time, and you did not notice. Then you go online with that stale data. So I assume IFF you really jumped back in time, that happened during your failover, because for some reason the nodes have not been replicating. You need to fix your setup, monitor the system, and probably add resource level fencing (outdate a disconnected secondary if a primary is still running). -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed