Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Thu, May 21, 2009 at 09:33:43AM -0600, David.Livingstone at cn.ca wrote: > It seems like the "out-of-sync" log messages are output because the data > in those sectors was changing during the verification. > > Here's how I came up with that... > > For a /var/log/messages statement like this (notice the time stamp): > May 20 14:30:19 wimpas2 kernel: drbd0: Out of sync: start=137739248, > size=8 (sectors) > > We can run a "dd" command to peek at what data it's talking about on > both servers: > (on wimpas1): sudo dd if=/dev/mapper/VolGroup01-LogVol00 iflag=direct > bs=512 skip=137739248 count=8 of=/tmp/wimpas1-drbd-oos > > (on wimpas2): sudo dd if=/dev/mapper/VolGroup01-LogVol00 iflag=direct > bs=512 skip=137739248 count=8 of=/tmp/wimpas2-drbd-oos > > Comparing the two output files using "diff" showed they were the same, > so that indicates replication worked properly. > > Looking inside the files showed they were polling logs with timestamps > from the same time that the /var/log/messages statement was output: > > eg) (snipped for brevity, notice the time stamps 20th day, 14:30:16 - > 14:30:22) > time:20143016 REC fd:21 ff1216060100ef57000000000000f78f > time:20143016 TRA fd:21 12ff14000100e0a6 size:8 dur:0 OK ... > time:20143022 REC fd:21 ff0f160601002763000000000000f78f > time:20143022 TRA fd:21 0fff1400010097c9 size:8 dur:0 OK > > > So, the theory right now is that the "out-of-sync" messages were because > the data in those sectors was changing during the verification and the > "0 KB (0 bits) marked out-of-sync" means DRBD realized that. please also see: http://thread.gmane.org/gmane.linux.kernel.drbd.devel/790 http://thread.gmane.org/gmane.linux.network.drbd/14850 I'd suggest that "somthing" modified in-flight buffers, then re-submitted them. the drbd online-verify (as well as the syncer) is supposed to "lock" the regions it currently compares against application IO, so it should do the compare when no application IO is in-flight (on that region). but it may hit such a "transient" not-in-sync thingy. iirc, a few "modify in-flight buffer" things have been tackled in the upstream kernel during the "bio integrity" work in recent kernels. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed