<br><font size=2><tt>We confirmed that for speed considerations the circular
buffers do </tt></font>
<br><font size=2><tt>not use fsync(). Data loss is not a concern in a crash.</tt></font>
<br>
<br><font size=2><tt>We have also confirmed on subsequent verifies that
the marked </tt></font>
<br><font size=2><tt>out-of-sync blocks are all from these logs.</tt></font>
<br>
<br><font size=2><tt>Thanks<br>
</tt></font>
<br><font size=2><tt>> > On Thu, May 21, 2009 at 09:33:43AM -0600,
David.Livingstone@cn.ca wrote:<br>
> > > It seems like the "out-of-sync" log messages are
output because the<br>
> data<br>
> > > in those sectors was changing during the verification.<br>
> > ><br>
> > > Here's how I came up with that...<br>
> > ><br>
> > > For a /var/log/messages statement like this (notice the
time stamp):<br>
> > > May 20 14:30:19 wimpas2 kernel: drbd0: Out of sync: start=137739248,<br>
> > > size=8 (sectors)<br>
> > ><br>
> > > We can run a "dd" command to peek at what data
it's talking about on<br>
> > > both servers:<br>
> > > (on wimpas1): sudo dd if=/dev/mapper/VolGroup01-LogVol00
iflag=direct<br>
> > > bs=512 skip=137739248 count=8 of=/tmp/wimpas1-drbd-oos<br>
> > ><br>
> > > (on wimpas2): sudo dd if=/dev/mapper/VolGroup01-LogVol00
iflag=direct<br>
> > > bs=512 skip=137739248 count=8 of=/tmp/wimpas2-drbd-oos<br>
> > ><br>
> > > Comparing the two output files using "diff" showed
they were the same,<br>
> > > so that indicates replication worked properly.<br>
> > ><br>
> > > Looking inside the files showed they were polling logs with
timestamps<br>
> > > from the same time that the /var/log/messages statement
was output:<br>
> > ><br>
> > > eg) (snipped for brevity, notice the time stamps 20th day,
14:30:16 -<br>
> > > 14:30:22)<br>
> > > time:20143016 REC fd:21 ff1216060100ef57000000000000f78f<br>
> > > time:20143016 TRA fd:21 12ff14000100e0a6 size:8 dur:0 OK<br>
> > ...<br>
> > > time:20143022 REC fd:21 ff0f160601002763000000000000f78f<br>
> > > time:20143022 TRA fd:21 0fff1400010097c9 size:8 dur:0 OK<br>
> > ><br>
> > ><br>
> > > So, the theory right now is that the "out-of-sync"
messages were<br>
> because<br>
> > > the data in those sectors was changing during the verification
and the<br>
> > > "0 KB (0 bits) marked out-of-sync" means DRBD
realized that.<br>
</tt></font>
<br><font size=2><tt>> > please also see:<br>
</tt></font>
<br><font size=2><tt>> > http://thread.gmane.org/gmane.linux.kernel.drbd.devel/790<br>
> > http://thread.gmane.org/gmane.linux.network.drbd/14850<br>
</tt></font>
<br><font size=2><tt>> Lars,<br>
</tt></font>
<br><font size=2><tt>> Thanks for the reply.<br>
</tt></font>
<br><font size=2><tt>> I've reviewed the links above(head is now spinning:).
With respect<br>
> to "crash safe" applications the out-of-sync disk portions
that<br>
> we looked at were poller and alarm daemon log files. They use
circular</tt></font>
<br><font size=2><tt>> logs, so they<br>
> would be overwriting a file they've created. We're currently
checking<br>
> whether or not they use fsync().</tt></font>
<br><font size=2><tt>> <br>
> As shown in the initial post we are using ext3.<br>
</tt></font>
<br><font size=2><tt>> Anything else we could be checking ?<br>
</tt></font>
<br><font size=2><tt>> Thanks<br>
</tt></font>
<br><font size=2><tt>> ><br>
> > I'd suggest that "somthing" modified in-flight buffers,<br>
> > then re-submitted them.<br>
</tt></font>
<br><font size=2><tt>> > the drbd online-verify (as well as the syncer)
is supposed to "lock" the<br>
> > regions it currently compares against application IO, so it should
do<br>
> > the compare when no application IO is in-flight (on that region).<br>
> > but it may hit such a "transient" not-in-sync thingy.<br>
</tt></font>
<br><font size=2><tt>> > iirc, a few "modify in-flight buffer"
things have been tackled in the<br>
> > upstream kernel during the "bio integrity" work in
recent kernels.<br>
</tt></font>
<br><font size=2><tt>> > --<br>
> > : Lars Ellenberg<br>
> > : LINBIT | Your Way to High Availability<br>
> > : DRBD/HA support and consulting http://www.linbit.com<br>
</tt></font>
<br><font size=2><tt>> > DRBD? and LINBIT? are registered trademarks
of LINBIT, Austria.<br>
> > __<br>
> > please don't Cc me, but send to list -- I'm subscribed<br>
</tt></font>