[DRBD-user] Tracking down sources of corruption examined by drbdadm verify

Tue Apr 22 16:20:21 CEST 2008

On Fri, Apr 18, 2008 at 11:29 AM, Lars Ellenberg
<lars.ellenberg at linbit.com> wrote:
> On Thu, Apr 17, 2008 at 07:19:06PM +0200, Szeróvay Gergely wrote:
>  > >  > Any idea would help.
>  > >
>  > >  what file systems?
>  > >  what kernel version?
>  > >  what drbd protocol?
>  > >
>  > >  it is possible (I got this suspicion earlier, but could not prove it
>  > >  during local testing) that something submits a buffer to the block
>  > >  device stack, but then modifies this buffer while it is still in flight.
>  > >
>  > >  these snippets you show look suspiciously like block maps.  if the block
>  > >  offset also confirms that this is within some filesystem block map, than
>  > >  this is my working theory of what happens:
>  > >
>  > >  ext3 submits block to drbd
>  > >   drbd writes to local storage
>  > >   ext3 modifies the page, even though the bio is not yet completed
>  > >   drbd sends the (now modified) page over network
>  > >   drbd is notified of local completion
>  > >   drbd receives acknowledgement of remote completion
>  > >  original request completed.
>  > >
>  > >  i ran into these things while testing the "data integrity" thing,
>  > >  i.e. "data-integrity-alg md5sum", where every now and then
>  > >  an ext3 on top of drbd would produce "wrong checksums",
>  > >  and the hexdump of the corresponding data payload always
>  > >  looked like a block map, and was different in just one 64bit "pointer".
>
>
> > DRBD 8.2.5 with protocol „C"
>  >
>  > Kernel versions (kernels from kernel.org with Vserver patch):
>  > node „immortal": 2.6.21.6-vs2.2.0.3 32bit smp
>  > node „endless": 2.6.22.18-vs2.2.0.6 32bit smp (with new e1000 driver)
>  > node „infinity": 2.6.22.18-vs2.2.0.6 32bit smp (with new e1000 driver)
>  >
>  > I use Reiserfs usually with group quotas enabled. The DRBD device is
>  > on the top of LVM2 (and on software RAID1 in some cases).
>  >
>  > My system often has heavy load, but I cannot found connection between
>  > the oos blocks and the load. My most problematic volume  contains a
>  > Mysql5 database. I try to stress it with move big files to the volume,
>  > but the oos blocks not generated more frequently.
>  >
>  > I tried the crc32 data-integrity-alg on one most problematic volume,
>  > it detected some errors per day, but I think its not a network error,
>  > because the network pass the tests cleanly, and the full resyncs made
>  > no corruptions.
>
>  right. so my working hypthesis is that somehow reiserfs
>  modifies its buffers even while they are in flight.
>
>  because submission to local disk and tcp send over network happen at
>  different times, local disk and remote system see different data.
>
>  to verify that, we could
>   * memcopy the data which is submitted
>   * memcmp it just before it is completed
>  I could provide a patch to do so somewhen next week.
>
>  if it is an option, change the filesystem of your "most problematic"
>  volumes to xfs, and see how it behaves then.
>
>  --
>
>
> : Lars Ellenberg                           http://www.linbit.com :
>  : DRBD/HA support and consulting             sales at linbit.com :
>  : LINBIT Information Technologies GmbH      Tel +43-1-8178292-0  :
>  : Vivenotgasse 48, A-1120 Vienna/Europe     Fax +43-1-8178292-82 :
>  __
>  please don't Cc me, but send to list -- I'm subscribed
>  _______________________________________________
>  drbd-user mailing list
>  drbd-user at lists.linbit.com
>  http://lists.linbit.com/mailman/listinfo/drbd-user
>

I have an idea about how could I reproduce the ooses in our test enviroment.
I hope I can try it in this week.
If I can reproduce it, I can do the tests with the patch and try the
XFS file system.

What do you think, if the secondary volume has oos blocks, the file
system is damaged on it?