[DRBD-user] Tracking down sources of corruption examined by drbdadm verify

Lars Ellenberg lars.ellenberg at linbit.com
Tue Apr 22 20:09:52 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tue, Apr 22, 2008 at 04:20:21PM +0200, Szeróvay Gergely wrote:
> On Fri, Apr 18, 2008 at 11:29 AM, Lars Ellenberg
> <lars.ellenberg at linbit.com> wrote:
> > On Thu, Apr 17, 2008 at 07:19:06PM +0200, Szeróvay Gergely wrote:
> >  > >  > Any idea would help.
> >  > >
> >  > >  what file systems?
> >  > >  what kernel version?
> >  > >  what drbd protocol?
> >  > >
> >  > >  it is possible (I got this suspicion earlier, but could not prove it
> >  > >  during local testing) that something submits a buffer to the block
> >  > >  device stack, but then modifies this buffer while it is still in flight.
> >  > >
> >  > >  these snippets you show look suspiciously like block maps.  if the block
> >  > >  offset also confirms that this is within some filesystem block map, than
> >  > >  this is my working theory of what happens:
> >  > >
> >  > >  ext3 submits block to drbd
> >  > >   drbd writes to local storage
> >  > >   ext3 modifies the page, even though the bio is not yet completed
> >  > >   drbd sends the (now modified) page over network
> >  > >   drbd is notified of local completion
> >  > >   drbd receives acknowledgement of remote completion
> >  > >  original request completed.
> >  > >
> >  > >  i ran into these things while testing the "data integrity" thing,
> >  > >  i.e. "data-integrity-alg md5sum", where every now and then
> >  > >  an ext3 on top of drbd would produce "wrong checksums",
> >  > >  and the hexdump of the corresponding data payload always
> >  > >  looked like a block map, and was different in just one 64bit "pointer".
> >
> >
> > > DRBD 8.2.5 with protocol „C"
> >  >
> >  > Kernel versions (kernels from kernel.org with Vserver patch):
> >  > node „immortal": 2.6.21.6-vs2.2.0.3 32bit smp
> >  > node „endless": 2.6.22.18-vs2.2.0.6 32bit smp (with new e1000 driver)
> >  > node „infinity": 2.6.22.18-vs2.2.0.6 32bit smp (with new e1000 driver)
> >  >
> >  > I use Reiserfs usually with group quotas enabled. The DRBD device is
> >  > on the top of LVM2 (and on software RAID1 in some cases).
> >  >
> >  > My system often has heavy load, but I cannot found connection between
> >  > the oos blocks and the load. My most problematic volume  contains a
> >  > Mysql5 database. I try to stress it with move big files to the volume,
> >  > but the oos blocks not generated more frequently.
> >  >
> >  > I tried the crc32 data-integrity-alg on one most problematic volume,
> >  > it detected some errors per day, but I think its not a network error,
> >  > because the network pass the tests cleanly, and the full resyncs made
> >  > no corruptions.
> >
> >  right. so my working hypthesis is that somehow reiserfs
> >  modifies its buffers even while they are in flight.
> >
> >  because submission to local disk and tcp send over network happen at
> >  different times, local disk and remote system see different data.
> >
> >  to verify that, we could
> >   * memcopy the data which is submitted
> >   * memcmp it just before it is completed
> >  I could provide a patch to do so somewhen next week.
> >
> >  if it is an option, change the filesystem of your "most problematic"
> >  volumes to xfs, and see how it behaves then.
> 
> 
> I have an idea about how could I reproduce the ooses in our test
> enviroment.  I hope I can try it in this week.  If I can reproduce it,
> I can do the tests with the patch and try the XFS file system.

doing both at the same time would not prove much.
the thing is, I have the suspicion that "something"
is modifying in-flight io pages.
further, the suspicion is some specific "user" (file system) does this.

so to gather more data points, collecting "circumstancial evidence",
you can try and reproduce the effect with a different "user" (e.g. xfs).

if the "oos" effect does not show up with xfs,
but is reproducible with the other file system,
I'd say suspicion confirmed.

you can do that now.

the other way is to try to _prove_ that someone does modify in-flight io
pages. that would be possible by instrumentalizing the device driver,
i.e. patch drbd, adding additional page allocation,
doing copy on submit and verify before completion.

I could code up such a crude verification code, but it would be
debugging code only, just to be clear on that. And since this additional
copy/verification step would heavily alter the timeing and caching
behaviour, adding this debug code may even make the observed effect
vanish.
we'll see.

> What do you think, if the secondary volume has oos blocks, the file
> system is damaged on it?

difficult to say.

-- 
: commercial DRBD/HA support and consulting: sales at linbit.com :
: Lars Ellenberg                            Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list