[DRBD-user] communication constantly terminated, always re-syncing

Lars Ellenberg lars.ellenberg at linbit.com
Tue Mar 8 17:25:15 CET 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tue, Mar 08, 2011 at 08:52:13AM -0500, Cory Coager wrote:
> On 03/04/2011 09:42 AM, Lars Ellenberg wrote:
> >I wrote: my best guess is your NIC hardware is broken.  These logs are
> >more complete than what you posted before, though, so you _may_
> >just have a pathologicall case of data buffers modified in flight.
> >
> >There have been a few threads on that, the most recent one there:
> >Sat, 26 Feb 2011 15:40:44 -0800
> >Re: [DRBD-user] Explained: Digest integrity check FAILED
> >http://www.mail-archive.com/drbd-user@lists.linbit.com/msg03373.html
> 
> OK, how do I prove the vendor that the NIC hardware is broken?  Why
> do you think its hardware and not a driver issue?

I did think that first, because the logs you provided
have been incomplete, and missing important hints.

I did correct myself above, you even quoted that part:
> >These logs are
> >more complete than what you posted before, though, so you _may_
> >just have a pathologicall case of data buffers modified in flight.
> >
> >There have been a few threads on that, the most recent one there:
> >Sat, 26 Feb 2011 15:40:44 -0800
> >Re: [DRBD-user] Explained: Digest integrity check FAILED
> >http://www.mail-archive.com/drbd-user@lists.linbit.com/msg03373.html

So, did you read that thread,
and the things it refers to?

In short: to be able to distinguish
	a) "buffers modified in flight, *by upper layers*",
from
	b) "data modified in flight by something else"
you need to upgrade to drbd 8.3.10.
Symptoms will not change, but for a) you get an additional
log message saying so on the Primary.

If for each such disconnect you get a corresponding log line
saying "Digest mismatch, buffer modified by upper layers during write:",
then that is what happens.

If you get some disconnects without that log message, something else is
going on, suggesting other sources of modification of in-flight data,
including, but not limitted to, broken hardware.
*NOTE* again: you need DRBD 8.3.10 to get that log message at all.

For some more background info, and suggestions what to do:
read that thread I referred to, and the things it refers to.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list