[Drbd-dev] Troubleshooting digest failures?

Lars Ellenberg lars.ellenberg at linbit.com
Wed Aug 6 19:09:30 CEST 2008


On Wed, Aug 06, 2008 at 06:04:09PM +0200, Lars Ellenberg wrote:
> On Wed, Aug 06, 2008 at 08:56:25AM -0600, Gregor Mosheh wrote:
> > Hey guys.
> 
> Hello again.
> 
> Sorry for the joke, but I cannot help it.
> You know the story about "The hare and the hedgehog"?
> 
> > I've gotten no response from the user list,
> 
> now, that is not entirely true ;)
> 
> > so maybe it's time  
> > for a different tack debugging DRBD's innards...
> >
> > I've been having a problem which I describe here. The last posting is  
> > probably the most relevant.
> > http://www.gossamer-threads.com/lists/drbd/users/15119
> >
> > How would I go about debugging this?  Is there extra logging or
> > debugging  which I can enable? Have any of you seen this before?
> 
> Anyways,
> appart from what I wrote in your thread, and the 
> "What causes nodes to become out-of-sync?" thread,
>  http://www.gossamer-threads.com/lists/drbd/users/15081
> there is not much else I can say.
> 
> You said you have an other cluster, not yet in production, where it did
> not occur so far, and you suggest it may be just the missing load that
> makes it "appear" healthy.
> 
> How about using it as test setup, and generate load on it,
> until you can provoke the symptom there, too?
> 
> To reverse that, if you cannot provoke the symptom there,
> I'd still point to hardware issues on the affected cluster.

also, please have a look at this thread, where I try to explain
why modifying in-flight data buffers would lead to these symptoms.
http://www.gossamer-threads.com/lists/drbd/users/15189

also, when online-verify reports the out-of-sync sectors,
please to the
 # dd iflag=direct if=/dev/whatever bs=512 \
	skip=sector-offset count=size \
	of=nodename.dump
 # diff -U0 <(xxd node0.dump) <(xxd node1.dump)
trick (explained in the "what causes nodes to become out of sync"
thread) to get a diff of the hexdumps, so we can tell whether there is
  single bit flips,
  multiple word data changes
  complete unrelated stuff
in the corresponding sectors on the different nodes.

-- 
: Lars Ellenberg                            Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :


More information about the drbd-dev mailing list