Re-8: [DRBD-user] drbd freezes completely!

Stefan Kerkemeier stefanke at micodat.com
Sat Nov 19 16:40:18 CET 2005

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


sorry for my ignorance, I think you asked for an UP=updated kernel.

I can reproduce this bug also with a non-smp (uniprocessor) kernel! Yes your right the bug behaviour makes no sense at all, but it is happening as I told you. I can reproduce this bug (sync source hangs up) with the other node (exact the same hardware setup) syncing to the other "direction" -> So I think a hardware issue isn`t very likly because the bug occus on both nodes!

It has something to do with the network/disc load because when I decrease the sync rate all is working. But the system only hangs up with drbd. When there is high network and disc load without drbd all is ok (when there is a hardware problem the system has to freeze too). 

it is very strange ...


cheers
Stefan

-------- Original Message --------
Subject: Re: Re-6: [DRBD-user] drbd freezes completely! (19-Nov-2005 15:59)
From:    Philipp Reisner <philipp.reisner at linbit.com>
To:      drbd-user at lists.linbit.com

> > > > > > Any suggestions?
> > > > >
> > > > > What means "freezes completely" ?
> > > > >  - What is on the screen
> > > >
> > > > no error messages
> > > >
> > > > >  - does it respond to key strokes
> > > >
> > > > no
> > > >
> > > > >  - does it respond to pings
> > > >
> > > > no
> > > >
> > > > >  - does it toggle the keyboard leds when you press "Num-Lock" etc..
> > > >
> > > > no
> > > >
> > > > no log entries. Note with nmi_watchdog=1 there is no addintional
> > > > information available!
> > >
> > > Please try to reproduce the freeze with an UP kernel.
> > >
> >
> > As I already mentioned, I tried vanilla 2.6.13 without success.
> 
> UP stands for 'uniprocessor' as opposition to SMP 'symmetric multi 
> processing'. This does not say anthing about vanilla or vendor kernel.
> In one of your posts you state that you run the SUSE SLES9 SP2/2.6.5.-191-
> smp
> kernel. What I asked you to do, is to run a kernel that was build for
> a single CPU machine.
> 
> What I am trying to find out if your lockup is a lockup on a 
> spinlock or on an other synchronousation primitive.
> 
> But your description so far makes no sense at all.
> 
> If it is on a semaphore/wait queue etc... it should respond to pings
> and key strokes.
> 
> If it is on a spinlock... it should OOPS when booted with "nmi_watchdog=1"
> 
> If it is on a spinlock the lockup will simply go away when you run on only
> a singe CPU (I.e. an UP kernel)
> 
> It looks like if your machine freezes due to some other reason, i.e. "bug"
> on the PCI bus... etc. Actually your observation that it does not lock
> up when it runs slower follows that pattern.
> 
> -Phil
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
> 
> 


To: philipp.reisner at linbit.com
Cc: drbd-user at lists.linbit.com





More information about the drbd-user mailing list