[DRBD-user] What to gather when a lockup happens? FC6 + drbd 8.0.5

Jason Zhang jzhang at silver-peak.com
Wed Sep 12 09:50:10 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Thanks. Installed 8.0.6, so far so good, no lock up upon connection
breaks.

-----Original Message-----
From: drbd-user-bounces at lists.linbit.com
[mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Lars Ellenberg
Sent: Tuesday, September 11, 2007 1:44 AM
To: drbd-user at lists.linbit.com
Subject: Re: [DRBD-user] What to gather when a lockup happens? FC6 +
drbd 8.0.5

On Mon, Sep 10, 2007 at 02:45:29PM -0700, Jason Zhang wrote:
> I have an experimental setup, and the primary node seems to lockup in
a
> few occasions, mostly due to it is not connected to secondary
(manually
> disconnected or connection breakage). Once it enters in that state,
> issuing shutdown command won't reboot the machine, had to power cycle.

first, upgrade to 8.0.6, please.
and give your exact kernel version, too.

> My questions are:
> 
> 1. what information should I gather in this case?

you could do
 cat /proc/drbd
 ps -eo pid,state,wchan:40,comm | grep drbd

or even
 echo 1 > /proc/sys/kernel/sysrq
 echo t > /proc/sysrq-trigger

then take the stack dumps from wherever your syslog puts the kernel
messages, into some file, throw away the uninteressting userland
processes, if possible reduce the list to only kernel related, vm, fs,
maybe network and blockdevice related things, _avoid linebreaks_,
gzip it, and post that.

but since this may be A LOT of information which may be completely
useless, someone would need to really be motivated to look at that.

> 2. any software methods to kill/reload drbd besides physically
pressing
> the reset/power buttons?

it depends on the nature of that "lockup".
mostly the answer is: no, you have to reset.
well, you can do
 echo s > /proc/sysrq-trigger
 echo u > /proc/sysrq-trigger
maybe stop all stopable md-devices now, if you know a way to do so, then
 echo b > /proc/sysrq-trigger

and hope for the best, which may be slightly more convenient
if you don't have a network power switch, and remote hands are
cumbersome and costly...

-- 
: Lars Ellenberg                            Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
__
please use the "List-Reply" function of your email client.
_______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user




More information about the drbd-user mailing list