Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Thursday 08 April 2004 11:53, Andreas Schultz wrote: > On Wednesday 07 April 2004 17:29, Philipp Reisner wrote: > > [...] > > > Could you try this patch ? > > > > - There is a good chance that it will "just work". > > It works, kind off. > > Normal operation appears to be ok, initial resync, normal work and so one. > > A problem occurred when i attempted to repair the underlaying soft raid5 > array while a drbd resync was underway. The raid rebuild caused the drbd > sync to stall. The drbd alsodid not recover after the raid rebuild was > completed. Stopping the secondary (drbdadm down ...) worked ok, but > stopping the primary failed: > > sdev01:~# drbdadm down db > drbd0: worker terminated > > Child process does not terminate! > Exiting. > > # tail /var/log/syslog > Apr 8 10:57:26 sdev01 kernel: drbd0: Resync started as source (need to > sync 40. > Apr 8 11:09:20 sdev01 kernel: drbd0: meta connection shut down by peer. > Apr 8 11:09:20 sdev01 kernel: drbd0: asender terminated > Apr 8 11:10:21 sdev01 kernel: drbd0: worker terminated > > # ps xaw > PID TTY STAT TIME COMMAND > 22377 ? DW 0:14 [drbd0_receiver] > 625 ttyS0 D 0:00 /sbin/drbdsetup /dev/drbd0 down > > backtrace for drbd0_receiver (sysreq-t): > _drbd_process_ee > drbd0_receive D 00000086 0 22377 1 625 9715 (L-TLB) > c2441f08 00000046 00000003 00000086 00000001 00000000 f77e02a0 00000000 > c2440000 c2441f08 f8c4d347 00000001 d283346c d2833000 c2441ef4 > c2441eec c01fcf96 c1a0cbe0 000186a0 93e1f740 000f7a28 cb063388 d2833000 > 00000000 Call Trace: > [<f8c4d347>] _drbd_process_ee+0x137/0x1d0 [drbd] > [<c01fcf96>] generic_unplug_device+0x66/0x70 > [<f8c4d4ec>] drbd_get_ee+0x10c/0x280 [drbd] > [<c011e270>] default_wake_function+0x0/0x20 > [<f8c4ef1a>] receive_DataRequest+0xca/0x290 [drbd] > [<f8c4de5d>] drbd_recv_header+0x1d/0xe0 [drbd] > [<f8c4ff52>] drbdd_init+0xb2/0x6c0 [drbd] > [<c0124d24>] daemonize+0xa4/0xb0 > [<f8c472cd>] drbd_thread_setup+0x3d/0x60 [drbd] > [<f8c47290>] drbd_thread_setup+0x0/0x60 [drbd] > [<c0106005>] kernel_thread_helper+0x5/0x10 > > sdev01:~# cat /proc/modules > drbd 91616 1 - Live 0xf8c45000 > > sdev01:/# addr2line -e /usr/src/kbuild-srv-dev/drivers/block/drbd/drbd.o > 8347 /usr/src/linux-2.6.5-vs/drivers/block/drbd/drbd_receiver.c:350 > > The above source code line looks a bit strange. I'll probably > missunderstood something about how the real address/offset has to be > calculated. Let me know if you need any additional information. > If I assume that it was blocked in _drbd_process_ee().. _drbd_process_ee() tries to send ACKs. What happens if it can not get them through ? Why is it in "D" state? blocked on the send_mutex ? The output of "netstat -t" would be nice in this situation. -Philipp -- : Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com :