Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2006-06-08 20:34:10 -0400 \ Maurice Volaski: > It appears that kernel 2.6.17-rc5 under amd64 may be hanging I/O > processes spontaneously at random. > > Our setup here uses drbd (ver. 0.7.19), which is a network RAID kernel > module, and initially I was fscking (ext3) filesystems, which are on > drbd devices, and the fsck just stopped spontaneously on the two of > them. > > Today, I tried copying a directory on the command line with cp -a on > this computer (on a drbd-managed device) and then in mid-copy I tried > to abort the process with control-C. It did not abort. I tried killing > it with kill and then with kill -9. It turns out that the process had > died, but is still left in ps. > > Just by happenstance, > > Anyway, I tried bringing down the peer and bringing it back up and it > stalled. > > I also had an emerge sync (i.e., the Gentoo update mechanism) going > and it too got stuck, but it doesn't, or at least shouldn't affect > drbd disks, implying this is not a drbd bug, but a kernel bug. > well. it says below that emerge hangs is drbd_al_begin_io, so it is at least drbd related, too. > >* what are the numbers in /proc/drbd > > This is how it appears long after the copy hang and I stopped and just > restarted drbd on the peer. The output from then hanging computer: thanks, that might help in debugging things. I'll have a look, maybe I can find something in there. -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com : __ please use the "List-Reply" function of your email client.