Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> I just set up a drbd replication pair and shipped it to the colo. It was working great. so you did local tests first, and all was working as expected? > Then we upgraded the kernel to 2.6.15-1.1833_FC4smp. I rebuilt the > kernel modules on one machine, rebooted, connected and became primary. > Then I rebooted the other machine into 2.6.15 so I could build its new > modules. While I was waiting for that, I started transferring some > data onto the drbd partition (fs already there, had been working > great) by untarring. It had gone for a little bit when tar hung. I > couldn't kill it. I figured it would timeout. It didn't. Also no > errors in /var/log/messages. The other machine couldn't connect at > this point, so I figured I had hosed something and that I would just > start with what should be a pristine copy of the data on the other > machine and make the bad machine resync. > > I power-cycled both machines, made the good machine primary, ran an > fsck on the fs, mounted it. The bad machine connected and started to > resync. It all looked as if it was fine. Yay! So then i realized that > I had the network throttling down way too low (it's a 20G partition) > and I needed to restart the drbd. So I disconnected the secondary. And > unmounted the filesystem on the primary. At least I tried to. The > umount failed (I had modified one file) And now the primary machine is > hung again. > So I've rebooted the secondary system back into 2.6.11, and I'm going > to power cycle the primary again. Any ideas as to what's going on? now. your report is somewhat unspecific. anyways, when I read your last sentence "machine is hung", this might point to a deadlock that could occur when stressing the box. this possible deadlock is due to a bio_alloc(,GFP_KERNEL) in drbd where is should have been GFP_NOIO, and has been recognized and fixed just after we released 0.7.17. may I ask you to try again with recent drbd svn? svn co http://svn.drbd.org/drbd/branches/drbd-0.7 revision 2111 and greater should contain that fix. there may be a 0.7.18 bugfix release because of that. please report your findings. thanks, -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com : __ please use the "List-Reply" function of your email client.