Hi all,

I just set up a drbd replication pair and shipped it to the colo. It was 
working great.

Then we upgraded the kernel to 2.6.15-1.1833_FC4smp. I rebuilt the 
kernel modules on one machine, rebooted, connected and became primary. 
Then I rebooted the other machine into 2.6.15 so I could build its new 
modules. While I was waiting for that, I started transferring some data 
onto the drbd partition (fs already there, had been working great) by 
untarring. It had gone for a little bit when tar hung. I couldn't kill 
it. I figured it would timeout. It didn't. Also no errors in 
/var/log/messages. The other machine couldn't connect at this point, so 
I figured I had hosed something and that I would just start with what 
should be a pristine copy of the data on the other machine and make the 
bad machine resync.

I power-cycled both machines, made the good machine primary, ran an fsck 
on the fs, mounted it. The bad machine connected and started to resync. 
It all looked as if it was fine. Yay! So then i realized that I had the 
network throttling down way too low (it's a 20G partition) and I needed 
to restart the drbd. So I disconnected the secondary. And unmounted the 
filesystem on the primary. At least I tried to. The umount failed (I had 
modified one file) And now the primary machine is hung again.

So I've rebooted the secondary system back into 2.6.11, and I'm going to 
power cycle the primary again. Any ideas as to what's going on?

Monty Taylor
Senior Consultant, MySQL, Inc.

