[DRBD-user] drbd 0.7.13 slow resync and panic with RedHat kernel 2.4.21-32.0.1.ELsmp

Diego Liziero diegoliz at carpidiem.it
Wed Sep 14 18:49:03 CEST 2005

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello,
today we tried to update our drbd 0.6.x system to 0.7.13
using a free disk partition as meta-disk.

We followed all the update instructions and we got the first
5 drbd partitions in sync with the new 0.7.13 version.

While the 6th and last drdb partition was syncing, we first noticed a
slowdown.

The bitrate went down from 480Mbit/sec to about 60Mbit/sec.

The link between the 2 nodes of the cluster is a dedicated gigabit
ethernet link used only by drbd, we noticed and measured
this slowdown using iptraf.

The last partition is the bigger one (250G), and after 10% of the 
resync process, the primary cluster hanged. The console was black,
the keyboard not responding, we had to press the reset button.

We tried this process various times, and with different versions
of the 2.4.21smp kernel and all with a new (recompiled
each time) 0.7.13 drdb module.

In all cases we got a system hang during the resync, sometimes
with a slowdown of the sync rate some minutes before the hang.

In one case we were able to see an Oops message on the console,
but unfortunately just the last lines were visible
(I remember something about tasker, irq and smp)
and shift-pageup was not working.

Our system is a cluster with 2 servers each one with 4 Xeon
processors and 7 Gb of RAM.

The same kernel version works fine with drbd 0.6.12

Any suggestion?

Regards,
Diego.





More information about the drbd-user mailing list