[DRBD-user] degradation of performance -90% - v.0.7.5

Thu Sep 27 21:41:40 CEST 2012

Hello everyone,

I am running drbd on 2.4.x, 2.6.x. and 3.x kernels - only one systems gives me this following problem:

[ this is 0.7.25 on 2.4.21-51.ELsmp ( Rhel3 32 bit ) - 16GB RAM.... 8 cores,  etc ]

between 6 to 12 hours of starting drbd, the transmission rates go down from 70-90+ MB/s to about 5MB/s
At this moment, the only way to getting the transmission speed back up, without stopping/restarting the service is 'drbdadm adjust' after chnanging 
max-buffers or max-epoch-size ( change the values by 1 ) on the secondary node.

so if the slowdown occurs, I change max-buffers from 10000 to 10001 and do the adjust.... the performance is perfect again.... for ~6 hours or sometimes a bit more....

I spent days to see if tweaking drbd, tcp, vm would change this behavior - no luck...

my current config:

resource export {
protocol C;
incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f";
startup {
wfc-timeout 0; ## Infinite!
degr-wfc-timeout 60; ## 2 minutes.
}
disk {
on-io-error detach;
}
net {
# timeout 60;
# connect-int 10;
# ping-int 10;
max-buffers 10001;
max-epoch-size 8003;
}
syncer {
rate 50M;
group 1;
al-extents 3389;
}

# PRIMARY - remote
on yy.xx.com {
device /dev/drbd0;
disk /dev/emcpowerc1;
address 10.15.1.100:7790;
meta-disk /dev/emcpowerc2[0];
}

# SECONDARY - this box
on zz.xx.com {
device /dev/drbd0;
disk /dev/sdb1;
address 10.15.11.103:7790;
meta-disk /dev/sdb2[0];
}

}

If I am unable to find the solution I might need to put the "adjust" in the cron - which I would hate to do...

unfortunately I >cannot< upgrade the kernel, nor the linux release... any other 'fix' should be possible....

thanks for all your suggestions!!!!

Mike