[DRBD-user] DRBD stability issues

Steve Kieu msh.computing at gmail.com
Wed Nov 16 03:01:38 CET 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


I am experimenting drbd and not quite good in stability (un usable). I saw
this in dmesg log:

block drbd1: md_sync_timer expired! Worker calls drbd_md_sync

At fist restart it works for a while, and then all of sudden - cat
/proc/drbd show ProtocolError and system hang (mysql or any other process
read/write to the drbd partitions.

It is repeatable and when it happend network is not busy, machine load is
nearly 0 and all other network connectivity is normal.

Googling show me that many users has same problem and one suggested to
lower the rate of resync and sync, I did that (for 100Mbit ethernet I set
resync is 3M and in syncer  rate 40M; I setup two volumes . Problem still.

Here is the short description of the system:

* Centos 6  x86_64
* Kernel which is vanilar
kernel with vserver patch vs2. - compile with HZ = 100
and SMP for 32 cpu
* DRBD compiled from source, version 8.4.0 (including kernel module)
* DRBD build on top of LVM here is the config

resource r0 {

          on cosmos {
                  volume 0 {
                    #device minor 0;
                    device /dev/drbd0;
                    meta-disk internal;
                    disk  /dev/vs-resource1/mysqldata;

                  volume 1 {
                    device /dev/drbd1;
                    meta-disk internal;
                    disk  /dev/vs-resource1/pgsqldata;


          on seaspray {
                 volume 0 {
                        # device minor 0;
                        device /dev/drbd0;
                        meta-disk internal;
                        disk      /dev/vg_seaspray/mysqldata;

                 volume 1 {
                    device /dev/drbd1;
                    meta-disk internal;
                    disk  /dev/vg_seaspray/pgsqldata;


        startup {
          #become-primary-on both;

 net {
                protocol C;
                after-sb-0pri discard-zero-changes;
                after-sb-1pri discard-secondary;
                after-sb-2pri disconnect;
                #cram-hmac-alg sha1;
                #shared-secret "FooFunFactory";



* DRBD runs in Primary/Secondary mode for now. The device is mounted into a
vserver instance and mysql and postgres is running from the vserver
* IPtables is setup to allow DRBD trafic - it happened even iptables is off

* Network route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use
Iface     *      U     0      0        0 eth0   *      U     0      0        0
eth1.503   *        U     0      0        0
dummy0         *        U     0      0        0
link-local      *          U     1002   0        0 eth0
link-local      *          U     1003   0        0 eth1
link-local      *          U     1004   0        0
default         UG    0      0        0 eth0

I attach the dmesg here as well if it helps to debug. I would like to have
it fixed so please help.

Many thanks,

Steve Kieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20111116/c649e398/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmesg-drbd-error.gz
Type: application/x-gzip
Size: 22562 bytes
Desc: not available
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20111116/c649e398/attachment.bin>

More information about the drbd-user mailing list