[DRBD-user] DRBD7/8 + kernel 2.6.22.1 + load + Primary power off = weird problem

Thu Jul 26 10:32:44 CEST 2007

Hello!

I'm testing a behavior in different fail-over scenarios and got
a very weird problem.

I have two servers with kernel built from sources 2.6.22.1 with
precompiled DRBD module (make patch-kernel), tested with both
DRBD-8.0.4 and DRBD-8.0.24. DRBD configuration is identical and
follows below. Two nodes are up and running after reboot, disks
are in Secondary/Secondary and UpToDate/UpToDate state. Then
I do the following:

node1# drbdadm primary r0
node1# mount /dev/drbd0 /mnt
node1# dd if=/dev/urandom of=/mnt/randomfile bs=1M count=2048

In the middle I power this node off and keep it off all the time. And
do on the second node:

node2# drbdadm primary r0
(It works fine because the state was
cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown)

And here I get the problem. It shows
node2# fsck.ext3  /dev/drbd0
e2fsck 1.40-WIP (14-Nov-2006)
/dev/drbd0: recovering journal

and it freezes (no disk activity). Kill -9 doesn't work, reboot
doesn't work, any attempt to run "sync" freezes the "sync". If I run
the command

dd if=/dev/drbd0 of=/dev/null bs=1M

before the fsck it reads well the whole disk. But if I run it at the same
time with fsck then it freezes somewhere at the middle of the process
(I was able to read at least the first GB of the disk).

Then I reset the server and try to repeat the steps on the second
node. Result is the same. After reset I run fsck on the low-level disk
sdb1 and it works fine, without any delays. But after that if I mount
the file system through drbd0 device, at some point disk operations
stuck again (I suspect when they touch some area).

I'm able to reproduce the problem easily. Without any load during
"crash of a Primary" failover works as expected. I use corresponding
versions of a module and tools, of course. I'm going to do the same
with another kernel version like 2.6.18.8.

What am I doing wrong?

#------------------------------------------------
resource r0 {
    protocol               C;
    on node2 {
        device           /dev/drbd0;
        disk             /dev/sdb1;
        address          192.168.0.212:7788;
        meta-disk        /dev/sdb2 [0];
    }
    on node1 {
        device           /dev/drbd0;
        disk             /dev/sdb1;
        address          192.168.0.211:7788;
        meta-disk        /dev/sdb2 [0];
    }
    net {
        sndbuf-size       1m;
        ko-count          16;
        cram-hmac-alg    sha1;
        shared-secret    testtest;
        after-sb-0pri    discard-older-primary;
        after-sb-1pri    violently-as0p;
        after-sb-2pri    violently-as0p;
        rr-conflict      violently;
    }
    disk {
        on-io-error      pass_on;
    }
    syncer {
        rate             500M;
        al-extents       103;
    }
    startup {
        wfc-timeout        5;
        degr-wfc-timeout   5;
    }
    handlers {
        pri-on-incon-degr "echo DRDB pri-on-incon-degr | wall";
        pri-lost-after-sb "echo DRBD pri-lost-after-sb | wall";
        local-io-error   "echo DRDB local-io-error | wall";
    }
}
#------------------------------------------------

-- 
Igor