[DRBD-user] Dual-primary split-brain recovery after reboot

Pete Ashdown pashdown at xmission.com
Fri Jun 24 21:03:35 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 06/24/2011 12:43 PM, Lars Ellenberg wrote:
>
> Once you know that, you can either fix it yourself,
> (and kindly tell the list what the problem was).
> Or come back here with that additional information,
> and ask for further assistance, and we will try to help fixing it, and
> possibly even find out why it "went wrong" in the first place.
>

Thank you Lars.  I have spent most of this week trying to get it to work,
so I am not merely looking for a "click here solution".  I've tried
tweaking the after-sb-* settings and init scripts and it always comes back
in StandAlone disconnected mode.  I've also tried forcing secondary upon
shutdown, but it usually fails due to the "held open by someone" error.

Here is some additional information:

Both servers are Ubuntu 10.04 LTS with 2:8.3.7-1ubuntu2.1

/etc/drbd.conf:
global { usage-count no; }
common { syncer { rate 1G; verify-alg crc32c; } }

resource r0 {
        protocol C;
        startup {
                wfc-timeout  15;
                degr-wfc-timeout 60;
        become-primary-on both;
        }
        net {
                cram-hmac-alg sha1;
                shared-secret "secret";
        allow-two-primaries;
        after-sb-0pri discard-younger-primary;
        after-sb-1pri discard-secondary;
        after-sb-2pri call-pri-lost-after-sb;
        }
        on servera {
                device /dev/drbd0;
                disk /dev/md2;
                address 10.0.0.5:7788;
                meta-disk internal;
        }

        on serverb {
                device /dev/drbd0;
                disk /dev/md2;
                address 10.0.0.6:7788;
                meta-disk internal;
        }
}

Syslog upon reboot and during manual recovery: 
http://pastebin.com/Rq9Yyp4A  Note the "sync from peer node", which is
fine, but isn't automatically done by the init script.  I have to manually
run the following in a short time window after system reboot to recover
completely:
           drbdadm secondary all
           drbdadm -- --discard-my-data connect all
           drbdadm primary all


I inserted a "cp /proc/drbd /tmp" in /etc/init.d/drbd right before the
"$DRBDADM sh-b-pri all # Become primary if configured" line and this is
what it shows:

version: 8.3.7 (api:88/proto:86-91)
GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by root at serverb,
2011-0
6-17 12:50:42
 0: cs:Disconnecting ro:Secondary/Unknown ds:UpToDate/DUnknown C r----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110624/7d3276c3/attachment.pgp>


More information about the drbd-user mailing list