[DRBD-user] No resync after network downtime

Tue Aug 12 00:55:47 CEST 2008

Hi guys,

I have a pair of machines running a DRBD shared disk, I also have 
Heartbeat 2 installed - unfortunately there's no documented example of 
using hb_gui to setup a pair like this so i'm largely winging it and 
trying to avoid learning massively complicated heartbeat2/cib XML syntax!!

My issue is this:

The pair is primary-primary, i'm simulating network downtime by pulling 
out the crossover cable currently connected them and watching the response.

DRBD notices the loss and switches to:

# cat /proc/drbd
version: 8.0.11 (api:86/proto:86)
GIT-hash: b3fe2bdfd3b9f7c2f923186883eb9e2a0d3a5b1b build by phil at mescal, 
2008-02-12 11:56:43
 0: cs:WFConnection st:Primary/Unknown ds:UpToDate/DUnknown C r---
    ns:15 nr:28684 dw:28701 dr:323 al:0 bm:14 lo:0 pe:0 ua:0 ap:0
    resync: used:0/31 hits:14329 misses:7 starving:0 dirty:0 changed:7
    act_log: used:0/257 hits:13 misses:0 starving:0 dirty:0 changed:0

^^ on both machines.

The problem is that upon reconnection of the crossover cable (and no 
other intervention) both machines maintain the state:

# cat /proc/drbd
version: 8.0.11 (api:86/proto:86)
GIT-hash: b3fe2bdfd3b9f7c2f923186883eb9e2a0d3a5b1b build by phil at mescal, 
2008-02-12 11:56:43
 0: cs:StandAlone st:Primary/Unknown ds:UpToDate/DUnknown   r---
    ns:139 nr:28808 dw:29019 dr:262919 al:0 bm:14 lo:0 pe:0 ua:0 ap:0
    resync: used:0/31 hits:14329 misses:7 starving:0 dirty:0 changed:7
    act_log: used:0/257 hits:366 misses:0 starving:0 dirty:0 changed:0

i.e. they don't reconnect and resync

Am i doing something wrong? Should DRBD do this automatically or should 
i be adding something to Heartbeat to deal with it??

Thanks,

Henri

-----------------------------------------------------------

My config file:

#
# At most ONE global section is allowed.
# It must precede any resource section.
#
global {
    usage-count yes;
}

common {
    syncer { rate 60M; }
    protocol C;
}

resource shared {

  handlers {
    # what should be done in case the node is primary, degraded
    # (=no connection) and has inconsistent data.
    # Action: Notify administrators, reboot the system
    pri-on-incon-degr "/usr/local/bin/admin-notify 'DRBD: Node is 
primary and degraded (no network connection). Data is inconsistent. 
Rebooted.' ; /usr/$

    # The node is currently primary, but lost the after split brain
    # auto recovery procedure. As as consequence it should go away.
    # Action: Notify administrators, reboot the system
    pri-lost-after-sb "/usr/local/bin/admin-notify 'DRBD: A split brain 
situation occured. This node lost. Rebooted' ; /usr/local/sbin/reboot-sane";

    # In case you have set the on-io-error option to "call-local-io-error",
    # this script will get executed in case of a local IO error. It is
    # expected that this script will case a immediate failover in the
    # cluster.
    # Action: Call a reboot-or-shutdown script. If the script was last 
called more than three times AND less than five minutes ago
    local-io-error "/usr/local/bin/admin-notify 'DRBD: A local IO error 
occurred, rebooting.' ; /usr/local/sbin/reboot-sane";

    # Commands to run in case we need to downgrade the peer's disk
    # state to "Outdated". Should be implemented by the superior
    # communication possibilities of our cluster manager.
    # The provided script uses ssh, and is for demonstration/development
    # purposis.
    # outdate-peer "/usr/lib/drbd/outdate-peer.sh on amd 192.168.22.11 
192.168.23.11 on alf 192.168.22.12 192.168.23.12";
    #
    # Update: Now there is a solution that relies on heartbeat's
    # communication layers. You should really use this.
    # outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";

    # The node is currently primary, but should become sync target
    # after the negotiating phase. Alert someone about this incident.
    #pri-lost "echo pri-lost. Have a look at the log files. | mail -s 
'DRBD Alert' root";

    # Notify someone in case DRBD split brained.
    split-brain "/usr/local/bin/admin-notify 'DRBD: A split-brain 
situation occurred and was resolved successfully.'";
  }

  startup {
    # Wait for connection timeout.
    # The init script blocks the boot process until the resources
    # are connected. This is so when the cluster manager starts later,
    # it does not see a resource with internal split-brain.
    # In case you want to limit the wait time, do it here.
    # Default is 0, which means unlimited. Unit is seconds.
    #
    # wfc-timeout  0;

    # Wait for connection timeout if this node was a degraded cluster.
    # In case a degraded cluster (= cluster with only one node left)
    # is rebooted, this timeout value is used.
    #
    degr-wfc-timeout 120;    # 2 minutes.

    # In case you are using DRBD for GFS/OCFS2 you want that the
    # startup script promotes it to primary. Nodenames are also
    # possible instead of "both".
    become-primary-on both;
  }

    disk {
    # if the lower level device reports io-error you have the choice of
    #  "pass_on"  ->  Report the io-error to the upper layers.
    #                 Primary   -> report it to the mounted file system.
    #                 Secondary -> ignore it.
    #  "call-local-io-error"
    #             ->  Call the script configured by the name 
"local-io-error".
    #  "detach"   ->  The node drops its backing storage device, and
    #                 continues in disk less mode.
    #
    on-io-error   detach;
    }

    net {
    allow-two-primaries;
    cram-hmac-alg "sha256"
    shared-secret "secret"
    after-sb-0pri discard-younger-primary;
    after-sb-1pri discard-secondary;
    after-sb-2pri call-pri-lost-after-sb;
    rr-conflict call-pri-lost;
    }

syncer {
    al-extents 257;
  }

 on alpha {
    device     /dev/drbd0;
    disk       /dev/md4;
    address    10.0.0.2:7788;
    meta-disk  /dev/md5[0];

    # meta-disk is either 'internal' or '/dev/ice/name [idx]'
    #
    # You can use a single block device to store meta-data
    # of multiple DRBD's.
    # E.g. use meta-disk /dev/hde6[0]; and meta-disk /dev/hde6[1];
    # for two different resources. In this case the meta-disk
    # would need to be at least 256 MB in size.
    #
    # 'internal' means, that the last 128 MB of the lower device
    # are used to store the meta-data.
    # You must not give an index with 'internal'.
  }

  on beta {
    device    /dev/drbd0;
    disk      /dev/md4;
    address   10.0.0.3:7788;
    meta-disk /dev/md5[0];
  }