[DRBD-user] Reboot either machine in the pair during active transfer and the other reboots

Henri Cook drbd at theplayboymansion.net
Sat Sep 6 19:58:09 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


An off-list respondent suggested that I post my drbd.conf - I dont' have
any reboot options in there (that are being used) - but I may be missing
something. (Below)

Please note that reboot-sane (a custom reboot script) emails me, writes
to a file, waits 2 seconds, then reboots the system and is NOT being
called in the situation I describe (that's what i originally thought was
happening)

#
# drbd.conf
#

skip {

}

global {
    # minor-count 64;
    # dialog-refresh 5; # 5 seconds
    usage-count yes;
}

common {
  syncer { rate 60M; }
  protocol C;
}

resource shared {

  handlers {
    pri-on-incon-degr "echo 'DRBD: Inconsistent - Rebooting.' >>
/var/log/drbd.log ; /usr/local/sbin/reboot-sane";
    pri-lost-after-sb "echo 'DRBD: A split brain situation occured. This
node lost. Rebooted' >> /var/log/drbd.log ; /usr/local/sbin/reboot-sane";
    local-io-error "echo 'DRBD: A local IO error occurred, rebooting.'
>> /var/log/drbd.log ; /usr/local/sbin/reboot-sane";
    pri-lost "echo 'DRBD: Pri-lost, check log files.' >>
/var/log/drbd.log ; /usr/local/sbin/reboot-sane";

    outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5 -r shared
-p dean"; # NB: The machine we're on now is torvil, the -p option here
is different on the other host
   

    split-brain "echo 'DRBD: A split-brain situation occurred and was
resolved successfully.' >> /var/log/drbd.log";
  }

  startup {
    wfc-timeout  15;

    degr-wfc-timeout 30;

    # In case you are using DRBD for GFS/OCFS2 you want that the
    # startup script promotes it to primary. Nodenames are also
    # possible instead of "both".
    become-primary-on both;
  }

  disk {
    on-io-error   detach;

    fencing resource-only;
  }

  net {

    # timeout       60;    #  6 seconds  (unit = 0.1 seconds)
    # connect-int   10;    # 10 seconds  (unit = 1 second)
    # ping-int      10;    # 10 seconds  (unit = 1 second)
    # ping-timeout   5;    # 500 ms (unit = 0.1 seconds)

    # max-buffers     2048;

    # unplug-watermark   128;

    # max-epoch-size  2048;

    ko-count 4;

    allow-two-primaries;

    cram-hmac-alg "sha256";
    shared-secret "w405FDS^%tngpDSFg^";

    after-sb-0pri discard-older-primary;
    after-sb-1pri discard-secondary;
    after-sb-2pri call-pri-lost-after-sb;

    rr-conflict call-pri-lost;

  }

  syncer {
    al-extents 257;
  }

  on torvil {
    device     /dev/drbd0;
    disk       /dev/md4;
    address    10.0.0.2:7788;
    meta-disk  /dev/md5[0];
  }

  on dean {
    device    /dev/drbd0;
    disk      /dev/md4;
    address   10.0.0.3:7788;
    meta-disk /dev/md5[0];
  }
}



Henri Cook wrote:
> Please, can anyone help? This is severely affecting my setup
>
> If I start say, an FTP file transfer to my drbd /shared directory on
> node A, then reboot node B which is the other machine in the
> Primary-Primary configuration DRBD on node A register's a NetworkFailure
> which appears to trigger a reboot action - I can't find anywhere to
> define this behaviour, i'd very much like to stop the reboot happening.
>
> So to confirm behaviour, during a transfer to A onto /shared, if I
> reboot B as soon as A loses the connection to B, A reboots also -
> cripping the pair.
>
> Henri
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>   




Henri Cook wrote:
> Please, can anyone help? This is severely affecting my setup
>
> If I start say, an FTP file transfer to my drbd /shared directory on
> node A, then reboot node B which is the other machine in the
> Primary-Primary configuration DRBD on node A register's a NetworkFailure
> which appears to trigger a reboot action - I can't find anywhere to
> define this behaviour, i'd very much like to stop the reboot happening.
>
> So to confirm behaviour, during a transfer to A onto /shared, if I
> reboot B as soon as A loses the connection to B, A reboots also -
> cripping the pair.
>
> Henri
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20080906/e0f94ba4/attachment.htm>


More information about the drbd-user mailing list