[DRBD-user] Failover Behavior in Server-Crash Scenario

Robinson, Eric eric.robinson at psmnv.com
Fri Dec 7 00:53:47 CET 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


> >> Any concurrent log entries in your kernel log, from the 
> drbd0 device?
> >>
> > 
> > 
> > In fact, there are...
> > 
> > Dec  6 13:51:17 ha09a kernel: d-con ha02_mysql: conn( 
> Unconnected -> 
> > WFConnection ) Dec  6 13:51:19 ha09a root: drbd SA notify
> > Dec  6 13:51:19 ha09a crm_node[25546]:   notice: 
> crm_add_logfile: Additional logging available in /var/log/corosync.log
> > Dec  6 13:51:19 ha09a crm_attribute[25547]:   notice: 
> crm_add_logfile: Additional logging available in /var/log/corosync.log
> > Dec  6 13:51:20 ha09a root: drbd SA notify
> > Dec  6 13:51:20 ha09a crm_node[25577]:   notice: 
> crm_add_logfile: Additional logging available in /var/log/corosync.log
> > Dec  6 13:51:20 ha09a crm_attribute[25578]:   notice: 
> crm_add_logfile: Additional logging available in /var/log/corosync.log
> > Dec  6 13:51:21 ha09a crmd[3066]:   notice: 
> process_lrm_event: LRM operation p_drbd0_notify_0 (call=500, 
> rc=0, cib-update=0, confirmed=true) ok
> > Dec  6 13:51:21 ha09a crmd[3066]:   notice: 
> process_lrm_event: LRM operation p_drbd1_notify_0 (call=502, 
> rc=0, cib-update=0, confirmed=true) ok
> > Dec  6 13:51:22 ha09a root: drbd SA notify Dec  6 13:51:23 
> ha09a root: 
> > drbd SA notify
> > Dec  6 13:51:24 ha09a crmd[3066]:   notice: 
> process_lrm_event: LRM operation p_drbd0_notify_0 (call=506, 
> rc=0, cib-update=0, confirmed=true) ok
> > Dec  6 13:51:24 ha09a crmd[3066]:   notice: 
> process_lrm_event: LRM operation p_drbd1_notify_0 (call=508, 
> rc=0, cib-update=0, confirmed=true) ok
> > Dec  6 13:51:25 ha09a root: drbd SA promote Dec  6 13:51:25 ha09a 
> > kernel: d-con ha01_mysql: helper command: /sbin/drbdadm fence-peer 
> > ha01_mysql Dec  6 13:51:25 ha09a kernel: d-con ha01_mysql: helper 
> > command: /sbin/drbdadm fence-peer ha01_mysql exit code 127 (0x7f00) 
> > Dec  6 13:51:25 ha09a kernel: d-con ha01_mysql: fence-peer helper 
> > broken, returned 127
> 
> Your DRBD refuses to promote because it's unable to get a 
> meaningful response from the fence-peer handler. That in turn 
> is because it's failing with a "command not found" error. 
> (Try typing "foobarblatch; echo $?" in a shell.) Check your 
> "fence-peer" setting in the handlers section of your DRBD 
> config, and see whether it points to a non-existing script. 
> If that script does exist, examine whether it _invokes_ 
> something that doesn't.
> 
> Cheers,
> Florian
> 


It turns out that the fence-peer handler script does not exist. This is certainly because I copied the drbd.conf file from a preious cluster running drbd 8.3.12. 

I am now sure that there are other problems in the config file waiting to bite me. Following is what my drbd.conf file looks like. Please tell tell me if you see anywhere ELSE that I have shot myself in the foot.



# drbd.conf

global {
    usage-count no;
}

common {
  syncer {
    verify-alg sha1;
    rate 30M;
    al-extents 3389;
  }
}

resource ha01_mysql {
  protocol C;
  handlers {
    pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
    pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
    local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
    out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
    split-brain "/usr/lib/drbd/notify-split-brain.sh root";
    fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
    after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
    # pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
    # pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
    # local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
    # outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";
    # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
    # after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
    #pri-lost "echo pri-lost. Have a look at the log files. | mail -s 'DRBD Alert' root";
    # split-brain "echo split-brain. drbdadm -- --discard-my-data connect $DRBD_RESOURCE ? | mail -s 'DRBD Alert' admin at pmcipa.com";
    #out-of-sync "echo out-of-sync. drbdadm down $DRBD_RESOURCE. drbdadm ::::0 set-gi $DRBD_RESOURCE. drbdadm up $DRBD_RESOURCE. | mail -s 'DRBD Alert' root";
  }

  startup {
    wfc-timeout  0;          # infinite
    degr-wfc-timeout 120;    # 2 minutes.
  }

  disk {
    on-io-error   detach;
    fencing resource-only;
  }
  net {
    cram-hmac-alg "sha1";
    shared-secret "removed";
    after-sb-0pri disconnect;
    after-sb-1pri disconnect;
    after-sb-2pri disconnect;
    rr-conflict disconnect;
  }
  on ha09a {
    device     /dev/drbd0;
    disk       /dev/vg00/lv00;
    address    198.51.100.58:7788;
    meta-disk  internal;
  }
  on ha09b {
    device     /dev/drbd0;
    disk       /dev/vg00/lv00;
    address   198.51.100.59:7788;
    meta-disk  internal;
  }
}

resource ha02_mysql {
  protocol C;
  handlers {
    pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
    pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
    local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
    out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
    split-brain "/usr/lib/drbd/notify-split-brain.sh root";
    fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
    after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
    # pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
    # pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
    # local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
    # outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";
    # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
    # after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
    #pri-lost "echo pri-lost. Have a look at the log files. | mail -s 'DRBD Alert' root";
    # split-brain "echo split-brain. drbdadm -- --discard-my-data connect $DRBD_RESOURCE ? | mail -s 'DRBD Alert' admin at pmcipa.com";
    #out-of-sync "echo out-of-sync. drbdadm down $DRBD_RESOURCE. drbdadm ::::0 set-gi $DRBD_RESOURCE. drbdadm up $DRBD_RESOURCE. | mail -s 'DRBD Alert' root";
  }

  startup {
    wfc-timeout  0;          # infinite
    degr-wfc-timeout 120;    # 2 minutes.
  }

  disk {
    on-io-error   detach;
    fencing resource-only;
  }
  net {
    cram-hmac-alg "sha1";
    shared-secret "removed";
    after-sb-0pri disconnect;
    after-sb-1pri disconnect;
    after-sb-2pri disconnect;
    rr-conflict disconnect;
  }
  on ha09a {
    device     /dev/drbd1;
    disk       /dev/vg00/lv01;
    address    198.51.100.58:7789;
    meta-disk  internal;
  }
  on ha09b {
    device     /dev/drbd1;
    disk       /dev/vg00/lv01;
    address   198.51.100.59:7789;
    meta-disk  internal;
  }
}



Disclaimer - December 6, 2012 
This email and any files transmitted with it are confidential and intended solely for Florian Haas,drbd-user at lists.linbit.com. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of Physicians' Managed Care or Physician Select Management. Warning: Although Physicians' Managed Care or Physician Select Management has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments. 
This disclaimer was added by Policy Patrol: http://www.policypatrol.com/



More information about the drbd-user mailing list