[DRBD-user] Pacemaker drbd fail to promote to primary

Fri Sep 2 06:56:01 CEST 2016

Below are my configs. 
The issue I'm experiencing is that when stonith reboots the primary, the secondary doesn't get promoted. 
I thought the handlers in drbd.conf were supposed to "handle" that. 
Anybody know what I'm missing? 
I've been looking at logs, but nothing stands out to me. The logging is pretty verbose.
Perhaps I could make the logs a little less verbose. 
I don't know where those options are. 
I had this working with this identical configuration with the same nodes but simple hostnames.
I switched it up to simulate a real world change, where I am using fqdn style hostnames, and also in the configuration files below.
This is a pair of "appliances" that run out proprietary software. 
FQDNs are part of our requirements for our software.  
So I or someone I train, is going to have to configure this for each pair we deliver. 
The old HA was much simpler. 
Have to always move forward. 

/etc/drbd.conf
global {
        usage-count no;
}
common {
        protocol C;
        handlers {
                fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
                after-resync-target "/usr/lib/drbd/crm_unfence-peer.sh";
                split-brain "/usr/lib/drbd/notify-split-brain.sh root";
        }

        startup {
        }
        disk {
                fencing resource-and-stonith;
                #on-io-error detach;
        }
        net {
                allow-two-primaries yes;
                after-sb-0pri discard-zero-changes;
                after-sb-1pri discard-secondary;
                after-sb-2pri disconnect;
                rr-conflict disconnect;
        }
}
resource mysql {
        protocol C;
        meta-disk internal;
        device /dev/drbd1;
        syncer {
                verify-alg sha1;
                rate 33M;
                csums-alg sha1;
        }
on node1 {
        disk /dev/sda1;
        address 10.6.7.24:7789;
            }
on awpnode2 {
        disk /dev/sda1;
        address 10.6.7.27:7789;
            }
}

/etc/corosync/corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank

totem {
        version: 2
        secauth: off
        threads: 0
        interface {
                member {
                        memberaddr: 10.6.7.24
                        }
                member  {
                        memberaddr: 10.6.7.27
                        }
                ringnumber: 0
                bindnetaddr: 10.6.7.0
                mcastport: 5405
                ttl: 1
        }
        transport: udpu
}
logging {
        fileline: off
        to_stderr: no
        to_logfile: yes
        logfile: /var/log/cluster/corosync.log
        to_syslog: yes
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

pcs config show

 Master/Slave Set: mysql_data_clone [mysql_data]
     Masters: [ node2 ]
     Slaves: [ node1i ]

pcs config show
Cluster Name: awpcluster
Corosync Nodes:
 node1 node2
Pacemaker Nodes:
 node1 node2

Resources:
 Master: mysql_data_clone
  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify                                                                                        =true
  Resource: mysql_data (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=mysql
   Operations: start interval=0s timeout=240 (mysql_data-start-interval-0s)
               promote interval=0s timeout=90 (mysql_data-promote-interval-0s)
               demote interval=0s timeout=90 (mysql_data-demote-interval-0s)
               stop interval=0s timeout=100 (mysql_data-stop-interval-0s)
               monitor interval=30s (mysql_data-monitor-interval-30s)

Stonith Devices:
 Resource: fence_node1_kvm (class=stonith type=fence_virsh)
  Attributes: pcmk_host_list=node1 ipaddr=10.6.7.10 action=reboot login=root passwd=password port=node1
  Operations: monitor interval=30s (fence_node1_kvm-monitor-interval-30s)
 Resource: fence_node2_kvm (class=stonith type=fence_virsh)
  Attributes: pcmk_host_list=node2 ipaddr=10.6.7.12 action=reboot login=root passwd=password port=awpnode2 delay=15
  Operations: monitor interval=30s (fence_awpnode2_kvm-monitor-interval-30s)
Fencing Levels:

Location Constraints:
  Resource: mysql_data_clone
    Constraint: drbd-fence-by-handler-mysql-mysql_data_clone
      Rule: score=-INFINITY role=Master  (id:drbd-fence-by-handler-mysql-rule-mysql_data_clone)
        Expression: #uname ne cleardata-awpnode2.awarepoint.com  (id:drbd-fence-by-handler-mysql-expr-mysql_data_clone)
Ordering Constraints:
Colocation Constraints:

Resources Defaults:
 resource-stickiness: 200
Operations Defaults:
 No defaults set

Cluster Properties:
 cluster-infrastructure: cman
 dc-version: 1.1.14-8.el6_8.1-70404b0
 have-watchdog: false
 last-lrm-refresh: 1472768147
 no-quorum-policy: ignore
 stonith-enabled: true

Thank you.

Neil Schneider
DevOps Engineer