<div dir="ltr"><div style="font-size:12.8px"><div>Hi<br><br></div>I am having a problem with a very simple Active/Passive cluster built with Pacemaker/Corosync using DRBD.<br><br></div><span style="font-size:12.8px">This is my configuration:</span><br style="font-size:12.8px"><div style="font-size:12.8px"><div><br>Cluster Name: kamcluster<br>Corosync Nodes:<br> kam1vs3 kam2vs3 <br>Pacemaker Nodes:<br> kam1vs3 kam2vs3 <br><br>Resources: <br> Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)<br>  Attributes: ip=10.0.1.206 cidr_netmask=32 <br>  Operations: start interval=0s timeout=20s (ClusterIP-start-interval-0s)<br>              stop interval=0s timeout=20s (ClusterIP-stop-interval-0s)<br>              monitor interval=10s (ClusterIP-monitor-interval-10s)<br> Resource: ClusterIP2 (class=ocf provider=heartbeat type=IPaddr2)<br>  Attributes: ip=10.0.1.207 cidr_netmask=32 <br>  Operations: start interval=0s timeout=20s (ClusterIP2-start-interval-0s)<br>              stop interval=0s timeout=20s (ClusterIP2-stop-interval-0s)<br>              monitor interval=10s (ClusterIP2-monitor-interval-10s)<br> Resource: rtpproxycluster (class=systemd type=rtpproxy)<br>  Operations: monitor interval=10s (rtpproxycluster-monitor-interval-10s)<br>              stop interval=0s on-fail=fence (rtpproxycluster-stop-interval-0s)<br> Resource: kamailioetcfs (class=ocf provider=heartbeat type=Filesystem)<br>  Attributes: device=/dev/drbd1 directory=/etc/kamailio fstype=ext4 <br>  Operations: start interval=0s timeout=60 (kamailioetcfs-start-interval-0s)<br>              monitor interval=10s on-fail=fence (kamailioetcfs-monitor-interval-10s)<br>              stop interval=0s on-fail=fence (kamailioetcfs-stop-interval-0s)<br> Clone: fence_kam2_xvm-clone<br>  Meta Attrs: interleave=true clone-max=2 clone-node-max=1 <br>  Resource: fence_kam2_xvm (class=stonith type=fence_xvm)<br>   Attributes: port=tegamjg_kam2 pcmk_host_list=kam2vs3 <br>   Operations: monitor interval=60s (fence_kam2_xvm-monitor-interval-60s)<br> Master: kamailioetcclone<br>  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true <br>  Resource: kamailioetc (class=ocf provider=linbit type=drbd)<br>   Attributes: drbd_resource=kamailioetc <br>   Operations: start interval=0s timeout=240 (kamailioetc-start-interval-0s)<br>               promote interval=0s timeout=90 (kamailioetc-promote-interval-0s)<br>               demote interval=0s timeout=90 (kamailioetc-demote-interval-0s)<br>               stop interval=0s timeout=100 (kamailioetc-stop-interval-0s)<br>               monitor interval=10s (kamailioetc-monitor-interval-10s)<br> Resource: kamailiocluster (class=ocf provider=heartbeat type=kamailio)<br>  Attributes: listen_address=10.0.1.206 conffile=/etc/kamailio/kamailio.cfg pidfile=/var/run/kamailio.pid monitoring_ip=10.0.1.206 monitoring_ip2=10.0.1.207 port=5060 proto=udp kamctlrc=/etc/kamailio/kamctlrc <br>  Operations: start interval=0s timeout=60 (kamailiocluster-start-interval-0s)<br>              stop interval=0s on-fail=fence (kamailiocluster-stop-interval-0s)<br>              monitor interval=5s (kamailiocluster-monitor-interval-5s)<br> Clone: fence_kam1_xvm-clone<br>  Meta Attrs: interleave=true clone-max=2 clone-node-max=1 <br>  Resource: fence_kam1_xvm (class=stonith type=fence_xvm)<br>   Attributes: port=tegamjg_kam1 pcmk_host_list=kam1vs3 <br>   Operations: monitor interval=60s (fence_kam1_xvm-monitor-interval-60s)<br><br>Stonith Devices: <br>Fencing Levels: <br><br>Location Constraints:<br>  Resource: kamailiocluster<br>    Enabled on: kam1vs3 (score:INFINITY) (role: Started) (id:cli-prefer-kamailiocluster)<br>Ordering Constraints:<br>  start ClusterIP then start ClusterIP2 (kind:Mandatory) (id:order-ClusterIP-ClusterIP2-mandatory)<br>  start ClusterIP2 then start rtpproxycluster (kind:Mandatory) (id:order-ClusterIP2-rtpproxycluster-mandatory)<br>  start fence_kam2_xvm-clone then promote kamailioetcclone (kind:Mandatory) (id:order-fence_kam2_xvm-clone-kamailioetcclone-mandatory)<br>  promote kamailioetcclone then start kamailioetcfs (kind:Mandatory) (id:order-kamailioetcclone-kamailioetcfs-mandatory)<br>  start kamailioetcfs then start ClusterIP (kind:Mandatory) (id:order-kamailioetcfs-ClusterIP-mandatory)<br>  start rtpproxycluster then start kamailiocluster (kind:Mandatory) (id:order-rtpproxycluster-kamailiocluster-mandatory)<br>  start fence_kam1_xvm-clone then start fence_kam2_xvm-clone (kind:Mandatory) (id:order-fence_kam1_xvm-clone-fence_kam2_xvm-clone-mandatory)<br>Colocation Constraints:<br>  rtpproxycluster with ClusterIP2 (score:INFINITY) (id:colocation-rtpproxycluster-ClusterIP2-INFINITY)<br>  ClusterIP2 with ClusterIP (score:INFINITY) (id:colocation-ClusterIP2-ClusterIP-INFINITY)<br>  ClusterIP with kamailioetcfs (score:INFINITY) (id:colocation-ClusterIP-kamailioetcfs-INFINITY)<br>  kamailioetcfs with kamailioetcclone (score:INFINITY) (with-rsc-role:Master) (id:colocation-kamailioetcfs-kamailioetcclone-INFINITY)<br>  kamailioetcclone with fence_kam2_xvm-clone (score:INFINITY) (id:colocation-kamailioetcclone-fence_kam2_xvm-clone-INFINITY)<br>  kamailiocluster with rtpproxycluster (score:INFINITY) (id:colocation-kamailiocluster-rtpproxycluster-INFINITY)<br>  fence_kam2_xvm-clone with fence_kam1_xvm-clone (score:INFINITY) (id:colocation-fence_kam2_xvm-clone-fence_kam1_xvm-clone-INFINITY)<br><br>Resources Defaults:<br> migration-threshold: 2<br> failure-timeout: 10m<br> resource-stickiness: 200<br>Operations Defaults:<br> No defaults set<br><br>Cluster Properties:<br> cluster-infrastructure: corosync<br> cluster-name: kamcluster<br> dc-version: 1.1.13-10.el7_2.2-44eb2dd<br> have-watchdog: false<br> last-lrm-refresh: 1469123600<br> no-quorum-policy: ignore<br> start-failure-is-fatal: false<br> stonith-action: reboot<br> stonith-enabled: true<br><br clear="all"></div><div>The problem is that when i have only one node online in corosync and start the other node to rejoin the cluster, all my resources restart and sometimes even migrates to the other node (starting by changing in promotion who is master and who is slave) even though the first node is healthy and i use resource-stickiness=200 as a default in all resources inside the cluster.<br><br></div><div>I do believe it has something to do with the constraint of promotion that happens with DRBD. <br><br></div><div>Thank you very much in advance.<br><br></div><div>Regards.<br></div><div><br></div><div>Alejandro</div></div>
</div>