<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: Times New Roman; font-size: 12pt; color: #000000'><font size="3">Hi Felix,</font><div style="color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-size: 12pt; "><br></div><div style="color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-size: 12pt; ">Thanks for the suggestion. I have removed all of the target-* parameters, however this behavior is still present when failing over with the DRBD device. I did notice the following line in the log when attempting to failover:</div><div id=""><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:15 node2 lrmd: [1440]: info: RA output: (p_drbd_mount2:1:notify:stdout) drbdsetup 1 syncer --set-defaults --create-device --rate=1000M </font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:15 node2 crmd: [1443]: info: send_direct_ack: ACK'ing resource op p_drbd_mount2:1_notify_0 from 120:428:0:f84ff0aa-9a17-4b66-954d-8c3011a3441e: lrm_invoke-lrmd-1327957875-58</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:15 node2 crmd: [1443]: info: process_lrm_event: LRM operation p_drbd_mount2:1_notify_0 (call=79, rc=0, cib-update=0, confirmed=true) ok</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:15 node2 kernel: [ 1075.519415] block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Outdated -> Inconsistent ) </font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:15 node2 kernel: [ 1075.519441] block drbd0: Began resync as SyncSource (will sync 0 KB [0 bits set]).</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:15 node2 kernel: [ 1075.792404] block drbd1: conn( WFBitMapT -> WFSyncUUID ) </font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:15 node2 kernel: [ 1075.806756] block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:15 node2 kernel: [ 1075.809927] block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1 exit code 0 (0x0)</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:15 node2 kernel: [ 1075.809934] block drbd1: conn( WFSyncUUID -> SyncTarget ) </font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:15 node2 kernel: [ 1075.809942] block drbd1: Began resync as SyncTarget (will sync 838514728 KB [209628682 bits set]).</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:16 node2 kernel: [ 1076.060191] block drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:16 node2 kernel: [ 1076.060200] block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) </font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:16 node2 crmd: [1443]: info: do_lrm_rsc_op: Performing key=135:429:0:f84ff0aa-9a17-4b66-954d-8c3011a3441e op=p_drbd_mount2:1_notify_0 )</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:16 node2 lrmd: [1440]: info: rsc:p_drbd_mount2:1 notify[81] (pid 6052)</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:16 node2 lrmd: [1440]: info: operation notify[81] on p_drbd_mount2:1 for client 1443: pid 6052 exited with return code 0</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:16 node2 crmd: [1443]: info: send_direct_ack: ACK'ing resource op p_drbd_mount2:1_notify_0 from 135:429:0:f84ff0aa-9a17-4b66-954d-8c3011a3441e: lrm_invoke-lrmd-1327957876-59</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:16 node2 crmd: [1443]: info: process_lrm_event: LRM operation p_drbd_mount2:1_notify_0 (call=81, rc=0, cib-update=0, confirmed=true) ok</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:24 node2 attrd: [1442]: notice: attrd_trigger_update: Sending flush op to all hosts for: master-p_drbd_mount2:1 (10)</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:24 node2 attrd: [1442]: notice: attrd_perform_update: Sent update 217: master-p_drbd_mount2:1=10</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:38 node2 kernel: [ 1097.979353] block drbd0: role( Secondary -> Primary ) </font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1"><br></font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:55 node2 lrmd: [1440]: info: RA output: (p_drbd_mount1:0:monitor:stderr) lock on /var/lock/drbd-147-0 currently held by pid:6164</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1"><br></font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:55 node2 mount2_attribute: [6214]: info: Invoked: mount2_attribute -N node2 -n master-p_drbd_mount1:0 -l reboot -D </font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:55 node2 attrd: [1442]: notice: attrd_trigger_update: Sending flush op to all hosts for: master-p_drbd_mount1:0 (<null>)</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:55 node2 attrd: [1442]: notice: attrd_perform_update: Sent delete 221: node=039e53da-dce8-4fd7-84bc-7261682529e8, attr=master-p_drbd_mount1:0, id=<n/a>, set=(null), section=status</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:55 node2 crmd: [1443]: info: process_lrm_event: LRM operation p_drbd_mount1:0_monitor_30000 (call=74, rc=7, cib-update=114, confirmed=false) not running</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:55 node2 attrd: [1442]: notice: attrd_perform_update: Sent delete 223: node=039e53da-dce8-4fd7-84bc-7261682529e8, attr=master-p_drbd_mount1:0, id=<n/a>, set=(null), section=status</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:55 node2 kernel: [ 1115.843923] block drbd1: role( Secondary -> Primary ) </font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:55 node2 crmd: [1443]: info: process_lrm_event: LRM operation p_drbd_mount2:1_monitor_30000 (call=24, rc=8, cib-update=115, confirmed=false) master</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:56 node2 attrd: [1442]: notice: attrd_trigger_update: Sending flush op to all hosts for: fail-count-p_drbd_mount1:0 (1)</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:56 node2 crmd: [1443]: info: do_lrm_rsc_op: Performing key=108:434:0:f84ff0aa-9a17-4b66-954d-8c3011a3441e op=p_drbd_mount1:0_notify_0 )</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:56 node2 lrmd: [1440]: info: rsc:p_drbd_mount1:0 notify[82] (pid 6228)</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:56 node2 attrd: [1442]: notice: attrd_perform_update: Sent update 226: fail-count-p_drbd_mount1:0=1</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:56 node2 attrd: [1442]: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-p_drbd_mount1:0 (1327957916)</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:56 node2 attrd: [1442]: notice: attrd_perform_update: Sent update 228: last-failure-p_drbd_mount1:0=1327957916</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:56 node2 lrmd: [1440]: info: operation notify[82] on p_drbd_mount1:0 for client 1443: pid 6228 exited with return code 0</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:56 node2 crmd: [1443]: info: send_direct_ack: ACK'ing resource op p_drbd_mount1:0_notify_0 from 108:434:0:f84ff0aa-9a17-4b66-954d-8c3011a3441e: lrm_invoke-lrmd-1327957916-60</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:56 node2 crmd: [1443]: info: process_lrm_event: LRM operation p_drbd_mount1:0_notify_0 (call=82, rc=0, cib-update=0, confirmed=true) ok</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:57 node2 attrd: [1442]: notice: attrd_trigger_update: Sending flush op to all hosts for: fail-count-p_drbd_mount2:1 (1)</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1" id="">Jan 30 15:11:57 node2 attrd: [1442]: notice: attrd_perform_update: Sent update 231: fail-count-p_drbd_mount2:1=1</font></div><div><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:57 node2 attrd: [1442]: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-p_drbd_mount2:1 (1327957916)</font></div><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:57 node2 attrd: [1442]: notice: attrd_perform_update: Sent update 234: last-failure-p_drbd_mount2:1=1327957916</font></div><div id=""><br></div><div id="">I was unable to trace the 6164 process as it had already terminated when I looked for it. The node simply stays as slave, even though it is the only online node. Attempting to cleanup the ms_drbd_mount1 primitive does not resolve the situation, however I could set the DRBD device as primary using drbdadm primary mount1. After doing so I could then use cleanup ms_drbd_mount1 and it resumed bringing up the rest of the resources successfully. </div><div id=""><br></div><div id="">I also noticed this in the log:</div><div id=""><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:59 node2 kernel: [ 1119.882808] block drbd0: role( Primary -> Secondary ) </font></div><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:59 node2 lrmd: [1440]: info: RA output: (p_drbd_mount1:0:stop:stdout) </font></div><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:59 node2 kernel: [ 1119.960543] block drbd0: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) </font></div><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:59 node2 kernel: [ 1119.960589] block drbd0: short read expecting header on sock: r=-512</font></div><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:59 node2 kernel: [ 1119.968719] block drbd0: asender terminated</font></div><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:11:59 node2 kernel: [ 1119.968768] block drbd0: Terminating asender thread</font></div><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:12:00 node2 kernel: [ 1119.969386] block drbd0: Connection closed</font></div><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1" id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown">Jan 30 15:12:00 node2 kernel: [ 1119.969421] block drbd0: conn( Disconnecting -> StandAlone ) </font></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:12:00 node2 kernel: [ 1119.969467] block drbd0: receiver terminated</font></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:12:00 node2 kernel: [ 1119.969470] block drbd0: Terminating receiver thread</font></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:12:00 node2 kernel: [ 1119.969634] block drbd0: disk( UpToDate -> Diskless ) </font></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:12:00 node2 kernel: [ 1119.970702] block drbd0: drbd_bm_resize called with capacity == 0</font></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:12:00 node2 kernel: [ 1119.971970] block drbd0: worker terminated</font></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:12:00 node2 kernel: [ 1119.971973] block drbd0: Terminating worker thread</font></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:12:00 node2 lrmd: [1440]: info: RA output: (p_drbd_mount1:0:stop:stdout) </font></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Jan 30 15:12:01 node2 mount2_attribute: [6433]: info: Invoked: mount2_attribute -N node2 -n master-p_drbd_mount1:0 -l reboot -D </font></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"><br></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown">I'm not sure if the "short read expecting header on sock" error is related to why it can't promote the resource, but this doesn't seem to stop the drbdadm utility from doing it. Is there some pacemaker utility that gives more verbose information on why a resource fails to start?</div></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"><br></div><div id="">What does the digit after the resource indicate, e.g. the :1 or :1_stop_0 below:</div><div id=""><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1"> Master/Slave Set: ms_drbd_mount2 [p_drbd_mount2]</font></div><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1"> p_drbd_mount2:1 (ocf::linbit:drbd): Master node2 (unmanaged) FAILED</font></div><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1"> Slaves: [ node1 ]</font></div></div><div id=""><div id=""><font face="'courier new', courier, monaco, monospace, sans-serif" size="1">Failed actions:</font></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"><font face="'courier new', courier, monaco, monospace, sans-serif" size="1" id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"> p_drbd_mount2:1_stop_0 (node=node2, call=94, rc=-2, status=Timed Out): unknown exec error</font></div></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"><br></div>Thanks,</div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown"><br></div><div id="aeaoofnhgocdbnbeljkmbjdmhbcokfdb-mousedown">Andrew<br><hr id="zwchr" style="color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-size: 12pt; "><div style="color: rgb(0, 0, 0); font-weight: normal; font-style: normal; text-decoration: none; font-family: Helvetica, Arial, sans-serif; font-size: 12pt; "><b>From: </b><span>"Felix Frank" <<a class="smarterwiki-linkify" href="mailto:ff@mpexnet.de" title="[GMCP] Compose a new mail to ff@mpexnet.de" onclick="window.open('https://mail.google.com/mail/u/0/?view=cm&fs=1&tf=1&to=ff@mpexnet.de','Compose new message','width=640,height=480');return false" rel="noreferrer">ff@mpexnet.de</a>></span><br><b>To: </b><span>"Andrew Martin" <<a class="smarterwiki-linkify" href="mailto:amartin@xes-inc.com" title="[GMCP] Compose a new mail to amartin@xes-inc.com" onclick="window.open('https://mail.google.com/mail/u/0/?view=cm&fs=1&tf=1&to=amartin@xes-inc.com','Compose new message','width=640,height=480');return false" rel="noreferrer">amartin@xes-inc.com</a>></span><br><b>Cc: </b><span>"drbd-user" <<a class="smarterwiki-linkify" href="mailto:drbd-user@lists.linbit.com" title="[GMCP] Compose a new mail to drbd-user@lists.linbit.com" onclick="window.open('https://mail.google.com/mail/u/0/?view=cm&fs=1&tf=1&to=drbd-user@lists.linbit.com','Compose new message','width=640,height=480');return false" rel="noreferrer">drbd-user@lists.linbit.com</a>></span><br><b>Sent: </b>Monday, January 30, 2012 3:07:08 AM<br><b>Subject: </b>Re: [DRBD-user] Removing DRBD Kernel Module Blocks<br><br>Hi,<br><br>On 01/29/2012 03:03 PM, Lars Ellenberg wrote:<br>>>> I don't think DRBD should attempt to become primary when<br>>>> > > you issue a stop command/<br>> And it does not. It is the *peer* that is "refusing" here,<br>> hinting to "self" that I should Outdate myself when voluntarily<br>> disconnecting from a peer in Primary role.<br><br>D'oh!<br><br>On 01/27/2012 11:19 PM, Andrew Martin wrote:<br>> However, shouldn't it have migrated already when that node went offline?<br>> How can I what is preventing the DRBD resource from being promoted?<br>...<br>> I've attached my configuration (as outputted by crm configure show).<br><br>Hum, difficult. I can't really tell the problem from what you posted.<br><br>From experience: Your config contains lots of "target-role" etc.<br>parameters (probably from issuing crm resource start / promote ...).<br>I have known some of these to prevent clean failovers under specific<br>circumstances.<br><br>You may want to try and get rid of all those target-* parameters as a<br>first try.<br><br>HTH,<br>Felix<br><br></div><br></div></div></body></html>