<div dir="ltr">Hello.<div>I have two Debian 9 servers with
<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">configured</span> Corosync-Pacemaker-DRBD. All work well for month.</div><div>After some servers issues (with reboots) I have situation that pacemaker could not switch drbd node with such errors:</div><div><br clear="all"><div>
<p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:11pt;font-family:Calibri,sans-serif"><span style="font-family:"Lucida Console";font-size:9pt">Mar 16 06:25:11 [877] <a href="http://nfs01-az-eus.tech-corps.com">nfs01-az-eus.tech-corps.com</a>
lrmd: notice: operation_finished:
drbd_nfs_stop_0:3667:stderr [ 1: State change failed: (-12) Device is held open
by someone ]</span><br></p>
<p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:11pt;font-family:Calibri,sans-serif"><span style="font-size:9pt;font-family:"Lucida Console"">Mar 16 06:25:11 [877]
<a href="http://nfs01-az-eus.tech-corps.com">nfs01-az-eus.tech-corps.com</a>
lrmd: notice: operation_finished: drbd_nfs_stop_0:3667:stderr
[ Command 'drbdsetup-84 secondary 1' terminated with exit code 11 ]<span></span></span></p>
<p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:11pt;font-family:Calibri,sans-serif"><span style="font-size:9pt;font-family:"Lucida Console"">Mar 16 06:25:11 [877]
<a href="http://nfs01-az-eus.tech-corps.com">nfs01-az-eus.tech-corps.com</a>
lrmd: info: log_finished: finished -
rsc:drbd_nfs action:stop call_id:47 pid:3667 exit-code:1 exec-time:20002ms
queue-time:0ms<span></span></span></p>
<p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:11pt;font-family:Calibri,sans-serif"><span style="font-size:9pt;font-family:"Lucida Console"">Mar 16 06:25:11 [880]
<a href="http://nfs01-az-eus.tech-corps.com">nfs01-az-eus.tech-corps.com</a>
crmd: error: process_lrm_event:
Result of stop operation for drbd_nfs on <a href="http://nfs01-az-eus.tech-corps.com">nfs01-az-eus.tech-corps.com</a>: Timed Out
| call=47 key=drbd_nfs_stop_0 timeout=20000ms<span></span></span></p>
<p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:11pt;font-family:Calibri,sans-serif"><span style="font-size:9pt;font-family:"Lucida Console"">Mar 16 06:25:11 [880]
<a href="http://nfs01-az-eus.tech-corps.com">nfs01-az-eus.tech-corps.com</a>
crmd: notice: process_lrm_event:
nfs01-az-eus.tech-corps.com-drbd_nfs_stop_0:47 [ 1: State change failed: (-12)
Device is held open by someone\nCommand 'drbdsetup-84 secondary 1' terminated
with exit code 11\n1: State change failed: (-12) Device is held open by
someone\nCommand 'drbdsetup-84 secondary 1' terminated with exit code 11\n1:
State change failed: (-12) Device is held open by someone\nCommand
'drbdsetup-84 secondary 1' terminated with exit<span></span></span></p>
<br></div><div>I tried to resolve the issue with many googled receipts but all attempts were unsuccessful. </div><div>As well I have another two node cluster with exactly the same configuration and it works without any issues.</div><div><br></div><div>Right now I placed nodes to standby mode and manually raised all services. </div><div>Please, could You help me to analyze and solve the problem?</div><div>Thanks</div><div><br></div><div>Here are my configuration files:</div><div>--- CRM CONFIG ---</div><div><div>crm configure show</div><div>node 171049224: <a href="http://nfs01-az-eus.tech-corps.com">nfs01-az-eus.tech-corps.com</a> \</div><div> attributes standby=off</div><div>node 171049225: <a href="http://nfs02-az-eus.tech-corps.com">nfs02-az-eus.tech-corps.com</a> \</div><div> attributes standby=on</div><div>primitive drbd_nfs ocf:linbit:drbd \</div><div> params drbd_resource=nfs \</div><div> op monitor interval=29s role=Master \</div><div> op monitor interval=31s role=Slave</div><div>primitive fs_nfs Filesystem \</div><div> params device="/dev/drbd1" directory="/data" fstype=ext4 \</div><div> meta is-managed=true</div><div>primitive nfs lsb:nfs-kernel-server \</div><div> op monitor interval=5s</div><div>primitive nmbd lsb:nmbd \</div><div> op monitor interval=5s</div><div>primitive smbd lsb:smbd \</div><div> op monitor interval=5s</div><div>group NFS fs_nfs nfs nmbd smbd</div><div>ms ms_drbd_nfs drbd_nfs \</div><div> meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true</div><div>order fs-nfs-before-nfs inf: fs_nfs:start nfs:start</div><div>order fs-nfs-before-nmbd inf: fs_nfs:start nmbd:start</div><div>order fs-nfs-before-smbd inf: fs_nfs:start smbd:start</div><div>order ms-drbd-nfs-before-fs-nfs inf: ms_drbd_nfs:promote fs_nfs:start</div><div>colocation ms-drbd-nfs-with-ha inf: ms_drbd_nfs:Master NFS</div><div>order nmbd-before-smbd inf: nmbd:start smbd:start</div><div>property cib-bootstrap-options: \</div><div> have-watchdog=false \</div><div> dc-version=1.1.16-94ff4df \</div><div> cluster-infrastructure=corosync \</div><div> cluster-name=debian \</div><div> stonith-enabled=false \</div><div> no-quorum-policy=ignore</div><div><br></div></div><div><br></div><div><br></div><div>--- DRBD GLOBAL ---</div><div><div>cat /etc/drbd.d/global_common.conf | grep -v '#'</div><div><br></div><div>global {</div><div> usage-count no;</div><div>}</div><div><br></div><div>common {</div><div> protocol C;</div><div><br></div><div> handlers {</div><div><br></div><div> }</div><div><br></div><div> startup {</div><div> }</div><div><br></div><div> options {</div><div> }</div><div><br></div><div> disk {</div><div> }</div><div><br></div><div> net {</div><div> }</div><div>}</div></div><div><br></div><div><br></div><div>--- DRBD -RESOURCE ---</div><div><div>cat /etc/drbd.d/nfs.res | grep -v '#'</div><div>resource nfs{</div><div> meta-disk internal;</div><div> device /dev/drbd1;</div><div> syncer {</div><div> verify-alg sha1;</div><div> rate 100M;</div><div> }</div><div><br></div><div> net{</div><div> max-buffers 8000;</div><div> max-epoch-size 8000;</div><div> unplug-watermark 16;</div><div> sndbuf-size 0;</div><div> }</div><div><br></div><div> disk{</div><div> disk-barrier no;</div><div> disk-flushes no;</div><div> }</div><div><br></div><div> on <a href="http://nfs01-az-eus.tech-corps.com">nfs01-az-eus.tech-corps.com</a>{</div><div> disk /dev/sdc1;</div><div> address <a href="http://10.50.1.8:7789">10.50.1.8:7789</a>;</div><div> }</div><div><br></div><div> on <a href="http://nfs02-az-eus.tech-corps.com">nfs02-az-eus.tech-corps.com</a>{</div><div> disk /dev/sdc1;</div><div> address <a href="http://10.50.1.9:7789">10.50.1.9:7789</a>;</div><div> }</div><div>}</div></div><div><br></div><div><br></div><div><br></div><div><br></div>-- <br><div class="gmail_signature">Segey L</div>
</div></div>