<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Hi all,</p>
<p> I've setup a 3-node cluster (config below). Basically, Node 1
& 2 are protocol C and have <tt>resource-and-stonith</tt>
fencing. Node 1 -> 3 and 2 -> 3 are protocol A and fencing
is '<tt>dont-care</tt>' (it's not part of the cluster and would
only ever be promoted manually).</p>
<p> When I crash node 2 via '<tt>echo c > /proc/sysrq-trigger</tt>',
pacemaker detected the faults and so does DRBD. DRBD invokes the
fence-handler as expected and all is good. However, I want to test
breaking just DRBD, so on node 2 I used '<tt>iptables -I INPUT -p
tcp -m tcp --dport 7788:7790 -j DROP</tt>' to interrupt DRBD
traffic. When this is done, the fence handler is not invoked. <br>
</p>
<p> Details below:<br>
</p>
<p>==== [root@m3-a02n01 ~]# drbdadm dump<br>
<tt># /etc/drbd.conf</tt><tt><br>
</tt><tt>global {</tt><tt><br>
</tt><tt> usage-count yes;</tt><tt><br>
</tt><tt>}</tt><tt><br>
</tt><tt><br>
</tt><tt>common {</tt><tt><br>
</tt><tt> options {</tt><tt><br>
</tt><tt> auto-promote yes;</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt> net {</tt><tt><br>
</tt><tt> csums-alg md5;</tt><tt><br>
</tt><tt> data-integrity-alg md5;</tt><tt><br>
</tt><tt> allow-two-primaries no;</tt><tt><br>
</tt><tt> after-sb-0pri discard-zero-changes;</tt><tt><br>
</tt><tt> after-sb-1pri discard-secondary;</tt><tt><br>
</tt><tt> after-sb-2pri disconnect;</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt> disk {</tt><tt><br>
</tt><tt> disk-flushes no;</tt><tt><br>
</tt><tt> md-flushes no;</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt> handlers {</tt><tt><br>
</tt><tt> fence-peer /usr/sbin/fence_pacemaker;</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt>}</tt><tt><br>
</tt><tt><br>
</tt><tt># resource srv01-c7_0 on m3-a02n01.alteeve.com: not
ignored, not stacked</tt><tt><br>
</tt><tt># defined at /etc/drbd.d/srv01-c7_0.res:2</tt><tt><br>
</tt><tt>resource srv01-c7_0 {</tt><tt><br>
</tt><tt> device /dev/drbd0 minor 0;</tt><tt><br>
</tt><tt> on m3-a02n01.alteeve.com {</tt><tt><br>
</tt><tt> node-id 0;</tt><tt><br>
</tt><tt> disk /dev/node01_vg0/srv01-c7;</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt> on m3-a02n02.alteeve.com {</tt><tt><br>
</tt><tt> node-id 1;</tt><tt><br>
</tt><tt> disk /dev/node02_vg0/srv01-c7;</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt> on m3-a02dr01.alteeve.com {</tt><tt><br>
</tt><tt> node-id 2;</tt><tt><br>
</tt><tt> disk /dev/dr01_vg0/srv01-c7;</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt> connection {</tt><tt><br>
</tt><tt> host m3-a02n01.alteeve.com
address ipv4 10.41.20.1:7788;</tt><tt><br>
</tt><tt> host m3-a02n02.alteeve.com
address ipv4 10.41.20.2:7788;</tt><tt><br>
</tt><tt> net {</tt><tt><br>
</tt><tt> protocol C;</tt><tt><br>
</tt><tt> fencing resource-and-stonith;</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt> connection {</tt><tt><br>
</tt><tt> host m3-a02n01.alteeve.com
address ipv4 10.41.20.1:7789;</tt><tt><br>
</tt><tt> host m3-a02dr01.alteeve.com
address ipv4 10.41.20.3:7789;</tt><tt><br>
</tt><tt> net {</tt><tt><br>
</tt><tt> protocol A;</tt><tt><br>
</tt><tt> fencing dont-care;</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt> connection {</tt><tt><br>
</tt><tt> host m3-a02n02.alteeve.com
address ipv4 10.41.20.2:7790;</tt><tt><br>
</tt><tt> host m3-a02dr01.alteeve.com
address ipv4 10.41.20.3:7790;</tt><tt><br>
</tt><tt> net {</tt><tt><br>
</tt><tt> protocol A;</tt><tt><br>
</tt><tt> fencing dont-care;</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt> }</tt><tt><br>
</tt><tt>}</tt><tt><br>
</tt>====<br>
</p>
<p>DRBD and Pacemaker status pre-iptables break;</p>
<p>==== [root@m3-a02n01 ~]# drbdsetup status --verbose<br>
<tt>srv01-c7_0 node-id:0 role:Primary suspended:no</tt><tt><br>
</tt><tt> volume:0 minor:0 disk:UpToDate quorum:yes blocked:no</tt><tt><br>
</tt><tt> m3-a02dr01.alteeve.com node-id:2 connection:Connected
role:Secondary congested:no</tt><tt><br>
</tt><tt> volume:0 replication:Established peer-disk:UpToDate
resync-suspended:no</tt><tt><br>
</tt><tt> m3-a02n02.alteeve.com node-id:1 connection:Connected
role:Secondary congested:no</tt><tt><br>
</tt><tt> volume:0 replication:Established peer-disk:UpToDate
resync-suspended:no</tt><tt><br>
</tt>====<br>
</p>
<p>==== [root@m3-a02n01 ~]# pcs status<br>
<tt>Cluster name: m3-anvil-02</tt><tt><br>
</tt><tt>Stack: corosync</tt><tt><br>
</tt><tt>Current DC: m3-a02n01.alteeve.com (version
1.1.16-12.el7_4.7-94ff4df) - partition with quorum</tt><tt><br>
</tt><tt>Last updated: Sun Feb 11 06:21:21 2018</tt><tt><br>
</tt><tt>Last change: Sun Feb 11 02:35:25 2018 by root via
crm_resource on m3-a02n01.alteeve.com</tt><tt><br>
</tt><tt><br>
</tt><tt>2 nodes configured</tt><tt><br>
</tt><tt>7 resources configured</tt><tt><br>
</tt><tt><br>
</tt><tt>Online: [ m3-a02n01.alteeve.com m3-a02n02.alteeve.com ]</tt><tt><br>
</tt><tt><br>
</tt><tt>Full list of resources:</tt><tt><br>
</tt><tt><br>
</tt><tt> virsh_node1 (stonith:fence_virsh): Started
m3-a02n01.alteeve.com</tt><tt><br>
</tt><tt> virsh_node2 (stonith:fence_virsh): Started
m3-a02n02.alteeve.com</tt><tt><br>
</tt><tt> Clone Set: hypervisor-clone [hypervisor]</tt><tt><br>
</tt><tt> Started: [ m3-a02n01.alteeve.com
m3-a02n02.alteeve.com ]</tt><tt><br>
</tt><tt> Clone Set: drbd-clone [drbd]</tt><tt><br>
</tt><tt> Started: [ m3-a02n01.alteeve.com
m3-a02n02.alteeve.com ]</tt><tt><br>
</tt><tt> srv01-c7 (ocf::heartbeat:VirtualDomain): Started
m3-a02n01.alteeve.com</tt><tt><br>
</tt><tt><br>
</tt><tt>Daemon Status:</tt><tt><br>
</tt><tt> corosync: active/disabled</tt><tt><br>
</tt><tt> pacemaker: active/disabled</tt><tt><br>
</tt><tt> pcsd: active/enabled</tt><tt><br>
</tt>====</p>
<p> Issue the iptables command on node 2. Journald output;<br>
</p>
<p>====<br>
<tt>-- Logs begin at Sat 2018-02-10 17:51:59 GMT. --</tt><tt><br>
</tt><tt>Feb 11 06:20:18 m3-a02n01.alteeve.com crmd[2817]:
notice: State transition S_TRANSITION_ENGINE -> S_IDLE</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: PingAck did not arrive in
time.</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0: susp-io( no -> fencing)</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: conn( Connected ->
NetworkFailure ) peer( Secondary -> Unknown )</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: pdsk( UpToDate ->
DUnknown ) repl( Established -> Off )</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: ack_receiver terminated</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: Terminating ack_recv thread</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02dr01.alteeve.com: Preparing remote state change
1400759070 (primary_nodes=1, weak_nodes=FFFFFFFFFFFFFFFA)</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02dr01.alteeve.com: Committing remote state
change 1400759070</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: pdsk( DUnknown ->
Outdated )</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0: new current UUID: 769A55B47EB143CD weak:
FFFFFFFFFFFFFFFA</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0: susp-io( fencing -> no)</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: Connection closed</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: conn( NetworkFailure ->
Unconnected )</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: Restarting receiver thread</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: conn( Unconnected ->
Connecting )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: Handshake to peer 1
successful: Agreed network protocol version 112</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: Feature flags enabled on
protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: Starting ack_recv thread (from
drbd_r_srv01-c7 [3336])</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0: Preparing cluster-wide state change 140629015
(0->1 499/145)</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0: State change 140629015: primary_nodes=1,
weak_nodes=FFFFFFFFFFFFFFF8</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0: Committing cluster-wide state change 140629015 (0ms)</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: conn( Connecting ->
Connected ) peer( Unknown -> Secondary )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: drbd_sync_handshake:</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: self
769A55B47EB143CD:4CF0E17ADD9D1E0F:4161585F99D3837C:361856E4E3DE837C
bits:0 flags:120</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: peer
4CF0E17ADD9D1E0E:0000000000000000:4CF0E17ADD9D1E0E:4161585F99D3837C
bits:0 flags:120</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: uuid_compare()=2 by
rule 70</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: repl( Off ->
WFBitMapS )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: send bitmap stats
[Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression:
100.0%</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: receive bitmap stats
[Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression:
100.0%</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: helper command:
/sbin/drbdadm before-resync-source</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: helper command:
/sbin/drbdadm before-resync-source exit code 0 (0x0)</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: pdsk( Outdated ->
Inconsistent ) repl( WFBitMapS -> SyncSource )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: Began resync as
SyncSource (will sync 0 KB [0 bits set]).</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: updated UUIDs
769A55B47EB143CD:0000000000000000:4CF0E17ADD9D1E0E:4161585F99D3837C</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: Resync done (total 1
sec; paused 0 sec; 0 K/sec)</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n02.alteeve.com: pdsk( Inconsistent
-> UpToDate ) repl( SyncSource -> Established )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: helper command: /sbin/drbdadm
unfence-peer</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n01.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n02.alteeve.com: helper command: /sbin/drbdadm
unfence-peer exit code 0 (0x0)</tt><tt><br>
</tt>==== <br>
<tt>-- Logs begin at Sun 2018-02-11 06:18:20 GMT. --</tt><tt><br>
</tt><tt>Feb 11 06:20:30 m3-a02n02.alteeve.com sshd[1968]:
pam_unix(sshd:session): session opened for user root by (uid=0)</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: PingAck did not arrive in
time.</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: conn( Connected ->
NetworkFailure ) peer( Primary -> Unknown )</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0: disk( UpToDate -> Consistent )</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: pdsk( UpToDate ->
DUnknown ) repl( Established -> Off )</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: ack_receiver terminated</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: Terminating ack_recv thread</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0: Preparing cluster-wide state change 1400759070
(1->-1 0/0)</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0: State change 1400759070: primary_nodes=1,
weak_nodes=FFFFFFFFFFFFFFFA</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0: Committing cluster-wide state change 1400759070
(1ms)</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0: disk( Consistent -> Outdated )</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: Connection closed</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: conn( NetworkFailure ->
Unconnected )</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: Restarting receiver thread</tt><tt><br>
</tt><tt>Feb 11 06:28:57 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: conn( Unconnected ->
Connecting )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: Handshake to peer 0
successful: Agreed network protocol version 112</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: Feature flags enabled on
protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: Starting ack_recv thread (from
drbd_r_srv01-c7 [1885])</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: Preparing remote state change
140629015 (primary_nodes=0, weak_nodes=0)</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: Committing remote state change
140629015</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0 m3-a02n01.alteeve.com: conn( Connecting ->
Connected ) peer( Unknown -> Primary )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: drbd_sync_handshake:</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: self
4CF0E17ADD9D1E0E:0000000000000000:4CF0E17ADD9D1E0E:4161585F99D3837C
bits:0 flags:120</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: peer
769A55B47EB143CD:4CF0E17ADD9D1E0F:4161585F99D3837C:361856E4E3DE837C
bits:0 flags:120</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: uuid_compare()=-2 by
rule 50</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: pdsk( DUnknown ->
UpToDate ) repl( Off -> WFBitMapT )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: receive bitmap stats
[Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression:
100.0%</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: send bitmap stats
[Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression:
100.0%</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: helper command:
/sbin/drbdadm before-resync-target</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: helper command:
/sbin/drbdadm before-resync-target exit code 0 (0x0)</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0: disk( Outdated -> Inconsistent )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02dr01.alteeve.com: resync-susp( no ->
connection dependency )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: repl( WFBitMapT ->
SyncTarget )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: Began resync as
SyncTarget (will sync 0 KB [0 bits set]).</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: Resync done (total 1
sec; paused 0 sec; 0 K/sec)</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: updated UUIDs
769A55B47EB143CC:0000000000000000:4CF0E17ADD9D1E0E:4161585F99D3837C</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0: disk( Inconsistent -> UpToDate )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02dr01.alteeve.com: resync-susp(
connection dependency -> no )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: repl( SyncTarget ->
Established )</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: helper command:
/sbin/drbdadm after-resync-target</tt><tt><br>
</tt><tt>Feb 11 06:29:18 m3-a02n02.alteeve.com kernel: drbd
srv01-c7_0/0 drbd0 m3-a02n01.alteeve.com: helper command:
/sbin/drbdadm after-resync-target exit code 0 (0x0)</tt><tt><br>
</tt>====<br>
</p>
<p>DRBD status on both nodes, post iptables break;</p>
<p>==== [root@m3-a02n01 ~]# drbdsetup status --verbose<br>
<tt>srv01-c7_0 node-id:0 role:Primary suspended:no</tt><tt><br>
</tt><tt> volume:0 minor:0 disk:UpToDate quorum:yes blocked:no</tt><tt><br>
</tt><tt> m3-a02dr01.alteeve.com node-id:2 connection:Connected
role:Secondary congested:no</tt><tt><br>
</tt><tt> volume:0 replication:Established peer-disk:UpToDate
resync-suspended:no</tt><tt><br>
</tt><tt> m3-a02n02.alteeve.com node-id:1 connection:Connected
role:Secondary congested:no</tt><tt><br>
</tt><tt> volume:0 replication:Established peer-disk:UpToDate
resync-suspended:no</tt><tt><br>
</tt>==== [root@m3-a02n02 ~]# drbdsetup status --verbose<br>
<tt>srv01-c7_0 node-id:1 role:Secondary suspended:no</tt><tt><br>
</tt><tt> volume:0 minor:0 disk:UpToDate quorum:yes blocked:no</tt><tt><br>
</tt><tt> m3-a02dr01.alteeve.com node-id:2 connection:Connected
role:Secondary congested:no</tt><tt><br>
</tt><tt> volume:0 replication:Established peer-disk:UpToDate
resync-suspended:no</tt><tt><br>
</tt><tt> m3-a02n01.alteeve.com node-id:0 connection:Connected
role:Primary congested:no</tt><tt><br>
</tt><tt> volume:0 replication:Established peer-disk:UpToDate
resync-suspended:no</tt><tt><br>
</tt>====</p>
<p>The cluster still thinks all is well, too.</p>
<p>==== [root@m3-a02n01 ~]# pcs status<br>
<tt>Cluster name: m3-anvil-02</tt><tt><br>
</tt><tt>Stack: corosync</tt><tt><br>
</tt><tt>Current DC: m3-a02n01.alteeve.com (version
1.1.16-12.el7_4.7-94ff4df) - partition with quorum</tt><tt><br>
</tt><tt>Last updated: Sun Feb 11 06:33:48 2018</tt><tt><br>
</tt><tt>Last change: Sun Feb 11 02:35:25 2018 by root via
crm_resource on m3-a02n01.alteeve.com</tt><tt><br>
</tt><tt><br>
</tt><tt>2 nodes configured</tt><tt><br>
</tt><tt>7 resources configured</tt><tt><br>
</tt><tt><br>
</tt><tt>Online: [ m3-a02n01.alteeve.com m3-a02n02.alteeve.com ]</tt><tt><br>
</tt><tt><br>
</tt><tt>Full list of resources:</tt><tt><br>
</tt><tt><br>
</tt><tt> virsh_node1 (stonith:fence_virsh): Started
m3-a02n01.alteeve.com</tt><tt><br>
</tt><tt> virsh_node2 (stonith:fence_virsh): Started
m3-a02n02.alteeve.com</tt><tt><br>
</tt><tt> Clone Set: hypervisor-clone [hypervisor]</tt><tt><br>
</tt><tt> Started: [ m3-a02n01.alteeve.com
m3-a02n02.alteeve.com ]</tt><tt><br>
</tt><tt> Clone Set: drbd-clone [drbd]</tt><tt><br>
</tt><tt> Started: [ m3-a02n01.alteeve.com
m3-a02n02.alteeve.com ]</tt><tt><br>
</tt><tt> srv01-c7 (ocf::heartbeat:VirtualDomain): Started
m3-a02n01.alteeve.com</tt><tt><br>
</tt><tt><br>
</tt><tt>Daemon Status:</tt><tt><br>
</tt><tt> corosync: active/disabled</tt><tt><br>
</tt><tt> pacemaker: active/disabled</tt><tt><br>
</tt><tt> pcsd: active/enabled</tt><tt><br>
</tt>====</p>
<p>To verify, I can't connect to node 2;</p>
<p>==== [root@m3-a02n01 ~]# telnet m3-a02n02.sn 7788<br>
<tt>Trying 10.41.20.2...</tt><tt><br>
</tt><tt>telnet: connect to address 10.41.20.2: Connection timed
out</tt><tt><br>
</tt>====</p>
<p> Did it somehow maintain connection through node 3? <br>
</p>
<p> If not, then a) Why didn't the fence-handler get invoked? b)
Why is it still showing connected?</p>
<p> If so, then is the connection between node 1 and 2 still
protocol C, even if the connection between 1 <-> 3 and 2
<-> 3 are protocol A?</p>
<p>Thanks!</p>
<pre class="moz-signature" cols="72">--
Digimer
Papers and Projects: <a class="moz-txt-link-freetext" href="https://alteeve.com/w/">https://alteeve.com/w/</a>
"I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould</pre>
</body>
</html>