[DRBD-user] Using discard-zero-changes and consensus with fencing

Federico Simoncelli federico.simoncelli at gmail.com
Thu Feb 26 10:42:20 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi all I'm using drbd 8.3.0 with fence configured for my resources and
I'd like to use discard-zero-changes for after-sb-0pri and consensus
for after-sb-1pri; the problem is that when a node fence the other
looks like that they both create a new UUID and the
discard-zero-changes doesn't work. Is my analysis correct? Is there
something wrong in my configuration?
Thank you for your help.

common {
  protocol C;

  net {
    after-sb-0pri discard-zero-changes;
    after-sb-1pri consensus;
    after-sb-2pri disconnect;
  }

  handlers {
    outdate-peer "/usr/lib/drbd/obliterate-peer.sh";
  }
}

resource mystorage {
  startup {
    become-primary-on both;
  }

  net {
    allow-two-primaries;
  }

  disk {
    fencing resource-and-stonith;
  }

  on node1 {
    device     /dev/drbd0;
    disk       /dev/md3;
    address    10.0.0.1:7788;
    flexible-meta-disk internal;
  }

  on node2 {
    device    /dev/drbd0;
    disk      /dev/md3;
    address   10.0.0.2:7788;
    flexible-meta-disk internal;
  }
}

# cat /proc/drbd
version: 8.3.0 (api:88/proto:86-89)
GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by
federico at nethesis.it, 2009-02-20 15:23:22
 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r---
    ns:0 nr:8192 dw:8192 dr:236 al:0 bm:4 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

[ --- node1 --- ]
Feb 26 09:57:17 node1 kernel: drbd0: PingAck did not arrive in time.
Feb 26 09:57:17 node1 kernel: drbd0: peer( Primary -> Unknown ) conn(
Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) susp( 0 ->
1 )
Feb 26 09:57:17 node1 kernel: drbd0: asender terminated
Feb 26 09:57:17 node1 kernel: drbd0: Terminating asender thread
Feb 26 09:57:17 node1 kernel: drbd0: short read expecting header on sock: r=-512
Feb 26 09:57:17 node1 kernel: drbd0: Creating new current UUID
Feb 26 09:57:17 node1 /usr/lib/drbd/obliterate-peer.sh: Local node ID:
1 / Remote node: node2
Feb 26 09:57:17 node1 kernel: drbd0: Connection closed
Feb 26 09:57:17 node1 kernel: drbd0: helper command: /sbin/drbdadm
fence-peer minor-0
Feb 26 09:57:29 node1 openais[1922]: [TOTEM] The token was lost in the
OPERATIONAL state.
Feb 26 09:57:29 node1 openais[1922]: [TOTEM] Receive multicast socket
recv buffer size (288000 bytes).
Feb 26 09:57:29 node1 openais[1922]: [TOTEM] Transmit multicast socket
send buffer size (219136 bytes).
Feb 26 09:57:29 node1 openais[1922]: [TOTEM] entering GATHER state from 2.
Feb 26 09:57:30 node1 fence_node[3561]: Fence of "node2" was successful
Feb 26 09:57:30 node1 kernel: drbd0: helper command: /sbin/drbdadm
fence-peer minor-0 exit code 7 (0x700)
Feb 26 09:57:30 node1 kernel: drbd0: fence-peer helper returned 7
(peer was stonithed)
Feb 26 09:57:30 node1 kernel: drbd0: pdsk( DUnknown -> Outdated )
Feb 26 09:57:30 node1 kernel: drbd0: susp( 1 -> 0 )
Feb 26 09:57:30 node1 kernel: drbd0: conn( NetworkFailure -> Unconnected )
Feb 26 09:57:30 node1 kernel: drbd0: receiver terminated
Feb 26 09:57:30 node1 kernel: drbd0: Restarting receiver thread
Feb 26 09:57:30 node1 kernel: drbd0: receiver (re)started
Feb 26 09:57:30 node1 kernel: drbd0: conn( Unconnected -> WFConnection )

[ --- node2 --- ]
Feb 26 09:57:19 node2 kernel: drbd0: PingAck did not arrive in time.
Feb 26 09:57:19 node2 kernel: drbd0: peer( Primary -> Unknown ) conn(
Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) susp( 0 ->
1 )
Feb 26 09:57:19 node2 kernel: drbd0: asender terminated
Feb 26 09:57:19 node2 kernel: drbd0: Terminating asender thread
Feb 26 09:57:19 node2 kernel: drbd0: short read expecting header on sock: r=-512
Feb 26 09:57:19 node2 kernel: drbd0: Creating new current UUID
Feb 26 09:57:19 node2 kernel: drbd0: Connection closed
Feb 26 09:57:19 node2 kernel: drbd0: helper command: /sbin/drbdadm
fence-peer minor-0
Feb 26 09:57:20 node2 /usr/lib/drbd/obliterate-peer.sh: Local node ID:
2 / Remote node: node1
[ --- node2 fenced --- ]

[ --- boot node2 ---]
Feb 26 09:58:32 node2 kernel: drbd0: drbd_sync_handshake:
Feb 26 09:58:32 node2 kernel: drbd0: self
CA3C3A2ADBB72A1E:32FD37CBFD965ABB:FB2B6776FC9D45FA:0002D187F14A403B
Feb 26 09:58:32 node2 kernel: drbd0: peer
09BF6602E2EFF0BD:32FD37CBFD965ABB:FB2B6776FC9D45FA:0002D187F14A403B
Feb 26 09:58:32 node2 kernel: drbd0: uuid_compare()=100 by rule 9
Feb 26 09:58:32 node2 kernel: drbd0: Split-Brain detected, dropping connection!
Feb 26 09:58:32 node2 kernel: drbd0: self
CA3C3A2ADBB72A1E:32FD37CBFD965ABB:FB2B6776FC9D45FA:0002D187F14A403B
Feb 26 09:58:32 node2 kernel: drbd0: peer
09BF6602E2EFF0BD:32FD37CBFD965ABB:FB2B6776FC9D45FA:0002D187F14A403B
Feb 26 09:58:32 node2 kernel: drbd0: helper command: /sbin/drbdadm
split-brain minor-0
Feb 26 09:58:32 node2 kernel: drbd0: helper command: /sbin/drbdadm
split-brain minor-0 exit code 0 (0x0)
Feb 26 09:58:32 node2 kernel: drbd0: conn( WFReportParams -> Disconnecting )
Feb 26 09:58:32 node2 kernel: drbd0: error receiving ReportState, l: 4!
Feb 26 09:58:32 node2 kernel: drbd0: asender terminated
Feb 26 09:58:32 node2 kernel: drbd0: Terminating asender thread
Feb 26 09:58:32 node2 kernel: drbd0: Connection closed
Feb 26 09:58:32 node2 kernel: drbd0: conn( Disconnecting -> StandAlone )
Feb 26 09:58:32 node2 kernel: drbd0: receiver terminated
Feb 26 09:58:32 node2 kernel: drbd0: Terminating receiver thread

-- 
Federico.



More information about the drbd-user mailing list