[DRBD-user] after-sb-0pri issue
Christian Affolter
c.affolter at stepping-stone.ch
Wed Jul 11 18:39:25 CEST 2007
Hi all!
I'm in the process of simulating/testing some split brain scenarios.
I've got two hosts (host-01 and host-02) running DRBD 8.0.4 on a 2.6.20
kernel. The meta-disk is set to 'internal' and the configuration files
are exactly the same on both hosts ;)
Now I'm testing the outcome of the after-sb-0pri situation, which is set
to 'discard-younger-primary' (on both hosts).
I've executed the following steps to provoke a split brain situation
with zero primaries afterwards:
host-01: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
host-02: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
host-01 # drbdadm disconnect r0
host-01 # drbdadm secondary r0
host-02 # drbdadm primary r0
host-02 # drbdadm secondary r0
host-01 # drbdadm connect r0
The above results in the expected split brain situation. DRBD
successfully detects the split brain situation and syncs from host-01,
as it should:
host-01: drbd0: Split-Brain detected, 0 primaries, automatically solved.
Sync from this node
host-01: cs:Connected st:Secondary/Secondary ds:UpToDate/UpToDate C r---
host-02: drbd0: Split-Brain detected, 0 primaries, automatically solved.
Sync from peer node
host-02: cs:Connected st:Secondary/Secondary ds:UpToDate/UpToDate C r---
However, if I start with host-02 in primary initial state and execute
the above commands vice versa:
host-01: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
host-02: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
host-02 # drbdadm disconnect r0
host-02 # drbdadm secondary r0
host-01 # drbdadm primary r0
host-01 # drbdadm secondary r0
host-02 # drbdadm connect r0
then DRBD again correctly detects the split brain situation, but it
still syncs from host-01?
host-01: drbd0: Split-Brain detected, 0 primaries, automatically solved.
Sync from this node
host-01: cs:Connected st:Secondary/Secondary ds:UpToDate/UpToDate C r---
host-02: drbd0: Split-Brain detected, 0 primaries, automatically solved.
Sync from peer node
host-02: cs:Connected st:Secondary/Secondary ds:UpToDate/UpToDate C r---
Shouldn't it have synced from host-02, in this case?
Many thanks for clarifying this!
Regards Chris
PS: Below are the relevant log sections:
host-01 primary, host-02 secondary:
Jul 11 17:10:52 host-01 drbd0: role( Secondary -> Primary )
Jul 11 17:10:52 host-01 drbd0: Writing meta data super block now.
Jul 11 17:11:27 host-01 drbd0: peer( Secondary -> Unknown ) conn(
Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Jul 11 17:11:27 host-01 drbd0: Creating new current UUID
Jul 11 17:11:27 host-01 drbd0: short read expecting header on sock: r=-512
Jul 11 17:11:27 host-01 drbd0: asender terminated
Jul 11 17:11:27 host-01 drbd0: tl_clear()
Jul 11 17:11:27 host-01 drbd0: Connection closed
Jul 11 17:11:27 host-01 drbd0: Writing meta data super block now.
Jul 11 17:11:27 host-01 drbd0: conn( Disconnecting -> StandAlone )
Jul 11 17:11:27 host-01 drbd0: receiver terminated
Jul 11 17:11:38 host-01 drbd0: role( Primary -> Secondary )
Jul 11 17:11:38 host-01 drbd0: Writing meta data super block now.
Jul 11 17:11:57 host-01 drbd0: conn( StandAlone -> Unconnected )
Jul 11 17:11:57 host-01 drbd0: receiver (re)started
Jul 11 17:11:57 host-01 drbd0: conn( Unconnected -> WFConnection )
Jul 11 17:11:57 host-01 drbd0: conn( WFConnection -> WFReportParams )
Jul 11 17:11:57 host-01 drbd0: Handshake successful: DRBD Network
Protocol version 86
Jul 11 17:11:57 host-01 drbd0: Peer authenticated using 20 bytes of
'sha1' HMAC
Jul 11 17:11:57 host-01 drbd0: Split-Brain detected, 0 primaries,
automatically solved. Sync from this node
Jul 11 17:11:57 host-01 drbd0: peer( Unknown -> Secondary ) conn(
WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
Jul 11 17:11:57 host-01 drbd0: Writing meta data super block now.
Jul 11 17:11:58 host-01 drbd0: conn( WFBitMapS -> SyncSource ) pdsk(
UpToDate -> Inconsistent )
Jul 11 17:11:58 host-01 drbd0: Began resync as SyncSource (will sync 0
KB [0 bits set]).
Jul 11 17:11:58 host-01 drbd0: Resync done (total 1 sec; paused 0 sec; 0
K/sec)
Jul 11 17:11:58 host-01 drbd0: conn( SyncSource -> Connected ) pdsk(
Inconsistent -> UpToDate )
Jul 11 17:11:58 host-01 drbd0: Writing meta data super block now.
#######################################################################
Jul 11 17:10:52 host-02 drbd0: peer( Secondary -> Primary )
Jul 11 17:11:27 host-02 drbd0: peer( Primary -> Unknown ) conn(
Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
Jul 11 17:11:27 host-02 drbd0: Writing meta data super block now.
Jul 11 17:11:27 host-02 drbd0: meta connection shut down by peer.
Jul 11 17:11:27 host-02 drbd0: asender terminated
Jul 11 17:11:27 host-02 drbd0: tl_clear()
Jul 11 17:11:27 host-02 drbd0: Connection closed
Jul 11 17:11:27 host-02 drbd0: conn( TearDown -> Unconnected )
Jul 11 17:11:27 host-02 drbd0: receiver terminated
Jul 11 17:11:27 host-02 drbd0: receiver (re)started
Jul 11 17:11:27 host-02 drbd0: conn( Unconnected -> WFConnection )
Jul 11 17:11:47 host-02 drbd0: role( Secondary -> Primary )
Jul 11 17:11:47 host-02 drbd0: Creating new current UUID
Jul 11 17:11:47 host-02 drbd0: Writing meta data super block now.
Jul 11 17:11:53 host-02 drbd0: role( Primary -> Secondary )
Jul 11 17:11:53 host-02 drbd0: Writing meta data super block now.
Jul 11 17:11:57 host-02 drbd0: conn( WFConnection -> WFReportParams )
Jul 11 17:11:57 host-02 drbd0: Handshake successful: DRBD Network
Protocol version 86
Jul 11 17:11:57 host-02 drbd0: Peer authenticated using 20 bytes of
'sha1' HMAC
Jul 11 17:11:57 host-02 drbd0: Split-Brain detected, 0 primaries,
automatically solved. Sync from peer node
Jul 11 17:11:57 host-02 drbd0: peer( Unknown -> Secondary ) conn(
WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
Jul 11 17:11:57 host-02 drbd0: Writing meta data super block now.
Jul 11 17:11:58 host-02 drbd0: conn( WFBitMapT -> WFSyncUUID )
Jul 11 17:11:58 host-02 drbd0: conn( WFSyncUUID -> SyncTarget ) disk(
UpToDate -> Inconsistent )
Jul 11 17:11:58 host-02 drbd0: Began resync as SyncTarget (will sync 0
KB [0 bits set]).
Jul 11 17:11:58 host-02 drbd0: Resync done (total 1 sec; paused 0 sec; 0
K/sec)
Jul 11 17:11:58 host-02 drbd0: conn( SyncTarget -> Connected ) disk(
Inconsistent -> UpToDate )
Jul 11 17:11:58 host-02 drbd0: Writing meta data super block now.
host-02 primary, host-01 secondary:
Jul 11 17:15:12 host-02 drbd0: role( Secondary -> Primary )
Jul 11 17:15:12 host-02 drbd0: Writing meta data super block now.
Jul 11 17:15:29 host-02 drbd0: peer( Secondary -> Unknown ) conn(
Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Jul 11 17:15:29 host-02 drbd0: Creating new current UUID
Jul 11 17:15:29 host-02 drbd0: short read expecting header on sock: r=-512
Jul 11 17:15:29 host-02 drbd0: asender terminated
Jul 11 17:15:29 host-02 drbd0: tl_clear()
Jul 11 17:15:29 host-02 drbd0: Connection closed
Jul 11 17:15:29 host-02 drbd0: Writing meta data super block now.
Jul 11 17:15:29 host-02 drbd0: conn( Disconnecting -> StandAlone )
Jul 11 17:15:29 host-02 drbd0: receiver terminated
Jul 11 17:15:36 host-02 drbd0: role( Primary -> Secondary )
Jul 11 17:15:36 host-02 drbd0: Writing meta data super block now.
Jul 11 17:15:54 host-02 drbd0: conn( StandAlone -> Unconnected )
Jul 11 17:15:54 host-02 drbd0: receiver (re)started
Jul 11 17:15:54 host-02 drbd0: conn( Unconnected -> WFConnection )
Jul 11 17:15:54 host-02 drbd0: conn( WFConnection -> WFReportParams )
Jul 11 17:15:54 host-02 drbd0: Handshake successful: DRBD Network
Protocol version 86
Jul 11 17:15:54 host-02 drbd0: Peer authenticated using 20 bytes of
'sha1' HMAC
Jul 11 17:15:54 host-02 drbd0: Split-Brain detected, 0 primaries,
automatically solved. Sync from peer node
Jul 11 17:15:54 host-02 drbd0: peer( Unknown -> Secondary ) conn(
WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
Jul 11 17:15:54 host-02 drbd0: Writing meta data super block now.
Jul 11 17:15:55 host-02 drbd0: conn( WFBitMapT -> WFSyncUUID )
Jul 11 17:15:55 host-02 drbd0: conn( WFSyncUUID -> SyncTarget ) disk(
UpToDate -> Inconsistent )
Jul 11 17:15:55 host-02 drbd0: Began resync as SyncTarget (will sync 0
KB [0 bits set]).
Jul 11 17:15:55 host-02 drbd0: Resync done (total 1 sec; paused 0 sec; 0
K/sec)
Jul 11 17:15:55 host-02 drbd0: conn( SyncTarget -> Connected ) disk(
Inconsistent -> UpToDate )
Jul 11 17:15:55 host-02 drbd0: Writing meta data super block now.
########################################################################
Jul 11 17:15:11 host-01 drbd0: peer( Secondary -> Primary )
Jul 11 17:15:29 host-01 drbd0: peer( Primary -> Unknown ) conn(
Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
Jul 11 17:15:29 host-01 drbd0: Writing meta data super block now.
Jul 11 17:15:29 host-01 drbd0: meta connection shut down by peer.
Jul 11 17:15:29 host-01 drbd0: asender terminated
Jul 11 17:15:29 host-01 drbd0: tl_clear()
Jul 11 17:15:29 host-01 drbd0: Connection closed
Jul 11 17:15:29 host-01 drbd0: conn( TearDown -> Unconnected )
Jul 11 17:15:29 host-01 drbd0: receiver terminated
Jul 11 17:15:29 host-01 drbd0: receiver (re)started
Jul 11 17:15:29 host-01 drbd0: conn( Unconnected -> WFConnection )
Jul 11 17:15:41 host-01 drbd0: role( Secondary -> Primary )
Jul 11 17:15:41 host-01 drbd0: Creating new current UUID
Jul 11 17:15:41 host-01 drbd0: Writing meta data super block now.
Jul 11 17:15:48 host-01 drbd0: role( Primary -> Secondary )
Jul 11 17:15:48 host-01 drbd0: Writing meta data super block now.
Jul 11 17:15:54 host-01 drbd0: conn( WFConnection -> WFReportParams )
Jul 11 17:15:54 host-01 drbd0: Handshake successful: DRBD Network
Protocol version 86
Jul 11 17:15:54 host-01 drbd0: Peer authenticated using 20 bytes of
'sha1' HMAC
Jul 11 17:15:54 host-01 drbd0: Split-Brain detected, 0 primaries,
automatically solved. Sync from this node
Jul 11 17:15:54 host-01 drbd0: peer( Unknown -> Secondary ) conn(
WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
Jul 11 17:15:55 host-01 drbd0: Writing meta data super block now.
Jul 11 17:15:55 host-01 drbd0: conn( WFBitMapS -> SyncSource ) pdsk(
UpToDate -> Inconsistent )
Jul 11 17:15:55 host-01 drbd0: Began resync as SyncSource (will sync 0
KB [0 bits set]).
Jul 11 17:15:55 host-01 drbd0: Resync done (total 1 sec; paused 0 sec; 0
K/sec)
Jul 11 17:15:55 host-01 drbd0: conn( SyncSource -> Connected ) pdsk(
Inconsistent -> UpToDate )
Jul 11 17:15:55 host-01 drbd0: Writing meta data super block now.
More information about the drbd-user
mailing list