Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi all! I'm in the process of simulating/testing some split brain scenarios. I've got two hosts (host-01 and host-02) running DRBD 8.0.4 on a 2.6.20 kernel. The meta-disk is set to 'internal' and the configuration files are exactly the same on both hosts ;) Now I'm testing the outcome of the after-sb-0pri situation, which is set to 'discard-younger-primary' (on both hosts). I've executed the following steps to provoke a split brain situation with zero primaries afterwards: host-01: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r--- host-02: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r--- host-01 # drbdadm disconnect r0 host-01 # drbdadm secondary r0 host-02 # drbdadm primary r0 host-02 # drbdadm secondary r0 host-01 # drbdadm connect r0 The above results in the expected split brain situation. DRBD successfully detects the split brain situation and syncs from host-01, as it should: host-01: drbd0: Split-Brain detected, 0 primaries, automatically solved. Sync from this node host-01: cs:Connected st:Secondary/Secondary ds:UpToDate/UpToDate C r--- host-02: drbd0: Split-Brain detected, 0 primaries, automatically solved. Sync from peer node host-02: cs:Connected st:Secondary/Secondary ds:UpToDate/UpToDate C r--- However, if I start with host-02 in primary initial state and execute the above commands vice versa: host-01: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r--- host-02: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r--- host-02 # drbdadm disconnect r0 host-02 # drbdadm secondary r0 host-01 # drbdadm primary r0 host-01 # drbdadm secondary r0 host-02 # drbdadm connect r0 then DRBD again correctly detects the split brain situation, but it still syncs from host-01? host-01: drbd0: Split-Brain detected, 0 primaries, automatically solved. Sync from this node host-01: cs:Connected st:Secondary/Secondary ds:UpToDate/UpToDate C r--- host-02: drbd0: Split-Brain detected, 0 primaries, automatically solved. Sync from peer node host-02: cs:Connected st:Secondary/Secondary ds:UpToDate/UpToDate C r--- Shouldn't it have synced from host-02, in this case? Many thanks for clarifying this! Regards Chris PS: Below are the relevant log sections: host-01 primary, host-02 secondary: Jul 11 17:10:52 host-01 drbd0: role( Secondary -> Primary ) Jul 11 17:10:52 host-01 drbd0: Writing meta data super block now. Jul 11 17:11:27 host-01 drbd0: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) Jul 11 17:11:27 host-01 drbd0: Creating new current UUID Jul 11 17:11:27 host-01 drbd0: short read expecting header on sock: r=-512 Jul 11 17:11:27 host-01 drbd0: asender terminated Jul 11 17:11:27 host-01 drbd0: tl_clear() Jul 11 17:11:27 host-01 drbd0: Connection closed Jul 11 17:11:27 host-01 drbd0: Writing meta data super block now. Jul 11 17:11:27 host-01 drbd0: conn( Disconnecting -> StandAlone ) Jul 11 17:11:27 host-01 drbd0: receiver terminated Jul 11 17:11:38 host-01 drbd0: role( Primary -> Secondary ) Jul 11 17:11:38 host-01 drbd0: Writing meta data super block now. Jul 11 17:11:57 host-01 drbd0: conn( StandAlone -> Unconnected ) Jul 11 17:11:57 host-01 drbd0: receiver (re)started Jul 11 17:11:57 host-01 drbd0: conn( Unconnected -> WFConnection ) Jul 11 17:11:57 host-01 drbd0: conn( WFConnection -> WFReportParams ) Jul 11 17:11:57 host-01 drbd0: Handshake successful: DRBD Network Protocol version 86 Jul 11 17:11:57 host-01 drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC Jul 11 17:11:57 host-01 drbd0: Split-Brain detected, 0 primaries, automatically solved. Sync from this node Jul 11 17:11:57 host-01 drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate ) Jul 11 17:11:57 host-01 drbd0: Writing meta data super block now. Jul 11 17:11:58 host-01 drbd0: conn( WFBitMapS -> SyncSource ) pdsk( UpToDate -> Inconsistent ) Jul 11 17:11:58 host-01 drbd0: Began resync as SyncSource (will sync 0 KB [0 bits set]). Jul 11 17:11:58 host-01 drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) Jul 11 17:11:58 host-01 drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) Jul 11 17:11:58 host-01 drbd0: Writing meta data super block now. ####################################################################### Jul 11 17:10:52 host-02 drbd0: peer( Secondary -> Primary ) Jul 11 17:11:27 host-02 drbd0: peer( Primary -> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown ) Jul 11 17:11:27 host-02 drbd0: Writing meta data super block now. Jul 11 17:11:27 host-02 drbd0: meta connection shut down by peer. Jul 11 17:11:27 host-02 drbd0: asender terminated Jul 11 17:11:27 host-02 drbd0: tl_clear() Jul 11 17:11:27 host-02 drbd0: Connection closed Jul 11 17:11:27 host-02 drbd0: conn( TearDown -> Unconnected ) Jul 11 17:11:27 host-02 drbd0: receiver terminated Jul 11 17:11:27 host-02 drbd0: receiver (re)started Jul 11 17:11:27 host-02 drbd0: conn( Unconnected -> WFConnection ) Jul 11 17:11:47 host-02 drbd0: role( Secondary -> Primary ) Jul 11 17:11:47 host-02 drbd0: Creating new current UUID Jul 11 17:11:47 host-02 drbd0: Writing meta data super block now. Jul 11 17:11:53 host-02 drbd0: role( Primary -> Secondary ) Jul 11 17:11:53 host-02 drbd0: Writing meta data super block now. Jul 11 17:11:57 host-02 drbd0: conn( WFConnection -> WFReportParams ) Jul 11 17:11:57 host-02 drbd0: Handshake successful: DRBD Network Protocol version 86 Jul 11 17:11:57 host-02 drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC Jul 11 17:11:57 host-02 drbd0: Split-Brain detected, 0 primaries, automatically solved. Sync from peer node Jul 11 17:11:57 host-02 drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate ) Jul 11 17:11:57 host-02 drbd0: Writing meta data super block now. Jul 11 17:11:58 host-02 drbd0: conn( WFBitMapT -> WFSyncUUID ) Jul 11 17:11:58 host-02 drbd0: conn( WFSyncUUID -> SyncTarget ) disk( UpToDate -> Inconsistent ) Jul 11 17:11:58 host-02 drbd0: Began resync as SyncTarget (will sync 0 KB [0 bits set]). Jul 11 17:11:58 host-02 drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) Jul 11 17:11:58 host-02 drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) Jul 11 17:11:58 host-02 drbd0: Writing meta data super block now. host-02 primary, host-01 secondary: Jul 11 17:15:12 host-02 drbd0: role( Secondary -> Primary ) Jul 11 17:15:12 host-02 drbd0: Writing meta data super block now. Jul 11 17:15:29 host-02 drbd0: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) Jul 11 17:15:29 host-02 drbd0: Creating new current UUID Jul 11 17:15:29 host-02 drbd0: short read expecting header on sock: r=-512 Jul 11 17:15:29 host-02 drbd0: asender terminated Jul 11 17:15:29 host-02 drbd0: tl_clear() Jul 11 17:15:29 host-02 drbd0: Connection closed Jul 11 17:15:29 host-02 drbd0: Writing meta data super block now. Jul 11 17:15:29 host-02 drbd0: conn( Disconnecting -> StandAlone ) Jul 11 17:15:29 host-02 drbd0: receiver terminated Jul 11 17:15:36 host-02 drbd0: role( Primary -> Secondary ) Jul 11 17:15:36 host-02 drbd0: Writing meta data super block now. Jul 11 17:15:54 host-02 drbd0: conn( StandAlone -> Unconnected ) Jul 11 17:15:54 host-02 drbd0: receiver (re)started Jul 11 17:15:54 host-02 drbd0: conn( Unconnected -> WFConnection ) Jul 11 17:15:54 host-02 drbd0: conn( WFConnection -> WFReportParams ) Jul 11 17:15:54 host-02 drbd0: Handshake successful: DRBD Network Protocol version 86 Jul 11 17:15:54 host-02 drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC Jul 11 17:15:54 host-02 drbd0: Split-Brain detected, 0 primaries, automatically solved. Sync from peer node Jul 11 17:15:54 host-02 drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate ) Jul 11 17:15:54 host-02 drbd0: Writing meta data super block now. Jul 11 17:15:55 host-02 drbd0: conn( WFBitMapT -> WFSyncUUID ) Jul 11 17:15:55 host-02 drbd0: conn( WFSyncUUID -> SyncTarget ) disk( UpToDate -> Inconsistent ) Jul 11 17:15:55 host-02 drbd0: Began resync as SyncTarget (will sync 0 KB [0 bits set]). Jul 11 17:15:55 host-02 drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) Jul 11 17:15:55 host-02 drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) Jul 11 17:15:55 host-02 drbd0: Writing meta data super block now. ######################################################################## Jul 11 17:15:11 host-01 drbd0: peer( Secondary -> Primary ) Jul 11 17:15:29 host-01 drbd0: peer( Primary -> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown ) Jul 11 17:15:29 host-01 drbd0: Writing meta data super block now. Jul 11 17:15:29 host-01 drbd0: meta connection shut down by peer. Jul 11 17:15:29 host-01 drbd0: asender terminated Jul 11 17:15:29 host-01 drbd0: tl_clear() Jul 11 17:15:29 host-01 drbd0: Connection closed Jul 11 17:15:29 host-01 drbd0: conn( TearDown -> Unconnected ) Jul 11 17:15:29 host-01 drbd0: receiver terminated Jul 11 17:15:29 host-01 drbd0: receiver (re)started Jul 11 17:15:29 host-01 drbd0: conn( Unconnected -> WFConnection ) Jul 11 17:15:41 host-01 drbd0: role( Secondary -> Primary ) Jul 11 17:15:41 host-01 drbd0: Creating new current UUID Jul 11 17:15:41 host-01 drbd0: Writing meta data super block now. Jul 11 17:15:48 host-01 drbd0: role( Primary -> Secondary ) Jul 11 17:15:48 host-01 drbd0: Writing meta data super block now. Jul 11 17:15:54 host-01 drbd0: conn( WFConnection -> WFReportParams ) Jul 11 17:15:54 host-01 drbd0: Handshake successful: DRBD Network Protocol version 86 Jul 11 17:15:54 host-01 drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC Jul 11 17:15:54 host-01 drbd0: Split-Brain detected, 0 primaries, automatically solved. Sync from this node Jul 11 17:15:54 host-01 drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate ) Jul 11 17:15:55 host-01 drbd0: Writing meta data super block now. Jul 11 17:15:55 host-01 drbd0: conn( WFBitMapS -> SyncSource ) pdsk( UpToDate -> Inconsistent ) Jul 11 17:15:55 host-01 drbd0: Began resync as SyncSource (will sync 0 KB [0 bits set]). Jul 11 17:15:55 host-01 drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) Jul 11 17:15:55 host-01 drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) Jul 11 17:15:55 host-01 drbd0: Writing meta data super block now.