Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, a kernel log with timestamps would be a lot more useful here. > > And on my secondary: > <snip> > block drbd1: conn( SyncTarget -> Connected ) disk( Inconsistent -> > UpToDate ) Now you're good. > block drbd1: helper command: /sbin/drbdadm after-resync-target minor-1 > block drbd1: helper command: /sbin/drbdadm after-resync-target minor-1 > exit code 0 (0x0) > block drbd1: role( Secondary -> Primary ) > block drbd1: role( Primary -> Secondary ) Did you do that by hand? If not, who or what does this? I don't think a node is supposed to do this. > block drbd1: peer( Secondary -> Primary ) > block drbd1: PingAck did not arrive in time. > block drbd1: peer( Primary -> Unknown ) conn( Connected -> > NetworkFailure ) pdsk( UpToDate -> DUnknown ) Network failure. Correlate this to other logs. > block drbd1: asender terminated > block drbd1: Terminating asender thread > block drbd1: short read expecting header on sock: r=-512 > block drbd1: Connection closed > block drbd1: conn( NetworkFailure -> Unconnected ) > block drbd1: receiver terminated > block drbd1: Restarting receiver thread > block drbd1: receiver (re)started > block drbd1: conn( Unconnected -> WFConnection ) > block drbd1: Handshake successful: Agreed network protocol version 94 > block drbd1: conn( WFConnection -> WFReportParams ) > block drbd1: Starting asender thread (from drbd1_receiver [3538]) > block drbd1: data-integrity-alg: <not-used> > block drbd1: drbd_sync_handshake: > block drbd1: self > B0A76171352A5A3C:0000000000000000:28B6F22789E39014:C23C1E60BAF36299 > bits:0 flags:0 > block drbd1: peer > 5A7A31FAC38B4C31:B0A76171352A5A3D:28B6F22789E39014:C23C1E60BAF36299 > bits:0 flags:0 > block drbd1: uuid_compare()=-1 by rule 50 > block drbd1: peer( Unknown -> Primary ) conn( WFReportParams -> > WFBitMapT ) pdsk( DUnknown -> UpToDate ) > block drbd1: conn( WFBitMapT -> WFSyncUUID ) > block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1 > block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1 > exit code 0 (0x0) > block drbd1: conn( WFSyncUUID -> SyncTarget ) disk( UpToDate -> > Inconsistent ) > block drbd1: Began resync as SyncTarget (will sync 0 KB [0 bits set]). > block drbd1: Resync done (total 1 sec; paused 0 sec; 0 K/sec) > block drbd1: conn( SyncTarget -> Connected ) disk( Inconsistent -> > UpToDate ) You're good once more... > block drbd1: helper command: /sbin/drbdadm after-resync-target minor-1 > block drbd1: helper command: /sbin/drbdadm after-resync-target minor-1 > exit code 0 (0x0) > block drbd1: Connected in w_make_resync_request > block drbd1: PingAck did not arrive in time. > block drbd1: peer( Primary -> Unknown ) conn( Connected -> > NetworkFailure ) pdsk( UpToDate -> DUnknown ) ...and once again. > block drbd1: asender terminated > block drbd1: Terminating asender thread > block drbd1: short read expecting header on sock: r=-512 > block drbd1: Connection closed > block drbd1: conn( NetworkFailure -> Unconnected ) > block drbd1: receiver terminated > block drbd1: Restarting receiver thread > block drbd1: receiver (re)started > block drbd1: role( Secondary -> Primary ) And this should not have happened. Your secondary is going primary while unconnected. Hence splitbrain. Again - when and why does it do that? You need to try and find out about that. Cheers, Felix