Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Forgot to include the /proc/drbd outputs from the machines Primary: version: 8.3.0 (api:88/proto:86-89) GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by argus at docidtxt03, 2009-03-09 18:04:20 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r--- ns:0 nr:0 dw:94815205 dr:14861358 al:19160 bm:18136 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:18911216 Secondary: version: 8.3.0 (api:88/proto:86-89) GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by argus at docidtxt04, 2009-03-04 16:23:58 0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown A r--- ns:0 nr:0 dw:0 dr:0 al:0 bm:14 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:55544 Anyone have thoughts on this? I really don't want to have to resync the secondary. It takes about three days on this setup. Thanks, Jeff Orr Attributor Corporation Jeff Orr wrote: > So I was trying to upgrade the primary in one of our DRBD pairs last > night. The cluster manager moved the disk and virtual IP to the slave, > which subsequently crashed with a kernel panic (in XFS). I moved the > DRBD mount and virtual IP back to the primary, so services are up. But > now I am presented with this message on attempting to reconnect to the > secondary: > > drbd0: I shall become SyncTarget, but I am primary! > > I tried forcing the secondary to discard its data with "-- > --discard-my-data secondary", as well as "-- --overwrite-data-of-peer > primary" on the primary, but DRBD is flatly refusing to force the > secondary to become SyncTarget. I would like to discard the last 24hr or > so of changes on the secondary, but not resync the entire 6TB disk. Any > ideas on how to proceed? > > Here are the dmesg logs from primary: > > drbd0: conn( StandAlone -> Unconnected ) > drbd0: Starting receiver thread (from drbd0_worker [6728]) > drbd0: receiver (re)started > drbd0: conn( Unconnected -> WFConnection ) > drbd0: Handshake successful: Agreed network protocol version 89 > drbd0: conn( WFConnection -> WFReportParams ) > drbd0: Starting asender thread (from drbd0_receiver [12436]) > drbd0: data-integrity-alg: md5 > drbd0: drbd_sync_handshake: > drbd0: self > 4E2A0149690C1915:0CE9A5964B14BF3A:AC676DD67EC523BD:8435486C64177EB3 > drbd0: peer > 4E2A0149690C1914:0000000000000000:0CE9A5964B14BF3A:AC676DD67EC523BD > drbd0: uuid_compare()=-1 by rule 4 > drbd0: I shall become SyncTarget, but I am primary! > drbd0: conn( WFReportParams -> Disconnecting ) > drbd0: error receiving ReportState, l: 4! > drbd0: asender terminated > drbd0: Terminating asender thread > drbd0: Connection closed > drbd0: conn( Disconnecting -> StandAlone ) > drbd0: receiver terminated > drbd0: Terminating receiver thread > drbd0: conn( StandAlone -> Unconnected ) > drbd0: Starting receiver thread (from drbd0_worker [6728]) > drbd0: receiver (re)started > drbd0: conn( Unconnected -> WFConnection ) > drbd0: Handshake successful: Agreed network protocol version 89 > drbd0: conn( WFConnection -> WFReportParams ) > drbd0: Starting asender thread (from drbd0_receiver [12485]) > drbd0: data-integrity-alg: md5 > drbd0: drbd_sync_handshake: > drbd0: self > 4E2A0149690C1915:0CE9A5964B14BF3A:AC676DD67EC523BD:8435486C64177EB3 > drbd0: peer > 4E2A0149690C1914:0000000000000000:0CE9A5964B14BF3A:AC676DD67EC523BD > drbd0: uuid_compare()=-1 by rule 4 > drbd0: I shall become SyncTarget, but I am primary! > drbd0: conn( WFReportParams -> Disconnecting ) > drbd0: error receiving ReportState, l: 4! > drbd0: asender terminated > drbd0: Terminating asender thread > drbd0: Connection closed > drbd0: conn( Disconnecting -> StandAlone ) > drbd0: receiver terminated > drbd0: Terminating receiver thread > > and from secondary: > drbd0: conn( StandAlone -> Unconnected ) > drbd0: Starting receiver thread (from drbd0_worker [3584]) > drbd0: receiver (re)started > drbd0: conn( Unconnected -> WFConnection ) > drbd0: Handshake successful: Agreed network protocol version 89 > drbd0: conn( WFConnection -> WFReportParams ) > drbd0: Starting asender thread (from drbd0_receiver [8657]) > drbd0: data-integrity-alg: md5 > drbd0: drbd_sync_handshake: > drbd0: self > 4E2A0149690C1914:0000000000000000:0CE9A5964B14BF3A:AC676DD67EC523BD > drbd0: peer > 4E2A0149690C1915:0CE9A5964B14BF3A:AC676DD67EC523BD:8435486C64177EB3 > drbd0: uuid_compare()=1 by rule 4 > drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapS ) > pdsk( DUnknown -> UpToDate ) > drbd0: meta connection shut down by peer. > drbd0: peer( Primary -> Unknown ) conn( WFBitMapS -> NetworkFailure ) > pdsk( UpToDate -> DUnknown ) > drbd0: asender terminated > drbd0: Terminating asender thread > drbd0: sock_sendmsg returned -32 > drbd0: short sent ReportBitMap size=4096 sent=0 > drbd0: Connection closed > drbd0: conn( NetworkFailure -> Unconnected ) > drbd0: receiver terminated > drbd0: Restarting receiver thread > drbd0: receiver (re)started > drbd0: conn( Unconnected -> WFConnection ) > > Both machines are CentOS 2.6.18-92.1.13.el5 64-bit. DRBD is 8.3.0. > > Thanks in advance. > > Jeff Orr > Attributor Corporation > >