[DRBD-user] Cannot force node to be Primary Connected after kernel panic

Jeff Orr jeff at attributor.com
Wed Mar 11 19:11:09 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Forgot to include the /proc/drbd outputs from the machines

Primary:
version: 8.3.0 (api:88/proto:86-89)
GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by
argus at docidtxt03, 2009-03-09 18:04:20
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r---
    ns:0 nr:0 dw:94815205 dr:14861358 al:19160 bm:18136 lo:0 pe:0 ua:0
ap:0 ep:1 wo:b oos:18911216

Secondary:
version: 8.3.0 (api:88/proto:86-89)
GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by
argus at docidtxt04, 2009-03-04 16:23:58
 0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown A r---
    ns:0 nr:0 dw:0 dr:0 al:0 bm:14 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:55544

Anyone have thoughts on this? I really don't want to have to resync the
secondary. It takes about three days on this setup.

Thanks,
Jeff Orr
Attributor Corporation

Jeff Orr wrote:
> So I was trying to upgrade the primary in one of our DRBD pairs last
> night. The cluster manager moved the disk and virtual IP to the slave,
> which subsequently crashed with a kernel panic (in XFS). I moved the
> DRBD mount and virtual IP back to the primary, so services are up. But
> now I am presented with this message on attempting to reconnect to the
> secondary:
>
> drbd0: I shall become SyncTarget, but I am primary!
>
> I tried forcing the secondary to discard its data with "--
> --discard-my-data secondary", as well as "-- --overwrite-data-of-peer
> primary" on the primary, but DRBD is flatly refusing to force the
> secondary to become SyncTarget. I would like to discard the last 24hr or
> so of changes on the secondary, but not resync the entire 6TB disk. Any
> ideas on how to proceed?
>
> Here are the dmesg logs from primary:
>
> drbd0: conn( StandAlone -> Unconnected )
> drbd0: Starting receiver thread (from drbd0_worker [6728])
> drbd0: receiver (re)started
> drbd0: conn( Unconnected -> WFConnection )
> drbd0: Handshake successful: Agreed network protocol version 89
> drbd0: conn( WFConnection -> WFReportParams )
> drbd0: Starting asender thread (from drbd0_receiver [12436])
> drbd0: data-integrity-alg: md5
> drbd0: drbd_sync_handshake:
> drbd0: self
> 4E2A0149690C1915:0CE9A5964B14BF3A:AC676DD67EC523BD:8435486C64177EB3
> drbd0: peer
> 4E2A0149690C1914:0000000000000000:0CE9A5964B14BF3A:AC676DD67EC523BD
> drbd0: uuid_compare()=-1 by rule 4
> drbd0: I shall become SyncTarget, but I am primary!
> drbd0: conn( WFReportParams -> Disconnecting )
> drbd0: error receiving ReportState, l: 4!
> drbd0: asender terminated
> drbd0: Terminating asender thread
> drbd0: Connection closed
> drbd0: conn( Disconnecting -> StandAlone )
> drbd0: receiver terminated
> drbd0: Terminating receiver thread
> drbd0: conn( StandAlone -> Unconnected )
> drbd0: Starting receiver thread (from drbd0_worker [6728])
> drbd0: receiver (re)started
> drbd0: conn( Unconnected -> WFConnection )
> drbd0: Handshake successful: Agreed network protocol version 89
> drbd0: conn( WFConnection -> WFReportParams )
> drbd0: Starting asender thread (from drbd0_receiver [12485])
> drbd0: data-integrity-alg: md5
> drbd0: drbd_sync_handshake:
> drbd0: self
> 4E2A0149690C1915:0CE9A5964B14BF3A:AC676DD67EC523BD:8435486C64177EB3
> drbd0: peer
> 4E2A0149690C1914:0000000000000000:0CE9A5964B14BF3A:AC676DD67EC523BD
> drbd0: uuid_compare()=-1 by rule 4
> drbd0: I shall become SyncTarget, but I am primary!
> drbd0: conn( WFReportParams -> Disconnecting )
> drbd0: error receiving ReportState, l: 4!
> drbd0: asender terminated
> drbd0: Terminating asender thread
> drbd0: Connection closed
> drbd0: conn( Disconnecting -> StandAlone )
> drbd0: receiver terminated
> drbd0: Terminating receiver thread
>
> and from secondary:
> drbd0: conn( StandAlone -> Unconnected )
> drbd0: Starting receiver thread (from drbd0_worker [3584])
> drbd0: receiver (re)started
> drbd0: conn( Unconnected -> WFConnection )
> drbd0: Handshake successful: Agreed network protocol version 89
> drbd0: conn( WFConnection -> WFReportParams )
> drbd0: Starting asender thread (from drbd0_receiver [8657])
> drbd0: data-integrity-alg: md5
> drbd0: drbd_sync_handshake:
> drbd0: self
> 4E2A0149690C1914:0000000000000000:0CE9A5964B14BF3A:AC676DD67EC523BD
> drbd0: peer
> 4E2A0149690C1915:0CE9A5964B14BF3A:AC676DD67EC523BD:8435486C64177EB3
> drbd0: uuid_compare()=1 by rule 4
> drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapS )
> pdsk( DUnknown -> UpToDate )
> drbd0: meta connection shut down by peer.
> drbd0: peer( Primary -> Unknown ) conn( WFBitMapS -> NetworkFailure )
> pdsk( UpToDate -> DUnknown )
> drbd0: asender terminated
> drbd0: Terminating asender thread
> drbd0: sock_sendmsg returned -32
> drbd0: short sent ReportBitMap size=4096 sent=0
> drbd0: Connection closed
> drbd0: conn( NetworkFailure -> Unconnected )
> drbd0: receiver terminated
> drbd0: Restarting receiver thread
> drbd0: receiver (re)started
> drbd0: conn( Unconnected -> WFConnection )
>
> Both machines are CentOS 2.6.18-92.1.13.el5 64-bit. DRBD is 8.3.0.
>
> Thanks in advance.
>
> Jeff Orr
> Attributor Corporation
>
>   




More information about the drbd-user mailing list