[DRBD-user] Power off caused "Unknown" status.

guohuai li guohuai_li at hotmail.com
Sat Apr 11 04:39:40 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi, all

 

On both machine, stop drbd,

 

On machine A, issue below command, 


linux-10:~ # modprobe drbd
linux-10:~ # drbdadm attach r2
linux-10:~ # drbdadm syncer r2
linux-10:~ # drbdadm connect r2
linux-10:~ # 

 

Then on machine B, also issue these commands, the logs on machine B are attached.

 

It shows "Split-Brain detected".

 

How to avoid this problem ?  And how to restore it to proper status ?

 

Thanks for you help.

 

Best regards,

Edward

 

+++++++++++++++++++++++++++++++++++++++++++++++

linux-10:~ # cat  /proc/drbd
version: 8.3.0 (api:88/proto:86-89)
GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by root at linux-10, 2009-02-18 16:33:17

 2: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r---
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8
linux-10:~ # 

 

+++++++++++++++/var/log/messages+++++++++++++++++++++++++

Apr 11 10:33:05 linux-10 kernel: drbd0: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Apr 11 10:33:05 linux-10 kernel: drbd0: short read expecting header on sock: r=-512
Apr 11 10:33:05 linux-10 kernel: drbd0: asender terminated
Apr 11 10:33:05 linux-10 kernel: drbd0: Terminating asender thread
Apr 11 10:33:05 linux-10 kernel: drbd0: Connection closed
Apr 11 10:33:05 linux-10 kernel: drbd0: conn( Disconnecting -> StandAlone )
Apr 11 10:33:05 linux-10 kernel: drbd0: receiver terminated
Apr 11 10:33:05 linux-10 kernel: drbd0: Terminating receiver thread
Apr 11 10:33:05 linux-10 kernel: drbd0: disk( UpToDate -> Diskless )
Apr 11 10:33:05 linux-10 kernel: drbd0: drbd_bm_resize called with capacity == 0
Apr 11 10:33:05 linux-10 kernel: drbd0: worker terminated
Apr 11 10:33:05 linux-10 kernel: drbd0: Terminating worker thread
Apr 11 10:33:05 linux-10 kernel: drbd1: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Apr 11 10:33:05 linux-10 kernel: drbd1: short read expecting header on sock: r=-512
Apr 11 10:33:05 linux-10 kernel: drbd1: asender terminated
Apr 11 10:33:05 linux-10 kernel: drbd1: Terminating asender thread
Apr 11 10:33:05 linux-10 kernel: drbd1: Connection closed
Apr 11 10:33:05 linux-10 kernel: drbd1: conn( Disconnecting -> StandAlone )
Apr 11 10:33:05 linux-10 kernel: drbd1: receiver terminated
Apr 11 10:33:05 linux-10 kernel: drbd1: Terminating receiver thread
Apr 11 10:33:05 linux-10 kernel: drbd1: disk( UpToDate -> Diskless )
Apr 11 10:33:05 linux-10 kernel: drbd1: drbd_bm_resize called with capacity == 0
Apr 11 10:33:05 linux-10 kernel: drbd1: worker terminated

Apr 11 10:33:05 linux-10 kernel: drbd1: Terminating worker thread
Apr 11 10:33:05 linux-10 kernel: drbd2: role( Primary -> Secondary )
Apr 11 10:33:05 linux-10 kernel: drbd2: disk( UpToDate -> Diskless )
Apr 11 10:33:05 linux-10 kernel: drbd2: drbd_bm_resize called with capacity == 0
Apr 11 10:33:05 linux-10 kernel: drbd2: worker terminated
Apr 11 10:33:05 linux-10 kernel: drbd2: Terminating worker thread
Apr 11 10:33:05 linux-10 kernel: drbd: module cleanup done.
Apr 11 10:34:48 linux-10 kernel: drbd: module not supported by Novell, setting U taint flag.
Apr 11 10:34:48 linux-10 kernel: drbd: initialised. Version: 8.3.0 (api:88/proto:86-89)
Apr 11 10:34:48 linux-10 kernel: drbd: GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by root at linux-10, 2009-02-18 16:33:17
Apr 11 10:34:48 linux-10 kernel: drbd: registered as block device major 147
Apr 11 10:34:48 linux-10 kernel: drbd: minor_table @ 0xffff8100740720c0
Apr 11 10:35:13 linux-10 kernel: drbd2: disk( Diskless -> Attaching )
Apr 11 10:35:13 linux-10 kernel: drbd2: Starting worker thread (from cqueue/5 [470])
Apr 11 10:35:13 linux-10 kernel: drbd2: No usable activity log found.
Apr 11 10:35:13 linux-10 kernel: drbd2: Method to ensure write ordering: barrier
Apr 11 10:35:13 linux-10 kernel: drbd2: max_segment_size ( = BIO size ) = 32768
Apr 11 10:35:13 linux-10 kernel: drbd2: drbd_bm_resize called with capacity == 1011928
Apr 11 10:35:13 linux-10 kernel: drbd2: resync bitmap: bits=126491 words=1977
Apr 11 10:35:13 linux-10 kernel: drbd2: size = 494 MB (505964 KB)
Apr 11 10:35:13 linux-10 kernel: drbd2: recounting of set bits took additional 0 jiffies
Apr 11 10:35:13 linux-10 kernel: drbd2: 8 KB (2 bits) marked out-of-sync by on disk bit-map.
Apr 11 10:35:13 linux-10 kernel: drbd2: disk( Attaching -> UpToDate )
Apr 11 10:35:21 linux-10 kernel: drbd2: conn( StandAlone -> Unconnected )
Apr 11 10:35:21 linux-10 kernel: drbd2: Starting receiver thread (from drbd2_worker [27928])
Apr 11 10:35:21 linux-10 kernel: drbd2: receiver (re)started
Apr 11 10:35:21 linux-10 kernel: drbd2: conn( Unconnected -> WFConnection )
Apr 11 10:35:21 linux-10 kernel: drbd2: Handshake successful: Agreed network protocol version 89
Apr 11 10:35:21 linux-10 kernel: drbd2: conn( WFConnection -> WFReportParams )
Apr 11 10:35:21 linux-10 kernel: drbd2: Starting asender thread (from drbd2_receiver [27938])
Apr 11 10:35:21 linux-10 kernel: drbd2: data-integrity-alg: <not-used>
Apr 11 10:35:21 linux-10 kernel: drbd2: drbd_sync_handshake:
Apr 11 10:35:21 linux-10 kernel: drbd2: self 5EB8F04153EED616:96A1102B3AB64E7E:9B3B0DF6A3761D4B:7009CE72C95D4780
Apr 11 10:35:21 linux-10 kernel: drbd2: peer 9AF90754B0369F26:96A1102B3AB64E7F:9B3B0DF6A3761D4A:7009CE72C95D4780
Apr 11 10:35:21 linux-10 kernel: drbd2: uuid_compare()=100 by rule 9
Apr 11 10:35:21 linux-10 kernel: drbd2: Split-Brain detected, dropping connection!
Apr 11 10:35:21 linux-10 kernel: drbd2: self 5EB8F04153EED616:96A1102B3AB64E7E:9B3B0DF6A3761D4B:7009CE72C95D4780
Apr 11 10:35:21 linux-10 kernel: drbd2: peer 9AF90754B0369F26:96A1102B3AB64E7F:9B3B0DF6A3761D4A:7009CE72C95D4780
Apr 11 10:35:21 linux-10 kernel: drbd2: helper command: /sbin/drbdadm split-brain minor-2
Apr 11 10:35:21 linux-10 kernel: drbd2: helper command: /sbin/drbdadm split-brain minor-2 exit code 0 (0x0)
Apr 11 10:35:21 linux-10 kernel: drbd2: conn( WFReportParams -> Disconnecting )
Apr 11 10:35:21 linux-10 kernel: drbd2: error receiving ReportState, l: 4!
Apr 11 10:35:21 linux-10 kernel: drbd2: asender terminated
Apr 11 10:35:21 linux-10 kernel: drbd2: Terminating asender thread
Apr 11 10:35:21 linux-10 kernel: drbd2: Connection closed
Apr 11 10:35:21 linux-10 kernel: drbd2: conn( Disconnecting -> StandAlone )
Apr 11 10:35:21 linux-10 kernel: drbd2: receiver terminated
Apr 11 10:35:21 linux-10 kernel: drbd2: Terminating receiver thread

 

 

++++++++++++++ below is the result of "dmesg" +++++++++++++++++++++

drbd1: self 694BC1146C7A0476:37A938321F1078C5:8E8AEF3010DC95A9:527AC0DE800282AB
drbd1: peer 37A938321F1078C4:0000000000000000:8E8AEF3010DC95A8:527AC0DE800282AB
drbd0: drbd_sync_handshake:
drbd0: self 6DB905A9822E817A:B777D655F75A1ABF:AE2C3E2935A148AF:3E19EFA99A907A2F
drbd1: uuid_compare()=1 by rule 7
drbd0: peer B777D655F75A1ABE:0000000000000000:AE2C3E2935A148AE:3E19EFA99A907A2F
drbd0: uuid_compare()=1 by rule 7
drbd2: drbd_sync_handshake:
drbd2: self 5EB8F04153EED616:96A1102B3AB64E7E:9B3B0DF6A3761D4B:7009CE72C95D4780
drbd2: peer 9AF90754B0369F26:96A1102B3AB64E7F:9B3B0DF6A3761D4A:7009CE72C95D4780
drbd2: uuid_compare()=100 by rule 9
drbd2: Split-Brain detected, dropping connection!
drbd2: self 5EB8F04153EED616:96A1102B3AB64E7E:9B3B0DF6A3761D4B:7009CE72C95D4780
drbd2: peer 9AF90754B0369F26:96A1102B3AB64E7F:9B3B0DF6A3761D4A:7009CE72C95D4780
drbd2: helper command: /sbin/drbdadm split-brain minor-2
drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate ) 
drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate ) 
drbd2: meta connection shut down by peer.
drbd2: conn( WFReportParams -> NetworkFailure ) 
drbd2: asender terminated
drbd2: Terminating asender thread
drbd2: helper command: /sbin/drbdadm split-brain minor-2 exit code 0 (0x0)
drbd2: conn( NetworkFailure -> Disconnecting ) 
drbd2: error receiving ReportState, l: 4!
drbd2: Connection closed
drbd2: conn( Disconnecting -> StandAlone ) 
drbd2: receiver terminated
drbd2: Terminating receiver thread
drbd1: conn( WFBitMapS -> SyncSource ) pdsk( UpToDate -> Inconsistent ) 
drbd1: Began resync as SyncSource (will sync 4 KB [1 bits set]).
drbd1: Resync done (total 1 sec; paused 0 sec; 4 K/sec)
drbd1: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) 
drbd0: conn( WFBitMapS -> SyncSource ) pdsk( UpToDate -> Inconsistent ) 
drbd0: Began resync as SyncSource (will sync 4 KB [1 bits set]).
drbd0: Resync done (total 1 sec; paused 0 sec; 4 K/sec)
drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) 
drbd2: role( Secondary -> Primary ) 
drbd0: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) 
drbd0: short read expecting header on sock: r=-512
drbd0: asender terminated
drbd0: Terminating asender thread
drbd0: Connection closed
drbd0: conn( Disconnecting -> StandAlone ) 
drbd0: receiver terminated
drbd0: Terminating receiver thread
drbd0: disk( UpToDate -> Diskless ) 
drbd0: drbd_bm_resize called with capacity == 0
drbd0: worker terminated
drbd0: Terminating worker thread
drbd1: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) 
drbd1: short read expecting header on sock: r=-512
drbd1: asender terminated
drbd1: Terminating asender thread
drbd1: Connection closed
drbd1: conn( Disconnecting -> StandAlone ) 
drbd1: receiver terminated
drbd1: Terminating receiver thread
drbd1: disk( UpToDate -> Diskless ) 
drbd1: drbd_bm_resize called with capacity == 0
drbd1: worker terminated
drbd1: Terminating worker thread
drbd2: role( Primary -> Secondary ) 
drbd2: disk( UpToDate -> Diskless ) 
drbd2: drbd_bm_resize called with capacity == 0
drbd2: worker terminated
drbd2: Terminating worker thread
drbd: module cleanup done.
drbd: module not supported by Novell, setting U taint flag.
drbd: initialised. Version: 8.3.0 (api:88/proto:86-89)
drbd: GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by root at linux-10, 2009-02-18 16:33:17
drbd: registered as block device major 147
drbd: minor_table @ 0xffff8100740720c0
drbd2: disk( Diskless -> Attaching ) 
drbd2: Starting worker thread (from cqueue/5 [470])
drbd2: No usable activity log found.
drbd2: Method to ensure write ordering: barrier
drbd2: max_segment_size ( = BIO size ) = 32768
drbd2: drbd_bm_resize called with capacity == 1011928
drbd2: resync bitmap: bits=126491 words=1977
drbd2: size = 494 MB (505964 KB)
drbd2: recounting of set bits took additional 0 jiffies
drbd2: 8 KB (2 bits) marked out-of-sync by on disk bit-map.
drbd2: disk( Attaching -> UpToDate ) 
drbd2: conn( StandAlone -> Unconnected ) 
drbd2: Starting receiver thread (from drbd2_worker [27928])
drbd2: receiver (re)started
drbd2: conn( Unconnected -> WFConnection ) 
drbd2: Handshake successful: Agreed network protocol version 89
drbd2: conn( WFConnection -> WFReportParams ) 
drbd2: Starting asender thread (from drbd2_receiver [27938])
drbd2: data-integrity-alg: <not-used>
drbd2: drbd_sync_handshake:
drbd2: self 5EB8F04153EED616:96A1102B3AB64E7E:9B3B0DF6A3761D4B:7009CE72C95D4780
drbd2: peer 9AF90754B0369F26:96A1102B3AB64E7F:9B3B0DF6A3761D4A:7009CE72C95D4780
drbd2: uuid_compare()=100 by rule 9
drbd2: Split-Brain detected, dropping connection!
drbd2: self 5EB8F04153EED616:96A1102B3AB64E7E:9B3B0DF6A3761D4B:7009CE72C95D4780
drbd2: peer 9AF90754B0369F26:96A1102B3AB64E7F:9B3B0DF6A3761D4A:7009CE72C95D4780
drbd2: helper command: /sbin/drbdadm split-brain minor-2
drbd2: helper command: /sbin/drbdadm split-brain minor-2 exit code 0 (0x0)
drbd2: conn( WFReportParams -> Disconnecting ) 
drbd2: error receiving ReportState, l: 4!
drbd2: asender terminated
drbd2: Terminating asender thread
drbd2: Connection closed
drbd2: conn( Disconnecting -> StandAlone ) 
drbd2: receiver terminated
drbd2: Terminating receiver thread
linux-10:~ # 
 
> Date: Sat, 11 Apr 2009 01:52:22 +0200
> From: r.bhatia at ipax.at
> To: guohuai_li at hotmail.com
> CC: drbd-user at lists.linbit.com
> Subject: Re: [DRBD-user] Power off caused "Unknown" status.
> 
> On 10.04.2009 02:30, guohuai li wrote:
> 
> > On machine B:
> >
> > 2: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r---
> > ns:0 nr:0 dw:4 dr:100 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:12
> >
> > On machine A:
> >
> > 2: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r---
> > ns:9 nr:4 dw:10 dr:200 al:1 bm:3 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8
> 
> did you try "drbdadm connect r2" on both nodes? what does dmesg say?
> what do the logs say?
> 
> cheers,
> raoul
> -- 
> ____________________________________________________________________
> DI (FH) Raoul Bhatia M.Sc. email. r.bhatia at ipax.at
> Technischer Leiter
> 
> IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at
> Barawitzkagasse 10/2/2/11 email. office at ipax.at
> 1190 Wien tel. +43 1 3670030
> FN 277995t HG Wien fax. +43 1 3670030 15
> ____________________________________________________________________

_________________________________________________________________
Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy!
http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20090411/8f005a40/attachment.htm>


More information about the drbd-user mailing list