<html>
<head>
<style>
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Verdana
}
</style>
</head>
<body class='hmmessage'>
Hi, all<BR>
<BR>
On both machine, stop drbd,<BR>
<BR>
On machine A, issue below command, <BR>
<BR>linux-10:~ # modprobe drbd<BR>linux-10:~ # drbdadm attach r2<BR>linux-10:~ # drbdadm syncer r2<BR>linux-10:~ # drbdadm connect r2<BR>linux-10:~ # <BR>
<BR>
Then on machine B, also issue these commands, the logs on machine B are attached.<BR>
<BR>
It shows "Split-Brain detected".<BR>
<BR>
How to avoid this problem ? And how to restore it to proper status ?<BR>
<BR>
Thanks for you help.<BR>
<BR>
Best regards,<BR>
Edward<BR>
<BR>
+++++++++++++++++++++++++++++++++++++++++++++++<BR>
linux-10:~ # cat /proc/drbd<BR>version: 8.3.0 (api:88/proto:86-89)<BR>GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by <A href="mailto:root@linux-10">root@linux-10</A>, 2009-02-18 16:33:17<BR>
2: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r---<BR> ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8<BR>linux-10:~ # <BR>
<BR>
+++++++++++++++/var/log/messages+++++++++++++++++++++++++<BR>
Apr 11 10:33:05 linux-10 kernel: drbd0: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )<BR>Apr 11 10:33:05 linux-10 kernel: drbd0: short read expecting header on sock: r=-512<BR>Apr 11 10:33:05 linux-10 kernel: drbd0: asender terminated<BR>Apr 11 10:33:05 linux-10 kernel: drbd0: Terminating asender thread<BR>Apr 11 10:33:05 linux-10 kernel: drbd0: Connection closed<BR>Apr 11 10:33:05 linux-10 kernel: drbd0: conn( Disconnecting -> StandAlone )<BR>Apr 11 10:33:05 linux-10 kernel: drbd0: receiver terminated<BR>Apr 11 10:33:05 linux-10 kernel: drbd0: Terminating receiver thread<BR>Apr 11 10:33:05 linux-10 kernel: drbd0: disk( UpToDate -> Diskless )<BR>Apr 11 10:33:05 linux-10 kernel: drbd0: drbd_bm_resize called with capacity == 0<BR>Apr 11 10:33:05 linux-10 kernel: drbd0: worker terminated<BR>Apr 11 10:33:05 linux-10 kernel: drbd0: Terminating worker thread<BR>Apr 11 10:33:05 linux-10 kernel: drbd1: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )<BR>Apr 11 10:33:05 linux-10 kernel: drbd1: short read expecting header on sock: r=-512<BR>Apr 11 10:33:05 linux-10 kernel: drbd1: asender terminated<BR>Apr 11 10:33:05 linux-10 kernel: drbd1: Terminating asender thread<BR>Apr 11 10:33:05 linux-10 kernel: drbd1: Connection closed<BR>Apr 11 10:33:05 linux-10 kernel: drbd1: conn( Disconnecting -> StandAlone )<BR>Apr 11 10:33:05 linux-10 kernel: drbd1: receiver terminated<BR>Apr 11 10:33:05 linux-10 kernel: drbd1: Terminating receiver thread<BR>Apr 11 10:33:05 linux-10 kernel: drbd1: disk( UpToDate -> Diskless )<BR>Apr 11 10:33:05 linux-10 kernel: drbd1: drbd_bm_resize called with capacity == 0<BR>Apr 11 10:33:05 linux-10 kernel: drbd1: worker terminated<BR>
Apr 11 10:33:05 linux-10 kernel: drbd1: Terminating worker thread<BR>Apr 11 10:33:05 linux-10 kernel: drbd2: role( Primary -> Secondary )<BR>Apr 11 10:33:05 linux-10 kernel: drbd2: disk( UpToDate -> Diskless )<BR>Apr 11 10:33:05 linux-10 kernel: drbd2: drbd_bm_resize called with capacity == 0<BR>Apr 11 10:33:05 linux-10 kernel: drbd2: worker terminated<BR>Apr 11 10:33:05 linux-10 kernel: drbd2: Terminating worker thread<BR>Apr 11 10:33:05 linux-10 kernel: drbd: module cleanup done.<BR>Apr 11 10:34:48 linux-10 kernel: drbd: module not supported by Novell, setting U taint flag.<BR>Apr 11 10:34:48 linux-10 kernel: drbd: initialised. Version: 8.3.0 (api:88/proto:86-89)<BR>Apr 11 10:34:48 linux-10 kernel: drbd: GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by <A href="mailto:root@linux-10">root@linux-10</A>, 2009-02-18 16:33:17<BR>Apr 11 10:34:48 linux-10 kernel: drbd: registered as block device major 147<BR>Apr 11 10:34:48 linux-10 kernel: drbd: minor_table @ 0xffff8100740720c0<BR>Apr 11 10:35:13 linux-10 kernel: drbd2: disk( Diskless -> Attaching )<BR>Apr 11 10:35:13 linux-10 kernel: drbd2: Starting worker thread (from cqueue/5 [470])<BR>Apr 11 10:35:13 linux-10 kernel: drbd2: No usable activity log found.<BR>Apr 11 10:35:13 linux-10 kernel: drbd2: Method to ensure write ordering: barrier<BR>Apr 11 10:35:13 linux-10 kernel: drbd2: max_segment_size ( = BIO size ) = 32768<BR>Apr 11 10:35:13 linux-10 kernel: drbd2: drbd_bm_resize called with capacity == 1011928<BR>Apr 11 10:35:13 linux-10 kernel: drbd2: resync bitmap: bits=126491 words=1977<BR>Apr 11 10:35:13 linux-10 kernel: drbd2: size = 494 MB (505964 KB)<BR>Apr 11 10:35:13 linux-10 kernel: drbd2: recounting of set bits took additional 0 jiffies<BR>Apr 11 10:35:13 linux-10 kernel: drbd2: 8 KB (2 bits) marked out-of-sync by on disk bit-map.<BR>Apr 11 10:35:13 linux-10 kernel: drbd2: disk( Attaching -> UpToDate )<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: conn( StandAlone -> Unconnected )<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: Starting receiver thread (from drbd2_worker [27928])<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: receiver (re)started<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: conn( Unconnected -> WFConnection )<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: Handshake successful: Agreed network protocol version 89<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: conn( WFConnection -> WFReportParams )<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: Starting asender thread (from drbd2_receiver [27938])<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: data-integrity-alg: <not-used><BR>Apr 11 10:35:21 linux-10 kernel: drbd2: drbd_sync_handshake:<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: self 5EB8F04153EED616:96A1102B3AB64E7E:9B3B0DF6A3761D4B:7009CE72C95D4780<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: peer 9AF90754B0369F26:96A1102B3AB64E7F:9B3B0DF6A3761D4A:7009CE72C95D4780<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: uuid_compare()=100 by rule 9<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: Split-Brain detected, dropping connection!<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: self 5EB8F04153EED616:96A1102B3AB64E7E:9B3B0DF6A3761D4B:7009CE72C95D4780<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: peer 9AF90754B0369F26:96A1102B3AB64E7F:9B3B0DF6A3761D4A:7009CE72C95D4780<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: helper command: /sbin/drbdadm split-brain minor-2<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: helper command: /sbin/drbdadm split-brain minor-2 exit code 0 (0x0)<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: conn( WFReportParams -> Disconnecting )<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: error receiving ReportState, l: 4!<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: asender terminated<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: Terminating asender thread<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: Connection closed<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: conn( Disconnecting -> StandAlone )<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: receiver terminated<BR>Apr 11 10:35:21 linux-10 kernel: drbd2: Terminating receiver thread<BR>
<BR>
<BR>
++++++++++++++ below is the result of "dmesg" +++++++++++++++++++++<BR>
drbd1: self 694BC1146C7A0476:37A938321F1078C5:8E8AEF3010DC95A9:527AC0DE800282AB<BR>drbd1: peer 37A938321F1078C4:0000000000000000:8E8AEF3010DC95A8:527AC0DE800282AB<BR>drbd0: drbd_sync_handshake:<BR>drbd0: self 6DB905A9822E817A:B777D655F75A1ABF:AE2C3E2935A148AF:3E19EFA99A907A2F<BR>drbd1: uuid_compare()=1 by rule 7<BR>drbd0: peer B777D655F75A1ABE:0000000000000000:AE2C3E2935A148AE:3E19EFA99A907A2F<BR>drbd0: uuid_compare()=1 by rule 7<BR>drbd2: drbd_sync_handshake:<BR>drbd2: self 5EB8F04153EED616:96A1102B3AB64E7E:9B3B0DF6A3761D4B:7009CE72C95D4780<BR>drbd2: peer 9AF90754B0369F26:96A1102B3AB64E7F:9B3B0DF6A3761D4A:7009CE72C95D4780<BR>drbd2: uuid_compare()=100 by rule 9<BR>drbd2: Split-Brain detected, dropping connection!<BR>drbd2: self 5EB8F04153EED616:96A1102B3AB64E7E:9B3B0DF6A3761D4B:7009CE72C95D4780<BR>drbd2: peer 9AF90754B0369F26:96A1102B3AB64E7F:9B3B0DF6A3761D4A:7009CE72C95D4780<BR>drbd2: helper command: /sbin/drbdadm split-brain minor-2<BR>drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate ) <BR>drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate ) <BR>drbd2: meta connection shut down by peer.<BR>drbd2: conn( WFReportParams -> NetworkFailure ) <BR>drbd2: asender terminated<BR>drbd2: Terminating asender thread<BR>drbd2: helper command: /sbin/drbdadm split-brain minor-2 exit code 0 (0x0)<BR>drbd2: conn( NetworkFailure -> Disconnecting ) <BR>drbd2: error receiving ReportState, l: 4!<BR>drbd2: Connection closed<BR>drbd2: conn( Disconnecting -> StandAlone ) <BR>drbd2: receiver terminated<BR>drbd2: Terminating receiver thread<BR>drbd1: conn( WFBitMapS -> SyncSource ) pdsk( UpToDate -> Inconsistent ) <BR>drbd1: Began resync as SyncSource (will sync 4 KB [1 bits set]).<BR>drbd1: Resync done (total 1 sec; paused 0 sec; 4 K/sec)<BR>drbd1: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) <BR>drbd0: conn( WFBitMapS -> SyncSource ) pdsk( UpToDate -> Inconsistent ) <BR>drbd0: Began resync as SyncSource (will sync 4 KB [1 bits set]).<BR>drbd0: Resync done (total 1 sec; paused 0 sec; 4 K/sec)<BR>drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) <BR>drbd2: role( Secondary -> Primary ) <BR>drbd0: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) <BR>drbd0: short read expecting header on sock: r=-512<BR>drbd0: asender terminated<BR>drbd0: Terminating asender thread<BR>drbd0: Connection closed<BR>drbd0: conn( Disconnecting -> StandAlone ) <BR>drbd0: receiver terminated<BR>drbd0: Terminating receiver thread<BR>drbd0: disk( UpToDate -> Diskless ) <BR>drbd0: drbd_bm_resize called with capacity == 0<BR>drbd0: worker terminated<BR>drbd0: Terminating worker thread<BR>drbd1: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) <BR>drbd1: short read expecting header on sock: r=-512<BR>drbd1: asender terminated<BR>drbd1: Terminating asender thread<BR>drbd1: Connection closed<BR>drbd1: conn( Disconnecting -> StandAlone ) <BR>drbd1: receiver terminated<BR>drbd1: Terminating receiver thread<BR>drbd1: disk( UpToDate -> Diskless ) <BR>drbd1: drbd_bm_resize called with capacity == 0<BR>drbd1: worker terminated<BR>drbd1: Terminating worker thread<BR>drbd2: role( Primary -> Secondary ) <BR>drbd2: disk( UpToDate -> Diskless ) <BR>drbd2: drbd_bm_resize called with capacity == 0<BR>drbd2: worker terminated<BR>drbd2: Terminating worker thread<BR>drbd: module cleanup done.<BR>drbd: module not supported by Novell, setting U taint flag.<BR>drbd: initialised. Version: 8.3.0 (api:88/proto:86-89)<BR>drbd: GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by <A href="mailto:root@linux-10">root@linux-10</A>, 2009-02-18 16:33:17<BR>drbd: registered as block device major 147<BR>drbd: minor_table @ 0xffff8100740720c0<BR>drbd2: disk( Diskless -> Attaching ) <BR>drbd2: Starting worker thread (from cqueue/5 [470])<BR>drbd2: No usable activity log found.<BR>drbd2: Method to ensure write ordering: barrier<BR>drbd2: max_segment_size ( = BIO size ) = 32768<BR>drbd2: drbd_bm_resize called with capacity == 1011928<BR>drbd2: resync bitmap: bits=126491 words=1977<BR>drbd2: size = 494 MB (505964 KB)<BR>drbd2: recounting of set bits took additional 0 jiffies<BR>drbd2: 8 KB (2 bits) marked out-of-sync by on disk bit-map.<BR>drbd2: disk( Attaching -> UpToDate ) <BR>drbd2: conn( StandAlone -> Unconnected ) <BR>drbd2: Starting receiver thread (from drbd2_worker [27928])<BR>drbd2: receiver (re)started<BR>drbd2: conn( Unconnected -> WFConnection ) <BR>drbd2: Handshake successful: Agreed network protocol version 89<BR>drbd2: conn( WFConnection -> WFReportParams ) <BR>drbd2: Starting asender thread (from drbd2_receiver [27938])<BR>drbd2: data-integrity-alg: <not-used><BR>drbd2: drbd_sync_handshake:<BR>drbd2: self 5EB8F04153EED616:96A1102B3AB64E7E:9B3B0DF6A3761D4B:7009CE72C95D4780<BR>drbd2: peer 9AF90754B0369F26:96A1102B3AB64E7F:9B3B0DF6A3761D4A:7009CE72C95D4780<BR>drbd2: uuid_compare()=100 by rule 9<BR>drbd2: Split-Brain detected, dropping connection!<BR>drbd2: self 5EB8F04153EED616:96A1102B3AB64E7E:9B3B0DF6A3761D4B:7009CE72C95D4780<BR>drbd2: peer 9AF90754B0369F26:96A1102B3AB64E7F:9B3B0DF6A3761D4A:7009CE72C95D4780<BR>drbd2: helper command: /sbin/drbdadm split-brain minor-2<BR>drbd2: helper command: /sbin/drbdadm split-brain minor-2 exit code 0 (0x0)<BR>drbd2: conn( WFReportParams -> Disconnecting ) <BR>drbd2: error receiving ReportState, l: 4!<BR>drbd2: asender terminated<BR>drbd2: Terminating asender thread<BR>drbd2: Connection closed<BR>drbd2: conn( Disconnecting -> StandAlone ) <BR>drbd2: receiver terminated<BR>drbd2: Terminating receiver thread<BR>linux-10:~ # <BR> <BR>> Date: Sat, 11 Apr 2009 01:52:22 +0200<BR>> From: r.bhatia@ipax.at<BR>> To: guohuai_li@hotmail.com<BR>> CC: drbd-user@lists.linbit.com<BR>> Subject: Re: [DRBD-user] Power off caused "Unknown" status.<BR>> <BR>> On 10.04.2009 02:30, guohuai li wrote:<BR>> <BR>> > On machine B:<BR>> ><BR>> > 2: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r---<BR>> > ns:0 nr:0 dw:4 dr:100 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:12<BR>> ><BR>> > On machine A:<BR>> ><BR>> > 2: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r---<BR>> > ns:9 nr:4 dw:10 dr:200 al:1 bm:3 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8<BR>> <BR>> did you try "drbdadm connect r2" on both nodes? what does dmesg say?<BR>> what do the logs say?<BR>> <BR>> cheers,<BR>> raoul<BR>> -- <BR>> ____________________________________________________________________<BR>> DI (FH) Raoul Bhatia M.Sc. email. r.bhatia@ipax.at<BR>> Technischer Leiter<BR>> <BR>> IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at<BR>> Barawitzkagasse 10/2/2/11 email. office@ipax.at<BR>> 1190 Wien tel. +43 1 3670030<BR>> FN 277995t HG Wien fax. +43 1 3670030 15<BR>> ____________________________________________________________________<BR><br /><hr />Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! <a href='http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us' target='_new'>Try it!</a></body>
</html>