[DRBD-user] servers out of sync

Marcel Kraan marcel at kraan.net
Sun May 13 15:25:11 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


i don't get it synced again.
they are now both stand alone?
i can ping them both.

don't  have any options left.

[root at kvmstorage1 drbd.d]# cat /proc/drbd 
version: 8.3.12 (api:88/proto:86-96)
GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by phil at Build64R6, 2012-04-08 09:36:52
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
    ns:0 nr:0 dw:412 dr:9926 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:280

[root at kvmstorage2 drbd.d]# cat /proc/drbd 
version: 8.3.12 (api:88/proto:86-96)
GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by phil at Build64R6, 2012-04-08 09:36:52
 0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:264




/var/log/messages on 2 servers

[root at kvmstorage2 drbd.d]# service drbd restart
Stopping all DRBD resources: May 13 15:14:13 kvmstorage2 kernel: block drbd0: disk( UpToDate -> Failed ) 
May 13 15:14:13 kvmstorage2 kernel: block drbd0: disk( Failed -> Diskless ) 
May 13 15:14:13 kvmstorage2 kernel: block drbd0: drbd_bm_resize called with capacity == 0
May 13 15:14:13 kvmstorage2 kernel: block drbd0: worker terminated
May 13 15:14:13 kvmstorage2 kernel: block drbd0: Terminating worker thread
May 13 15:14:13 kvmstorage2 kernel: drbd: module cleanup done.
.
Starting DRBD resources: May 13 15:14:13 kvmstorage2 kernel: drbd: initialized. Version: 8.3.12 (api:88/proto:86-96)
May 13 15:14:13 kvmstorage2 kernel: drbd: GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by phil at Build64R6, 2012-04-08 09:36:52
May 13 15:14:13 kvmstorage2 kernel: drbd: registered as block device major 147
May 13 15:14:13 kvmstorage2 kernel: drbd: minor_table @ 0xffff88020f7257c0
[ d(main) May 13 15:14:13 kvmstorage2 kernel: block drbd0: Starting worker thread (from cqueue [1344])
May 13 15:14:13 kvmstorage2 kernel: block drbd0: disk( Diskless -> Attaching ) 
May 13 15:14:13 kvmstorage2 kernel: block drbd0: Found 6 transactions (34 active extents) in activity log.
May 13 15:14:13 kvmstorage2 kernel: block drbd0: Method to ensure write ordering: barrier
May 13 15:14:13 kvmstorage2 kernel: block drbd0: max BIO size = 131072
May 13 15:14:13 kvmstorage2 kernel: block drbd0: drbd_bm_resize called with capacity == 6920386232
May 13 15:14:13 kvmstorage2 kernel: block drbd0: resync bitmap: bits=865048279 words=13516380 pages=26400
May 13 15:14:13 kvmstorage2 kernel: block drbd0: size = 3300 GB (3460193116 KB)
May 13 15:14:13 kvmstorage2 kernel: block drbd0: bitmap READ of 26400 pages took 198 jiffies
May 13 15:14:13 kvmstorage2 kernel: block drbd0: recounting of set bits took additional 90 jiffies
May 13 15:14:13 kvmstorage2 kernel: block drbd0: 264 KB (66 bits) marked out-of-sync by on disk bit-map.
May 13 15:14:13 kvmstorage2 kernel: block drbd0: disk( Attaching -> UpToDate ) 
May 13 15:14:13 kvmstorage2 kernel: block drbd0: attached to UUIDs C12A485E56F51104:9555562D91EACAC2:A615ADBD6A39BD99:A614ADBD6A39BD99
n(main) May 13 15:14:13 kvmstorage2 kernel: block drbd0: conn( StandAlone -> Unconnected ) 
May 13 15:14:13 kvmstorage2 kernel: block drbd0: Starting receiver thread (from drbd0_worker [6484])
May 13 15:14:13 kvmstorage2 kernel: block drbd0: receiver (re)started
May 13 15:14:13 kvmstorage2 kernel: block drbd0: conn( Unconnected -> WFConnection ) 
]May 13 15:14:14 kvmstorage2 kernel: block drbd0: Handshake successful: Agreed network protocol version 96
May 13 15:14:14 kvmstorage2 kernel: block drbd0: conn( WFConnection -> WFReportParams ) 
May 13 15:14:14 kvmstorage2 kernel: block drbd0: Starting asender thread (from drbd0_receiver [6494])
May 13 15:14:14 kvmstorage2 kernel: block drbd0: data-integrity-alg: <not-used>
May 13 15:14:14 kvmstorage2 kernel: block drbd0: drbd_sync_handshake:
May 13 15:14:14 kvmstorage2 kernel: block drbd0: self C12A485E56F51104:9555562D91EACAC2:A615ADBD6A39BD99:A614ADBD6A39BD99 bits:66 flags:0
May 13 15:14:14 kvmstorage2 kernel: block drbd0: peer E33CEADD1FF28EE1:9555562D91EACAC3:A615ADBD6A39BD98:A614ADBD6A39BD99 bits:70 flags:0
May 13 15:14:14 kvmstorage2 kernel: block drbd0: uuid_compare()=100 by rule 90
May 13 15:14:14 kvmstorage2 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0
May 13 15:14:14 kvmstorage2 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
May 13 15:14:14 kvmstorage2 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
May 13 15:14:14 kvmstorage2 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0
May 13 15:14:14 kvmstorage2 kernel: block drbd0: meta connection shut down by peer.
May 13 15:14:14 kvmstorage2 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
May 13 15:14:14 kvmstorage2 kernel: block drbd0: conn( WFReportParams -> Disconnecting ) 
May 13 15:14:14 kvmstorage2 kernel: block drbd0: error receiving ReportState, l: 4!
May 13 15:14:14 kvmstorage2 kernel: block drbd0: asender terminated
May 13 15:14:14 kvmstorage2 kernel: block drbd0: Terminating asender thread
May 13 15:14:14 kvmstorage2 kernel: block drbd0: Connection closed
May 13 15:14:14 kvmstorage2 kernel: block drbd0: conn( Disconnecting -> StandAlone ) 
May 13 15:14:14 kvmstorage2 kernel: block drbd0: receiver terminated
May 13 15:14:14 kvmstorage2 kernel: block drbd0: Terminating receiver thread



second server (primary right now)

root at kvmstorage1 drbd.d]# service drbd restart
Stopping all DRBD resources: umount: /datastore: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
/dev/drbd0: State change failed: (-12) Device is held open by someone
May 13 15:16:22 kvmstorage1 kernel: block drbd0: State change failed: Device is held open by someone
May 13 15:16:22 kvmstorage1 kernel: block drbd0:   state = { cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----- }
May 13 15:16:22 kvmstorage1 kernel: block drbd0:  wanted = { cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r----- }
ERROR: Module drbd is in use
.
Starting DRBD resources: [ n(main) May 13 15:16:22 kvmstorage1 kernel: block drbd0: conn( StandAlone -> Unconnected ) 
May 13 15:16:22 kvmstorage1 kernel: block drbd0: Starting receiver thread (from drbd0_worker [1441])
May 13 15:16:22 kvmstorage1 kernel: block drbd0: receiver (re)started
May 13 15:16:22 kvmstorage1 kernel: block drbd0: conn( Unconnected -> WFConnection ) 
]..........
***************************************************************
 DRBD's startup script waits for the peer node(s) to appear.
 - In case this node was already a degraded cluster before the
   reboot the timeout is 0 seconds. [degr-wfc-timeout]
 - If the peer was available before the reboot the timeout will
   expire after 0 seconds. [wfc-timeout]
   (These values are for resource 'drbd'; 0 sec -> wait forever)  
(i had to restart drbd on the second node)
 To abort waiting enter 'yes' [  54]:May 13 15:17:16 kvmstorage1 kernel: block drbd0: Handshake successful: Agreed network protocol version 96
May 13 15:17:16 kvmstorage1 kernel: block drbd0: conn( WFConnection -> WFReportParams ) 
May 13 15:17:16 kvmstorage1 kernel: block drbd0: Starting asender thread (from drbd0_receiver [7458])
May 13 15:17:16 kvmstorage1 kernel: block drbd0: data-integrity-alg: <not-used>
May 13 15:17:16 kvmstorage1 kernel: block drbd0: drbd_sync_handshake:
May 13 15:17:16 kvmstorage1 kernel: block drbd0: self E33CEADD1FF28EE1:9555562D91EACAC3:A615ADBD6A39BD98:A614ADBD6A39BD99 bits:70 flags:0
May 13 15:17:16 kvmstorage1 kernel: block drbd0: peer C12A485E56F51104:9555562D91EACAC2:A615ADBD6A39BD99:A614ADBD6A39BD99 bits:66 flags:0
May 13 15:17:16 kvmstorage1 kernel: block drbd0: uuid_compare()=100 by rule 90
May 13 15:17:16 kvmstorage1 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0
May 13 15:17:16 kvmstorage1 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
May 13 15:17:16 kvmstorage1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
May 13 15:17:16 kvmstorage1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0

May 13 15:17:16 kvmstorage1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
May 13 15:17:16 kvmstorage1 kernel: block drbd0: conn( WFReportParams -> Disconnecting ) 
May 13 15:17:16 kvmstorage1 kernel: block drbd0: error receiving ReportState, l: 4!
May 13 15:17:16 kvmstorage1 kernel: block drbd0: asender terminated
May 13 15:17:16 kvmstorage1 kernel: block drbd0: Terminating asender thread
May 13 15:17:16 kvmstorage1 kernel: block drbd0: Connection closed
May 13 15:17:16 kvmstorage1 kernel: block drbd0: conn( Disconnecting -> StandAlone ) 
May 13 15:17:16 kvmstorage1 kernel: block drbd0: receiver terminated
May 13 15:17:16 kvmstorage1 kernel: block drbd0: Terminating receiver thread


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120513/af0a4787/attachment.htm>


More information about the drbd-user mailing list