[DRBD-user] DRBD sync starts over again and again

Harald Rinker harald.rinker at unitedprint.com
Mon Dec 24 23:55:45 CET 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello list,

i have implemented drbd8 on debian etch with at first 1 node and add now 
the second one.

Now i have this messages in /var/log/syslog and cat /proc/drbd shows 
sometimes

k641
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by root at k641, 2007-12-24 10:57:36
0: cs:WFBitMapT st:Secondary/Primary ds:Inconsistent/UpToDate A r---
ns:0 nr:385901745 dw:385901745 dr:0 al:0 bm:30624 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:24671925 misses:35664 starving:0 dirty:0 
changed:35664
act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0

and sometimes
k641
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by root at k641, 2007-12-24 10:57:36
0: cs:SyncTarget st:Secondary/Primary ds:Inconsistent/UpToDate A r---
ns:0 nr:386266492 dw:386265627 dr:0 al:0 bm:30664 lo:29 pe:0 ua:28 ap:0
[>...................] sync'ed: 0.1% (1179695/1179704)M
finish: 34:57:14 speed: 9,376 (9,376) K/sec
resync: used:2/31 hits:24695008 misses:35714 starving:0 dirty:0 
changed:35714
act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0

on the other node it changes every 2 seconds and i have heavy load on 
the network

k713:~# cat /proc/drbd
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by root at k713, 2007-12-10 15:38:16
0: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent A r---
ns:387478872 nr:0 dw:763638 dr:402280172 al:3772 bm:31358 lo:1 pe:167 
ua:162 ap:1
[>...................] sync'ed: 0.1% (1178493/1178507)M
finish: 22:38:58 speed: 14,720 (14,720) K/sec
resync: used:1/31 hits:24440115 misses:35137 starving:0 dirty:0 
changed:35137
act_log: used:1/127 hits:348332 misses:4159 starving:0 dirty:387 
changed:3772


k713:~# cat /proc/drbd
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by root at k713, 2007-12-10 15:38:16
0: cs:WFConnection st:Primary/Unknown ds:UpToDate/Inconsistent A r---
ns:388863328 nr:0 dw:766849 dr:403709068 al:3810 bm:31502 lo:1 pe:0 ua:0 
ap:1
resync: used:0/31 hits:24527251 misses:35294 starving:0 dirty:0 
changed:35294
act_log: used:1/127 hits:349654 misses:4197 starving:0 dirty:387 
changed:3810


-snip
Writing meta data super block now.
Dec 24 23:39:36 k713 kernel: drbd0: sock_sendmsg returned -104
Dec 24 23:39:36 k713 kernel: drbd0: peer( Secondary -> Unknown ) conn( 
SyncSource -> BrokenPipe )
Dec 24 23:39:36 k713 kernel: drbd0: meta connection shut down by peer.
Dec 24 23:39:36 k713 kernel: drbd0: asender terminated
Dec 24 23:39:36 k713 kernel: drbd0: tl_clear()
Dec 24 23:39:36 k713 kernel: drbd0: Connection closed
Dec 24 23:39:36 k713 kernel: drbd0: Writing meta data super block now.
Dec 24 23:39:36 k713 kernel: drbd0: conn( BrokenPipe -> Unconnected )
Dec 24 23:39:36 k713 kernel: drbd0: receiver terminated
Dec 24 23:39:36 k713 kernel: drbd0: receiver (re)started
Dec 24 23:39:36 k713 kernel: drbd0: conn( Unconnected -> WFConnection )
Dec 24 23:39:36 k713 kernel: drbd0: conn( WFConnection -> WFReportParams )
Dec 24 23:39:36 k713 kernel: drbd0: Handshake successful: DRBD Network 
Protocol version 86
Dec 24 23:39:36 k713 kernel: drbd0: Peer authenticated using 32 bytes of 
'sha256' HMAC
Dec 24 23:39:36 k713 kernel: drbd0: Becoming sync source due to disk states.
Dec 24 23:39:36 k713 kernel: drbd0: peer( Unknown -> Secondary ) conn( 
WFReportParams -> WFBitMapS )
Dec 24 23:39:37 k713 kernel: drbd0: Writing meta data super block now.
Dec 24 23:39:37 k713 kernel: drbd0: conn( WFBitMapS -> SyncSource )
Dec 24 23:39:37 k713 kernel: drbd0: Began resync as SyncSource (will 
sync 1212785644 KB [303196411 bits set]).
Dec 24 23:39:37 k713 kernel: drbd0: Writing meta data super block now.
Dec 24 23:39:41 k713 kernel: drbd0: _drbd_send_page: size=4096 len=4096 
sent=-104
Dec 24 23:39:41 k713 kernel: drbd0: drbd_send_block() failed
Dec 24 23:39:41 k713 kernel: drbd0: peer( Secondary -> Unknown ) conn( 
SyncSource -> NetworkFailure )
Dec 24 23:39:41 k713 kernel: drbd0: meta connection shut down by peer.
Dec 24 23:39:41 k713 kernel: drbd0: asender terminated
Dec 24 23:39:41 k713 kernel: drbd0: tl_clear()
Dec 24 23:39:41 k713 kernel: drbd0: Connection closed
Dec 24 23:39:41 k713 kernel: drbd0: Writing meta data super block now.
Dec 24 23:39:41 k713 kernel: drbd0: conn( NetworkFailure -> Unconnected )
Dec 24 23:39:41 k713 kernel: drbd0: receiver terminated
Dec 24 23:39:41 k713 kernel: drbd0: receiver (re)started
Dec 24 23:39:41 k713 kernel: drbd0: conn( Unconnected -> WFConnection )
Dec 24 23:39:41 k713 kernel: drbd0: conn( WFConnection -> WFReportParams )
Dec 24 23:39:41 k713 kernel: drbd0: Handshake successful: DRBD Network 
Protocol version 86
Dec 24 23:39:41 k713 kernel: drbd0: Peer authenticated using 32 bytes of 
'sha256' HMAC
Dec 24 23:39:41 k713 kernel: drbd0: Becoming sync source due to disk states.
Dec 24 23:39:41 k713 kernel: drbd0: peer( Unknown -> Secondary ) conn( 
WFReportParams -> WFBitMapS )
Dec 24 23:39:42 k713 kernel: drbd0: Writing meta data super block now.
Dec 24 23:39:42 k713 kernel: drbd0: conn( WFBitMapS -> SyncSource )
Dec 24 23:39:42 k713 kernel: drbd0: Began resync as SyncSource (will 
sync 1212638412 KB [303159603 bits set]).
Dec 24 23:39:42 k713 kernel: drbd0: Writing meta data super block now.
-snip

how can i resolv this. I have tested this configuration in vmware and it 
works fine.

i can?t believe that this is realy an Network error because ping ist ok 
and the load on the interfaces is ~ 40 MB/sec

Greets Harry



More information about the drbd-user mailing list