[DRBD-user] repeated resync/fail/resync...

Steve Thompson smt at vgersoft.com
Sat Jan 8 16:07:17 CET 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


CentOS 5.5, x86_64, drbd 8.3.8.

After running fine for a month, this started a couple of days ago on only 
one of three drbd volumes (the other two being fine):

Jan  8 09:54:53 tiger kernel: block drbd11: Began resync as SyncSource (will sync 1882624060 KB [470656015 bits set]).
Jan  8 09:54:57 tiger kernel: block drbd11: sock_recvmsg returned -104
Jan  8 09:54:57 tiger kernel: block drbd11: peer( Secondary -> Unknown ) conn( SyncSource -> NetworkFailure )
Jan  8 09:54:57 tiger kernel: block drbd11: asender terminated
Jan  8 09:54:57 tiger kernel: block drbd11: Terminating asender thread
Jan  8 09:54:57 tiger kernel: block drbd11: sock was shut down by peer
Jan  8 09:54:57 tiger kernel: block drbd11: short read expecting header on sock: r=0
Jan  8 09:54:57 tiger kernel: block drbd11: drbd_send_block/ack() failed
Jan  8 09:54:57 tiger kernel: block drbd11: Connection closed
Jan  8 09:54:57 tiger kernel: block drbd11: conn( NetworkFailure -> Unconnected )
Jan  8 09:54:57 tiger kernel: block drbd11: receiver terminated
Jan  8 09:54:57 tiger kernel: block drbd11: Restarting receiver thread
Jan  8 09:54:57 tiger kernel: block drbd11: receiver (re)started
Jan  8 09:54:57 tiger kernel: block drbd11: conn( Unconnected -> WFConnection )
Jan  8 09:54:57 tiger kernel: block drbd11: Handshake successful: Agreed network protocol version 94
Jan  8 09:54:57 tiger kernel: block drbd11: Peer authenticated using 20 bytes of 'sha1' HMAC
Jan  8 09:54:57 tiger kernel: block drbd11: conn( WFConnection -> WFReportParams )
Jan  8 09:54:57 tiger kernel: block drbd11: Starting asender thread (from drbd11_receiver [4148])
Jan  8 09:54:57 tiger kernel: block drbd11: data-integrity-alg: sha1
Jan  8 09:54:57 tiger kernel: block drbd11: drbd_sync_handshake:
Jan  8 09:54:57 tiger kernel: block drbd11: self 59D4BF73D5EC82B7:0CC8A25C011E7E57:1FF443C818621FD3:D319C97BC05715A4 
bits:470570058 flags:0
Jan  8 09:54:57 tiger kernel: block drbd11: peer 0CC8A25C011E7E56:0000000000000000:3AEE8F0D28607364:01DF314E083700DD 
bits:470569999 flags:0
Jan  8 09:54:57 tiger kernel: block drbd11: uuid_compare()=1 by rule 70
Jan  8 09:54:57 tiger kernel: block drbd11: Becoming sync source due to disk states.
Jan  8 09:54:57 tiger kernel: block drbd11: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS )
Jan  8 09:54:58 tiger kernel: block drbd11: conn( WFBitMapS -> SyncSource )

which repeats itself continuously. The resync gets to 0.1% and then starts 
over. The replication link is a dual bonded GbE point-to-point pair, and 
is in full operating condition. All other hardware is fine, and has been 
all along. Nothing has been changed. Can anyone give me a clue as to why 
this is happening?

Steve



More information about the drbd-user mailing list