[DRBD-user] Delayed Handshake start on DRBD version 8.4

Vivek Singh vivek.singh.kr at gmail.com
Mon Dec 30 15:56:17 CET 2019


Hi,

I have found a strange behavior. Earlier I was working on DRBD version
8.3.8 and all remains good for "Digest integrity check FAILED" but with
recent upgrade on 8.4 started to see delay in handshake process for around
7-8 sec which is causing temporarily split-brain type situation. I am very
new to DRBD so please help me find this issue.Though I am planning to avoid
using  data-integrity-alg: md5 to avoid this error. But still need to
understand what has changed in DRBD that caused the issue.
I also tried updating the drbd.conf :
fence-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";
 AS : fence-peer "/usr/lib/heartbeat/drbd-peer-outdater -t *1*";
Still no success.  Any help or pointer is appreciated.

Following is the log sequence:

On version 8.4.11-1

Dec 25 14:29:58 t1 kernel: block drbd0: Digest integrity check FAILED:
212623024s +4096
Dec 25 14:29:58 t1 kernel: drbd r0: error receiving Data, e: -5 l: 4112!
Dec 25 14:29:58 t1 kernel: drbd r0: peer( Primary -> Unknown ) conn(
Connected -> ProtocolError ) pdsk( UpToDate -> DUnknown )
Dec 25 14:30:06 t1 kernel: drbd r0: Handshake successful: Agreed network
protocol version 101
Dec 25 14:30:06 t1 kernel: drbd r0: Feature flags enabled on protocol
level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES.
Dec 25 14:30:06 t1 kernel: drbd r0: Peer authenticated using 20 bytes HMAC
Dec 25 14:30:06 t1 kernel: drbd r0: conn( WFConnection -> WFReportParams )
Dec 25 14:30:06 t1 kernel: drbd r0: Starting ack_recv thread (from
drbd_r_r0 [46248])
Dec 25 14:30:06 t1 kernel: block drbd0: drbd_sync_handshake:
Dec 25 14:30:06 t1 kernel: block drbd0: self
0238C8E28ABB978C:0000000000000000:9B86A2A3B57BABE6:9B85A2A3B57BABE7 bits:0
flags:0
Dec 25 14:30:06 t1 kernel: block drbd0: peer
134DF8CBE55C5001:0238C8E28ABB978D:9B86A2A3B57BABE7:9B85A2A3B57BABE7
bits:526 flags:0
Dec 25 14:30:06 t1 kernel: block drbd0: uuid_compare()=-1 by rule 50
Dec 25 14:30:06 t1 kernel: block drbd0: peer( Unknown -> Primary ) conn(
WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
Dec 25 14:30:06 t1 kernel: block drbd0: receive bitmap stats
[Bytes(packets)]: plain 0(0), RLE 1215(1), total 1215; compression: 100.0%
Dec 25 14:30:06 t1 kernel: block drbd0: send bitmap stats [Bytes(packets)]:
plain 0(0), RLE 1215(1), total 1215; compression: 100.0%
Dec 25 14:30:06 t1 kernel: block drbd0: conn( WFBitMapT -> WFSyncUUID )
Dec 25 14:30:06 t1 kernel: block drbd0: updated sync uuid
0239C8E28ABB978C:0000000000000000:9B86A2A3B57BABE6:9B85A2A3B57BABE7
Dec 25 14:30:06 t1 kernel: block drbd0: helper command: /sbin/drbdadm
before-resync-target minor-0
Dec 25 14:30:06 t1 kernel: block drbd0: helper command: /sbin/drbdadm
before-resync-target minor-0 exit code 0 (0x0)
Dec 25 14:30:06 t1 kernel: block drbd0: conn( WFSyncUUID -> SyncTarget )
disk( Outdated -> Inconsistent )
Dec 25 14:30:06 t1 kernel: block drbd0: Began resync as SyncTarget (will
sync 2104 KB [526 bits set]).



On Version 8.3.8.1

Dec 2 06:23:18 t1 kernel: block drbd0: Digest integrity check FAILED.
Dec 2 06:23:18 t1 kernel: block drbd0: error receiving Data, l: 4136!
Dec 2 06:23:18 t1 kernel: block drbd0: peer( Primary -> Unknown ) conn(
Connected -> ProtocolError ) pdsk( UpToDate -> DUnknown )
Dec 2 06:23:18 t1 kernel: block drbd0: asender terminated
Dec 2 06:23:18 t1 kernel: block drbd0: Terminating asender thread
Dec 2 06:23:18 t1 kernel: block drbd0: Connection closed
Dec 2 06:23:18 t1 kernel: block drbd0: conn( ProtocolError -> Unconnected )
Dec 2 06:23:18 t1 kernel: block drbd0: receiver terminated
Dec 2 06:23:18 t1 kernel: block drbd0: Restarting receiver thread
Dec 2 06:23:18 t1 kernel: block drbd0: receiver (re)started
Dec 2 06:23:18 t1 kernel: block drbd0: conn( Unconnected -> WFConnection )
Dec 2 06:23:19 t1 kernel: block drbd0: Handshake successful: Agreed network
protocol version 94
Dec 2 06:23:19 t1 kernel: block drbd0: Peer authenticated using 20 bytes of
'sha1' HMAC
Dec 2 06:23:19 t1 kernel: block drbd0: conn( WFConnection -> WFReportParams
)
Dec 2 06:23:19 t1 kernel: block drbd0: Starting asender thread (from
drbd0_receiver [1909])
Dec 2 06:23:19 t1 kernel: block drbd0: data-integrity-alg: md5
Dec 2 06:23:19 t1 kernel: block drbd0: drbd_sync_handshake:
Dec 2 06:23:19 t1 kernel: block drbd0: self
296E21EFF323E278:0000000000000000:404AAC56D3649880:9D5477E5D21C6D23 bits:0
flags:0
Dec 2 06:23:19 t1 kernel: block drbd0: peer
2DFCAA9A02C8F155:296E21EFF323E279:404AAC56D3649880:9D5477E5D21C6D23
bits:129 flags:0
Dec 2 06:23:19 t1 kernel: block drbd0: uuid_compare()=-1 by rule 50
Dec 2 06:23:19 t1 kernel: block drbd0: peer( Unknown -> Primary ) conn(
WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
Dec 2 06:23:19 t1 kernel: block drbd0: conn( WFBitMapT -> WFSyncUUID )
Dec 2 06:23:19 t1 kernel: block drbd0: helper command: /sbin/drbdadm
before-resync-target minor-0
Dec 2 06:23:19 t1 kernel: block drbd0: helper command: /sbin/drbdadm
before-resync-target minor-0 exit code 0 (0x0)
Dec 2 06:23:19 t1 kernel: block drbd0: conn( WFSyncUUID -> SyncTarget )
disk( UpToDate -> Inconsistent )
Dec 2 06:23:19 t1 kernel: block drbd0: Began resync as SyncTarget (will
sync 516 KB [129 bits set]).
Dec 2 06:23:19 t1 kernel: block drbd0: Resync done (total 1 sec; paused 0
sec; 516 K/sec)
Dec 2 06:23:19 t1 kernel: block drbd0: conn( SyncTarget -> Connected )
disk( Inconsistent -> UpToDate )
Dec 2 06:23:19 t1 kernel: block drbd0: helper command: /sbin/drbdadm
after-resync-target minor-0
Dec 2 06:23:19 t1 kernel: block drbd0: helper command: /sbin/drbdadm
after-resync-target minor-0 exit code 0 (0x0)


Thanks
Vivek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20191230/37219c4f/attachment.htm>


More information about the drbd-user mailing list