[DRBD-user] drbd resyncing entire device after each reboot

Digimer lists at alteeve.ca
Sat Oct 6 06:02:58 CEST 2018


On 2018-10-05 04:02 PM, Hanspeter Kunz wrote:
> Hi there,
> 
> I see a strange behavior on a freshly set up pair of machines (debian
> stretch, drbd 8.4.7): 
> 
> after each reboot, the whole drbd device is resynced from scratch, even
> if both drbd devices report to be uptodate before the reboot. I never
> experienced this on other drbd installations I have. 
> 
> I just rebooted the secondary machine, after starting drbd syslog gives
> me the following information on that machine:
> 
> Oct  5 21:36:43 claire drbd[3578]: Starting DRBD resources:[
> Oct  5 21:36:43 claire drbd[3578]:      create res: nfs
> Oct  5 21:36:43 claire drbd[3578]:    prepare disk: nfs
> Oct  5 21:36:43 claire kernel: [  379.663592] drbd nfs: Starting worker thread (from drbdsetup-84 [3596])
> Oct  5 21:36:43 claire kernel: [  379.664004] block drbd0: disk( Diskless -> Attaching ) 
> Oct  5 21:36:43 claire kernel: [  379.664629] drbd nfs: Method to ensure write ordering: flush
> Oct  5 21:36:43 claire kernel: [  379.664634] block drbd0: max BIO size = 1048576
> Oct  5 21:36:43 claire kernel: [  379.664642] block drbd0: drbd_bm_resize called with capacity == 53685452728
> Oct  5 21:36:43 claire kernel: [  379.875816] block drbd0: resync bitmap: bits=6710681591 words=104854400 pages=204794
> Oct  5 21:36:43 claire kernel: [  379.875819] block drbd0: size = 25 TB (26842726364 KB)
> Oct  5 21:36:44 claire drbd[3578]:     adjust disk: nfs
> Oct  5 21:36:44 claire kernel: [  381.510770] block drbd0: recounting of set bits took additional 32 jiffies
> Oct  5 21:36:44 claire kernel: [  381.510772] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
> Oct  5 21:36:44 claire kernel: [  381.510778] block drbd0: disk( Attaching -> UpToDate ) 
> Oct  5 21:36:44 claire kernel: [  381.510789] block drbd0: attached to UUIDs 0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
> Oct  5 21:36:44 claire drbd[3578]:      adjust net: nfs
> Oct  5 21:36:44 claire drbd[3578]: ]
> Oct  5 21:36:44 claire kernel: [  381.516705] drbd nfs: conn( StandAlone -> Unconnected ) 
> Oct  5 21:36:44 claire kernel: [  381.516756] drbd nfs: Starting receiver thread (from drbd_w_nfs [3598])
> Oct  5 21:36:44 claire kernel: [  381.516823] drbd nfs: receiver (re)started
> Oct  5 21:36:44 claire kernel: [  381.516883] drbd nfs: conn( Unconnected -> WFConnection ) 
> Oct  5 21:36:45 claire kernel: [  382.250879] drbd nfs: Handshake successful: Agreed network protocol version 101
> Oct  5 21:36:45 claire kernel: [  382.250884] drbd nfs: Feature flags enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
> Oct  5 21:36:45 claire kernel: [  382.251202] drbd nfs: Peer authenticated using 20 bytes HMAC
> Oct  5 21:36:45 claire kernel: [  382.251307] drbd nfs: conn( WFConnection -> WFReportParams ) 
> Oct  5 21:36:45 claire kernel: [  382.251366] drbd nfs: Starting ack_recv thread (from drbd_r_nfs [3607])
> Oct  5 21:36:45 claire kernel: [  382.310672] block drbd0: drbd_sync_handshake:
> Oct  5 21:36:45 claire kernel: [  382.310680] block drbd0: self 0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7 bits:0 flags:0
> Oct  5 21:36:45 claire kernel: [  382.310687] block drbd0: peer 06D17ADE18B89143:0000000000000005:B6D88D552E97D8B7:B6D78D552E97D8B7 bits:0 flags:0
> Oct  5 21:36:45 claire kernel: [  382.310691] block drbd0: uuid_compare()=-2 by rule 20
> Oct  5 21:36:45 claire kernel: [  382.310696] block drbd0: Writing the whole bitmap, full sync required after drbd_sync_handshake.
> Oct  5 21:36:47 claire kernel: [  383.728620] block drbd0: bitmap WRITE of 204794 pages took 1228 ms
> Oct  5 21:36:47 claire kernel: [  383.728626] block drbd0: 25 TB (6710681591 bits) marked out-of-sync by on disk bit-map.
> Oct  5 21:36:47 claire kernel: [  383.728693] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate ) 
> Oct  5 21:36:47 claire drbd[3578]: WARN: stdin/stdout is not a TTY; using /dev/console.
> Oct  5 21:36:47 claire systemd[1]: Started LSB: Control DRBD resources..
> Oct  5 21:36:47 claire kernel: [  384.049775] block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
> Oct  5 21:36:47 claire kernel: [  384.145044] block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
> Oct  5 21:36:47 claire kernel: [  384.145049] block drbd0: conn( WFBitMapT -> WFSyncUUID ) 
> Oct  5 21:36:47 claire kernel: [  384.275789] block drbd0: updated sync uuid 0001000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
> Oct  5 21:36:47 claire kernel: [  384.275945] block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0
> Oct  5 21:36:47 claire kernel: [  384.279872] block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
> Oct  5 21:36:47 claire kernel: [  384.279905] block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
> Oct  5 21:36:47 claire kernel: [  384.279949] block drbd0: Began resync as SyncTarget (will sync 26842726364 KB [6710681591 bits set]).
> 
> Probably the explanation is simple, I just do not see it. 
> 
> If you need the configuration (although it should be identical to
> similar drbd configs which are working without problems) I am happy to
> provide it.
> 
> Best and many thanks if any body could shed some light on this,
> Hp

Can you share your config? Are you using thin LVM?

Also, 8.4.7 is _ancient_. Nearly countless bug fixes since then, which
may or may not relate. In any case, updating is _strongly_ recommended.

-- 
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould


More information about the drbd-user mailing list