Verify consistently fails after rebooting secondary node

Tim Westbrook Tim_Westbrook at selinc.com
Tue Dec 24 20:01:57 CET 2024


Hello

We are observing the following issue with resync after reboot. 

After rebooting a secondary node (in a 2 or 3 node cluster), the
secondary successfully connects to primary and reports UpToDate, but 
when a verify is launched on the secondary node that was rebooted, it reports
out of sync blocks.

If an "invalidate --reset-bitmap=no" is issued on the resource on the secondary
node, the invalidate sync happens quickly and the next verify succeeds with 
no out of sync blocks. 

This was initially detected when we promoted a backup node and it came up with
disk corruption. We traced this to the reboot occurring before the promotion. 

 Versions

The logs attached are using the 9.2.12 version of the driver on the 5.15.173 kernel, 
but we have also observed this issue on the 9.2.4 driver with the 5.15.166 kernel

We have not seen the problem on 5.15.151 and version 9.2.4 of the driver. 


 Attachments

initsyncandverify_noreboot.txt - drbd logs from system prior to reboot , includes
verify before reboot

verify_after_invalidate_no_reset.txt - drbd logs after reboot show initial failed
verify then, invalidate, then successful verify 

dynamic.res - drbd conf file - note use of separate metadata disk - we also 


 Secondary Bring Up

Secondary nodes enable drbd "persist" resource as follows
 
 """
  da up all || true
  da secondary persist || true
  da disconnect persist || true
  da -- --discard-my-data connect persist || true
"""
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dynamic.res
Type: application/octet-stream
Size: 1442 bytes
Desc: dynamic.res
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20241224/eda12e52/attachment-0001.obj>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: initsyncandverify_noreboot.txt
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20241224/eda12e52/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: verify_after_invalidate_no_reset.txt
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20241224/eda12e52/attachment-0003.txt>


More information about the drbd-user mailing list