Verify consistently fails after rebooting secondary node
Tim Westbrook
Tim_Westbrook at selinc.com
Tue Dec 24 20:01:57 CET 2024
Hello
We are observing the following issue with resync after reboot.
After rebooting a secondary node (in a 2 or 3 node cluster), the
secondary successfully connects to primary and reports UpToDate, but
when a verify is launched on the secondary node that was rebooted, it reports
out of sync blocks.
If an "invalidate --reset-bitmap=no" is issued on the resource on the secondary
node, the invalidate sync happens quickly and the next verify succeeds with
no out of sync blocks.
This was initially detected when we promoted a backup node and it came up with
disk corruption. We traced this to the reboot occurring before the promotion.
Versions
The logs attached are using the 9.2.12 version of the driver on the 5.15.173 kernel,
but we have also observed this issue on the 9.2.4 driver with the 5.15.166 kernel
We have not seen the problem on 5.15.151 and version 9.2.4 of the driver.
Attachments
initsyncandverify_noreboot.txt - drbd logs from system prior to reboot , includes
verify before reboot
verify_after_invalidate_no_reset.txt - drbd logs after reboot show initial failed
verify then, invalidate, then successful verify
dynamic.res - drbd conf file - note use of separate metadata disk - we also
Secondary Bring Up
Secondary nodes enable drbd "persist" resource as follows
"""
da up all || true
da secondary persist || true
da disconnect persist || true
da -- --discard-my-data connect persist || true
"""
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dynamic.res
Type: application/octet-stream
Size: 1442 bytes
Desc: dynamic.res
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20241224/eda12e52/attachment-0001.obj>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: initsyncandverify_noreboot.txt
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20241224/eda12e52/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: verify_after_invalidate_no_reset.txt
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20241224/eda12e52/attachment-0003.txt>
More information about the drbd-user
mailing list