Issue with Both Diskful Nodes Being Outdated in DRBD9
范锐
buaafanrui at qq.com
Tue Dec 3 14:20:22 CET 2024
Hello! I am currently working with a three-node DRBD9 setup, where A and B are diskful nodes, and C is a diskless node. In certain scenario, I have observed that both diskful nodes (A and B) can end up in the "outdated" state. Below are the steps to reproduce this issue:
1. The primary node is initially on B.
2. The connection between B and A is severed (by iptables in my experiment), leaving A in the "outdated" state. Some data is then written to B.
3. The connection between B and A is restored, making A "Inconsistent" and syncing the latest data from B.
4. During the sync process, B is demoted from primary, and A is promoted to primary.
5. While the sync process is ongoing, the connection between A and B is severed again, leaving B in the "outdated" state.
6. The connection is restored, and the synchronization process completes. However, both A and B are now in the "outdated" state and remain so even after a restart.
I am using DRBD 9.2.8 and have reproduced this issue multiple times with the same result. After analyzing the behavior, I believe the root cause is that DRBD allows an "Inconsistent" node to be promoted to primary, provided it has a stable connection to an "UpToDate" node. However, this can lead to the following issue:
When an "Inconsistent" node (while syncing) becomes primary and then its connection to another node is severed, the other "UpToDate" node becomes "outdated." Once the connection is restored and synchronization completes, both nodes end up in the "outdated" state.
I have the following questions:
1. Is it possible to configure DRBD to disallow the promotion of an "Inconsistent" node to primary? This would help avoid this issue.
2. If both disked nodes are in the "outdated" state, is it guaranteed that their data is consistent? If the data is consistent, it would it be safe to use the --force option to promote one of the nodes to primary to resolve the situation.
3. Can nodes in the "Inconsistent" or "Outdated" state participate in voting? Based on my understanding of distributed systems like etcd, unhealthy nodes are not allowed to vote or become leaders.
I would greatly appreciate your guidance on these issues. Thank you in advance for your time and support, and I look forward to your reply.
Best regards,
Rui
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20241203/5f8b35aa/attachment.htm>
More information about the drbd-user
mailing list