[DRBD-user] Failing to migrate two DRBD nodes

Marc Richter drbd at zoosau.de
Wed Jan 12 13:26:08 CET 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi There.

I'm still failing in replacing a HA - node and home someone may help me.
I'm trying the following:
I have two nodes which serve as a HA - NAS and are connected by DRBD. We
have bought new Hardware and installed a new version of the Linux
Distribution onto this new devices. Since the complete initial sync
takes a long time and makes the devices quite unusable for 5 hours, I'm
planing to do the following:

1) Remove the secondary node from the cluster by issuing "drbdadm
disconnect r0".

2) Connecting the first new node to this removed node and have the
initial re sync done without affecting the live node.

3) When they have synced, disconnect these two nodes and re-connect the
new one to the still productive old one, to have the changes synced,
which occurred during the last few hours since the initial rebuild was
running.

4) When they have synced, remove the old one from the cluster and have
the now synced new one serving all requests.

5) Connecting the second old node to the second new one and again let
them do the initial syncing.

6) After this is done, reconnect the second new node to the first.

7) I'm done with a very short downtime.

I previously set up 4 virtual machines (2 old ones, 2 new ones) which
used _exactly_ the same Versions of the OS, which the real nodes are
running on and copied the Configurations for drbd exactly from the real
nodes (changed only the IPs).
This way I could successfully finalize and simulate this plan. All went
well, so far.

Now I'm trying the same with the real nodes, but I'm stuck with point 3).
I've successfully synced the secondary old node with the first new one.
But when I try to connect this new node to the second (productive) old
node, I get the following in the Logs: "Unrelated data, aborting".
I googled for this, but don't understand what has/is happening here:

I extracted the UUID from the secondary old node (SyncSource) by issuing
"drbdadm show-gi r0" and got
"1F583294AF81AF78:0000000000000000:F2997C7C2F263DF4:0F765342B3D081E2:1:1:0:0:0:0:0".
This is exactly the same string I get when I issue the same command on
the SyncTarget (first new node I connected to this old one).
When I issue this on the currently productive old one I'm trying to
connect to, I get
"C37BCA37822F0D6D:0F765342B3D081E3:2DD3D7AEBE3F7A75:8C5762D1E5ABE255:1:1:1:0:0:0"

As far as I understand all of this, drbd rejects to connect the two
nodes, because the UUIDs (the part
"1F583294AF81AF78:0000000000000000:F2997C7C2F263DF4:0F765342B3D081E2"
without the trailing state Bits) differ (correct?).

But how can this be? The sync source I used was connected to the other
node and the whole thing worked perfectly for a long time.
And what can I do now to connect these two nodes?
Can / should I change a part of the UUID - String on any of the nodes or
such? And if so, how?

Thanks very much in advance.

Best regards,
Marc



More information about the drbd-user mailing list