Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi there, I'm currently running a 2-node-cluster, running Ubuntu 8.04 and drbd. We bought faster hardware and installed this new nodes with Ubuntu 10.04 . The two old nodes are currently under heavy load, so we'd like to have the downtime as short as possible. The last time we synced the data with rsync (~700 GB with many small files) even a differential rsync took round about two hours, because it took very long to have rsync generated it's indexes. Also, connecting one new node to the primary is very expensive, since a full rebuild drags the performance down in a way we can't realize in order to keep the served services up and running. So our plan was : 1) Disconnect the secondary (old) live - node. 2) Connect this node to one of the new ones and wait until the main - resync is done. 3) Connect the synced new node to the primary to have only the changed data re-synced and wait until this is done. 4) Raise the new node to be primary. 5) Disconnect the former primary. 6) Connect the second new node to the now disconnected old primary and again, wait until they have done the initial sync. 7) Connect the seocond new node to the already live new node and be done. Sounds quite complicated but seems to be the best way when you want to keep the service up as good as possible to us. This plan is working great through point 2). The former secondary syncs everything to the new node, but when we try to reconnect it to the old primary, it isn't working. The Logs complain something about: http://pastebin.com/AKthcx7b So - a classical : Split brain :P Can somebody tell me: 1) Do you think that plan is ok? 2) Why does a split brain happen here? 3) How to do it right then? (Without having to get EVERYTHING from the live node)? Thanks in advance! Best regards, Marc