Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I have a two node cluster that I have to upgrade the OS on. So in an effort to test my upgrade procedure I setup a quick test environment on virtual machines. The nodes have Fedora 8 drbd-8.0.8, drbdlinks, and heartbeat 2.1.2. I will attempt to upgrade the nodes to RH5, drbd-8.0.14, heartbeat-2.1.4 one system at a time. I had both machines running and everything seemed ok. At this point Node A was running as primary and was running all services. I tried to stop heartbeat on Node A so Node B could take over everything. When I did this node A had a spontaneous reboot. At the same time Node B took over all resources and seemed to be functioning properly, but drbd status showed a Standalone connection state on Node B. When node A came backup it was in a WFConnection state and in Secondary mode. This seemed to indicate a splitbrain type of situation as described in the manual. Splitbrain was not mentioned in the logs though. So I did the steps outline in Manual split brain recovery. On Node A > drdbadm secondary > drbdadm -- --discard-my-data connect all On Node B > drbdadm connect all The logs on Node B kept saying: kernel: drbd0: I shall become SyncTarget, but I am primary! I could never get the nodes to reconnect. I finally shut down both Nodes and brought up Node A first and then Node B. Everything seemed to connect fine and it is syncing. So, I was curious if their was something else I could have done instead of the reboot in case this ever happens on a production system. The good thing is, it seems that no data was lost.