Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Mon, Aug 29, 2005 at 02:17:13PM -0400, Musard, Kris wrote: > I recently experienced some data loss with drbd. I had a resource > "r0" which lost its connection about a month ago and went unnoticed. > This past weekend something caused both machines to reboot. They are > both running heartbeat. The machine with the older data started > heartbeat first and became primary. A sync occurred causing the older > data to be copied to the other node. In order to prevent this from > happening in the future I have put a script in place to monitor the > status of drbd and notify me when resources are not connected. I also > set the "on-disconnect reconnect" parameter for all of my resources. > My question is what would have caused the older data to look newer to > drbd and cause the incorrect re-sync, and what additional steps can be > taken to prevent this from happening in the future? At least on Debian GNU/Linux (I have not tried DRBD with the other major distributions, although this is probably a common trait), the DRBD initialization script waits for its partner to come online, or a manual administrative override at the console to bypass this wait, before it proceeds. This process is also blocking, which means Heartbeat and the other processes that are started after DRBD won't start until DRBD is done starting. This ensures that both machines are able to talk and agree on who has "newer" data, and how they should synchronize before other things like Heartbeat start and declare one or the other as primary. AFAIK, as soon as both DRBD nodes have begun talking and know who has what, it shouldn't matter who gets flagged as primary and gets mounted somewhere. --> Jijo -- Federico Sevilla III : jijo.free.net.ph : When we speak of free software GNU/Linux Specialist : GnuPG 0x93B746BE : we refer to freedom, not price.