Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Christophe Zwecker wrote: > node1 is primary with mounted fs > node2 is secondary > > nod1 goes down (only network failure), "only" network failure? Which network? In many cases, a network failure alone is worse than one box completely failing, because it can cause "split brain" if you're not careful. What connections do you have for Heartbeat to use? (A serial heartbeat is always a good idea if you can have it). As many redundant paths as possible is good. (typical might be 3: replication (crossover) network between the DRBD machines, "normal" network and serial heartbeat) > heartbeat unmounts the drbd fs on > node1. node 2 takes over and mounts the drbd volume. And what happens to node1 here? Are you sure that Heartbeat stops the DRBD services? My guess is that you have a single network connection for both DRBD and Heartbeat, in which case DRBD will still be primary on node1. > node1 comes backup, mounts drbd volume and the change aint there because: > Sep 15 13:47:03 mw-test-n2 kernel: drbd0: Current Primary shall become > sync TARGET! Aborting to prevent data corruption. DRBD is doing the right thing here. Either your nodes weren't really synchronised before the failure, or you had a split brain where DRBD was primary on both machines. This situation can only be resolved manually, i.e. by a human telling DRBD which machine has the latest data. (something like "drbdadm XXX invalidate_remote --do-what-I-say" on the "good" machine) Tim