<span class="gmail_quote">2007/6/11, N.J. van der Horn (Nico) <<a href="mailto:nico@vanderhorn.nl">nico@vanderhorn.nl</a>>:</span><blockquote class="gmail_quote" style="margin-top: 0; margin-right: 0; margin-bottom: 0; margin-left: 0; margin-left: 0.80ex; border-left-color: #cccccc; border-left-width: 1px; border-left-style: solid; padding-left: 1ex">
Hallo DRBD-meisters und lovers !<br><br>As far as i am aware of, i never had any real<br>problems using DRBD, but that changed a couple of days ago.<br>Both nodes suddenly have status "StandAlone" and<br>"messages" shows that i am blessed with "Split-Brain".
<br><br>I suspect myself forgotting to change the state<br>of node foc1 to Secondary before starting Heartbeat.<br>There is no other clue coming up into my mind to<br>explain what caused this situation.... grinzz<br><br>On both nodes fsck is happy, even with "fsck -n"
<br>(readonly) on the physical device after stopping DRBD.<br>I can mount (did that 1-at-a-time) both sides and<br>my data looks about the same (no real comparison made).<br><br>The cluster is a test-setup in my lab, the data
<br>has no real value, but i like to understand what's wrong.<br><br>Thanks in advance for your valued answers.<br><br>Nico van der Horn<br><br><br>Questions:<br>----------<br>1. how can i determine the real cause of the split-brain ?
</blockquote><div><br>Split-Brains are mainly caused by a loss of communication between the two nodes, and the Secondary node becoming Primary, while the other node remains in Primary state (both nodes think their peer is dead, so they become both Primary).
<br><br>Check your syslog to see what happened before the Split-Brain occurred (eth0 link down, etc.)</div><br><blockquote class="gmail_quote" style="margin-top: 0; margin-right: 0; margin-bottom: 0; margin-left: 0; margin-left: 0.80ex; border-left-color: #cccccc; border-left-width: 1px; border-left-style: solid; padding-left: 1ex">
2. how to correct the situation ?</blockquote><div><br>You have to decide which node you want to "sacrifice", and tell him to discard his data.<br><br>Run the following command on the node on which you want to discard the data :
<br><br> root@bad-data# drbdadm -- --discard-my-data connect all<br><br>Then, simply connect the other machine : <br><br> root@good-data# drbdadm connect all</div><br>The nodes will start to resynchronize, transferring data from the good-data node (will appears as SyncSource in /proc/drbd) to the bad-data node (SyncTarget).
<br><br>Regards,<br>Jérôme Augé<br>