[DRBD-user] Split brain problem.

Digimer linux at alteeve.com
Mon Dec 5 02:21:58 CET 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 12/04/2011 04:15 PM, Ivan Pavlenko wrote:
>         handlers {
>                 pri-on-incon-degr
> "/usr/lib/drbd/notify-pri-on-incon-degr.sh;
> /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
> reboot -f";
>                 pri-lost-after-sb
> "/usr/lib/drbd/notify-pri-lost-after-sb.sh;
> /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
> reboot -f";
>                 local-io-error "/usr/lib/drbd/notify-io-error.sh;
> /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger
> ; halt -f";
>         }

You need to configure DRBD to use fencing. The best way to do this when
using a Red Hat cluster is via Lon's "obliterate-peer.sh" script. You
can download a copy this way;

wget -c https://alteeve.com/files/an-cluster/sbin/obliterate-peer.sh -O
/sbin/obliterate-peer.sh
chmod a+x /sbin/obliterate-peer.sh

Then add this;

handlers {
        fence-peer              "/sbin/obliterate-peer.sh";
}

> Here my answers on your questions:
> 
> 1) There is definitely split brain not a network problem. I demonstrated
> at my previous message I can ping members of the cluster and they have
> open firewall. When I use telnet and sniffer I see nodes try to estimate
> network connection, but they send reject pockets only.

Indeed.

> Dec  2 10:04:00 infplsm018 <kern.alert> kernel: block drbd1: Split-Brain
> detected but unresolved, dropping connection!

You will need to manually recover from this split brain. See;

http://www.drbd.org/users-guide/s-resolve-split-brain.html

> 3) And here my /etc/cluster/cluster.conf file
> 
> <fencedevice agent="fence_null" name="nullfence"/>
> <fencedevice agent="fence_manual" name="manfence"/>

These are not effective or supported. You need to use real fence
devices. This is exceedingly so when using shared storage in a cluster.
What caused your split-brain in this case is largely meaningless without
proper fencing.

Once you have this setup, tested and working, then the next time DRBD
would have split-brain'ed, it'll instead fence. At that point, then you
need to sort out what is breaking your cluster. That is another thread
though.

-- 
Digimer
E-Mail:              digimer at alteeve.com
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"omg my singularity battery is dead again.
stupid hawking radiation." - epitron



More information about the drbd-user mailing list