Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On 12/04/2011 04:15 PM, Ivan Pavlenko wrote: > handlers { > pri-on-incon-degr > "/usr/lib/drbd/notify-pri-on-incon-degr.sh; > /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; > reboot -f"; > pri-lost-after-sb > "/usr/lib/drbd/notify-pri-lost-after-sb.sh; > /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; > reboot -f"; > local-io-error "/usr/lib/drbd/notify-io-error.sh; > /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger > ; halt -f"; > } You need to configure DRBD to use fencing. The best way to do this when using a Red Hat cluster is via Lon's "obliterate-peer.sh" script. You can download a copy this way; wget -c https://alteeve.com/files/an-cluster/sbin/obliterate-peer.sh -O /sbin/obliterate-peer.sh chmod a+x /sbin/obliterate-peer.sh Then add this; handlers { fence-peer "/sbin/obliterate-peer.sh"; } > Here my answers on your questions: > > 1) There is definitely split brain not a network problem. I demonstrated > at my previous message I can ping members of the cluster and they have > open firewall. When I use telnet and sniffer I see nodes try to estimate > network connection, but they send reject pockets only. Indeed. > Dec 2 10:04:00 infplsm018 <kern.alert> kernel: block drbd1: Split-Brain > detected but unresolved, dropping connection! You will need to manually recover from this split brain. See; http://www.drbd.org/users-guide/s-resolve-split-brain.html > 3) And here my /etc/cluster/cluster.conf file > > <fencedevice agent="fence_null" name="nullfence"/> > <fencedevice agent="fence_manual" name="manfence"/> These are not effective or supported. You need to use real fence devices. This is exceedingly so when using shared storage in a cluster. What caused your split-brain in this case is largely meaningless without proper fencing. Once you have this setup, tested and working, then the next time DRBD would have split-brain'ed, it'll instead fence. At that point, then you need to sort out what is breaking your cluster. That is another thread though. -- Digimer E-Mail: digimer at alteeve.com Freenode handle: digimer Papers and Projects: http://alteeve.com Node Assassin: http://nodeassassin.org "omg my singularity battery is dead again. stupid hawking radiation." - epitron