Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi ALL, Digimer, thank you again for your answer I'm really appreciate it! Unfortunately, I've tried to fixes split brain manually several times. It doesn't work. # drbdadm disconnect r0 [root at infplsm017 ~]# drbdadm secondary r0 1: State change failed: (-12) Device is held open by someone Command 'drbdsetup 1 secondary' terminated with exit code 11 # drbdadm -- --discard-my-data connect r0 1: Failure: (123) --discard-my-data not allowed when primary. Command 'drbdsetup 1 net 10.10.24.10:7789 10.10.24.11:7789 C --set-defaults --create-device --ping-timeout=20 --after-sb-2pri=disconnect --after-sb-1pri=discard-secondary --after-sb-0pri=discard-zero-changes --allow-two-primaries --discard-my-data' terminated with exit code 10 # I guess I need to stop cluster daemons, don't I? Thank you again, Ivan On 12/05/2011 12:21 PM, Digimer wrote: > On 12/04/2011 04:15 PM, Ivan Pavlenko wrote: >> handlers { >> pri-on-incon-degr >> "/usr/lib/drbd/notify-pri-on-incon-degr.sh; >> /usr/lib/drbd/notify-emergency-reboot.sh; echo b> /proc/sysrq-trigger ; >> reboot -f"; >> pri-lost-after-sb >> "/usr/lib/drbd/notify-pri-lost-after-sb.sh; >> /usr/lib/drbd/notify-emergency-reboot.sh; echo b> /proc/sysrq-trigger ; >> reboot -f"; >> local-io-error "/usr/lib/drbd/notify-io-error.sh; >> /usr/lib/drbd/notify-emergency-shutdown.sh; echo o> /proc/sysrq-trigger >> ; halt -f"; >> } > You need to configure DRBD to use fencing. The best way to do this when > using a Red Hat cluster is via Lon's "obliterate-peer.sh" script. You > can download a copy this way; > > wget -c https://alteeve.com/files/an-cluster/sbin/obliterate-peer.sh -O > /sbin/obliterate-peer.sh > chmod a+x /sbin/obliterate-peer.sh > > Then add this; > > handlers { > fence-peer "/sbin/obliterate-peer.sh"; > } > >> Here my answers on your questions: >> >> 1) There is definitely split brain not a network problem. I demonstrated >> at my previous message I can ping members of the cluster and they have >> open firewall. When I use telnet and sniffer I see nodes try to estimate >> network connection, but they send reject pockets only. > Indeed. > >> Dec 2 10:04:00 infplsm018<kern.alert> kernel: block drbd1: Split-Brain >> detected but unresolved, dropping connection! > You will need to manually recover from this split brain. See; > > http://www.drbd.org/users-guide/s-resolve-split-brain.html > >> 3) And here my /etc/cluster/cluster.conf file >> >> <fencedevice agent="fence_null" name="nullfence"/> >> <fencedevice agent="fence_manual" name="manfence"/> > These are not effective or supported. You need to use real fence > devices. This is exceedingly so when using shared storage in a cluster. > What caused your split-brain in this case is largely meaningless without > proper fencing. > > Once you have this setup, tested and working, then the next time DRBD > would have split-brain'ed, it'll instead fence. At that point, then you > need to sort out what is breaking your cluster. That is another thread > though. >