Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
If you have the possibility to use a dedicated, ideally direct network connection for DRBD rather than the interface you are using for all other traffic as well, that would increase the stability of your solution. Configure 2 rings in pacemaker and set the primary ring to use the direct link. There is quite a chance that you might be unable to reach the stonith device in case of network issues on the normal network, so relying on stonith alone is a bad idea. Stonith can have unwanted sideeffects if the fenced node is powered back on while the former master is down as well (cascading powerfail or things like that). Yo want to make sure the resource is fenced on both nodes before the node is fenced. Otherwise the fenced node has no way of knowing, that his data is outdated. I have not checked lately how this is implemented if the policy is set as suggested. So be sure to test that the former slave can not become master if the former master has failed in the meantime and no sync has taken place. Mit freundlichen Grüßen / Best Regards Robert Köppl Customer Support & Projects Teamleader IT Support KNAPP Systemintegration GmbH Waltenbachstraße 9 8700 Leoben, Austria Phone: +43 3842 805-322 Fax: +43 3842 82930-500 robert.koeppl at knapp.com www.KNAPP.com Commercial register number: FN 138870x Commercial register court: Leoben The information in this e-mail (including any attachment) is confidential and intended to be for the use of the addressee(s) only. If you have received the e-mail by mistake, any disclosure, copy, distribution or use of the contents of the e-mail is prohibited, and you must delete the e-mail from your system. As e-mail can be changed electronically KNAPP assumes no responsibility for any alteration to this e-mail or its attachments. KNAPP has taken every reasonable precaution to ensure that any attachment to this e-mail has been swept for virus. However, KNAPP does not accept any liability for damage sustained as a result of such attachment being virus infected and strongly recommend that you carry out your own virus check before opening any attachment. Von: Digimer <lists at alteeve.ca> An: violeta mateiu <mateiu.violeta at gmail.com>, drbd-user at lists.linbit.com Datum: 09.04.2015 14:32 Betreff: Re: [DRBD-user] Drbd split brain after network failure Gesendet von: drbd-user-bounces at lists.linbit.com Fencing would prevent this. Configure (and test!) stonith in pacemaker, then hook DRBD into it using the crm-fence-peer.sh and crm-unfence-peer.sh handlers. You also need to set the fencing policy to 'resource-and-stonith;'. This way, instead of assuming the other peer is dead, it will fence and be sure. On 09/04/15 06:04 AM, violeta mateiu wrote: > Hello, > > I have configured a two node Active/Pasive (host1, host2) cluster on > fedora 20 with pacemaker, corosync, drbd . > > Every time i unplug a network cable form the master host at that point > drbd goes in split brain state. After i reconnect brbd says StandAlone > on one host and WFConnection on the other host. > > I can't fix this issue. Even though i declared automatic split-brain > policies in drbd configuration file my DRBD never recoveres after a > network failure. > > My resource configuration is as fallows: > > Resources: > Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) > Attributes: ip=192.168.100.94 cidr_netmask=32 > Operations: monitor interval=30s (ClusterIP-monitor-interval-30s) > Resource: WebSite (class=ocf provider=heartbeat type=apache) > Attributes: configfile=/etc/httpd/conf/httpd.conf > statusurl=http://localhost/server-status > Operations: monitor interval=1min (WebSite-monitor-interval-1min) > Master: WebDataClone > Meta Attrs: master-max=1 master-node-max=1 clone-max=2 > clone-node-max=1 notify=true > Resource: WebData (class=ocf provider=linbit type=drbd) > Attributes: drbd_resource=www > Operations: monitor interval=60s (WebData-monitor-interval-60s) > Resource: WebFS (class=ocf provider=heartbeat type=Filesystem) > Attributes: device=/dev/drbd/by-res/www directory=/var/www/html > fstype=ext4 > Master: SQLDataClone > Meta Attrs: master-max=1 master-node-max=1 clone-max=2 > clone-node-max=1 notify=true > Resource: SQLData (class=ocf provider=linbit type=drbd) > Attributes: drbd_resource=sql > Operations: monitor interval=60s (SQLData-monitor-interval-60s) > Resource: SQLFS (class=ocf provider=heartbeat type=Filesystem) > Attributes: device=/dev/drbd4 directory=/var/lib/mysql/data fstype=ext4 > Resource: appServer (class=ocf provider=heartbeat type=anything) > Attributes: binfile=/home/myApp/myAppV2_17 workdir=/home/myApp/ > logfile=/home/myApp/logFile.log errlogfile=/home/myApp/errlogFile.log > cmdline_options=/home/myApp/config.cfg > Operations: monitor interval=120s (appServer-monitor-interval-120s) > Resource: MySQL (class=ocf provider=heartbeat type=mysql) > Attributes: binary=/usr/sbin/mysqld config=/var/lib/mysql/my.cnf > datadir=/var/lib/mysql/data pid=/var/run/mysqld/mysqld.pid > socket=/var/lib/mysql/mysql.sock > Operations: monitor interval=60s (MySQL-monitor-interval-60s) > > I have two drbd partitions configured for mysql data files and one for > apache files. > > Drbd configuration is as fallows: > > global { > usage-count yes; > } > common { > protocol C; > } > resource sql { > meta-disk internal; > device /dev/drbd4; > > syncer { > verify-alg sha1; > } > > net { > allow-two-primaries; > > after-sb-0pri discard-zero-changes; > after-sb-1pri discard-secondary; > > after-sb-2pri disconnect; > } > on host1 { > disk /dev/SQL/TestSQL; > address 192.168.100.92:7789 <http://192.168.100.92:7789>; > } > on host2 { > disk /dev/SQL/TestSQL; > address 192.168.100.93:7789 <http://192.168.100.93:7789>; > } > } > resource www { > meta-disk internal; > device /dev/drbd3; > syncer { > verify-alg sha1; > } > > net { > allow-two-primaries; > > after-sb-0pri discard-zero-changes; > after-sb-1pri discard-secondary; > > after-sb-2pri disconnect; > } > on host1 { > disk /dev/WEB/TestWEB; > address 192.168.100.92:7799 <http://192.168.100.92:7799>; > } > on host2 { > disk /dev/WEB/TestWEB; > address 192.168.100.93:7799 <http://192.168.100.93:7799>; > } > } > > Thank you, > Violeta > > > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? _______________________________________________ drbd-user mailing list drbd-user at lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user