[DRBD-user] Drbd split brain after network failure

Digimer lists at alteeve.ca
Thu Apr 9 14:32:40 CEST 2015

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Fencing would prevent this. Configure (and test!) stonith in pacemaker,
then hook DRBD into it using the crm-fence-peer.sh and
crm-unfence-peer.sh handlers. You also need to set the fencing policy to
'resource-and-stonith;'. This way, instead of assuming the other peer is
dead, it will fence and be sure.

On 09/04/15 06:04 AM, violeta mateiu wrote:
> Hello,
> 
> I have configured a two node Active/Pasive (host1, host2) cluster on
> fedora 20 with pacemaker, corosync, drbd .
> 
> Every time i unplug a network cable form the master host at that point
> drbd goes in split brain state. After i reconnect brbd says StandAlone
> on one host and WFConnection on the other host.
> 
> I can't fix this issue. Even though i declared automatic split-brain
> policies in drbd configuration file my DRBD never recoveres after a
> network failure.
> 
> My resource configuration is as fallows:
> 
> Resources:
>  Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
>   Attributes: ip=192.168.100.94 cidr_netmask=32
>   Operations: monitor interval=30s (ClusterIP-monitor-interval-30s)
>  Resource: WebSite (class=ocf provider=heartbeat type=apache)
>   Attributes: configfile=/etc/httpd/conf/httpd.conf
> statusurl=http://localhost/server-status
>   Operations: monitor interval=1min (WebSite-monitor-interval-1min)
>  Master: WebDataClone
>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2
> clone-node-max=1 notify=true
>   Resource: WebData (class=ocf provider=linbit type=drbd)
>    Attributes: drbd_resource=www
>    Operations: monitor interval=60s (WebData-monitor-interval-60s)
>  Resource: WebFS (class=ocf provider=heartbeat type=Filesystem)
>   Attributes: device=/dev/drbd/by-res/www directory=/var/www/html
> fstype=ext4
>  Master: SQLDataClone
>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2
> clone-node-max=1 notify=true
>   Resource: SQLData (class=ocf provider=linbit type=drbd)
>    Attributes: drbd_resource=sql
>    Operations: monitor interval=60s (SQLData-monitor-interval-60s)
>  Resource: SQLFS (class=ocf provider=heartbeat type=Filesystem)
>   Attributes: device=/dev/drbd4 directory=/var/lib/mysql/data fstype=ext4
>  Resource: appServer (class=ocf provider=heartbeat type=anything)
>   Attributes: binfile=/home/myApp/myAppV2_17 workdir=/home/myApp/
> logfile=/home/myApp/logFile.log errlogfile=/home/myApp/errlogFile.log
> cmdline_options=/home/myApp/config.cfg
>   Operations: monitor interval=120s (appServer-monitor-interval-120s)
>  Resource: MySQL (class=ocf provider=heartbeat type=mysql)
>   Attributes: binary=/usr/sbin/mysqld config=/var/lib/mysql/my.cnf
> datadir=/var/lib/mysql/data pid=/var/run/mysqld/mysqld.pid
> socket=/var/lib/mysql/mysql.sock
>   Operations: monitor interval=60s (MySQL-monitor-interval-60s)
> 
> I have two drbd partitions configured for mysql data files and one for
> apache files.
> 
> Drbd configuration is as fallows:
> 
> global {
>  usage-count yes;
> }
> common {
>  protocol C;
> }
> resource sql {
>  meta-disk internal;
>  device /dev/drbd4;
>  
>  syncer {
>   verify-alg sha1;
>  }
> 
>  net {
>    allow-two-primaries;
> 
>     after-sb-0pri discard-zero-changes;
>     after-sb-1pri discard-secondary;
> 
>    after-sb-2pri disconnect;
>  }
>  on host1 {
>   disk /dev/SQL/TestSQL;
>   address 192.168.100.92:7789 <http://192.168.100.92:7789>;
>  }
>  on host2 {
>   disk /dev/SQL/TestSQL;
>  address 192.168.100.93:7789 <http://192.168.100.93:7789>;
>  }
> }
> resource www {
>  meta-disk internal;
>  device /dev/drbd3;
>  syncer {
>   verify-alg sha1;
>  }
>  
>  net {
>    allow-two-primaries;
> 
>  after-sb-0pri discard-zero-changes;
>  after-sb-1pri discard-secondary;
> 
>    after-sb-2pri disconnect;
>  }
>  on host1 {
>   disk /dev/WEB/TestWEB;
>   address 192.168.100.92:7799 <http://192.168.100.92:7799>;
>  }
>  on host2 {
>   disk /dev/WEB/TestWEB;
>   address 192.168.100.93:7799 <http://192.168.100.93:7799>;
>  }
> }
> 
> Thank you,
> Violeta
> 
> 
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
> 


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?



More information about the drbd-user mailing list