[DRBD-user] Help my drbd + heartbeat problems

Bruno Medico - EvoluServices bruno.medico at evoluservices.com
Tue Sep 24 17:24:40 CEST 2013

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


guys .. I'm in big trouble here ...

My heartbeat is changing from one server to another in loop.

The following configuration files

root at servidor-01:~# cat /etc/drbd.conf
global {
        usage-count no;
}

resource r0 {

        protocol C;

        startup {
                wfc-timeout 30;
                degr-wfc-timeout 120;
        }

        disk {
                on-io-error detach;
        }

        syncer {
                rate 600M;
        }

        on servidor-01 {
                device          /dev/drbd0;
                disk            /dev/mapper/DRBD-APP;
                address         172.16.1.1:7788;
                meta-disk       internal;
        }

        on servidor-02 {
                device          /dev/drbd0;
                disk            /dev/mapper/DRBD-APP;
                address         172.16.1.2:7788;
                meta-disk       internal;
        }
}

resource r1 {

        protocol C;

        startup {
                wfc-timeout 30;
                degr-wfc-timeout 120;
        }

        disk {
                on-io-error detach;
        }

        syncer {
                rate 600M;
        }

        on servidor-01 {
                device          /dev/drbd1;
                disk            /dev/mapper/ORACLE-DB;
                address         172.16.1.1:7789;
                meta-disk       internal;
        }

        on servidor-02 {
                device          /dev/drbd1;
                disk            /dev/mapper/ORACLE-DB;
                address         172.16.1.2:7789;
                meta-disk       internal;
        }

}

root at servidor-01:~# cat /etc/ha.d/ha.cf
logfacility local0
keepalive 2
deadtime 20
warntime 10
initdead 60
ucast eth1 172.16.1.2
auto_failback off
node Servidor-01
node Servidor-02
ping 192.168.1.1 192.168.1.5
apiauth pingd uid=hacluster
respawn hacluster /usr/lib/heartbeat/pingd -m 100 -d 5s
#respawn hacluster /usr/lib/heartbeat/ipfail
deadping 20
debug 0
debugfile /var/log/ha-debug

root at servidor-01:~# cat /etc/ha.d/haresources
servidor-01 10.10.10.28 drbddisk::r0 Filesystem::/dev/drbd0::/VBI::xfs
drbddisk::r1 Filesystem::/dev/drbd1::/usr/lib/oracle::xfs oracle-xe
MacFloating ArPing mon
root at servidor-01:~#

root at servidor-01:/etc/ha.d/resource.d# cat MacFloating
#!/bin/bash


#
# Process
case $1 in
        start)

        sudo macchanger -m 2a:45:aa:a6:aa:7d eth0

        #sudo arping -c 3 -I eth0:0 10.10.10.28

        ;;

        stop)

        sudo macchanger -m f6:8b:0d:1f:2a:e5 eth0


        #sudo arping -c 3 -I eth0 10.10.10.26

        ;;
        status)
        ifconfig
                ;;

        *)
                echo "Usage: $0 start|stop|status"
                exit 2
                ;;
esac
exit 0

root at servidor-01:/etc/ha.d/resource.d# cat ArPing
#!/bin/bash

ARPING=`which arping`
SUDO=`which sudo`

#
# Process
case $1 in
        start)
        sleep 2

        $SUDO $ARPING -c 2 -I eth0 10.10.10.26
        $SUDO $ARPING -c 2 -I eth0:0 10.10.10.28

        ;;

        stop)

        sleep 2
        $SUDO $ARPING -c 2 -I eth0 10.10.10.26

        ;;
        status)
        ifconfig
                ;;

        *)
                echo "Usage: $0 start|stop|status"
                exit 2
                ;;
esac
exit 0

=====================
Logs

Log server 01

Sep 24 12:21:51 servidor-01 heartbeat: [1681]: info: servidor-02 wants to
go standby [all]
Sep 24 12:22:01 servidor-01 heartbeat: [1681]: info: standby: acquire [all]
resources from servidor-02
Sep 24 12:22:01 servidor-01 heartbeat: [20239]: info: acquire all HA
resources (standby).
ResourceManager[20253]: 2013/09/24_12:22:02 info: Acquiring resource group:
servidor-01 10.10.10.28 drbddisk::r0 Filesystem::/dev/drbd0::/VBI::xfs
drbddisk::r1 Filesystem::/dev/drbd1::/usr/lib/oracle::xfs oracle-xe
MacFloating ArPing mon
IPaddr[20280]:  2013/09/24_12:22:02 INFO:  Resource is stopped
ResourceManager[20253]: 2013/09/24_12:22:02 info: Running
/etc/ha.d/resource.d/IPaddr 10.10.10.28 start
IPaddr[20338]:  2013/09/24_12:22:02 INFO: Using calculated nic for
10.10.10.28: eth0
IPaddr[20338]:  2013/09/24_12:22:02 INFO: Using calculated netmask for
10.10.10.28: 255.255.255.192
IPaddr[20338]:  2013/09/24_12:22:02 INFO: eval ifconfig eth0:0 10.10.10.28
netmask 255.255.255.192 broadcast 10.10.10.63
IPaddr[20326]:  2013/09/24_12:22:02 INFO:  Success
INFO:  Success
ResourceManager[20253]: 2013/09/24_12:22:02 info: Running
/etc/ha.d/resource.d/drbddisk r0 start
Filesystem[20476]:      2013/09/24_12:22:02 INFO:  Resource is stopped
ResourceManager[20253]: 2013/09/24_12:22:02 info: Running
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /VBI xfs start
Filesystem[20552]:      2013/09/24_12:22:02 INFO: Running start for
/dev/drbd0 on /VBI
FATAL: Module scsi_hostadapter not found.
Filesystem[20546]:      2013/09/24_12:22:02 INFO:  Success
INFO:  Success
ResourceManager[20253]: 2013/09/24_12:22:02 info: Running
/etc/ha.d/resource.d/drbddisk r1 start
Filesystem[20658]:      2013/09/24_12:22:02 INFO:  Resource is stopped
ResourceManager[20253]: 2013/09/24_12:22:02 info: Running
/etc/ha.d/resource.d/Filesystem /dev/drbd1 /usr/lib/oracle xfs start
Filesystem[20734]:      2013/09/24_12:22:02 INFO: Running start for
/dev/drbd1 on /usr/lib/oracle
FATAL: Module scsi_hostadapter not found.
Filesystem[20728]:      2013/09/24_12:22:04 INFO:  Success
INFO:  Success
ResourceManager[20253]: 2013/09/24_12:22:04 info: Running
/etc/ha.d/resource.d/oracle-xe  start
Starting Oracle Net Listener.
Starting Oracle Database 10g Express Edition Instance.

ResourceManager[20253]: 2013/09/24_12:22:13 info: Running
/etc/ha.d/resource.d/MacFloating  start
Current MAC: f6:8b:0d:1f:2a:e5 (unknown)
Faked MAC:   2a:45:aa:a6:aa:7d (unknown)
ResourceManager[20253]: 2013/09/24_12:22:13 info: Running
/etc/ha.d/resource.d/ArPing  start
ARPING 10.10.10.26

--- 10.10.10.26 statistics ---
2 packets transmitted, 0 packets received, 100% unanswered (0 extra)
ARPING 10.10.10.28

--- 10.10.10.28 statistics ---
2 packets transmitted, 0 packets received, 100% unanswered (0 extra)
ResourceManager[20253]: 2013/09/24_12:22:20 info: Running
/etc/ha.d/resource.d/mon  start
Starting mon daemon : mon.
Sep 24 12:22:21 servidor-01 heartbeat: [20239]: info: all HA resource
acquisition completed (standby).
Sep 24 12:22:21 servidor-01 heartbeat: [1681]: info: Standby resource
acquisition done [all].
Sep 24 12:22:21 servidor-01 heartbeat: [1681]: info: remote resource
transition completed.

Log Server 02

Sep 24 12:22:48 servidor-02 heartbeat: [1680]: info: standby: acquire [all]
resources from servidor-01
Sep 24 12:22:48 servidor-02 heartbeat: [1642]: info: acquire all HA
resources (standby).
ResourceManager[1656]:  2013/09/24_12:22:48 info: Acquiring resource group:
servidor-01 10.10.10.28 drbddisk::r0 Filesystem::/dev/drbd0::/VBI::xfs
drbddisk::r1 Filesystem::/dev/drbd1::/usr/lib/oracle::xfs oracle-xe
MacFloating ArPing mon
IPaddr[1685]:   2013/09/24_12:22:48 INFO:  Resource is stopped
ResourceManager[1656]:  2013/09/24_12:22:49 info: Running
/etc/ha.d/resource.d/IPaddr 10.10.10.28 start
IPaddr[1751]:   2013/09/24_12:22:49 INFO: Using calculated nic for
10.10.10.28: eth0
IPaddr[1751]:   2013/09/24_12:22:49 INFO: Using calculated netmask for
10.10.10.28: 255.255.255.192
IPaddr[1751]:   2013/09/24_12:22:49 INFO: eval ifconfig eth0:0 10.10.10.28
netmask 255.255.255.192 broadcast 10.10.10.63
IPaddr[1738]:   2013/09/24_12:22:49 INFO:  Success
INFO:  Success
ResourceManager[1656]:  2013/09/24_12:22:49 info: Running
/etc/ha.d/resource.d/drbddisk r0 start
Filesystem[1901]:       2013/09/24_12:22:49 INFO:  Resource is stopped
ResourceManager[1656]:  2013/09/24_12:22:49 info: Running
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /VBI xfs start
Filesystem[1977]:       2013/09/24_12:22:49 INFO: Running start for
/dev/drbd0 on /VBI
FATAL: Module scsi_hostadapter not found.
Filesystem[1971]:       2013/09/24_12:22:49 INFO:  Success
INFO:  Success
ResourceManager[1656]:  2013/09/24_12:22:49 info: Running
/etc/ha.d/resource.d/drbddisk r1 start
Filesystem[2084]:       2013/09/24_12:22:49 INFO:  Resource is stopped
ResourceManager[1656]:  2013/09/24_12:22:49 info: Running
/etc/ha.d/resource.d/Filesystem /dev/drbd1 /usr/lib/oracle xfs start
Filesystem[2159]:       2013/09/24_12:22:49 INFO: Running start for
/dev/drbd1 on /usr/lib/oracle
FATAL: Module scsi_hostadapter not found.
Filesystem[2153]:       2013/09/24_12:22:51 INFO:  Success
INFO:  Success
ResourceManager[1656]:  2013/09/24_12:22:51 info: Running
/etc/ha.d/resource.d/oracle-xe  start
Starting Oracle Net Listener.
Starting Oracle Database 10g Express Edition Instance.

ResourceManager[1656]:  2013/09/24_12:23:00 info: Running
/etc/ha.d/resource.d/MacFloating  start
Current MAC: 78:31:2f:bd:7c:2c (unknown)
Faked MAC:   2a:45:aa:a6:aa:7d (unknown)
ResourceManager[1656]:  2013/09/24_12:23:00 info: Running
/etc/ha.d/resource.d/ArPing  start
ARPING 10.10.10.27

--- 10.10.10.27 statistics ---
2 packets transmitted, 0 packets received, 100% unanswered (0 extra)
ARPING 10.10.10.28

--- 10.10.10.28 statistics ---
2 packets transmitted, 0 packets received, 100% unanswered (0 extra)
ResourceManager[1656]:  2013/09/24_12:23:07 info: Running
/etc/ha.d/resource.d/mon  start
Starting mon daemon : mon.
Sep 24 12:23:08 servidor-02 heartbeat: [1642]: info: all HA resource
acquisition completed (standby).
Sep 24 12:23:08 servidor-02 heartbeat: [1680]: info: Standby resource
acquisition done [all].
Sep 24 12:23:08 servidor-02 heartbeat: [1680]: info: remote resource
transition completed.


--



*Bruno Medico • Administrador de Redes
*
*(11) 3014-8628*

"Por causa do ferreiro, perdeu-se o prego; por causa do prego, perdeu-se a
ferradura; por causa da ferradura, perdeu-se o cavalo; por causa do cavalo,
perdeu-se o mensageiro; por causa do mensageiro, perdeu-se a carta; por
causa da carta, perdeu-se a guerra."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20130924/5983ec78/attachment.htm>


More information about the drbd-user mailing list