[DRBD-user] Split-Brain issue

Alexandre N n.alexinfo at gmail.com
Thu Jan 22 18:16:43 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi

Glad to be a newcomer on DRBD mailing list.
I'm recently testing an virtual appliance (XVS) to create a Virtual
SAN, in a HA infrastructure.
All seems to work correctly, but the matter comes when I disconnect
one of the node ethernet link , and then a split-brain is detected!! I
choose a dual-primary mode for the replication but no synchronization
will running...
This is my drbd.conf

[root at lvs-node1 ~]# cat /etc/drbd.conf
global {
        usage-count no;
}

common {
        syncer {
                rate 100M;
        }

        handlers {
                outdate-peer "/usr/lib/heartbeat/outdate-peer.sh";
                split-brain "/usr/lib/drbd/notify.sh";
        }
}

resource vmfs-0 {
        protocol C;

        startup {
                become-primary-on both;
                degr-wfc-timeout 120;
        }

        net {
                after-sb-0pri discard-zero-changes;
                after-sb-1pri consensus;
                after-sb-2pri disconnect;
                allow-two-primaries;
        }

        disk {
                on-io-error pass_on;
#                fencing resource-only;
        }

        on lvs-node1.xtravirt.com {
                device     /dev/drbd0;
                disk       /dev/sdb;
                address    10.1.14.130:7788;
                meta-disk  internal;
        }

        on lvs-node2.xtravirt.com {
                device     /dev/drbd0;
                disk       /dev/sdb;
                address    10.1.14.132:7788;
                meta-disk  internal;
        }
}

This is my ha.cf

[root at lvs-node1 ~]# cat /etc/ha.d/ha.cf
use_logd yes
ucast eth0 10.1.14.132
node lvs-node1.xtravirt.com
node lvs-node2.xtravirt.com
respawn hacluster /usr/lib/heartbeat/dopd
apiauth dopd uid=hacluster gid=haclient
crm on
watchdog /dev/watchdog
uuidfrom nodename

And a part of my messages
Jan 22 16:47:01 lvs-node1 kernel: drbd0: self 5D269E2327F99655:ADCF23B2F549C030:
B69B7F2C654BAE11:60C98125A255DAAF
Jan 22 16:47:01 lvs-node1 kernel: drbd0: Split-Brain detected, dropping connecti
on!
Jan 22 16:47:01 lvs-node1 kernel: drbd0: data-integrity-alg: <not-used>
Jan 22 16:47:01 lvs-node1 kernel: drbd0: Starting asender thread (from drbd0_rec
eiver [1538])
Jan 22 16:47:01 lvs-node1 kernel: drbd0: conn( WFConnection -> WFReportParams )
Jan 22 16:47:01 lvs-node1 kernel: drbd0: Handshake successful: Agreed network pr
otocol version 88
Jan 22 16:46:54 lvs-node1 kernel: drbd0: conn( Unconnected -> WFConnection )
Jan 22 16:46:54 lvs-node1 kernel: drbd0: receiver (re)started
Jan 22 16:46:54 lvs-node1 kernel: drbd0: Starting receiver thread (from drbd0_wo
rker [1516])
Jan 22 16:46:54 lvs-node1 kernel: drbd0: conn( StandAlone -> Unconnected )
Jan 22 16:46:54 lvs-node1 kernel: drbd0: role( Secondary -> Primary )
Jan 22 16:46:54 lvs-node1 kernel: drbd0: Terminating receiver thread
Jan 22 16:46:54 lvs-node1 kernel: drbd0: receiver terminated
Jan 22 16:46:54 lvs-node1 kernel: drbd0: conn( Disconnecting -> StandAlone )
Jan 22 16:46:54 lvs-node1 kernel: drbd0: Connection closed
Jan 22 16:46:54 lvs-node1 kernel: drbd0: tl_clear()
Jan 22 16:46:54 lvs-node1 kernel: drbd0: Terminating asender thread
Jan 22 16:46:54 lvs-node1 kernel: drbd0: asender terminated

I just want to order an auto synchronization when a Split brain was
detected, but it don't works with the "discard-zero-changes" option.

If you have some advice to this issue, please send it to me.
Thanks

Best Regards,

Alexandre



More information about the drbd-user mailing list