[DRBD-user] Problems with DRBD and OCFS2 in Active-Active Mode

Dzianis Kahanovich mahatma3 at bspu.unibel.by
Wed Jun 30 15:25:53 CEST 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Fernando de Lima e Silva wrote:

> Somebody work with ocfs2 and 2 primaries nodes ? What can I do to automatic
> policy on Split Brain situation ?
> 
> I'm using DRBD to replicate the /var/www on 2 web servers, I'm using only to
> load balance, I don't need of HA.

Yes, I use OCFS2 dual-primary.
RTFM. But I use not-standard way. Split-brains must be resolved in any normal
case. I use:

===
net{
# if both split-brained are secondary (double-boot)
# Manual suggesting discard-zero-changes
after-sb-0pri discard-least-changes;

# one of SB is primary (single-reboot):
# discard rebooted. normal, HA (also by manual):
#after-sb-1pri discard-secondary;
# but I use (sometimes not HA) - discard-least-changes anymore
after-sb-1pri violently-as0p;

# both primary (after temporary disconnect)
# this is standard (manual), but will not resolve SB after disconnect:
#after-sb-2pri disconnect;
# I use (may be used only with options next)
after-sb-2pri violently-as0p;

# this is REQUIRED if you use violently-as0p.
# this means: if you try to SyncTarget primary node
# (after disconnect, etc) - "pri-lost" handle (reboot)
# will be called to put this node into secondary state
rr-conflict call-pri-lost;
}

handlers{
# after sb-2pri (or 1pri) will be called "reboot"
# on SyncTarget (least changes) node.
# called by rr-conflict, else you may have broken disk
pri-lost "reboot";

# this is additional protection against primary SyncTarget:
# if trying to resync busy primary - exit-code will be prevent syncing
before-resync-target "/sbin/drbdsetup $DRBD_MINOR secondary";
}
===

This config will try to after-sb-0pri policy in any SB case, but if SyncTarget
node is primary - rr-conflict/pri-lost situation will reboot target node to
allow syncing. By manual SyncTarget of primary may destroy data.

Also you must wait to syncing (if "SyncTarget") in boot script before going
primary. Standard "wait-con-int" do not waiting for sync done. You may use
simple (before "drbdadm sh-b-pri all"):
drbdsetup 0 wait-sync --wait-after-sb --outdated-wfc-timeout=18000
--degr-wfc-timeout=18000 --wfc-timeout=18000" (5 min big timeouts or more to
sure sync or single), but additional diagnostic recommended.

My Gentoo init script (and full config) may be found here:
http://raw.googlecode.com/svn/trunk/sys-cluster/drbd/files/

In experiments on 8.3.7 I was have "become primary" on SyncTarget node without
troubles, but I not suggest to do so.

I may modelling some situations how to destroy data in this case, but in
normal (and backuping) case it is working and safe.



More information about the drbd-user mailing list