[DRBD-user] Default Split Brain Behaviour

Lew ls at redgrid.net
Tue Jan 25 00:36:03 CET 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Thanks for the reply Felix,
Log extracts are included below & attached as requested. 
> Hi,
> 
> On 01/24/2011 01:20 AM, Lew wrote:
> > I've encountered some unexpected behavior with a split brain
> > instance.
> > It seems from what has occurred that the default behavior is set to
> > roll
> > back & discard changes.
> >
> > Recently in my sand pit, I've been manually disconnecting resources
> > as a
> > an ad hock way of maintaining a snapshot for roll back.
> > This way if I'm happy with changes, I can reconnect and within
> > seconds
> > we're fully synced again.
> >
> > I moved a server yesterday and discovered after a drbdadm connect
> > all,
> > that one of the resources had split brained and discarded a few days
> > worth of work;
> > rolling back to the point in time when the resource was first
> > disconnected.
> >
> > What's interesting to me is that the disconnected secondary node had
> > never been set primary, so how did we end up in split brain?
> > I also do not understand why it was only this resource that split
> > brained, when others that existed in seemingly identical
> > configurations
> > and states did not.
> >
> > I expect I'll need to explicitly prohibit this behavior in a global
> > net
> > section covering after-sb-0pri etc;
> > I still don't understand why discard & roll back has been chosen
> > default
> > behavior, I'm contending from my experience it should not be.
> 
> It's not.

OK, seems to me something smells then.

> > Looks to me like a few days work is lost, but if anyone knows of a
> > way
> > to recover from a roll back discard scenario, I'd be very happy to
> > find out.
> 
> Please share pertinent logs and drbd configuration.

Config
------
resource x2 {
	protocol A;

syncer {
	rate 100M;
	}
on emlsurit-v4 {
    device     /dev/drbd9;
    disk      /dev/r50lvm/emlsurit-x2-drbd;
    address   192.168.254.100:7799;
    flexible-meta-disk  internal;
}
on emlsurit-v5 {
    device    /dev/drbd9;
    disk      /dev/r50lvm/emlsurit-x2-drbd;
    address   192.168.254.101:7799;
    meta-disk internal;
  	}
}

Global Config (comments removed)
-------------
global {
	usage-count yes;
	}

common {
	protocol A;

	handlers {
		pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh";
		pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh";
		echo o > /proc/sysrq-trigger ; halt -f";
		local-io-error "/usr/lib/drbd/notify-io-error.sh"; 
		split-brain "/usr/lib/drbd/notify-split-brain.sh root";
		out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
			}

	startup {
			}

	disk {
		no-disk-flushes;
		no-md-flushes;
		        
	}

	net {
	     
        }
	syncer {
		
	}
}

Message Log extract (A bit long to post)-- see attached.

Cheers & thanks,

Lew

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: drbd_msgs
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110125/e534249a/attachment.asc>


More information about the drbd-user mailing list