[DRBD-user] Not able to test Automatic split brain recovery policies

Thu Apr 11 14:27:46 CEST 2013

> -----Original Message-----
> From: Shailesh Vaidya [mailto:shailesh_vaidya at persistent.co.in]
> Sent: Thursday, April 11, 2013 1:50 AM
> To: Digimer
> Cc: Dan Barker; drbd-user at lists.linbit.com
> Subject: RE: [DRBD-user] Not able to test Automatic split brain recovery
> policies
> 
> Hi Digimer,
> 
> Thanks for help and explanation. I will try it out fencing option.
> 
> However, I would like to validate if what I am testing for split-brain is
> correct or not. Also what could be done for simple split-brain auto-
> recovery through configuration without fencing.
> 

There is no "simple split-brain" recovery. Split Brain only occurs after an error of some sort causing two different nodes to write to the same resource while disconnected. Anything other than manual recovery of files or blocks will lose data. In many cases, it's not even possible to determine what data is being lost or how to recover it. You just have to pick the lesser of two evils and move forward, honoring the writes to one node and discarding the writes done on the other. Most applications and file systems react poorly to having writes of theirs discarded.

Any effort spent automating the recovery of a split-brain could better be spent identifying how your configuration created the split brain, usually dual primary without sufficient controls in place to prevent split brain in the first place.

ymmv

Dan

> Regards,
> Shailesh Vaidya
> 
> 
> -----Original Message-----
> From: Digimer [mailto:lists at alteeve.ca]
> Sent: Wednesday, April 10, 2013 11:17 PM
> To: Shailesh Vaidya
> Cc: Dan Barker; drbd-user at lists.linbit.com
> Subject: Re: [DRBD-user] Not able to test Automatic split brain recovery
> policies
> 
> I've not done fencing in DRBD alone, so I am unable to offer specific
> suggestions. I can speak to generally what you need though.
> 
> You can set DRBD's fencing policy to 'resource-and-stonith'. What this
> does is tell DRBD "When you lose your peer, block IO and call a fence
> against it". The fence action reaches out (usually via IPMI or managed
> PDU) and forces the peer offline. After that, the surviving node will
> proceed. This way, at no time will both nodes be operating in StandAlone
> and Primary.
> 
> You will want to set a delay so that one of the nodes has a head start
> when trying to fence the other. This way, in your test, when the
> communication breaks but the nodes are still up, you remove the risk of
> both nodes being fenced. What this does is say "when you want to fence
> node 1, wait 15 seconds before doing so. when you want to fence node 2,
> don't wait and immediately fence". Thus, when it's a break in
> communications, you can predict which node will win the fence.
> 
> When a node really fails, it will obviously not try to fence, being dead,
> so the healthy node will always win the fence and then take over.
> 
> How you actually fence the peer will depends on what options you have
> available. Then you need a script that will actually do the work of
> reaching out and killing the peer. As I mentioned, this is usually done
> via IPMI (or branded out of band interfaces like iLO, DRAC, RSA, etc) or
> by using managed PDUs, like the APC AP7900. To do this, you need to have a
> scrip that reads certain environment variables set by DRBD, executes the
> request and then returns an appropriate exit code based on success or
> failure.
> 
> I wrote such a fence handler called "rhcs_fence" (based on
> obliterate-peer.sh) which handles fencing by passing the request up to
> rhcs. You should be able to fairly easily adapt it to work with your
> setup.
> 
> https://github.com/digimer/rhcs_fence
> 
> Hope this helps clarify things.
> 
> digimer
> 
> On 04/10/2013 01:22 PM, Shailesh Vaidya wrote:
> > Hi Don,
> >
> > Yup 8.3.8 is quit old but need to work with it for now.
> >
> > I am not using fencing and neither pacemaker or RHCS
> >
> > What I observed is that after split-brain its getting disconnected and
> dropped connection. both became unknown to each other. I am not sure is
> this issue with my test procedure itself.
> >
> > Do I need to make any additional configuration.
> >
> > Thanks,
> > Shailesh Vaidya.
> >
> > ________________________________________
> > From: Digimer [lists at alteeve.ca]
> > Sent: Wednesday, April 10, 2013 8:11 PM
> > To: Shailesh Vaidya
> > Cc: Dan Barker; drbd-user at lists.linbit.com
> > Subject: Re: [DRBD-user] Not able to test Automatic split brain recovery
> policies
> >
> > To your immediate problem;
> >
> > If you had configured fencing, drbd would not split-brain. Are you
> > using pacemaker or RHCS?
> >
> > Secondly, 8.3.8 is very, very old. Upgrading to a newer 8.3.x version
> > would be a good idea.
> >
> > Back to split-brain; DRBD declares a split-brain as soon as both nodes
> > are StandAlone and Primary. To recover, you need to tell DRBD which
> > node to consider "good" and then drop the changes on the peer and let
> > the good node sync to the other node.
> >
> > On 04/10/2013 08:08 AM, Shailesh Vaidya wrote:
> >> I have followed same procedure (disable Ethernet card) etc and after
> >> that drbd status on both the nodes
> >>
> >> [root at drbd1 ~]# cat /proc/drbd
> >>
> >> version: 8.3.8 (api:88/proto:86-94)
> >>
> >> GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by
> >> mockbuild at builder10.centos.org, 2010-06-04 08:04:09
> >>
> >> 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r----
> >>
> >>       ns:4 nr:0 dw:12 dr:82 al:1 bm:3 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
> >> oos:4
> >>
> >> [root at drbd1 ~]#
> >>
> >> [root at drbd2 ~]# cat /proc/drbd
> >>
> >> version: 8.3.8 (api:88/proto:86-94)
> >>
> >> GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by
> >> mockbuild at builder10.centos.org, 2010-06-04 08:04:09
> >>
> >> 0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r----
> >>
> >>       ns:0 nr:4 dw:56 dr:42 al:1 bm:4 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b
> >> oos:4
> >>
> >> [root at drbd2 ~]#
> >>
> >> /var/log/messages shows
> >>
> >> Apr 10 07:51:35 localhost kernel: block drbd0: helper command:
> >> /sbin/drbdadm initial-split-brain minor-0
> >>
> >> Apr 10 07:51:35 localhost kernel: block drbd0: helper command:
> >> /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
> >>
> >> Apr 10 07:51:35 localhost kernel: block drbd0: Split-Brain detected
> >> but unresolved, dropping connection!
> >>
> >> Apr 10 07:51:35 localhost kernel: block drbd0: helper command:
> >> /sbin/drbdadm split-brain minor-0
> >>
> >> Apr 10 07:51:35 localhost kernel: block drbd0: helper command:
> >> /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
> >>
> >> Apr 10 07:51:35 localhost kernel: block drbd0: conn( WFReportParams
> >> -> Disconnecting )
> >>
> >> Apr 10 07:51:35 localhost kernel: block drbd0: error receiving
> >> ReportState, l: 4!
> >>
> >> Apr 10 07:51:35 localhost kernel: block drbd0: asender terminated
> >>
> >> Apr 10 07:51:35 localhost kernel: block drbd0: Terminating asender
> >> thread
> >>
> >> Apr 10 07:51:35 localhost kernel: block drbd0: Connection closed
> >>
> >> Apr 10 07:51:35 localhost kernel: block drbd0: conn( Disconnecting ->
> >> StandAlone )
> >>
> >> Apr 10 07:51:35 localhost kernel: block drbd0: receiver terminated
> >>
> >> Apr 10 07:51:35 localhost kernel: block drbd0: Terminating receiver
> >> thread
> >>
> >> Now if I do 'drbdadm connect r0' on both the machines then,
> >>
> >> Apr 10 07:56:37 localhost kernel: block drbd0: uuid_compare()=100 by
> >> rule 90
> >>
> >> Apr 10 07:56:37 localhost kernel: block drbd0: helper command:
> >> /sbin/drbdadm initial-split-brain minor-0
> >>
> >> Apr 10 07:56:37 localhost kernel: block drbd0: helper command:
> >> /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
> >>
> >> Apr 10 07:56:37 localhost kernel: block drbd0: Split-Brain detected,
> >> 1 primaries, automatically solved. Sync from peer node
> >>
> >> Apr 10 07:56:37 localhost kernel: block drbd0: peer( Unknown ->
> >> Primary
> >> ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
> >>
> >> Apr 10 07:56:37 localhost kernel: block drbd0: conn( WFBitMapT ->
> >> WFSyncUUID )
> >>
> >> Apr 10 07:56:37 localhost kernel: block drbd0: helper command:
> >> /sbin/drbdadm before-resync-target minor-0
> >>
> >> Apr 10 07:56:37 localhost kernel: block drbd0: helper command:
> >> /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
> >>
> >> Apr 10 07:56:37 localhost kernel: block drbd0: conn( WFSyncUUID ->
> >> SyncTarget ) disk( UpToDate -> Inconsistent )
> >>
> >> Apr 10 07:56:37 localhost kernel: block drbd0: Began resync as
> >> SyncTarget (will sync 4 KB [1 bits set]).
> >>
> >> Apr 10 07:56:37 localhost kernel: block drbd0: Resync done (total 1
> >> sec; paused 0 sec; 4 K/sec)
> >>
> >> Regards,
> >>
> >> Shailesh Vaidya
> >>
> >> *From:*drbd-user-bounces at lists.linbit.com
> >> [mailto:drbd-user-bounces at lists.linbit.com] *On Behalf Of *Dan Barker
> >> *Sent:* Wednesday, April 10, 2013 5:16 PM
> >> *To:* drbd-user at lists.linbit.com
> >> *Subject:* Re: [DRBD-user] Not able to test Automatic split brain
> >> recovery policies
> >>
> >> You don't show the status of the nodes, but I imagine you have two
> >> primary nodes. There is no handler specified for two primary nodes.
> >> Did you have two primary, disconnected nodes?
> >>
> >> It shouldn't be possible to create split brain without writing on
> >> both nodes.
> >>
> >> Dan
> >>
> >> *From:*drbd-user-bounces at lists.linbit.com
> >> <mailto:drbd-user-bounces at lists.linbit.com>
> >> [mailto:drbd-user-bounces at lists.linbit.com] *On Behalf Of *Shailesh
> >> Vaidya
> >> *Sent:* Wednesday, April 10, 2013 1:58 AM
> >> *To:* drbd-user at lists.linbit.com <mailto:drbd-user at lists.linbit.com>
> >> *Subject:* [DRBD-user] Not able to test Automatic split brain
> >> recovery policies
> >>
> >> Hello,
> >>
> >> I am using DRBD 8.3.8
> >>
> >> I have configured Automatic split brain recovery policies as below in
> >> /etc/drbd.conf
> >>
> >> net {
> >>
> >> max-buffers     2048;
> >>
> >> ko-count 4;
> >>
> >> after-sb-0pri discard-zero-changes;
> >>
> >> after-sb-1pri discard-secondary;
> >>
> >> }
> >>
> >> My both machines are Virtual machines so not connected actual
> >> back-to-back connection. To reproduce split-brain, I am using below
> >> procedure,
> >>
> >> 1.On Primary disable Ethernet card from 'Virtual Machine properties'
> >>
> >> 2.Wait to Secondery to start switch over and again enable Ethernet
> >> card on Primary
> >>
> >> Log shows mw that split-brain is occurred , however its shows
> >> connection dropped.
> >>
> >> Apr  9 10:30:15 drbd1 kernel: block drbd0: uuid_compare()=100 by rule
> >> 90
> >>
> >> Apr  9 10:30:15 drbd1 kernel: block drbd0: helper command:
> >> /sbin/drbdadm initial-split-brain minor-0
> >>
> >> Apr  9 10:30:15 drbd1 kernel: block drbd0: helper command:
> >> /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
> >>
> >> Apr  9 10:30:15 drbd1 kernel: block drbd0: Split-Brain detected but
> >> unresolved, dropping connection!
> >>
> >> Apr  9 10:30:15 drbd1 kernel: block drbd0: helper command:
> >> /sbin/drbdadm split-brain minor-0
> >>
> >> Apr  9 10:30:15 drbd1 kernel: block drbd0: helper command:
> >> /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
> >>
> >> Apr  9 10:30:15 drbd1 kernel: block drbd0: conn( WFReportParams ->
> >> Disconnecting )
> >>
> >> Full DRBD conf file
> >>
> >> [root at drbd1 ~]# cat /etc/drbd.conf
> >>
> >> global {
> >>
> >> usage-count no;
> >>
> >> }
> >>
> >> resource r0 {
> >>
> >> protocol C;
> >>
> >> #incon-degr-cmd "echo !DRBD! pri on incon-degr | wall ; sleep 60 ;
> >> halt -f";
> >>
> >> on drbd1 {
> >>
> >> device     /dev/drbd0;
> >>
> >> disk       /dev/sda3;
> >>
> >> address    10.55.199.51:7789;
> >>
> >> meta-disk  internal;
> >>
> >> }
> >>
> >> on drbd2 {
> >>
> >> device    /dev/drbd0;
> >>
> >> disk      /dev/sda3;
> >>
> >> address   10.55.199.52:7789;
> >>
> >> meta-disk internal;
> >>
> >> }
> >>
> >> disk {
> >>
> >> on-io-error   detach;
> >>
> >> }
> >>
> >> net {
> >>
> >> max-buffers     2048;
> >>
> >> ko-count 4;
> >>
> >> after-sb-0pri discard-zero-changes;
> >>
> >> after-sb-1pri discard-secondary;
> >>
> >> }
> >>
> >> syncer {
> >>
> >> rate 25M;
> >>
> >> al-extents 257; # must be a prime number
> >>
> >> }
> >>
> >> startup {
> >>
> >> wfc-timeout  20;
> >>
> >> degr-wfc-timeout 120;    # 2 minutes.
> >>
> >> }
> >>
> >> }
> >>
> >> [root at drbd1 ~]# vi /var/log/messages
> >>
> >> [root at drbd1 ~]#
> >>
> >> [root at drbd1 ~]# cat /etc/drbd.conf
> >>
> >> global {
> >>
> >> usage-count no;
> >>
> >> }
> >>
> >> resource r0 {
> >>
> >> protocol C;
> >>
> >> #incon-degr-cmd "echo !DRBD! pri on incon-degr | wall ; sleep 60 ;
> >> halt -f";
> >>
> >> on drbd1 {
> >>
> >> device     /dev/drbd0;
> >>
> >> disk       /dev/sda3;
> >>
> >> address    10.55.199.51:7789;
> >>
> >> meta-disk  internal;
> >>
> >> }
> >>
> >> on drbd2 {
> >>
> >> device    /dev/drbd0;
> >>
> >> disk      /dev/sda3;
> >>
> >> address   10.55.199.52:7789;
> >>
> >> meta-disk internal;
> >>
> >> }
> >>
> >> disk {
> >>
> >> on-io-error   detach;
> >>
> >> }
> >>
> >> net {
> >>
> >> max-buffers     2048;
> >>
> >> ko-count 4;
> >>
> >> after-sb-0pri discard-zero-changes;
> >>
> >> after-sb-1pri discard-secondary;
> >>
> >> }
> >>
> >> syncer {
> >>
> >> rate 25M;
> >>
> >> al-extents 257; # must be a prime number
> >>
> >> }
> >>
> >> startup {
> >>
> >> wfc-timeout  20;
> >>
> >> degr-wfc-timeout 120;    # 2 minutes.
> >>
> >> }
> >>
> >> }
> >>
> >> [root at drbd1 ~]#
> >>
> >> Is this configuration issue or my testing procedure is not proper?
> >>
> >> Regards,
> >>
> >> Shailesh Vaidya