[DRBD-user] Secondary node io-error

Tue Oct 16 13:46:38 CEST 2012

On Sun, Oct 14, 2012 at 01:41:43PM +0000, Velayutham, Prakash wrote:
> On Oct 10, 2012, at 10:01 PM, Velayutham, Prakash wrote:
> 
> > On Oct 10, 2012, at 5:01 AM, Lars Ellenberg wrote:
> > 
> >> On Wed, Oct 10, 2012 at 03:42:02AM +0000, Velayutham, Prakash wrote:
> >>> 
> >>> On Oct 8, 2012, at 9:19 AM, Velayutham, Prakash wrote:
> >>> 
> >>>> On Oct 8, 2012, at 4:55 AM, Lars Ellenberg wrote:
> >>>> 
> >>>>> On Sat, Oct 06, 2012 at 01:08:43PM +0000, Velayutham, Prakash wrote:
> >>>>>> Hi,
> >>>>>> 
> >>>>>> I recently got a DRBD (8.4.2-2) cluster up (still testing). It seems to work nicely with Pacemaker CRM in several scenarios I have tested. Here is my config.
> >>>>>> 
> >>>>>> global {
> >>>>>>             usage-count     yes;
> >>>>>> }
> >>>>>> 
> >>>>>> common {
> >>>>>>     handlers {
> >>>>>>             outdate-peer    /usr/lib/drbd/crm-fence-peer.sh;
> >>>>>>             fence-peer      /usr/lib/drbd/crm-fence-peer.sh;

Uhm, outdate-peer is a deprecated synonym for fence-peer...

> >>>>>>             after-resync-target     /usr/lib/drbd/crm-unfence-peer.sh;
> >>>>>>             local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";

You realize that "echo o ..." may never be reached,
if one of the earlier commands block?

You realize that any method of "self fencing" relies on cooperation of
the node itself, and if the node is already "damaged", that may simply fail?

> >>>>>>             split-brain "/usr/lib/drbd/notify-split-brain.sh root";
> >>>>>>     }
> >>>>>> 
> >>>>>>     startup {
> >>>>>>             degr-wfc-timeout        0;
> >>>>>>     }
> >>>>>> 
> >>>>>>     net {
> >>>>>>             shared-secret   1QP69G4kWDslx2TMiaEStI6bwaGH5y8d;
> >>>>>>             after-sb-0pri discard-zero-changes;
> >>>>>>             after-sb-1pri discard-secondary;
> >>>>>>             after-sb-2pri disconnect;
> >>>>>>     }
> >>>>>> 
> >>>>>>     disk {
> >>>>>>             on-io-error     call-local-io-error;
> >>>>>>             fencing resource-and-stonith;
> >>>>>>     }
> >>>>>> 
> >>>>>> }
> >>>>>> 
> >>>>>> The io-error handler only gets called when the primary node has a disk
> >>>>>> issue. I have not seen the secondary node call the "local-io-error"
> >>>>>> handler when it had disk access issues. Is this by design?
> >>>>> 
> >>>>> No.
> >>>>> 
> >>>>> "Works for me", though.
> >>>>> 
> >>>>> Can you please double check?
> >>>>> And if in fact you can reproduce, tell us how, including logs?
> >> 
> >>>> If I disable all the FC ports in the fiber switch just for the
> >>>> primary node, the node fences, reboots and comes up, as I would
> >>>> expect. With the exact same config, if I disable the FC ports just
> >>>> for the secondary node, the node just sits there and it even shows
> >>>> up as Secondary in /proc/drbd.
> >> 
> >>>> That sounds odd and sounds like the
> >>>> config should be "diskless", but it is "call-local-io-error".
> >> 
> >> Huh? What has "config" to do with things,
> >> and what exactly is "config diskless"?

There was a question, but no answer yet...

> >>>> Which logs are you wanting me to share?
> >> 
> >> Those that show DRBD detecting an IO error,
> >> but not calling the io-error handler.

Again, where are those logs?

> >>>> Thanks,
> >>>> Prakash
> >>> 
> >>> Just wanted to add this. I repeated my test again and get the exact
> >>> same results again. Here is /proc/drbd of the primary (bmimysqlt3) and
> >>> secondary (bmimysqlt4) before the secondary's disk is cut off
> >>> (disabling the fiber switch port that the secondary is connected to)
> >>> 
> >>> [root at bmimysqlt3 ~]# cat /proc/drbd 
> >>> version: 8.4.2 (api:1/proto:86-101)
> >>> GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root at bmimysqlt3.chmcres.cchmc.org, 2012-10-02 00:02:32
> >>> 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
> >>>   ns:184 nr:0 dw:160 dr:14317 al:6 bm:6 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
> >>> 
> >>> [root at bmimysqlt4 ~]# cat /proc/drbd 
> >>> version: 8.4.2 (api:1/proto:86-101)
> >>> GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root at bmimysqlt3.chmcres.cchmc.org, 2012-10-02 00:02:32
> >>> 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
> >>>   ns:0 nr:184 dw:184 dr:0 al:0 bm:6 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
> >>> 
> >>> Here is /proc/drbd of primary and secondary about 5 minutes after the disk is cut off.
> >>> 
> >>> [root at bmimysqlt3 ~]# cat /proc/drbd 
> >>> version: 8.4.2 (api:1/proto:86-101)
> >>> GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root at bmimysqlt3.chmcres.cchmc.org, 2012-10-02 00:02:32
> >>> 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
> >>>   ns:184 nr:0 dw:160 dr:14317 al:6 bm:6 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
> >> 
> >> No additional writes.
> >> 
> >>> [root at bmimysqlt4 ~]# cat /proc/drbd 
> >>> version: 8.4.2 (api:1/proto:86-101)
> >>> GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root at bmimysqlt3.chmcres.cchmc.org, 2012-10-02 00:02:32
> >>> 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
> >>>   ns:0 nr:184 dw:184 dr:0 al:0 bm:6 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
> >> 
> >> Nothing transfered, nothing written, nothing changed.
> >> 
> >>> As you can see, there is absolutely nothing there to suggest that the
> >>> secondary even noticed the io-error.
> >>> 
> >>> I can't understand what is going on.
> >> 
> >> Do you realize that you need to do IO to get (and then be able to notice) IO errors?
> >> 
> >> Cheers,
> >> 
> >> 	Lars
> > 
> > Wow, feeling like an idiot now. Sorry for the false alarm. I just
> > got confused because the primary node got fenced right away without
> > any sort of manual write operation from me, but the secondary did
> > not exhibit that same behavior.
> > 
> > Thanks,
> > Prakash
> 
> However, I have hit the snag again, in a different scenario.

You are of course sure you always hit "that same snag".
Because always the behavior is "the same": it "does not work".

Maybe you can be more specific, and also gather some logs?
If necessary, from a serial console / logging terminal server,
though preferably from syslog including proper time stamps.

We need logs to determine what actually happened.

>From both peers.

> /etc/drbd.d/global_common.conf:
> 
> global {
> 		usage-count	yes;
> }
> 
> common {
> 	startup {
> 		degr-wfc-timeout	0;
> 	}
> 
> 	net {
> 		cram-hmac-alg	sha1;
> 		shared-secret	xxxxxx;
> 	}
> 
> 	disk {
> 		on-io-error	call-local-io-error;
> 	}
> 
> }
> 
> /etc/drbd.d/mysql1.res:
> 
> resource mysql1 {
> 
> 	handlers {
> 		fence-peer	/usr/lib/drbd/crm-fence-peer.sh;
> 		local-io-error	"/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
> 		split-brain	"/usr/lib/drbd/notify-split-brain.sh root";
> 		after-resync-target	/usr/lib/drbd/crm-unfence-peer.sh;
> 	}
> 
> 	net {
> 		after-sb-0pri	discard-zero-changes;
> 		after-sb-1pri	discard-secondary;

Of course you are automating data loss here.  Which is ok, as long
as you are aware of it and your usage scenario requires it.

> 	}
> 
> 	disk {
> 		fencing	resource-and-stonith;
> 	}
> 
> 	on node1 {
> 		volume 0 {
> 			device		/dev/drbd1;
> 			disk		/dev/mapper/mysql_data1;
> 			flexible-meta-disk	internal;
> 		}
> 		address		x.x.x.x:7788;
> 	}
> 	on node2 {
> 		volume 0 {
> 			device		/dev/drbd1;
> 			disk		/dev/mapper/mysql_data1;
> 			flexible-meta-disk	internal;
> 		}
> 		address		x.x.x.x:7788;
> 	}
> }
> 
> 
> /etc/drbd.d/mysql2.res:
> 
> resource mysql2 {
> 
> 	handlers {
> 		fence-peer	/usr/lib/drbd/crm-fence-peer.sh;
> 		local-io-error	"/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
> 		split-brain	"/usr/lib/drbd/notify-split-brain.sh root";
> 		after-resync-target	/usr/lib/drbd/crm-unfence-peer.sh;
> 	}
> 
> 	net {
> 		after-sb-0pri	discard-zero-changes;
> 		after-sb-1pri	discard-secondary;
> 	}
> 
> 	disk {
> 		fencing	resource-and-stonith;
> 	}
> 
> 	on node1 {
> 		volume 0 {
> 			device		/dev/drbd2;
> 			disk		/dev/mapper/mysql_data2;
> 			flexible-meta-disk	internal;
> 		}
> 		address		x.x.x.x:7789;
> 	}
> 	on node2 {
> 		volume 0 {
> 			device		/dev/drbd2;
> 			disk		/dev/mapper/mysql_data2;
> 			flexible-meta-disk	internal;
> 		}
> 		address		x.x.x.x:7789;
> 	}
> }
> 
> So there are 2 resources (mysql1, mysql2)
> 
> mysql1 is primary on node1, secondary on node2. An ext4 file system on this DRBD volume (/fs1).
> mysql2 is primary on node2, secondary on node1. An ext4 file system on this DRBD volume (/fs2).
> 
> If I disable the FC ports on node1, it reboots instantaneously. No
> need to create any IO for this to happen. mysql1 gets promoted to
> primary on node2 and all is fine.
> 
> But, if I disable the FC ports on node2, nothing happens. Even if I do
> "mkdir /fs2/newdir", the command just hangs. I would expect that
> command to create the necessary IO error and reboot the node. Any
> thoughts?

You should consider a support contract, and DRBD training,
we will help you get it working the way you need,
and will save you some time and trouble.

Cheers,

	Lars

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed