[DRBD-user] Secondary node io-error

Mon Oct 8 15:19:01 CEST 2012

On Oct 8, 2012, at 4:55 AM, Lars Ellenberg wrote:

> On Sat, Oct 06, 2012 at 01:08:43PM +0000, Velayutham, Prakash wrote:
>> Hi,
>> 
>> I recently got a DRBD (8.4.2-2) cluster up (still testing). It seems to work nicely with Pacemaker CRM in several scenarios I have tested. Here is my config.
>> 
>> global {
>>                usage-count     yes;
>> }
>> 
>> common {
>>        handlers {
>>                outdate-peer    /usr/lib/drbd/crm-fence-peer.sh;
>>                fence-peer      /usr/lib/drbd/crm-fence-peer.sh;
>>                after-resync-target     /usr/lib/drbd/crm-unfence-peer.sh;
>>                local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
>>                split-brain "/usr/lib/drbd/notify-split-brain.sh root";
>>        }
>> 
>>        startup {
>>                degr-wfc-timeout        0;
>>        }
>> 
>>        net {
>>                shared-secret   1QP69G4kWDslx2TMiaEStI6bwaGH5y8d;
>>                after-sb-0pri discard-zero-changes;
>>                after-sb-1pri discard-secondary;
>>                after-sb-2pri disconnect;
>>        }
>> 
>>        disk {
>>                on-io-error     call-local-io-error;
>>                fencing resource-and-stonith;
>>        }
>> 
>> }
>> 
>> The io-error handler only gets called when the primary node has a disk
>> issue. I have not seen the secondary node call the "local-io-error"
>> handler when it had disk access issues. Is this by design?
> 
> No.
> 
> "Works for me", though.
> 
> Can you please double check?
> And if in fact you can reproduce, tell us how, including logs?
> 
> 
> Thanks,
> 
> -- 
> : Lars Ellenberg

Hi Lars,

If I disable all the FC ports in the fiber switch just for the primary node, the node fences, reboots and comes up, as I would expect. With the exact same config, if I disable the FC ports just for the secondary node, the node just sits there and it even shows up as Secondary in /proc/drbd. That sounds odd and sounds like the config should be "diskless", but it is "call-local-io-error".

Here is the full config.

/etc/drbd.conf

## generated by drbd-gui

include "drbd.d/global_common.conf";
include "drbd.d/*.res";

/etc/drbd.d/global_common.conf:

## generated by drbd-gui

global {
		usage-count	yes;
}

common {
	handlers {
		fence-peer	/usr/lib/drbd/crm-fence-peer.sh;
		after-resync-target	/usr/lib/drbd/crm-unfence-peer.sh;
		local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
		split-brain "/usr/lib/drbd/notify-split-brain.sh root";
	}

	startup {
		degr-wfc-timeout	0;
	}

	net {
		shared-secret	1QP69G4kWDslx2TMiaEStI6bwaGH5y8d;
		after-sb-0pri discard-zero-changes;
		after-sb-1pri discard-secondary;
		after-sb-2pri disconnect;
	}

	disk {
		on-io-error	call-local-io-error;
		fencing	resource-and-stonith;
	}

}

/etc/drbd.d/mysql1.res:

resource mysql1 {
	net {
		cram-hmac-alg	sha1;
	}

	on bmimysqlt3.x.x.x {
		volume 0 {
			device		/dev/drbd0;
			disk		/dev/mapper/mysql_data1;
			flexible-meta-disk	internal;
		}
		address		x.x.x.x:7788;
	}
	on bmimysqlt4.x.x.x {
		volume 0 {
			device		/dev/drbd0;
			disk		/dev/mapper/mysql_data1;
			flexible-meta-disk	internal;
		}
		address		x.x.x.x:7788;
	}
}

Which logs are you wanting me to share?

Thanks,
Prakash