Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tue, Jun 01, 2010 at 05:47:34PM +0200, Florian Haas wrote: > On 2010-06-01 15:17, Caspar Smit wrote: > > I tested the following situation: > > > > - I pulled out BOTH sdb and sdc on node 1. > > - Did a write action on the iscsitarget from a client machine. > > - DRBD (on-io-error detach) detached the primary node 1 and became: > > Primary/Secondary - Diskless/UpToDate > > - The performance degraded significantly after it became diskless. > > > > Is it possible for the Linbit OCF RA script to detect that a low level IO > > error occured on the DRBD backed storage (md0)? > > > > What I would like is that in a case of low level failure (software raid > > failure) that a failover takes place and node 2 becomes the primary > > because now that doesn't happen. > > It doesn't hurt to peruse the documentation. :) > > http://www.drbd.org/users-guide/s-handling-disk-errors.html > http://www.drbd.org/users-guide/s-configure-io-error-behavior.html > > If you use a local-io-error handler that simply does > "echo o > /proc/sysrq-trigger", then in case of an I/O error the node > will remove itself from the cluster and Pacemaker will initiate failover. That may be a bit brutal, but should of course work. The less brutal way would use the preference scores. The monitor action of ocf:linbit:drbd does adjust its master score based on various things, also on the local disk state. If the local disk state is not UpToDate, the master score will be <= 10, if the local disk is UpToDate and we are connected, it will be 10000. So unless you have other scores that weight more, I expect the PE to decide that moving the resources over would be a good idea. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed