[DRBD-user] Using LVM with DRBD/RHEL 5

Lars Ellenberg lars.ellenberg at linbit.com
Fri May 27 12:10:40 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Fri, May 27, 2011 at 06:53:16PM +0900, Junko IKEDA wrote:
> Hi,
> 
> > If with direct io, you do not get larger requests than 4k in the
> > "virtual" layers, your (in kernel) device mapper and/or DRBD are
> > too old.
> >
> > If they even don't get merged into larger requests in the "real"
> > device queue, then there is something wrong there as well.
> 
> ok, RHEL 5.2 is pretty old,
> so we have to try this with RHEL 5.6 + DRBD 8.4.0, right?

No, 8.3.10 should allow for 128k bios, as I have demonstrated.  If it
does not, then you need to upgrade your kernel (device mapper), or
figure out what else in the stack prevents the requests to be merged.

> Is DRBD 8.4.0 release in June?

I certainly hope so.

As a .0 it may have some issues still, though,
so as always, even though we obviously do our own regression testing,
I'd recommend careful testing in your own lab.

> >> there is no I/O error but Pacemaker detects it as DRBD's error(monitor
> >> Timed out).
> >
> > What exactly is timing out,
> > and what is the time out?
> 
> This problem arose at the customer's place,
> and we are now asking them to see the logs.
> If we can get them, we 'll post them here.
> 
> from what I've gathered,
> They did "mke2fs" during DRBD + Pacemaker are running,
> # mke2fs -F -j /dev/vg3/lv0
> 
> after that, crm_mon showed that LVM RA failed.
> see attached.


LVM monitor action timed out.
That's very much different from "DRBD error" ;-)

Yes, we have reports that LVM monitor may timeout sometimes
when there is other io load in the system.
This seems to be an issue with IO scheduling,
and not directly related to DRBD, but we are still investigating.

As it does not gain much information, at this point we recommend to
disable LVM RA monitoring, or allow very high timeouts (minutes rather
than seconds).

You monitor the Filesystem RA and/or other services using the LV,
so when they are ok, the LV is ok as well, right?

> Failed actions:
>     res_portunblock_0_stop_0 (node=node02.tyo.**********.co.jp, call=32, rc=1, status=complete): unknown error
>     res_lvm_vg0_monitor_120000 (node=node01.tyo.**********.co.jp, call=70, rc=-2, status=Timed Out): unknown exec error
>     res_lvm_vg2_monitor_0 (node=node01.tyo.**********.co.jp, call=218, rc=-2, status=Timed Out): unknown exec error
>     res_lvm_vg2_stop_0 (node=node01.tyo.**********.co.jp, call=233, rc=-2, status=Timed Out): unknown exec error
>     res_lvm_vg1_monitor_0 (node=node01.tyo.**********.co.jp, call=250, rc=-2, status=Timed Out): unknown exec error


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list