[DRBD-user] DRBD 8.2.5 on LVM device: "local disk flush failed with status -5"

Fri Feb 22 17:54:29 CET 2008

On Fri, Feb 22, 2008 at 01:20:45PM +0100, Anders Henke wrote:
> Hi,
> 
> I'm using Kernel 2.6.24.2 with DRBD 8.2.5 
> (9faf052fdae5ef0c61b4d03890e2d2eab550610c) on top of an LVM2 device (LV):
> 
>   device    /dev/drbd0;
>   disk      /dev/vg/drbd;
>   meta-disk internal;
> 
> ... which leads to "flooding" the kernel logs on the secondary:
> 
> kernel: [  167.434201] drbd0: local disk flush failed with status -5
> kernel: [  167.964981] drbd0: local disk flush failed with status -5
> kernel: [  168.250102] drbd0: local disk flush failed with status -5
> kernel: [  168.345999] drbd0: local disk flush failed with status -5
> kernel: [  168.522441] drbd0: local disk flush failed with status -5
> kernel: [  168.666767] drbd0: local disk flush failed with status -5
> kernel: [  168.731338] drbd0: local disk flush failed with status -5
> 
> After moving the lower device from /dev/vg/drbd to /dev/sda5,
> the message completely disappeared.
> 
> DRBD tries to flush the metadata with write barriers enabled, but an LVM-LV 
> doesn't support write barriers - which gives this message. However, DRBD
> does correctly check for EOPNOTSUPP to detect this situation:
> 
> drbd-8.2.5/drbd/drbd_receiver.c:
> [...]
>         /* BarrierAck may imply that the corresponding extent is dropped
>  * from
>          * the activity log, which means it would not be resynced in
>          * case the
>          * Primary crashes now.
>          * Just waiting for write_completion is not enough,
>          * better flush to make sure it is all on stable storage. */
>         if (!test_bit(LL_DEV_NO_FLUSH, &mdev->flags) && inc_local(mdev))
> {
>                 rv = blkdev_issue_flush(mdev->bc->backing_bdev, NULL);
>                 dec_local(mdev);
>                 if (rv == -EOPNOTSUPP) /* don't try again */
>                         set_bit(LL_DEV_NO_FLUSH, &mdev->flags);

[mark]

>                 if (rv)
>                         ERR("local disk flush failed with status
> %d\n",rv);
>         }
> [...]
> 
> XFS-users are accustomed of non-barrier-enabled devices by the 
> message "Disabling barriers, not supported by the underlying device"
> when trying to mount such a device:
> 
> [ 2724.092649] Filesystem "drbd0": Disabling barriers, not supported by the underlying device
> 
> However, for XFS, this message usually only occurs once (during mount) and if 
> you don't care about the write barriers, you can also choose to disable write 
> barrier support at all by supplying the mount option "nobarrier" to xfs (which
> is also recommended for e.g RAID-devices using battery-backed up write caches).

right. see [mark] above, if the underlying device does not support it,
it is expected to return "EOPNOTSUPP", in which case we remember that it
does not support this, and do not try again. (if I understand the kernel
block api correctly, that is).

unfortunately it returns "EIO".

so we DO try again.

> Right now, device mapper devices (like dmraid or lvm2) and multipath-enabled
> devices don't support write barriers. Md devices only do support write barriers
> when RAID1 (mirroring) is being used and all underlying devices also do have 
> write barrier support. Some other drivers like ide-disk don't seem to
> support write barriers, too.

this is not the barrier, it is a flush operation,
which is not exactly the same.

> According to dm_request() from linux-2.6.24.2/drivers/md/dm.c, barrier 
> requests aren't being forwarded and should return an EOPNOTSUPP (which
> DRBD would catch and consequently set LL_DEV_NO_FLUSH):
> 
> static int dm_request(struct request_queue *q, struct bio *bio)
> {
>         int r = -EIO;
>         int rw = bio_data_dir(bio);
>         struct mapped_device *md = q->queuedata;
> 
>         /*
>          * There is no use in forwarding any barrier request since we
>          * can't
>          * guarantee it is (or can be) handled by the targets correctly.
>          */
>         if (unlikely(bio_barrier(bio))) {
>                 bio_endio(bio, -EOPNOTSUPP);
>                 return 0;
>         }
> [...]
> (Notice the -EIO default value: it's the 'status -5' from DRBD's message)
> 
> I'm still a little puzzled why XFS does see that the lvm device isn't capable 
> of barriers (checking or QUEUE_ORDERED_NONE-queing) and DRBD (correctly
> checking for EOPNOTSUPP upon blkdev_issue_flush()) doesn't detect this also.

because it is submitting a barrier request,
where drbd in this specific code path does a blkdev_issue_flush.

> My suggestion is to add a similar check for barrier support to DRBD like 
> the one the XFS guys do use and set LL_DEV_NO_FLUSH (and maybe also 
> MD_NO_BARRIER) accordingly; or check why DRBD doesn't catch an
> EOPNOTSUPP to disable the barrier flushes.

DRBD DOES catch the EOPNOTSUPP for blkdev_issue_flush and
BIO_RW_BARRIER. the lvm implementation of blkdev_issue_flush in your
kernel aparently does return EIO, though, for blkdev_issue_flush,
which is intentionally not caught.

but, yes,
we should probably rate_limit that message, anyways, and/or keep a
failure count per drbd, and disable it after 10 failures regardless of
failure type.

-- 
: Lars Ellenberg                           http://www.linbit.com :
: DRBD/HA support and consulting             sales at linbit.com :
: LINBIT Information Technologies GmbH      Tel +43-1-8178292-0  :
: Vivenotgasse 48, A-1120 Vienna/Europe     Fax +43-1-8178292-82 :
__
please use the "List-Reply" function of your email client.