[DRBD-user] "local disk flush failed with status -5" on LVM

Lars Ellenberg lars.ellenberg at linbit.com
Tue May 13 16:11:11 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tue, May 13, 2008 at 10:58:29AM +0200, Iustin Pop wrote:
> On Tue, May 13, 2008 at 10:50:22AM +0200, Lars Ellenberg wrote:
> > On Sat, May 10, 2008 at 12:28:00PM +0200, Iustin Pop wrote:
> > > Philipp Reisner wrote:
> > > > Am Sonntag, 4. Mai 2008 02:19:12 schrieb Wolfgang Denk:
> > > > > Hi,
> > > > >
> > > > > I'm trying to run DRBD on top of a LV, and get flooded with above
> > > > > error messages. I know this has been discussed before, see threads
> > > > > starting at
> > > > > http://lists.linbit.com/pipermail/drbd-user/2008-February/008665.html
> > > > > and
> > > > > http://lists.linbit.com/pipermail/drbd-user/2008-February/008519.html
> > > > >
> > > > > When this was discussed in February, it sounded (at least to me) as is
> > > > > a fix was on the way, see
> > > > > http://lists.linbit.com/pipermail/drbd-user/2008-February/008692.html
> > > > >
> > > > > However, even top of tree from the git repo still shows the same
> > > > > behaviour.
> > > > >
> > > > > Am I missing something, or is this usage mode so exotic  that  nobody
> > > > > cares?
> > > > >
> > > > 
> > > > Hi Wolfgang,
> > > > 
> > > > That is actually a kernel bug, I think in 2.6.24. Was fixed later, do not
> > > > know by heart with which "sucker" release. I guess it is fixed in 2.6.25.
> > > > 
> > > > Starting with 8.0.12 we offer a workaround for this in DRBD (and 8.2.6 
> > > > when I finally find the time to finish it):
> > > > 
> > > >   Add no-disk-flushes and no-md-flushes to your disk config.
> > > 
> > > Because this happens not only with LVM, but with any I/O subsystem that
> > > returns wrong error codes from flushes (e.g. broken scsi drivers or
> > > controller, I think), would it be a sane thing to disable barriers
> > > automatically if there after a certain number of errors?
> > > 
> > > (Looking at the barrier flush code I see that only the drbd_receiver.c
> > > has code for auto-disabling in case of EOPNOTSUPP, but drbd_actlog and
> > > drbd_bitmap.c don't; maybe these too should have this).
> > 
> > hm?
> > I think we do have a retry-and-disable-barriers in those places too.
> 
> I must be wrong then; I'm looking at the drbd 8.0 git tree, and I see in
> drbd_bitmap.c:
> 
>         if (rw == WRITE) {
>                 /* swap back endianness */
>                 bm_lel_to_cpu(b);
>                 /* flush bitmap to stable storage */
>                 if (!test_bit(MD_NO_BARRIER,&mdev->flags))
>                         blkdev_issue_flush(mdev->bc->md_bdev, NULL);
> 
> (around line 745). This just issues the flush, and no retry/disable in place
> (it uses the same blkdev_issue_flush as drbd_receiver.c, and there's no check
> of the return value).
> 
> What am I missing here? Wrong git tree?

grep for set_bit MD_NO_BARRIER

> > > The reason I propose this is because with many deployments on different
> > > machines it would be better to let it always enabled at startup and
> > > allow it to autodisable if it see EOPNOTSUPP
> > 
> > that is the way we do it.
> > 
> > > or too many other errors.
> > 
> > and that is what we don't.
> 
> Would it make sense to do it if no blkdev_issue_flush is ever successfull?
> 
> > > And people can't always track latest upstream kernel...
> > 
> > if they are stuck with a kernel where DRBD spits out too much
> > noise due to barrier requests throwing IO errors,
> > then they have to disable use of barriers in the drbd config.
> 
> Ok, let me explain some more. If you have deployments on the order of hundreds
> of machines, with various types of controllers, it would be easier to let the
> config always have barriers enabled and rely on auto-disable if *no single
> flush is ever successfull*.

buy a support contract,
have a script parse log files and auto-adjust them,
or send a patch.

-- 
: Lars Ellenberg                           http://www.linbit.com :
: DRBD/HA support and consulting             sales at linbit.com :
: LINBIT Information Technologies GmbH      Tel +43-1-8178292-0  :
: Vivenotgasse 48, A-1120 Vienna/Europe     Fax +43-1-8178292-82 :
__
please don't Cc me, but send to list -- I'm subscribed



More information about the drbd-user mailing list