[DRBD-user] [LVM2 + DRBD + Xen + DRBD 8.0] error on dom0 (the physical server) and on domU (the virtual machine)

Lars Ellenberg lars.ellenberg at linbit.com
Thu Aug 16 15:10:46 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On Thu, Aug 16, 2007 at 10:35:25AM +0200, Maxim Doucet wrote:
> Sorry for the duplicate, a problem with my email client.
> Here, is a forwarded email from the xen-users mailing list where someone
> encountered the same problem.
> A workaround is given, and further testing is done so I can only
> recommend to read it.
> The forwarded mail
> (http://lists.xensource.com/archives/html/xen-users/2007-08/msg00375.html) 
> :
> > On Tue, 14 Aug 2007, Maxim Doucet wrote:
> >
> >> I experience the following error messages when launching the virtual
> >> machine :
> >> *On dom0 : the physical server* (messages coming from dmesg) :
> >> drbd0: bio would need to, but cannot, be split:
> >> (vcnt=2,idx=0,size=2048,sector=126353855)
> >> drbd0: bio would need to, but cannot, be split:
> >> (vcnt=2,idx=0,size=2048,sector=126353855)
> >
> > We are using a nearly identical configuration and experienced the same
> > problem just today:
> >
> > LVM2 on DRBD under Xen 3.0.3 w/ DRBD 8.0.4 Using CentOS5 on x86_64
> > dom0 kernel 2.6.18-8.1.8-el5xen
> >
> > The virtual machine is an FC6 x86_64 PV guest and gave similar guest
> > errors.
> >
> > The workaround we are using is to change
> >
> > disk = [ 'phy:/dev/vg-drbd/vm0,xvda,w' ]
> >    to
> > disk = [ 'tap:aio:/dev/vg-drbd/vm0,xvda,w' ]
> >
> > This treats the underlying backing image as a file.  This may have
> > some performance loss since it is not using direct device IO, but as
> > far as I can tell it is stable.  Or at least, phy: fails miserably,
> > where tap:aio: works fine!
> >
> > This seems to indicate that its not an LVM+DRBD or Xen+LVM problem,
> > but rather a Xen+LVM+DRBD using phy: problem.  I tested to see if Xen
> > liked running LVM on a loopback device and loading a VM off it using
> > phy: (see below).  It worked fine, which makes me think this is more
> > of a drbd issue than a Xen or LVM issue.

the "problem" as I see it, is, that the xen virtual block device layer
makes wrong assumtions, creates its own bios, maybe even
respecting "max-segmend-size", but aparently completely ignoring the
"bdev_merge_bvec_fn". If you want to add a page to a bio, you have to
use bio_add_page. you must not just assume that, because the device has
a max_segment_size of 32k, that it will accept a bio containing a bvec
of 4 pages at every offset. this is not true. it may have offsets where
it can only accept a single page (and then even have to split that page
internally into two bios).

we have seen this also on md raid5, or md raid0, *no DRBD involved*.

most drivers/devices do not have any special offset limitation,
but raid5 or raid0 have their chunk size (raid1 does not, also linear
does not, apart from the device borders).

interesstingly when you have a devicemapper on top of some other device
with a merge_bvec_fn, devicemapper will announce a max segment size of
4K only, which should mask this away. you could try to verify this by
using a dm linear mapping on top of drbd.

I did not read xen code, but I assume that it basically does
   b = bio_alloc(,4);
   b->bi_io_vec[0].page = page0; offset...; len...;
   b->bi_io_vec[1].page = page1; ...
   b->bi_io_vec[2].page = page2; ...
   b->bi_io_vec[3].page = page3; ...

where it should do
  b = bio_alloc(...); 
  if (!b) whatever;
  initialize bio with target block device etc.
  until all pages to be submitted are submitted:
	  if (bio_add_page(b,page,len,offset) != len) {
		submit current bio;
		b = bio_alloc(...)
		if (!b) whatever;
  		initialize bio with target block device etc.

but maybe I misunderstand something,
so please correctme if I'm wrong.

: Lars Ellenberg                            Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
please use the "List-Reply" function of your email client.

More information about the drbd-user mailing list