Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, How did you turn off the cache? And how much did that influence performance? I'll try, but I'd rather find a solution that doesn't involve turning off the cache. Wiebe ----- Original Message ----- > From: "Marcel Kraan" <marcel at kraan.net> > To: "Wiebe Cazemier" <wiebe at halfgaar.net> > Cc: drbd-user at lists.linbit.com > Sent: Thursday, 31 May, 2012 7:58:10 AM > Subject: Re: [DRBD-user] Xen DomU on DRBD device: barrier errors > Helllo Wiebe, > I had that also. > But when i turned down the cache on the storage, and the other vm > clients the error was gone. > So no cache on the VM's and it worked very good. > marcel > On 30 mei 2012, at 17:13, Wiebe Cazemier wrote: > > Hi, > > > I'm testing setting up a Xen DomU with a DRBD storage for easy > > failover. Most of the time, immediately after booting the DomU, I > > get an IO error: > > > [ 3.153370] EXT3-fs (xvda2): using internal journal > > > [ 3.277115] ip_tables: (C) 2000-2006 Netfilter Core Team > > > [ 3.336014] nf_conntrack version 0.5.0 (3899 buckets, 15596 max) > > > [ 3.515604] init: failsafe main process (397) killed by TERM signal > > > [ 3.801589] blkfront: barrier: write xvda2 op failed > > > [ 3.801597] blkfront: xvda2: barrier or flush: disabled > > > [ 3.801611] end_request: I/O error, dev xvda2, sector 52171168 > > > [ 3.801630] end_request: I/O error, dev xvda2, sector 52171168 > > > [ 3.801642] Buffer I/O error on device xvda2, logical block 6521396 > > > [ 3.801652] lost page write due to I/O error on xvda2 > > > [ 3.801755] Aborting journal on device xvda2. > > > [ 3.804415] EXT3-fs (xvda2): error: ext3_journal_start_sb: Detected > > aborted journal > > > [ 3.804434] EXT3-fs (xvda2): error: remounting filesystem read-only > > > [ 3.814754] journal commit I/O error > > > [ 6.973831] init: udev-fallback-graphics main process (538) > > terminated with status 1 > > > [ 6.992267] init: plymouth-splash main process (546) terminated > > with > > status 1 > > > The manpage of drbdsetup says that LVM (which I use) doesn't > > support > > barriers (better known as "tagged command queuing" or "native > > command queuing"), so I configured the DRBD device not to use > > barriers. This can be seen in /proc/drbd (by "wo:f, meaning flush, > > the next method drbd chooses after barrier): > > > 3: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- > > > ns:2160152 nr:520204 dw:2680344 dr:2678107 al:3549 bm:9183 lo:0 > > pe:0 > > ua:0 ap:0 ep:1 wo:f oos:0 > > > And on the other host: > > > 3: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r---- > > > ns:0 nr:2160152 dw:2160152 dr:0 al:0 bm:8052 lo:0 pe:0 ua:0 ap:0 > > ep:1 > > wo:f oos:0 > > > I also enabled the option disable_sendpage, as per the DRBD docs: > > > cat /sys/module/drbd/parameters/disable_sendpage > > > Y > > > I also tried adding barriers=0 to fstab as mount option. Still it > > says: > > > [ 58.603896] blkfront: barrier: write xvda2 op failed > > > [ 58.603903] blkfront: xvda2: barrier or flush: disabled > > > I don't even know if ext3 has a nobarrier option, but it does seem > > to > > work. But, because only one of my storage systems is battery > > backed, > > it would not be smart. > > > Why does it still compain about barriers when I disabled that? > > > Both hosts are: > > > Debian: 6.0.4 > > > uname -a: Linux 2.6.32-5-xen-amd64 > > > drbd: 8.3.7 > > > Xen: 4.0.1 > > > Guest: > > > Ubuntu 12.04 LTS > > > uname -a: Linux 3.2.0-24-generic pvops > > > drbd resource: > > > resource drbdvm > > > { > > > meta-disk internal; > > > device /dev/drbd3; > > > startup > > > { > > > # The timeout value when the last known state of the other side was > > available. 0 means infinite. > > > wfc-timeout 0; > > > # Timeout value when the last known state was disconnected. 0 means > > infinite. > > > degr-wfc-timeout 180; > > > } > > > syncer > > > { > > > # This is recommended only for low-bandwidth lines, to only send > > those > > > # blocks which really have changed. > > > #csums-alg md5; > > > # Set to about half your net speed > > > rate 60M; > > > # It seems that this option moved to the 'net' section in drbd 8.4. > > (later release than Debian has currently) > > > verify-alg md5; > > > } > > > net > > > { > > > # The manpage says this is recommended only in pre-production > > (because of its performance), to determine > > > # if your LAN card has a TCP checksum offloading bug. > > > #data-integrity-alg md5; > > > } > > > disk > > > { > > > # Detach causes the device to work over-the-network-only after the > > > # underlying disk fails. Detach is not default for historical > > reasons, but is > > > # recommended by the docs. > > > # However, the Debian defaults in drbd.conf suggest the machine > > will > > reboot in that event... > > > on-io-error detach; > > > # LVM doesn't support barriers, so disabling it. It will revert to > > flush. Check wo: in /proc/drbd. If you don't disable it, you get IO > > errors. > > > no-disk-barrier; > > > } > > > on host1 > > > { > > > # universe is a VG > > > disk /dev/universe/drbdvm-disk; > > > address 10.0.0.1:7792; > > > } > > > on host2 > > > { > > > # universe is a VG > > > disk /dev/universe/drbdvm-disk; > > > address 10.0.0.2:7792; > > > } > > > } > > > In my test setup: the primary host's storage is 9650SE SATA-II RAID > > PCIe with battery. The secondary is software RAID1. > > > Isn't DRBD+Xen widely used? With these problems, it's not going to > > work. > > > Any help welcome. > > > _______________________________________________ > > > drbd-user mailing list > > > drbd-user at lists.linbit.com > > > http://lists.linbit.com/mailman/listinfo/drbd-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120531/c46bef8d/attachment.htm>