Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
re. previous messages on this topic: It's absolutely amazing with mounting volumes with "noatime" set will do to reduce i/o wait times! Took a while to figure this out, though. Miles > >> I wrote: >>> I've been doing some experimenting to see how far I can push some old >>> hardware into a virtualized environment - partially to see how much use >>> I can get out of the hardware, and partially to learn more about the >>> behavior of, and interactions between, software RAID, LVM, DRBD, and >>> Xen. >>> >>> What I'm finding is that it's really easy to get into a state where one >>> of my VMs is spending all of its time in i/o wait (95%+). Other times, >>> everything behaves fine. >>> >> Bart Coninckx replied: >>> Test the low level storage with bonnie++ by bringing DRBD down first >>> and have >>> it on run on the RAID6. If it hits below 110 MB/sec, that is your >>> bottleneck. >>> If it above, you might to replace the sync NICs by a bond. This will >>> give you >>> about 180 MB/sec in mode 0. Then test with bonnie++ on top of active >>> DRBD >>> resource. >>> >> and Michael Iverson wrote: >>> Your read performance is going to be limited by your RAID selection. >>> Be prepared to experiment and document the performance of various >>> different nodes. >>> >>> With a 1G interconnect, write performance will be dictated by >>> network speed. You'll want jumbo frames at a minimum, and might have >>> to mess with buffer sizes. Keep in mind that latency is just as >>> important as throughput. >> <snip> >>> However, I think you'll need to install a benchmark like iozone, and >>> spend a lot of time doing before/after comparisons. >> And to summarize the configuration again: >>> - two machines, 4 disk drives each, two 1G ethernet ports (1 each to >>> the >>> outside world, 1 each as a cross-connect) >>> - each machine runs Xen 3 on top of Debian Lenny (the basic install) >>> - very basic Dom0s - just running the hypervisor and i/o (including >>> disk >>> management) >>> ---- software RAID6 (md) >>> ---- LVM >>> ---- DRBD >>> ---- heartbeat to provide some failure migration >>> - each Xen VM uses 2 DRBD volumes - one for root, one for swap >>> - one of the VMs has a third volume, used for backup copies of files >>> >>> >> First off, thanks for the suggestions guys! >> >> What I've tried so far, which leaves me just a bit confused: >> >> TEST 1 >> - machine 1: running a mail server, in a DomU, on DRBD root and swap >> volumes, on LVs, on raid6 (md) >> --- baseline operation, disk wait seems to vary from 0% to about 25% >> while running mail >> --- note: when this was a non-virtualized machine, running on a >> RAID-1 volume, never saw disk waits >> - machine 2: just running a Dom0, DRBD is mirroring volumes from >> machine 1 >> --- Dom0's root and swap are directly on raid6 md volumes >> --- installed bonnie++ into Dom0, ran it >> --- different tests showed a range of speeds from around 50MB/sec to >> 80MB/sec (not blindingly fast) >> >> TEST2 >> - same as above, but TURNED OFF DRBD on machine 2 >> -- some improvement, but not a lot - one test went from 80MB/sec to >> 90MB/sec >> >> TEST3 >> - tuned DRBD back on on machine 2 >> - added a domU to machine 2 >> - ran bonnie++ inside the domU >> -- reported test speeds dropped to 23M/sec to 54M/sec, depending on >> the test >> -- I saw up to 30MB/sec of traffic on the cross-connect ethernet >> (vnstat) - nothing approaching the 1G theoretical limit >> >> TEST4 >> - started a 2nd domU on machine2 >> - re-ran the test (inside the other domU) >> - reported speeds dropped marginally (20M - 50M) >> >> TEST5 >> - moved to machine 1 (the one running the mail server), left one domU >> running on the other machine >> - while mail server was running in domU; ran bonnie++ in dom0 >> -- reported speeds from 31M to 44M >> -- interestingly, saw nothing above 1MB/sec on the cross-connect, >> even though dom0 has priority >> >> TEST6 >> - again, on the mail server machine >> - started a 2nd domU, ran bonnie++ in the 2nd domU >> --- reported speeds of 23M up to 72M; up to 30M/sec on the cross-connect >> --- what was noticeable was that the mail server's i/o wait time >> (top) moved up from 5-25% to more like 25-50% >> >> TEST7 >> - as above, but ran bonnie++ in the same domU as the mail server >> - reported speeds dropped to 34M-60M depending on the test >> - most noticeable: started seeing i/o wait time pushing up to 90%, >> highest during the "writing intelligently" and "reading >> intelligently" tests >> >> OTHER DATA POINTS >> - when running basic mail and list service, the domU runs at about >> 25% i/o wait as reported by top >> - when I start a tar job, i/o wait jumps up to the 70-90% range >> - i/o wait seems to drop just slightly if the tar job is reading from >> one DRBD volume and writing to another (somewhat counterintuitive as >> it would seem that there's more complexity involved) >> >> Overall, I'm really not sure what to make of this. It seems like: >> - there's a 40-50% drop in disk throughput when I add LVM, DRBD, and >> a domU on top of raid6 >> - the network is never particularly loaded >> - lots of disk i/o pushes a lot of cpu cycles into i/o wait - BUT... >> it's not clear what's going on during those wait cycles >> >> I'm starting to wonder if this is more a function of the hypervisor >> and/or memory/caching issues than the underlying disk stack. Any >> reactions, thoughts, diagnostic suggestions? >> >> Thanks again, >> >> Miles Fidelman >> >> >> -- >> In theory, there is no difference between theory and practice. >> In<fnord> practice, there is. .... Yogi Berra >> >> >> _______________________________________________ >> drbd-user mailing list >> drbd-user at lists.linbit.com >> http://lists.linbit.com/mailman/listinfo/drbd-user -- In theory, there is no difference between theory and practice. In<fnord> practice, there is. .... Yogi Berra