Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I wrote: > I've been doing some experimenting to see how far I can push some old > hardware into a virtualized environment - partially to see how much use > I can get out of the hardware, and partially to learn more about the > behavior of, and interactions between, software RAID, LVM, DRBD, and Xen. > > What I'm finding is that it's really easy to get into a state where one > of my VMs is spending all of its time in i/o wait (95%+). Other times, > everything behaves fine. > Bart Coninckx replied: > Test the low level storage with bonnie++ by bringing DRBD down first and have > it on run on the RAID6. If it hits below 110 MB/sec, that is your bottleneck. > If it above, you might to replace the sync NICs by a bond. This will give you > about 180 MB/sec in mode 0. Then test with bonnie++ on top of active DRBD > resource. > and Michael Iverson wrote: > Your read performance is going to be limited by your RAID selection. > Be prepared to experiment and document the performance of various > different nodes. > > With a 1G interconnect, write performance will be dictated by network > speed. You'll want jumbo frames at a minimum, and might have to mess > with buffer sizes. Keep in mind that latency is just as important as > throughput. <snip> > However, I think you'll need to install a benchmark like iozone, and > spend a lot of time doing before/after comparisons. And to summarize the configuration again: > - two machines, 4 disk drives each, two 1G ethernet ports (1 each to the > outside world, 1 each as a cross-connect) > - each machine runs Xen 3 on top of Debian Lenny (the basic install) > - very basic Dom0s - just running the hypervisor and i/o (including disk > management) > ---- software RAID6 (md) > ---- LVM > ---- DRBD > ---- heartbeat to provide some failure migration > - each Xen VM uses 2 DRBD volumes - one for root, one for swap > - one of the VMs has a third volume, used for backup copies of files > > First off, thanks for the suggestions guys! What I've tried so far, which leaves me just a bit confused: TEST 1 - machine 1: running a mail server, in a DomU, on DRBD root and swap volumes, on LVs, on raid6 (md) --- baseline operation, disk wait seems to vary from 0% to about 25% while running mail --- note: when this was a non-virtualized machine, running on a RAID-1 volume, never saw disk waits - machine 2: just running a Dom0, DRBD is mirroring volumes from machine 1 --- Dom0's root and swap are directly on raid6 md volumes --- installed bonnie++ into Dom0, ran it --- different tests showed a range of speeds from around 50MB/sec to 80MB/sec (not blindingly fast) TEST2 - same as above, but TURNED OFF DRBD on machine 2 -- some improvement, but not a lot - one test went from 80MB/sec to 90MB/sec TEST3 - tuned DRBD back on on machine 2 - added a domU to machine 2 - ran bonnie++ inside the domU -- reported test speeds dropped to 23M/sec to 54M/sec, depending on the test -- I saw up to 30MB/sec of traffic on the cross-connect ethernet (vnstat) - nothing approaching the 1G theoretical limit TEST4 - started a 2nd domU on machine2 - re-ran the test (inside the other domU) - reported speeds dropped marginally (20M - 50M) TEST5 - moved to machine 1 (the one running the mail server), left one domU running on the other machine - while mail server was running in domU; ran bonnie++ in dom0 -- reported speeds from 31M to 44M -- interestingly, saw nothing above 1MB/sec on the cross-connect, even though dom0 has priority TEST6 - again, on the mail server machine - started a 2nd domU, ran bonnie++ in the 2nd domU --- reported speeds of 23M up to 72M; up to 30M/sec on the cross-connect --- what was noticeable was that the mail server's i/o wait time (top) moved up from 5-25% to more like 25-50% TEST7 - as above, but ran bonnie++ in the same domU as the mail server - reported speeds dropped to 34M-60M depending on the test - most noticeable: started seeing i/o wait time pushing up to 90%, highest during the "writing intelligently" and "reading intelligently" tests OTHER DATA POINTS - when running basic mail and list service, the domU runs at about 25% i/o wait as reported by top - when I start a tar job, i/o wait jumps up to the 70-90% range - i/o wait seems to drop just slightly if the tar job is reading from one DRBD volume and writing to another (somewhat counterintuitive as it would seem that there's more complexity involved) Overall, I'm really not sure what to make of this. It seems like: - there's a 40-50% drop in disk throughput when I add LVM, DRBD, and a domU on top of raid6 - the network is never particularly loaded - lots of disk i/o pushes a lot of cpu cycles into i/o wait - BUT... it's not clear what's going on during those wait cycles I'm starting to wonder if this is more a function of the hypervisor and/or memory/caching issues than the underlying disk stack. Any reactions, thoughts, diagnostic suggestions? Thanks again, Miles Fidelman -- In theory, there is no difference between theory and practice. In<fnord> practice, there is. .... Yogi Berra