Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Saturday 05 June 2010 16:27:18 Miles Fidelman wrote: > Hi Folks, > > I've been doing some experimenting to see how far I can push some old > hardware into a virtualized environment - partially to see how much use > I can get out of the hardware, and partially to learn more about the > behavior of, and interactions between, software RAID, LVM, DRBD, and Xen. > > What I'm finding is that it's really easy to get into a state where one > of my VMs is spending all of its time in i/o wait (95%+). Other times, > everything behaves fine. > > So... I'm curious about where the bottlenecks are. > > What I'm running: > - two machines, 4 disk drives each, two 1G ethernet ports (1 each to the > outside world, 1 each as a cross-connect) > - each machine runs Xen 3 on top of Debian Lenny (the basic install) > - very basic Dom0s - just running the hypervisor and i/o (including disk > management) > ---- software RAID6 (md) > ---- LVM > ---- DRBD > ---- heartbeat to provide some failure migration > - each Xen VM uses 2 DRBD volumes - one for root, one for swap > - one of the VMs has a third volume, used for backup copies of files > > What I'd like to dig into: > - Dom0 plus one DomU running on each box > - only one of the DomUs is doing very much - and it's runnin about 90% > idle, the rest split between user cycles and wait cycles > - start a disk intensive job on the DomU (e.g., tar a bunch of files on > the root LV, put them on the backup LV) > - i/o WAIT goes through the roof > > It's pretty clear that this configuration generates a lot of complicated > disk activity. Since DRBD is at the top of the disk stack, I figure > this list is a good place to ask the question: > > Any suggestions on how to track down where the delays are creeping in, > what might be tunable, and any good references on these issues? > > Thanks very much, > > Miles Fidelman > I believe software RAID is not a good basis for this kind of setup. RAID6 is neither. RAID10 or RAID50 would be better. Test the low level storage with bonnie++ by bringing DRBD down first and have it on run on the RAID6. If it hits below 110 MB/sec, that is your bottleneck. If it above, you might to replace the sync NICs by a bond. This will give you about 180 MB/sec in mode 0. Then test with bonnie++ on top of active DRBD resource. Good luck, Bart