[DRBD-user] tuning?

Lee Riemer lriemer at bestline.net
Sun Jun 6 23:14:22 CEST 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Don't forget in a raid 5 or 6 you're also writing paritys along with  
the actual data.



On Jun 5, 2010, at 5:34 PM, Miles Fidelman  
<mfidelman at meetinghouse.net> wrote:

> I wrote:
>> I've been doing some experimenting to see how far I can push some old
>> hardware into a virtualized environment - partially to see how much  
>> use
>> I can get out of the hardware, and partially to learn more about the
>> behavior of, and interactions between, software RAID, LVM, DRBD,  
>> and Xen.
>>
>> What I'm finding is that it's really easy to get into a state where  
>> one
>> of my VMs is spending all of its time in i/o wait (95%+).  Other  
>> times,
>> everything behaves fine.
>>
> Bart Coninckx replied:
>> Test the low level storage with bonnie++ by bringing DRBD down  
>> first and have
>> it on run on the RAID6. If it hits below 110 MB/sec, that is your  
>> bottleneck.
>> If it above, you might to replace the sync NICs by a bond. This  
>> will give you
>> about 180 MB/sec in mode 0. Then test with bonnie++ on top of  
>> active DRBD
>> resource.
>>
> and Michael Iverson wrote:
>> Your read performance is going to be limited by your RAID  
>> selection. Be prepared to experiment and document the performance  
>> of various different nodes.
>>
>> With a 1G interconnect, write performance will be dictated by  
>> network speed. You'll want jumbo frames at a minimum, and might  
>> have to mess with buffer sizes. Keep in mind that latency is just  
>> as important as throughput.
> <snip>
>> However, I think you'll need to install a benchmark like iozone,  
>> and spend a lot of time doing before/after comparisons.
> And to summarize the configuration again:
>> - two machines, 4 disk drives each, two 1G ethernet ports (1 each  
>> to the
>> outside world, 1 each as a cross-connect)
>> - each machine runs Xen 3 on top of Debian Lenny (the basic install)
>> - very basic Dom0s - just running the hypervisor and i/o (including  
>> disk
>> management)
>> ---- software RAID6 (md)
>> ---- LVM
>> ---- DRBD
>> ---- heartbeat to provide some failure migration
>> - each Xen VM uses 2 DRBD volumes - one for root, one for swap
>> - one of the VMs has a third volume, used for backup copies of files
>>
>>
> First off, thanks for the suggestions guys!
>
> What I've tried so far, which leaves me just a bit confused:
>
> TEST 1
> - machine 1: running a mail server, in a DomU, on DRBD root and swap  
> volumes, on LVs, on raid6 (md)
> --- baseline operation, disk wait seems to vary from 0% to about 25%  
> while running mail
> --- note: when this was a non-virtualized machine, running on a  
> RAID-1 volume, never saw disk waits
> - machine 2: just running a Dom0, DRBD is mirroring volumes from  
> machine 1
> --- Dom0's root and swap are directly on raid6 md volumes
> --- installed bonnie++ into Dom0, ran it
> --- different tests showed a range of speeds from around 50MB/sec to  
> 80MB/sec (not blindingly fast)
>
> TEST2
> - same as above, but TURNED OFF DRBD on machine 2
> -- some improvement, but not a lot - one test went from 80MB/sec to  
> 90MB/sec
>
> TEST3
> - tuned DRBD back on on machine 2
> - added a domU to machine 2
> - ran bonnie++ inside the domU
> -- reported test speeds dropped to 23M/sec to 54M/sec, depending on  
> the test
> -- I saw up to 30MB/sec of traffic on the cross-connect ethernet  
> (vnstat) - nothing approaching the 1G theoretical limit
>
> TEST4
> - started a 2nd domU on machine2
> - re-ran the test (inside the other domU)
> - reported speeds dropped marginally (20M - 50M)
>
> TEST5
> - moved to machine 1 (the one running the mail server), left one  
> domU running on the other machine
> - while mail server was running in domU; ran bonnie++ in dom0
> -- reported speeds from 31M to 44M
> -- interestingly, saw nothing above 1MB/sec on the cross-connect,  
> even though dom0 has priority
>
> TEST6
> - again, on the mail server machine
> - started a 2nd domU, ran bonnie++ in the 2nd domU
> --- reported speeds of 23M up to 72M; up to 30M/sec on the cross- 
> connect
> --- what was noticeable was that the mail server's i/o wait time  
> (top) moved up from 5-25% to more like 25-50%
>
> TEST7
> - as above, but ran bonnie++ in the same domU as the mail server
> - reported speeds dropped to 34M-60M depending on the test
> - most noticeable: started seeing i/o wait time pushing up to 90%,  
> highest during the "writing intelligently" and "reading  
> intelligently" tests
>
> OTHER DATA POINTS
> - when running basic mail and list service, the domU runs at about  
> 25% i/o wait as reported by top
> - when I start a tar job, i/o wait jumps up to the 70-90% range
> - i/o wait seems to drop just slightly if the tar job is reading  
> from one DRBD volume and writing to another (somewhat  
> counterintuitive as it would seem that there's more complexity  
> involved)
>
> Overall, I'm really not sure what to make of this.  It seems like:
> - there's a 40-50% drop in disk throughput when I add LVM, DRBD, and  
> a domU on top of raid6
> - the network is never particularly loaded
> - lots of disk i/o pushes a lot of cpu cycles into i/o wait - BUT...  
> it's not clear what's going on during those wait cycles
>
> I'm starting to wonder if this is more a function of the hypervisor  
> and/or memory/caching issues than the underlying disk stack.  Any  
> reactions, thoughts, diagnostic suggestions?
>
> Thanks again,
>
> Miles Fidelman
>
>
> -- 
> In theory, there is no difference between theory and practice.
> In<fnord>  practice, there is.   .... Yogi Berra
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user



More information about the drbd-user mailing list