[DRBD-user] Re: slow disk throughput

Mon Jun 16 22:51:29 CEST 2008

I should have also mentioned this is on Ubuntu Hardy:

uname -a:
Linux d242 2.6.24-16-server #1 SMP Thu Apr 10 13:58:00 UTC 2008 i686 
GNU/Linux

version: 8.0.11 (api:86/proto:86)
GIT-hash: b3fe2bdfd3b9f7c2f923186883eb9e2a0d3a5b1b build by phil at mescal, 
2008-02-12 11:56:43
 0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
    ns:0 nr:7484500 dw:7484500 dr:0 al:0 bm:165 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:147017 misses:165 starving:0 dirty:0 
changed:165
        act_log: used:0/1009 hits:0 misses:0 starving:0 dirty:0 changed:0

William Francis wrote:
>
>
> I'm trying to tune DRBD to be working between two new computers with 
> very fast RAID10 disk. I understand that there's a performance hit 
> when using DRBD but this seems unusually high from what I've read 
> elsewhere.
>
> My end usage is as a mail server serving about 100 people which means 
> lots of small reads and writes. I don't have a good test for that so 
> I'm using dd to test basic throughput and I'm seeing only about 
> 10-15%  performance on the DRBD partition compared with raw disk when 
> using vmstat to view. When we tried to bring the mail server up we saw 
> about 90%+ iowait times very often.
>
> Raw throughput on the same RAID10 array, non-drbd partition:
>
> time dd if=/dev/zero of=/tmp/delete/out.file bs=1M count=5000
>
> root at d243:/opt/delete# vmstat 1
> procs -----------memory---------- ---swap-- -----io---- -system-- 
> ----cpu----
> r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy 
> id wa
> 1  2    264 2458700  24396 1526716    0    0     8 169552 1141  195  1 
> 35 41 23
> 1  2    264 2276340  24584 1703764    0    0    12 182552 1213  308  0 
> 37 12 50
> 1  3    264 2117804  24752 1860468    0    0     8 177404 1109 1115  0 
> 39 10 51
>
>
> notice the 170+K block throughput
>
> DRBD partition, same dd command onto the DRBD partition:
>
> procs -----------memory---------- ---swap-- -----io---- -system-- 
> ----cpu----
> r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy 
> id wa
> 1  1    304 157968  15292 3802460    0    0     0  8622 6641 8026  0  
> 4 80 16
> 0  1    304 158256  15296 3802456    0    0     0  7208 10241 11711  
> 0  6 30 64
> 0  0    304 159156  15024 3801788    0    0     4  9238 1293 1073  0 
> 16 63 21
> 0  1    304 157912  15032 3803036    0    0     0 12273 8828 10401  0  
> 6 86  8
> 0  1    304 159208  15044 3801588    0    0     0 12278 8964 9651  0  
> 9 64 27
>
>
> now it does about 8K-12K blocks a second though it will do 25K for a 
> little while before settling down to this speed.
>
> The network link is gige and runs about 0.060ms ping times (no 
> "crossover" cable).
>
> The file system is ext3 with 4K blocks. if I stop drbd on the slave 
> machine I get about 20% performance increase. Should I expect more?
>
> Here's my drbd.conf - thanks for any ideas
>
> Will
>
>
>
> global {
>    usage-count yes;
> }
> common {
>  syncer { rate 100M; }
> }
> resource drbd0 {
>  protocol C;
>  handlers {
>    pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
>    pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
>    local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
>    outdate-peer "/usr/sbin/drbd-peer-outdater";
>  }
>  startup {
>  }
>  disk {
>    on-io-error   detach;
>  }
>  net {
>    max-buffers     2048;
>    unplug-watermark   2048;
>    max-epoch-size  2048;
>    after-sb-0pri discard-younger-primary;
>    after-sb-1pri consensus;
>    after-sb-2pri disconnect;
>    rr-conflict disconnect;
>  }
>  syncer {
>    rate 50M;
>    al-extents 1009;
>  }
>  on d242 {
>    device     /dev/drbd0;
>    disk       /dev/sda3;
>    address    10.2.8.17:7788;
>    meta-disk  internal;
>  }
>  on d243 {
>    device    /dev/drbd0;
>    disk      /dev/sda3;
>    address   10.2.8.18:7788;
>    meta-disk internal;
>  }
> }
>