Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
hi On Fri, 2011-07-01 at 16:44 -0500, Zev Weiss wrote: > > (Re-sending since my prior attempt via gmane doesn't appear to have > worked; > apologies if this is duplicated.) > > Hi, > > I'm seeing similar problems with massive underperformance on writes on > my > system. I'm running locally-compiled DRBD [version: 8.3.7 (api:88/ > proto:86-91), > GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917] on RHEL 5. Read > performance > is fine, but I'm getting write throughput of about 4MB/s, with latency > around > 13ms (as measured with a script similar to the one in the user's guide). > > I haven't yet tried tuning various parameters in drbd.conf as > described in the > available performance-optimization guides, but it's so slow I have to > think > there must be some more fundamental problem at play here than a lack of > optimization (i.e. the defaults shouldn't be *that* bad). > > And for what it's worth, I've ruled out the network link as a possible > bottleneck -- it's giving me 1.97Gbps throughput in both directions > according to > iperf (a bonded pair of back-to-back GbE ports, 9k MTU). > > Anyone have any suggestions or advice? > we had similar problems, but recently i found a setup which gives good write performance. the problem i have is that i don't know exactly what was wrong with my initial setup and which config change gave the performance boost. we have ext4 cluster resources on cluster lvm and kvm virtual machines directly on drbd device (qemu-kvm ..,cache=none) dd below was done inside virtual machine (2gb ram und 2 virtual cores) on ext4 filesystem /var/spool/imap. as soon as i fire up the dd the replication runs at maximum network speed (about 119M). [root at cmail imap]# dd if=/dev/zero of=test.img bs=1024k count=10k 10240+0 records in 10240+0 records out 10737418240 bytes (11 GB) copied, 82.1332 seconds, 131 MB/s [root at cmail imap]# dd if=/dev/zero of=test.img bs=1024k count=1k oflag=direct 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 10.9007 seconds, 98.5 MB/s tests below run on ext4 filesystem which is ontop of cluster lvm volume /dev/mapper/vg_cdata-data on /data type ext4 (rw,noatime,nodiratime) [root at cnode1 data]# dd if=/dev/zero of=test.img bs=1024k count=1k oflag=direct 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 10.3417 seconds, 104 MB/s [root at cnode1 data]# dd if=/dev/zero of=test.img bs=64k count=10k oflag=direct 10240+0 records in 10240+0 records out 671088640 bytes (671 MB) copied, 9.44331 seconds, 71.1 MB/s as u see the 1gb interconnect is pretty busy when using such dd. i expect a write performance about 300MB/s or more when we get a pair of 10gb cards dedicated for drbd. below you can see the current drbd and sysctl settings. from our initial setup we changed hardware raid5 to raid50 which doubles the raw write performance. also changed sndbuf-size from fixed 512k to 0 (auto tuning) and sysctl tcp tuning settings, can't remember initial settings. but we also had one hardware problem (raid controller battery) the raid controller write cache was disabled. raw raid50 write performance was only about 175MB/s on one node, after battery was replaced it is about 450MB/s so this is my first drbd cluster installation which we started 2 weeks ago. i am sure drbd needs a little more settings to handle problems but performance seems good. br ulrich cluster hardware: ----------------- pair of hp dl380 g7 singe xeon E5649 (6-core cpu) 12gb ram 6x300gb raid50 (about 450mb/s write performance) 1gb cluster interconnect (will be replaced by 10gb) cluster software: ------------------ centos 5.6 x86_64 drbd83 from centos extras repository (drbd83-8.3.8-1.el5.centos) rhel cluster suite drbd performance settings: -------------------------- /etc/drbd.d/global_common.conf ... common { ... disk { no-disk-barrier; no-disk-flushes; on-io-error detach; } net { max-buffers 8000; max-epoch-size 8000; sndbuf-size 0; } syncer { rate 50M; al-extents 3389; verify-alg md5; } ... } drbd device which runs kvm virtual machine (/etc/drbd.d/cmail.res): ------------------------------------------------------------------- resource cmail { device /dev/drbd2; meta-disk internal; on cnode1.obvsg.at { disk /dev/vg_cnode1/cmail; address 10.0.0.161:7791; } on cnode2.obvsg.at { disk /dev/vg_cnode2/cmail; address 10.0.0.162:7791; } net { allow-two-primaries; } startup { become-primary-on both; } } cat >> /etc/sysctl.conf <<EOF # http://fasterdata.es.net/fasterdata/host-tuning/linux/ # increase TCP max buffer size setable using setsockopt() # 16 MB with a few parallel streams is recommended for most 10G paths # 32 MB might be needed for some very long end-to-end 10G or 40G paths net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 # increase Linux autotuning TCP buffer limits # min, default, and max number of bytes to use # (only change the 3rd value, and make it 16 MB or more) net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 # recommended to increase this for 10G NICS net.core.netdev_max_backlog = 30000 # Disable netfilter on bridges. net.bridge.bridge-nf-call-ip6tables = 0 net.bridge.bridge-nf-call-iptables = 0 net.bridge.bridge-nf-call-arptables = 0 # reduce water levels to start marketing background (and foreground) # write back early. Reduces the chance of resource starvation. vm.dirty_ratio = 10 vm.dirty_background_ratio = 3 EOF -- Ulrich Leodolter <ulrich.leodolter at obvsg.at>