Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi again, Problem solved. There is a setting called storsave for newer 3ware RAID- controllers. This was set to "balanced" which - among others - provides a write-journal for the disk-cache to prevent data-loss in case of power failure. Setting it to "perform" boosted write performance from ~55MB/s to ~155MB/s. Looking carefully at the ouput of atop and comparing the writes/s to the backing device with the writes/s to the disk made me feel that it must have something to do with raid-/disk-caching settings. atop is a wonderful tool! The manual http://www.3ware.com/support/UserDocs/UsrGuide-9.5.2.pdf has more details on this setting. I do wonder though why this performance bottleneck (storsave=balance) only applies for writes on the DRBD-device. Writes on the local backing device directly are fast (~300MB/s). Are different syncs or write-calls used when writing locally or when DRBD writes to the disk of the secondary node? Or is this due to the network-latency? Warm regards and thanks for the good work! Tom On Wednesday 02 01 2013 17:51:22 Tom Fernandes wrote: > Hi Florian, > > Thanks for your reply. I was out of office for some time so here's my > observations... > > On Wednesday 02 01 2013 16:29:17 you wrote: > > On Tue, Dec 18, 2012 at 10:58 AM, Tom Fernandes <anyaddress at gmx.net> > > wrote: > > > ------------------------------- DRBD > > ----------------------------------------- > > > tom at hydra04 [1526]:~$ sudo drbdadm dump > > > # /etc/drbd.conf > > > common { > > > protocol C; > > > syncer { > > > rate 150M; > > > } > > > } > > > > > > # resource leela on hydra04: not ignored, not stacked > > > resource leela { > > > on hydra04 { > > > device minor 0; > > > disk /dev/vg0/leela; > > > address ipv4 10.0.0.1:7788; > > > meta-disk internal; > > > } > > > on hydra05 { > > > device minor 0; > > > disk /dev/vg0/leela; > > > address ipv4 10.0.0.2:7788; > > > meta-disk internal; > > > } > > > } > > > > If that configuration is indeed "similar" to the one on the other > > cluster (the one where you're apparently writing to DRBD at 200 > > MB/s), I'd be duly surprised. Indeed I'd consider it quite unlikely > > for _any_ DRBD 8.3 cluster to hit that throughput unless you tweaked > > at least al-extents, max-buffers and max-epoch-size, and possibly > > also sndbuf-size and rcvbuf-size, and set no-disk-flushes and no-md- > > flushes (assuming you run on flash or battery backed write cache). > > I compared the DRBD-configuration of the fast and the slow cluster > again with drbdadm dump. They are the same. Both configurations have > just the defaults. No modifications of the parameters you mentioned > above. > To be on the save side I re-ran the benchmarks with a 2048MB dd-file > (as we have big RAID-caches). On the fast cluster I have 1024 flash- > backed cache, on the slow cluster it's 512MB (without BBU). When doing > the tests on the fast cluster I observed nr and dw in /proc/drbd on > the secondary node to be sure, that the data is really getting synced. > The fast cluster are HP-Servers. The slow cluster is different > hardware (it's rented from our provider and may be no-name hardware). > But they have the same amount of RAM, same number of threads, both SAS > drives and both have a RAID6 configured. > > The fast cluster gives ~176MB/s write performance (not 200MB/s as I > mentioned before - I wasn't accurate when I wrote that - sorry). The > slow cluster gives ~55MB/s write performance. The speed on the slow > cluster stays roughly the same, whether I use protocol C or A. On the > fast cluster the speed increases from ~176MB/s to 187MB/s when > switching from protocol C to protocol A. > > > > > So I'd suggest that you refer back to your "fast" cluster and see if > > perhaps you forgot to copy over your /etc/drbd.d/global_common.conf. > > I checked. Both configs are the same. > > > > > You may also need to switch your I/O scheduler from cfq to deadline > > on your backing devices, if you haven't already done so. > > I switched from cfq to dealine on the slow cluster. There was a > performance increase from ~55MB/s to ~58MB/s. > > > And finally, for > > a round-robin bonded network link, upping the > > net.ipv4.tcp_reordering sysctl to perhaps 30 or so would also be > > wise. > > I tried out setting it to 30 on the slow cluster but performance > didn't really change. > > I did not feel it makes sense to tweak the DRBD-configuration on the > slow cluster as the fast cluster has the same DRBD-configuration but > gives more than 3x better performance. > > I'll try with 8.4 tomorrow. Let's see if that makes a difference. > > Is there any more information I can provide? > > > warm regards, > > > Tom Fernandes > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user >