Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi Gordan & Lars Thanks for your reply. I set drbd's rate to something ridiculous just to ensure it wasn't part of the problem. The bs=512 count=1000 tests ive been doing only use 5-6mbit I guess I wont try no-disk-drain then ;P They're 100mbit connected via crossover cable. And cause its vmware they are AMD pcnet32 interfaces, which as far as I can tell doesn't support coalescing tuning. DRBD has no trouble pushing exactly 10MB/sec while syncing or doing large files. 64 bytes from 192.168.0.40: icmp_seq=7 ttl=64 time=0.462 ms 64 bytes from 192.168.0.40: icmp_seq=8 ttl=64 time=0.484 ms 64 bytes from 192.168.0.40: icmp_seq=9 ttl=64 time=0.467 ms Rtt isn't fantastic in this 2 x vmware sample, didn't know it was this bad actually. But, definitely better than 2ms. Ill see what I can do about setting up a 2 x gige systems I can put linux on natively to test with. I had hoped I could just see the result of no-disk-flushes straight away, evaluate it, and put it on our live servers, seems nothing can go the easy way for me. On our live env its 2 x gige intel nics directly connected, I see 0.160ms average there. File creation on this env is also very slow, tho harder to test because I cannot damage the filesystem. Ive only recently started looking into tuning the rx-usecs, which the e1000 module says it supports. But have already run into problems there too - tho this bit is probably for another place another time. # ethtool -c eth1 Coalesce parameters for eth0: Cannot get device coalesce settings: Operation not supported Gordan suggested trying an alternative (eg an nfs mount), I used rw,async to better simulate no-disk-flushes, as rw,sync for small files was 10x slower. #nfs mount: # dd if=/dev/zero of=/root/x/testfile bs=512k count=1000 oflag=direct 524288000 bytes (524 MB) copied, 53.8223 seconds, 9.7 MB/s # dd if=/dev/zero of=/root/x/testfile bs=512 count=1000 oflag=direct 512000 bytes (512 kB) copied, 1.17283 seconds, 437 kB/s # dd if=/dev/zero of=/root/x/testfile bs=512 count=25000 oflag=direct 12800000 bytes (13 MB) copied, 26.5869 seconds, 481 kB/s # bonnie++ -d /root/x/ -n 25:1024:0:10 -u nobody -s 512M Version 1.03e ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP ocfstest2 512M 9714 43 9830 2 5336 9 8933 50 10461 7 1023 14 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 25:1024:0/10 258 9 407 10 648 10 257 9 381 14 565 11 Do you think this 0.400 ms latency is the reason why I see no change when I use the no-disk-flushes options? Andy.. -----Original Message----- From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Lars Ellenberg Sent: Thursday, 19 March 2009 5:14 AM To: drbd-user at lists.linbit.com Subject: Re: [DRBD-user] no-disk-flushes ineffective? On Wed, Mar 18, 2009 at 09:46:53AM +0000, Gordan Bobic wrote: > Can you post the contents your /proc/drbd? > You might also want to add "no-disk-drain", and see if that helps. No. Don't use no-disk-drain, especially if you already use all the other "no-disk-whatever" options. Because that will in most real-life setups cause write reordering where it is not allowed. Which means journalling file system or data base assumtions are violated. If you run into data consistency issues, lose files or transactions, after failover or crash on DRBD with no-disk-drain, don't blame DRBD. Similar, if you run on volatile caches, don't blame DRBD. > > > > #dd if=/dev/zero of=/dev/sdb bs=512k count=1000 oflag=direct > > 524288000 bytes (524 MB) copied, 15.0304 seconds, 34.9 MB/s so your backend storage is capable of 35 MB/s throughput, which is just about ok for a single disk. > > # dd if=/dev/zero of=/dev/sdb bs=512 count=1000 oflag=direct > > 512000 bytes (512 kB) copied, 0.351227 seconds, 1.5 MB/s and ~ 0.35 ms latency, which means there are caches involved. hopefully they are non-volatile. but probably its just a single low end IDE disk, with volatile on disk cache enabled? > > ##### without flushes: > > #single node > > # dd if=/dev/zero of=/dev/drbd0 bs=512k count=1000 oflag=direct > > 524288000 bytes (524 MB) copied, 15.0428 seconds, 34.9 MB/s > > > > # dd if=/dev/zero of=/dev/drbd0 bs=512 count=1000 oflag=direct > > 512000 bytes (512 kB) copied, 0.367788 seconds, 1.4 MB/s yep, about the same. ok. > > #dual node > > # dd if=/dev/zero of=/dev/drbd0 bs=512k count=1000 oflag=direct > > 524288000 bytes (524 MB) copied, 51.0372 seconds, 10.3 MB/s so your network is a 100 MBit connection? > > # dd if=/dev/zero of=/dev/drbd0 bs=512 count=1000 oflag=direct > > 512000 bytes (512 kB) copied, 2.03025 seconds, 252 kB/s with an approximate RTT of 2 ms? -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBDR and LINBITR are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed _______________________________________________ drbd-user mailing list drbd-user at lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user