Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello Lars, On Mon, 3 Mar 2008 13:41:54 +0100 Lars Ellenberg wrote: [...] > > 50GB test size by far exceeds the > "activity log size" you configured, > which covers 1801 * 4M = about 7G only. > Ah! Finally the clue to which part of the protocol/system was peeing in my HA soup. ^^ I forced the test size to 6GB (as the current AL size is decent for real world scenarios and won't fit 50GB anyways), which of course makes reads cache bound, but delivers more realistic numbers for writes. > so you get constant meta data transactions, > which are synchronous sector writes including barriers. > > you can see this in the "al:" numbers increasing, > as well as the "hit/misses/changed" ratio in the act_log line. > Makes perfect sense now. > synchronous writes don't have the best latency characteristics > with md raid5. > Yeah, but a RAID10 is just too... wasteful. ^^ Maybe if we ever need a real high performance cluster I'll go for that case with the 24 hotswap SATA drives and RAID10. And if somebody says SUN Thumper, I shall have them whipped with my checkbook. :-p [...] > > suggestions for this setup: > sndbuf-size 1M # or even more, if you try with bonding. > max-buffers 8000 # or more > max-epoch.size # equal to max-buffers > I cranked those to 2M sndbuf and 16k max-buffers and epoch. > unplug-watermark > # try it to be equal to max-buffers, > # or half of it, > # or something like that. > # > # also try the oposite: > # make it small, 64, 128, 800. > This made the most impact and for the record the LSI (Fusion MPT) SAS 1068E controller likes it BIG. Here are the numbers for the above values and an unplug-watermark of 8k (half maxbuf) and 16k respectively: --- drbd0, ext3, 256MB journal, 2MB sndbuf, 16k maxbuf/epoch, 8k unplug Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP borg00a 6000M 75230 21 70554 21 2083225 100 1727 1 --- drbd0, ext3, 256MB journal, 2MB sndbuf, 16k maxbuf/epoch, 16k unplug Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP borg00a 6000M 86819 24 72069 20 2062711 100 +++++ +++ --- This is close enough to the actual speed of the backing device to make me happy and while I will try bonding tomorrow I think it won't have much of a further impact. > try different io schedulers for your physical drives. > I still like deadline, because it is simple, and the few parameters > it has are straight forward to tune. also try setting read-ahead > smallish on your physical devices, and largish on the md. > Sage advice and I will toy with that, now that I got sufficient oomph from drbd itself. ^^ Regards, Christian -- Christian Balzer Network/Systems Engineer NOC chibi at gol.com Global OnLine Japan/Fusion Network Services http://www.gol.com/