Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello Folks, Il 11-04-2008 15:19, "Florian Haas" <florian.haas at linbit.com> ha scritto: >> On Thursday 10 April 2008 17:32:42 Florian Haas wrote: >> >>> No, it's DRBD doing its grey voodoo magic. :-) You simply witnessed the >>> effects of "cold" vs. "hot" activity log. >> >> Cool! (pun intended). >> >>> May I guess you are using the CFQ I/O scheduler? >>> What's /sys/block/sda/queue/scheduler say? >> >> I'm currently using noop because of the hardware RAID underneath[1]. I've >> also >> tried the deadline scheduler, since it's better for database loads >> according >> to the Linux docs. This didn't improve anything, although noop's only a >> bit >> faster. Noop's results are consistent while deadline fluctuates a bit >> more. >> >> The way I've tested this is keeping a 0.5s watch firing a SELECT COUNT(*) >> query into the database while a serial INSERT script is running on the >> background. All of this is running while switching/tuning schedulers. The >> setup causing the fastest increment in the COUNT(*) result wins. >> >> I haven't tested CFQ that well, but without tuning CFQ it's performance is >> worse, which is to be expected. Is this scheduler a potential winner when >> tuned correctly? > > No, not really. I was actually asking because most people tend to use CFQ > these days since it's the default in recent kernels. I was going to > suggest noop or deadline, but you've tried that already. > > And, I assume you do have your write cache enabled and set to write back. > >> The weird thing is, when I disconnect the secondary DRBD node the >> increment >> becomes a few hundred times faster. When the second node reconnects after >> a >> few minutes it's sync is _very_ fast (a few seconds). The performance >> drops >> back again after the reconnect. > > Um, this is just a wild guess, but I do remember having observed similar > symptoms after enabling Jumbo frames on one of my test systems. I never > found a reasonable explanation for this -- if someone else has, please > share -- but latency dropped for a few writes, then surged dramatically > and never improved. Can you duplicate your tests with a standard-issue MTU > of 1500? I had the same behaviour using RAID5/RAID6 with internal metadata. We already discussed here few months ago, and think Lars explained it as a "bitmap sync writes problem with the raid parity calculation" Try to change your raid level to 0/10 or move the internal meta-data somewhere else. For those interested, i'm going to try with a i-ram for metadata. Any thoughts? -- matteo