[DRBD-user] Secondary node saturises RAID array

Joris van Rooij jorrizza at wasda.nl
Fri Apr 11 12:13:31 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Thursday 10 April 2008 17:32:42 Florian Haas wrote:

> No, it's DRBD doing its grey voodoo magic. :-) You simply witnessed the
> effects of "cold" vs. "hot" activity log.

Cool! (pun intended).

> May I guess you are using the CFQ I/O scheduler?
> What's /sys/block/sda/queue/scheduler say?

I'm currently using noop because of the hardware RAID underneath[1]. I've also 
tried the deadline scheduler, since it's better for database loads according 
to the Linux docs. This didn't improve anything, although noop's only a bit 
faster. Noop's results are consistent while deadline fluctuates a bit more.

The way I've tested this is keeping a 0.5s watch firing a SELECT COUNT(*) 
query into the database while a serial INSERT script is running on the 
background. All of this is running while switching/tuning schedulers. The 
setup causing the fastest increment in the COUNT(*) result wins.

I haven't tested CFQ that well, but without tuning CFQ it's performance is 
worse, which is to be expected. Is this scheduler a potential winner when 
tuned correctly?

The weird thing is, when I disconnect the secondary DRBD node the increment 
becomes a few hundred times faster. When the second node reconnects after a 
few minutes it's sync is _very_ fast (a few seconds). The performance drops 
back again after the reconnect.

It seems the sync system call takes alot longer when DRBD is connected. Maybe 
this test is a little mislead again, but a 1M sync shouldn't take seconds, 
right? This is done on an idle system by the way.

Disconnected:
# time dd if=/dev/zero of=/mnt/test/tempfile bs=1M count=1; time sync
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00224756 s, 467 MB/s

real    0m0.004s
user    0m0.000s
sys     0m0.008s

real    0m0.005s
user    0m0.000s
sys     0m0.000s

Connected:
# time dd if=/dev/zero of=/mnt/test/tempfile bs=1M count=1; time sync
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00228889 s, 458 MB/s

real    0m0.004s
user    0m0.000s
sys     0m0.004s

real    0m1.859s
user    0m0.000s
sys     0m0.000s

Repeating this test simply gives the same results over and over again. A 2s 
penalty on every sync() call would explain a lot of these problems.

Thanks again for your help.

[1]  
http://www.fishpool.org/post/2008/03/31/Optimizing-Linux-I/O-on-hardware-RAID

-- 
Greetings,
Joris



More information about the drbd-user mailing list