[DRBD-user] Secondary node saturises RAID array

Thu Apr 10 17:32:42 CEST 2008

Joris,

On Thursday 10 April 2008 17:22:36 Joris van Rooij wrote:
> Here are the results using the suggestions (al-extents is 3389):
>
> Local filesystem:
> # for try in 1 2 3; do echo $try; dd if=/dev/zero of=/tmp/testfile bs=1G
> count=1 oflag=dsync; sleep 10; done
> 1
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 7.28003 s, 147 MB/s
> 2
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 5.68388 s, 189 MB/s
> 3
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 7.28915 s, 147 MB/s
>
> DRBD disconnected:
> # for try in 1 2 3; do echo $try; dd if=/dev/zero of=/mnt/test/testfile
> bs=1G count=1 oflag=dsync; sleep 10; done
> 1
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 7.00449 s, 153 MB/s
> 2
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 6.10559 s, 176 MB/s
> 3
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 7.6466 s, 140 MB/s
>
> DRBD connected:
> # for try in 1 2 3; do echo $try; dd if=/dev/zero of=/mnt/test/testfile
> bs=1G count=1 oflag=dsync; sleep 10; done
> 1
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 71.8806 s, 14.9 MB/s
> 2
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 70.4868 s, 15.2 MB/s
> 3
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 25.8466 s, 41.5 MB/s
>
> Because of that last peak I ran it again:
> # for try in 1 2 3; do echo $try; dd if=/dev/zero of=/mnt/test/testfile
> bs=1G count=1 oflag=dsync; sleep 10; done
> 1
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 24.642 s, 43.6 MB/s
> 2
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 25.9565 s, 41.4 MB/s
> 3
> 1+0 records in
> 1+0 records out
> 1073741824 bytes (1.1 GB) copied, 23.8683 s, 45.0 MB/s
>
> So it needs some time to get going. I've set the unplug-watermark to 16,
> which had no significant effect. Increasing it in regular intervals all the
> way up to 16000 didn't change much. Neither did tasksetting drbdN_*
> processes. I have to reboot the entire machine in order to get the slow
> startup result back. I guess it's safe to say it's the Sun STK RAID device
> doing it's black voodoo magic.

No, it's DRBD doing its grey voodoo magic. :-) You simply witnessed the 
effects of "cold" vs. "hot" activity log.

> The troughput is quite reasonable now. But the problem still exists I'm
> afraid. MySQL has it's InnoDB storage on top of one of the DRBD devices.
> SELECT statements are fast as f*ck, but INSERTs start locking when executed
> in fast salvos. Simple serial INSERT queries take ~0.2 seconds a piece on
> an idle system. Again, the secondary nodes gets saturated quickly. I've
> already mounted the partition using the noatime flag. This caused the speed
> to increase, but not that dramatically as I hoped it would.
>
> Where should I look for the solution? Is it DRBD or MySQL/InnoDB's locking
> behaviour? It's not so much the troughput but some latency that's causing
> the slowdown.
>
> It's not the network, that's for sure.. I think.

May I guess you are using the CFQ I/O scheduler? 
What's /sys/block/sda/queue/scheduler say?

Cheers,
Florian

-- 
: Florian G. Haas
: LINBIT Information Technologies GmbH
: Vivenotgasse 48, A-1120 Vienna, Austria

When replying, there is no need to CC my personal address.
I monitor the list on a daily basis. Thank you.