[DRBD-user] Weird DRBD performance problem

Sat Feb 4 10:14:21 CET 2012

Hi Lars and Everyone else,

I put the no-disk-drain in the hope that it might help me find the problem.  I leave it out on production.

The run with the various dd's with different commands was interesting:

san1:/dev # dd if=/dev/zero of=/dev/sdb2 bs=1M count=32768 oflag=direct
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 38.9945 s, 881 MB/s
san1:/dev # dd if=/dev/zero of=/dev/sdb2 bs=1M count=32768 oflag=dsync
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 88.8506 s, 387 MB/s
san1:/dev # dd if=/dev/zero of=/dev/sdb2 bs=1M count=32768 conv=fsync
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 72.9384 s, 471 MB/s

Not sure why direct would be so much faster than the others.  Something to look into.  Anyone have some thoughts on that one?

Since I'm trying to figure out what is going on I decided to put the DRBD metadata on a RAM disk and see what happens and did two sets of runs in the same format as the above.

Here are the runs with the metadata on the regular RAID system:

san1:/dev/drbd/by-res # dd if=/dev/zero of=r0 bs=1M count=32768 oflag=direct
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 53.1025 s, 647 MB/s
san1:/dev/drbd/by-res # dd if=/dev/zero of=r0 bs=1M count=32768 oflag=dsync
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 147.199 s, 233 MB/s
san1:/dev/drbd/by-res # dd if=/dev/zero of=r0 bs=1M count=32768 conv=fsync
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 156.27 s, 220 MB/s

Here are the runs with the metadata on the RAM disk:

san1:/dev/drbd/by-res # dd if=/dev/zero of=r0 bs=1M count=32768 oflag=direct
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 53.6152 s, 641 MB/s
san1:/dev/drbd/by-res # dd if=/dev/zero of=r0 bs=1M count=32768 oflag=dsync
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 129.425 s, 265 MB/s
san1:/dev/drbd/by-res # dd if=/dev/zero of=r0 bs=1M count=32768 conv=fsync
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 148.076 s, 232 MB/s

I did run the tests about 10 times each and they came out to about the same numbers (+/- 15MB/s).  Basically, the difference between the metadata being on the disk and something extremely fast doesn't make a difference in this setup.  The odd thing is the DRBD overhead is just really high on this system for some reason and I can't seem to find the answer.

The firmware between the raid controllers isn't the same.  The big difference is that these servers are using the dell H700 card and the other server are using the LSI Megaraid.  Both cards are really made by LSI but Dell has its own firmware but they all use the same driver.

I also don't understand why the transactions per second as seem by iostat is so high on the drbd device but when it gets to the backing store the transactions drop.  Also, when I just run the dd command on the backing store the transactions per second are not high and look just like they do in the drbd test run for the backing device.  For some reason the drbd device shows a huge number of transactions.

Anymore have some more thoughts about where the overhead hit might be coming from?

thanks

________________________________

From: drbd-user-bounces at lists.linbit.com on behalf of Lars Ellenberg
Sent: Thu 2/2/2012 7:11 AM
To: drbd-user at lists.linbit.com
Subject: Re: [DRBD-user] Weird DRBD performance problem

On Wed, Feb 01, 2012 at 06:04:18PM -0700, Roof, Morey R. wrote:
> Hi Everyone,
> 
> I have a DRBD performance problem that has got me completely confused.
> I hoping that someone can help with this one as my other servers that
> use the same type of RAID cards and DRBD don't have this problem.
> 
> For the hardware, I have two Dell R515 servers with the H700 card,
> basically an LSI Megaraid based card, and running SLES 11 SP1.  This
> problem shows up on drbd 8.3.11, 8.3.12, and 8.4.1 but I haven't
> tested other versions.
> 
> here is the simple config I made based on the servers that don't have
> any issues:
> 
> global {
>         # We don't want to be bother by the usage count numbers
>         usage-count no;
> }
> common {
>         protocol C;
>         net {
>                 cram-hmac-alg           md5;
>                 shared-secret           "P4ss";
>         }
> }
> resource r0 {
>         on san1 {
>                 device                  /dev/drbd0;
>                 disk                    /dev/disk/by-id/scsi-36782bcb0698b6300167badae13f2884d-part2;
>                 address                 10.60.60.1:63000;
>                 flexible-meta-disk      /dev/disk/by-id/scsi-36782bcb0698b6300167badae13f2884d-part1;
>         }
>         on san2 {
>                 device                  /dev/drbd0;
>                 disk                    /dev/disk/by-id/scsi-36782bcb0698b6e00167bb1d107a77a47-part2;
>                 address                 10.60.60.2:63000;
>                 flexible-meta-disk      /dev/disk/by-id/scsi-36782bcb0698b6e00167bb1d107a77a47-part1;
>         }
>         startup {
>                 wfc-timeout             5;
>         }
>         syncer {
>                 rate                    50M;
>                 cpu-mask                4;
>         }
>         disk {
>                 on-io-error             detach;
>                 no-disk-barrier;
>                 no-disk-flushes;
>                 no-disk-drain;

Will people please STOP using no-disk-drain.  On most hardware, it does
not provide measurable performance gain, but may risk data integrity
because of potential violation of write-after-write dependencies!

>                 no-md-flushes;
>         }
> }
> 
> version: 8.3.11 (api:88/proto:86-96)
> GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by phil at fat-tyre <mailto:phil at fat-tyre <mailto:phil at fat-tyre> > , 2011-06-29 11:37:11
>  0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r----s
>     ns:0 nr:0 dw:8501248 dr:551 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:n oos:3397375600
> 
> So, when I'm running just with one server and no replication the performance hit with DRBD is huge.  The backing device shows a throughput of:
> ----
> san1:~ # dd if=/dev/zero of=/dev/disk/by-id/scsi-36782bcb0698b6300167badae13f2884d-part2 bs=1M count=16384

Hope you are not writing to the page cache only?
add oflag=direct, or oflag=dsync, or conv=fsync combinations thereof.

> san1:~ # dd if=/dev/zero of=/dev/drbd/by-res/r0 bs=1M count=16384
> 16384+0 records in
> 16384+0 records out
> 17179869184 bytes (17 GB) copied, 93.457 s, 184 MB/s

See if moving the drbd meta data to raid 1 helps.

> -------
> 
> using iostat I see part of the problem:
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.08    0.00   16.76    0.00    0.00   83.17
> Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
> sda               0.00         0.00         0.00          0          0
> sdb           20565.00         0.00       360.00          0        719
> drbd0         737449.50         0.00       360.08          0        720
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.07    0.00   28.87    1.37    0.00   69.69
> Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
> sda               1.50         0.00         0.01          0          0
> sdb           57859.50         0.00       177.22          0        354
> drbd0         362787.00         0.00       177.14          0        354
> 
> the drbd device is showing a TPS about 10x - 20x of the backing store.
> When I do this on my other servers I don't see anything like it.  The
> working servers are also running the same kernel and drbd versions.

The rest of the IO stack is the same as well, including driver,
firmware, settings, health of controller cache battery?
Not implying anything, that's just something to check...

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com <http://www.linbit.com/> 
_______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user <http://lists.linbit.com/mailman/listinfo/drbd-user> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120204/70a82f16/attachment.htm>