[DRBD-user] Antwort: Re: performance issues DRBD8.3.1 on Serveraid 8k

Robert.Koeppl at knapp.com Robert.Koeppl at knapp.com
Fri Jul 17 13:38:44 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Thanks a lot, that solved the problem.
Mit freundlichen Grüßen / Best regards,
 
Robert Köppl
System Administration




Lars Ellenberg <lars.ellenberg at linbit.com> 
Gesendet von: drbd-user-bounces at lists.linbit.com
17.07.2009 11:11

An
drbd-user at lists.linbit.com
Kopie

Thema
Re: [DRBD-user] performance issues DRBD8.3.1 on Serveraid 8k






On Fri, Jul 17, 2009 at 10:23:15AM +0200, Robert.Koeppl at knapp.com wrote:
> Good morning!

Hey there.

Solution below ;)

> I am experiencing some troubling performance issues un one of my 
clusters.
> Hardware: 
> IBM 3650 16GB RAM, 2xQuadcore Xeon 5450 at 3GHz
> ServeRaid 8k. 6x 2,5" SAS 136GB 10kRPM Harddisks dedicated for DRDB as 
> RAID 10
> 256MB Chache, readahead and Write  Cache activated. Stripe Size 256 KB
> Interlink over two intel 1Gbit optical NICs, bonded in mode 1
> 
> OS:
> SLES 10 SP2, 64 bit, Kernel 2.6.16.60-33-smp x86_64
> DRBD 8.3.1 compiled from source on that machines.
> 
> Oracle 10.2.0.4 running 2 different SIDs at the same time
> 
> There are 17 DRBD-Devices running on top of LVM,, the LVM resides on 
> /dev/sdb, which is the RAID10 Arraiy mentioned above.
> 
> The large Number of devices results from the whish of our DBA to have 
each 
> folder on a different Filesystem and synced independently. Although this 

> is far from optimal from a performance view it is fast enough on our 
other 
> systems that have similar setups.
> 
> As long as DRBD is running standalone or waiting for connection the 
system 
> runs fine. 
> iostat -x of the underlying device gives the following:
> 
> Linux 2.6.16.60-0.33-smp (k1327kc1)     16.07.2009
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0,76    0,00    0,20    0,64    0,00   98,40
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
avgrq-sz 
> avgqu-sz   await  svctm  %util
> sdb              38,13    37,31   17,20   16,22   985,46  1124,75 63,13 
>     0,65   19,34   3,00  10,04
> 
> iostat -x of the drbd devices gives
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0,12    0,00    0,06    0,00    0,00   99,81

> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0,37    0,00    0,12    0,00    0,00   99,50

> sometimes peak values are a bit higher, but well within reasonalbe 
> boundaries. whzich meand await somewhere up to 30 or 40 ms
> 
> if DRBD is connected this changes dramatically:
> 
> this is th master side:
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0,12    0,00    0,12   24,81    0,00   74,94

> drbd5             0,00     0,00    9,50    5,00   112,00    64,00 12,14 
>     3,62  356,69  66,62  96,60
> drbd6             0,00     0,00    0,50    3,00    16,00    48,00 18,29 
>     3,22 1417,71 217,71  76,20
> drbd7             0,00     0,00    0,50    3,00    16,00    48,00 18,29 
>     3,50 1497,71 225,71  79,00
> drbd8             0,00     0,00    0,00    1,50     0,00     6,00 4,00 
>     0,72  482,67 381,33  57,20
> drbd9             0,00     0,00    0,00    1,50     0,00     6,00 4,00 
>     0,78  520,00 400,00  60,00
> drbd15            0,00     0,00    0,00    0,50     0,00     4,00 8,00 
>     2,34 7988,00 1256,00  62,80
> drbd16            0,00     0,00    0,00    1,50     0,00    12,00 8,00 
>     1,18  900,00 610,67  91,60

> this is on the slave node:
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0,19    0,00    0,00    0,13    0,00   99,69
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
avgrq-sz 
> avgqu-sz   await  svctm  %util
> sdb               0,00    13,00    0,00   14,00     0,00   272,50 19,46 
>     5,49  483,14  71,29  99,80
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            1,19    0,00    1,56    5,87    0,00   91,38
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
avgrq-sz 
> avgqu-sz   await  svctm  %util
> sdb               0,00     6,00    0,00   10,50     0,00   148,00 14,10 
>     4,94  343,81  92,19  96,80
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0,37    0,00    4,80    4,86    0,00   89,96
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s 
avgrq-sz 
> avgqu-sz   await  svctm  %util
> sdb               0,00    40,80    0,00   14,43     0,00   514,93 35,69 
>     3,10  307,31  54,07  78,01
> 
> 
> This renders the system completely useless.
> 
> Here is the drbd.conf:
> 
> global {usage-count no;}
> resource r0 {
> handlers {
>         outdate-peer 
> "/usr/lib64/heartbeat/drbd-peer-outdater";pri-on-incon-degr "echo 
'!DRBD! 
> pri on incon-degr' | wall ; sleep 60 ; halt -f";
>     }
>   protocol C;
> 
>   startup {
>     wfc-timeout 0; degr-wfc-timeout 120;    # 2 minutes.
>   }
> 
>   disk {

add here:

                 no-disk-barrier;

>         no-disk-flushes;
>         no-md-flushes;
>         fencing resource-only;
>     on-io-error   detach;
>   }

btw, you may want to simplify your drbd.conf file
by using the "common {}" secttion.

see also e.g.:
http://thread.gmane.org/gmane.linux.network.drbd/17545/focus=17585

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed
_______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20090717/7625cb59/attachment.htm>


More information about the drbd-user mailing list