Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Fri, Jul 17, 2009 at 10:23:15AM +0200, Robert.Koeppl at knapp.com wrote:
> Good morning!
Hey there.
Solution below ;)
> I am experiencing some troubling performance issues un one of my clusters.
> Hardware:
> IBM 3650 16GB RAM, 2xQuadcore Xeon 5450 at 3GHz
> ServeRaid 8k. 6x 2,5" SAS 136GB 10kRPM Harddisks dedicated for DRDB as
> RAID 10
> 256MB Chache, readahead and Write Cache activated. Stripe Size 256 KB
> Interlink over two intel 1Gbit optical NICs, bonded in mode 1
>
> OS:
> SLES 10 SP2, 64 bit, Kernel 2.6.16.60-33-smp x86_64
> DRBD 8.3.1 compiled from source on that machines.
>
> Oracle 10.2.0.4 running 2 different SIDs at the same time
>
> There are 17 DRBD-Devices running on top of LVM,, the LVM resides on
> /dev/sdb, which is the RAID10 Arraiy mentioned above.
>
> The large Number of devices results from the whish of our DBA to have each
> folder on a different Filesystem and synced independently. Although this
> is far from optimal from a performance view it is fast enough on our other
> systems that have similar setups.
>
> As long as DRBD is running standalone or waiting for connection the system
> runs fine.
> iostat -x of the underlying device gives the following:
>
> Linux 2.6.16.60-0.33-smp (k1327kc1) 16.07.2009
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 0,76 0,00 0,20 0,64 0,00 98,40
>
> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz
> avgqu-sz await svctm %util
> sdb 38,13 37,31 17,20 16,22 985,46 1124,75 63,13
> 0,65 19,34 3,00 10,04
>
> iostat -x of the drbd devices gives
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 0,12 0,00 0,06 0,00 0,00 99,81
> avg-cpu: %user %nice %system %iowait %steal %idle
> 0,37 0,00 0,12 0,00 0,00 99,50
> sometimes peak values are a bit higher, but well within reasonalbe
> boundaries. whzich meand await somewhere up to 30 or 40 ms
>
> if DRBD is connected this changes dramatically:
>
> this is th master side:
> avg-cpu: %user %nice %system %iowait %steal %idle
> 0,12 0,00 0,12 24,81 0,00 74,94
> drbd5 0,00 0,00 9,50 5,00 112,00 64,00 12,14
> 3,62 356,69 66,62 96,60
> drbd6 0,00 0,00 0,50 3,00 16,00 48,00 18,29
> 3,22 1417,71 217,71 76,20
> drbd7 0,00 0,00 0,50 3,00 16,00 48,00 18,29
> 3,50 1497,71 225,71 79,00
> drbd8 0,00 0,00 0,00 1,50 0,00 6,00 4,00
> 0,72 482,67 381,33 57,20
> drbd9 0,00 0,00 0,00 1,50 0,00 6,00 4,00
> 0,78 520,00 400,00 60,00
> drbd15 0,00 0,00 0,00 0,50 0,00 4,00 8,00
> 2,34 7988,00 1256,00 62,80
> drbd16 0,00 0,00 0,00 1,50 0,00 12,00 8,00
> 1,18 900,00 610,67 91,60
> this is on the slave node:
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 0,19 0,00 0,00 0,13 0,00 99,69
>
> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz
> avgqu-sz await svctm %util
> sdb 0,00 13,00 0,00 14,00 0,00 272,50 19,46
> 5,49 483,14 71,29 99,80
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 1,19 0,00 1,56 5,87 0,00 91,38
>
> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz
> avgqu-sz await svctm %util
> sdb 0,00 6,00 0,00 10,50 0,00 148,00 14,10
> 4,94 343,81 92,19 96,80
>
> avg-cpu: %user %nice %system %iowait %steal %idle
> 0,37 0,00 4,80 4,86 0,00 89,96
>
> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz
> avgqu-sz await svctm %util
> sdb 0,00 40,80 0,00 14,43 0,00 514,93 35,69
> 3,10 307,31 54,07 78,01
>
>
> This renders the system completely useless.
>
> Here is the drbd.conf:
>
> global {usage-count no;}
> resource r0 {
> handlers {
> outdate-peer
> "/usr/lib64/heartbeat/drbd-peer-outdater";pri-on-incon-degr "echo '!DRBD!
> pri on incon-degr' | wall ; sleep 60 ; halt -f";
> }
> protocol C;
>
> startup {
> wfc-timeout 0; degr-wfc-timeout 120; # 2 minutes.
> }
>
> disk {
add here:
no-disk-barrier;
> no-disk-flushes;
> no-md-flushes;
> fencing resource-only;
> on-io-error detach;
> }
btw, you may want to simplify your drbd.conf file
by using the "common {}" secttion.
see also e.g.:
http://thread.gmane.org/gmane.linux.network.drbd/17545/focus=17585
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list -- I'm subscribed