<br><font size=2 face="sans-serif">Thanks a lot, that solved the problem.</font>
<br><font size=2 face="sans-serif">Mit freundlichen Grüßen / Best regards,<br>
<br>
Robert Köppl<br>
System Administration<br>
</font>
<br>
<br>
<br>
<table width=100%>
<tr valign=top>
<td width=40%><font size=1 face="sans-serif"><b>Lars Ellenberg <lars.ellenberg@linbit.com></b>
</font>
<br><font size=1 face="sans-serif">Gesendet von: drbd-user-bounces@lists.linbit.com</font>
<p><font size=1 face="sans-serif">17.07.2009 11:11</font>
<td width=59%>
<table width=100%>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">An</font></div>
<td><font size=1 face="sans-serif">drbd-user@lists.linbit.com</font>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">Kopie</font></div>
<td>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">Thema</font></div>
<td><font size=1 face="sans-serif">Re: [DRBD-user] performance issues DRBD8.3.1
on Serveraid 8k</font></table>
<br>
<table>
<tr valign=top>
<td>
<td></table>
<br></table>
<br>
<br>
<br><tt><font size=2>On Fri, Jul 17, 2009 at 10:23:15AM +0200, Robert.Koeppl@knapp.com
wrote:<br>
> Good morning!<br>
<br>
Hey there.<br>
<br>
Solution below ;)<br>
<br>
> I am experiencing some troubling performance issues un one of my clusters.<br>
> Hardware: <br>
> IBM 3650 16GB RAM, 2xQuadcore Xeon 5450@3GHz<br>
> ServeRaid 8k. 6x 2,5" SAS 136GB 10kRPM Harddisks dedicated for
DRDB as <br>
> RAID 10<br>
> 256MB Chache, readahead and Write Cache activated. Stripe Size
256 KB<br>
> Interlink over two intel 1Gbit optical NICs, bonded in mode 1<br>
> <br>
> OS:<br>
> SLES 10 SP2, 64 bit, Kernel 2.6.16.60-33-smp x86_64<br>
> DRBD 8.3.1 compiled from source on that machines.<br>
> <br>
> Oracle 10.2.0.4 running 2 different SIDs at the same time<br>
> <br>
> There are 17 DRBD-Devices running on top of LVM,, the LVM resides
on <br>
> /dev/sdb, which is the RAID10 Arraiy mentioned above.<br>
> <br>
> The large Number of devices results from the whish of our DBA to have
each <br>
> folder on a different Filesystem and synced independently. Although
this <br>
> is far from optimal from a performance view it is fast enough on our
other <br>
> systems that have similar setups.<br>
> <br>
> As long as DRBD is running standalone or waiting for connection the
system <br>
> runs fine. <br>
> iostat -x of the underlying device gives the following:<br>
> <br>
> Linux 2.6.16.60-0.33-smp (k1327kc1) 16.07.2009<br>
> <br>
> avg-cpu: %user %nice %system %iowait %steal
%idle<br>
> 0,76 0,00
0,20 0,64 0,00 98,40<br>
> <br>
> Device: rrqm/s wrqm/s
r/s w/s rsec/s wsec/s avgrq-sz <br>
> avgqu-sz await svctm %util<br>
> sdb 38,13 37,31
17,20 16,22 985,46 1124,75 63,13
<br>
> 0,65 19,34 3,00 10,04<br>
> <br>
> iostat -x of the drbd devices gives<br>
> <br>
> avg-cpu: %user %nice %system %iowait %steal
%idle<br>
> 0,12 0,00
0,06 0,00 0,00 99,81<br>
<br>
> avg-cpu: %user %nice %system %iowait %steal
%idle<br>
> 0,37 0,00
0,12 0,00 0,00 99,50<br>
<br>
> sometimes peak values are a bit higher, but well within reasonalbe
<br>
> boundaries. whzich meand await somewhere up to 30 or 40 ms<br>
> <br>
> if DRBD is connected this changes dramatically:<br>
> <br>
> this is th master side:<br>
> avg-cpu: %user %nice %system %iowait %steal
%idle<br>
> 0,12 0,00
0,12 24,81 0,00 74,94<br>
<br>
> drbd5 0,00
0,00 9,50 5,00 112,00 64,00
12,14 <br>
> 3,62 356,69 66,62 96,60<br>
> drbd6 0,00
0,00 0,50 3,00 16,00 48,00
18,29 <br>
> 3,22 1417,71 217,71 76,20<br>
> drbd7 0,00
0,00 0,50 3,00 16,00 48,00
18,29 <br>
> 3,50 1497,71 225,71 79,00<br>
> drbd8 0,00
0,00 0,00 1,50 0,00
6,00 4,00 <br>
> 0,72 482,67 381,33 57,20<br>
> drbd9 0,00
0,00 0,00 1,50 0,00
6,00 4,00 <br>
> 0,78 520,00 400,00 60,00<br>
> drbd15 0,00
0,00 0,00 0,50 0,00
4,00 8,00 <br>
> 2,34 7988,00 1256,00 62,80<br>
> drbd16 0,00
0,00 0,00 1,50 0,00 12,00
8,00 <br>
> 1,18 900,00 610,67 91,60<br>
<br>
> this is on the slave node:<br>
> <br>
> avg-cpu: %user %nice %system %iowait %steal
%idle<br>
> 0,19 0,00
0,00 0,13 0,00 99,69<br>
> <br>
> Device: rrqm/s wrqm/s
r/s w/s rsec/s wsec/s avgrq-sz <br>
> avgqu-sz await svctm %util<br>
> sdb 0,00 13,00
0,00 14,00 0,00 272,50
19,46 <br>
> 5,49 483,14 71,29 99,80<br>
> <br>
> avg-cpu: %user %nice %system %iowait %steal
%idle<br>
> 1,19 0,00
1,56 5,87 0,00 91,38<br>
> <br>
> Device: rrqm/s wrqm/s
r/s w/s rsec/s wsec/s avgrq-sz <br>
> avgqu-sz await svctm %util<br>
> sdb 0,00
6,00 0,00 10,50 0,00 148,00
14,10 <br>
> 4,94 343,81 92,19 96,80<br>
> <br>
> avg-cpu: %user %nice %system %iowait %steal
%idle<br>
> 0,37 0,00
4,80 4,86 0,00 89,96<br>
> <br>
> Device: rrqm/s wrqm/s
r/s w/s rsec/s wsec/s avgrq-sz <br>
> avgqu-sz await svctm %util<br>
> sdb 0,00 40,80
0,00 14,43 0,00 514,93
35,69 <br>
> 3,10 307,31 54,07 78,01<br>
> <br>
> <br>
> This renders the system completely useless.<br>
> <br>
> Here is the drbd.conf:<br>
> <br>
> global {usage-count no;}<br>
> resource r0 {<br>
> handlers {<br>
> outdate-peer <br>
> "/usr/lib64/heartbeat/drbd-peer-outdater";pri-on-incon-degr
"echo '!DRBD! <br>
> pri on incon-degr' | wall ; sleep 60 ; halt -f";<br>
> }<br>
> protocol C;<br>
> <br>
> startup {<br>
> wfc-timeout 0; degr-wfc-timeout 120; #
2 minutes.<br>
> }<br>
> <br>
> disk {<br>
<br>
add here:<br>
<br>
no-disk-barrier;<br>
<br>
> no-disk-flushes;<br>
> no-md-flushes;<br>
> fencing resource-only;<br>
> on-io-error detach;<br>
> }<br>
<br>
btw, you may want to simplify your drbd.conf file<br>
by using the "common {}" secttion.<br>
<br>
see also e.g.:<br>
http://thread.gmane.org/gmane.linux.network.drbd/17545/focus=17585<br>
<br>
-- <br>
: Lars Ellenberg<br>
: LINBIT | Your Way to High Availability<br>
: DRBD/HA support and consulting http://www.linbit.com<br>
<br>
DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.<br>
__<br>
please don't Cc me, but send to list -- I'm subscribed<br>
_______________________________________________<br>
drbd-user mailing list<br>
drbd-user@lists.linbit.com<br>
http://lists.linbit.com/mailman/listinfo/drbd-user<br>
</font></tt>
<br>