[DRBD-user] Slower disk throughput on DRBD partition

Wed Feb 1 19:00:26 CET 2012

Frederic,

Paravirtual SCSI is supposed to be fully efficient passed a given level of
IOPS, under that level of activity an LSI SAS adapter used to be reputed
being more performant (It was in v4, don’t have any update regarding v5, I
suspect it’s still the case though). However, since both disks are plugged
onto the same adapter and you issue the same test command for both disks,
this can’t explain what you’re seeing
 Therefore it seems like your problem
is effectively whithin your VM, and not “around”. Let’s keep searching,
then
 J

Best regards,

Pascal.

De : Frederic DeMarcy [mailto:fred.demarcy.ml at gmail.com] 
Envoyé : mercredi 1 février 2012 17:20
À : Pascal BERTON (EURIALYS)
Cc : drbd-user at lists.linbit.com
Objet : Re: [DRBD-user] Slower disk throughput on DRBD partition

Hi Pascal

1) Both vdisks for /dev/sda and /dev/sdb are on the same datastore which is
made of the entire RAID5 array capacity (7xHDs + 1 spare). Minus space used
by the ESXi installation.
2) HD1 (/dev/sda) is SCSI (0:1) and HD2 (/dev/sdb) is SCSI (0:2). Both
initialized with Thick Provisioning Eager Zeroed. The SCSI controller type
is paravirtual.

Fred

On Wed, Feb 1, 2012 at 2:13 PM, Pascal BERTON (EURIALYS)
<pascal.berton at eurialys.fr> wrote:

Frederic,

Let's take care of the virtualisation layer wich might induce significant
side effects
Are sda and sdb :
1) vdisk files located on the same datastore ?
2) vdisks plugged on the same virtual SCSI interface ? What type of SCSI
interface ?

Best regards,

Pascal.

-----Message d'origine-----
De : drbd-user-bounces at lists.linbit.com
[mailto:drbd-user-bounces at lists.linbit.com] De la part de Frederic DeMarcy
Envoyé : mercredi 1 février 2012 13:05
À : drbd-user at lists.linbit.com
Objet : Re: [DRBD-user] Slower disk throughput on DRBD partition

Hi

Note 1:
Scientific Linux 6.1 with kernel 2.6.32-220.4.1.el6.x86_64
DRBD 8.4.1 compiled from source

Note 2:
server1 and server2 are 2 VMware VMs on top of ESXi 5. However they reside
on different physical 2U servers.
The specs for the 2U servers are identical:
 - HP DL380 G7 (2U)
 - 2 x Six Core Intel Xeon X5680 (3.33GHz)
 - 24GB RAM
 - 8 x 146 GB SAS HD's (7xRAID5 + 1s)
 - Smart Array P410i with 512MB BBWC

Note 3:
I've tested the network throughput with iperf which yields close to 1Gb/s
[root at server1 ~]# iperf -c 192.168.111.11 -f g
------------------------------------------------------------
Client connecting to 192.168.111.11, TCP port 5001
TCP window size: 0.00 GByte (default)
------------------------------------------------------------
[  3] local 192.168.111.10 port 54330 connected with 192.168.111.11 port
5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.10 GBytes  0.94 Gbits/sec

[root at server2 ~]# iperf -s -f g
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 0.00 GByte (default)
------------------------------------------------------------
[  4] local 192.168.111.11 port 5001 connected with 192.168.111.10 port
54330
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.10 GBytes  0.94 Gbits/sec

Scp'ing a large file from server1 to server2 yields ~ 57MB/s but I guess
it's due to the encryption overhead.

Note 4:
MySQL was not running.

Base DRBD config:
resource mysql {
 startup {
   wfc-timeout 3;
   degr-wfc-timeout 2;
   outdated-wfc-timeout 1;
 }
 net {
   protocol C;
   verify-alg sha1;
   csums-alg sha1;
   data-integrity-alg sha1;
   cram-hmac-alg sha1;
   shared-secret "MySecret123";
 }
 on server1 {
   device    /dev/drbd0;
   disk      /dev/sdb;
   address   192.168.111.10:7789;
   meta-disk internal;
 }
 on server2 {
   device    /dev/drbd0;
   disk      /dev/sdb;
   address   192.168.111.11:7789;
   meta-disk internal;
 }
}

After any change in the /etc/drbd.d/mysql.res file I issued a "drbdadm
adjust mysql" on both nodes.

Test #1
DRBD partition on primary (secondary node disabled)
Using Base DRBD config
# dd if=/dev/zero of=/var/lib/mysql/TMP/disk-test.xxx bs=1M count=4096
oflag=direct
Throughput ~ 420MB/s

Test #2
DRBD partition on primary (secondary node enabled)
Using Base DRBD config
# dd if=/dev/zero of=/var/lib/mysql/TMP/disk-test.xxx bs=1M count=4096
oflag=direct
Throughput ~ 61MB/s

Test #3
DRBD partition on primary (secondary node enabled)
Using Base DRBD config with:
 Protocol B;
# dd if=/dev/zero of=/var/lib/mysql/TMP/disk-test.xxx bs=1M count=4096
oflag=direct
Throughput ~ 68MB/s

Test #4
DRBD partition on primary (secondary node enabled)
Using Base DRBD config with:
 Protocol A;
# dd if=/dev/zero of=/var/lib/mysql/TMP/disk-test.xxx bs=1M count=4096
oflag=direct
Throughput ~ 94MB/s

Test #5
DRBD partition on primary (secondary node enabled)
Using Base DRBD config with:
 disk {
   disk-barrier no;
   disk-flushes no;
   md-flushes no;
 }
# dd if=/dev/zero of=/var/lib/mysql/TMP/disk-test.xxx bs=1M count=4096
oflag=direct
Disk throughput ~ 62MB/s

No difference from Test #2 really. Also cat /proc/drbd still shows wo:b in
both cases so I'm not even sure
these disk {..} parameters have been taken into account...

Test #6
DRBD partition on primary (secondary node enabled)
Using Base DRBD config with:
 Protocol B;
 disk {
   disk-barrier no;
   disk-flushes no;
   md-flushes no;
 }
# dd if=/dev/zero of=/var/lib/mysql/TMP/disk-test.xxx bs=1M count=4096
oflag=direct
Disk throughput ~ 68MB/s

No difference from Test #3 really. Also cat /proc/drbd still shows wo:b in
both cases so I'm not even sure
these disk {..} parameters have been taken into account...

What else can I try?
Is it worth trying DRBD 8.3.x?

Thx.

Fred

On 1 Feb 2012, at 08:35, James Harper wrote:

>> Hi
>>
>> I've configured DRBD with a view to use it with MySQL (and later on
>> Pacemaker + Corosync) in a 2 nodes primary/secondary
>> (master/slave) setup.
>>
>> ...
>>
>> No replication over the 1Gb/s crossover cable is taking place since the
>> secondary node is down yet there's x2 lower disk performance.
>>
>> I've tried to add:
>>  disk {
>>    disk-barrier no;
>>    disk-flushes no;
>>    md-flushes no;
>>  }
>> to the config but it didn't seem to change anything.
>>
>> Am I missing something here?
>> On another note is 8.4.1 the right version to use?
>>
>
> If you can do it just for testing, try changing to protocol B with one
primary and one secondary and see how that impacts your performance, both
with barrier/flushes on and off. I'm not sure if it will help but if
protocol B makes things faster then it might hint as to where to start
looking...
>
> James

_______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120201/2e533c47/attachment.htm>