[DRBD-user] Slower disk throughput on DRBD partition

Fri Feb 3 16:08:32 CET 2012

Hi Lars

Thanks for your answer.

I restarted a bunch of tests using DRBD 8.3.12 and DRBD 8.4.1

- Scientific Linux 64 bits with Kernel 2.6.32-220.4.1.el6.x86_64
- DRBD 8.3.12 and 8.4.1 recompiled from source using
  ./configure
  make rpm
  make km-rpm
- Using the XFS filesystem (mkfs.xfs /dev/drbd0)
- 1Gb/s crossover link between server1 and server2
- Both server1 and server2 are configured the same way with 2 disks /dev/sda (all usual linux partitions) and /dev/sdb
  used only for DRBD
- Both servers are VMs on top of VMware ESXi 5. /dev/sda and /dev/sdb are carved out of the same datastore. They use
  the same paravirtual SCSI adaptor.

DRBD 8.3.12
===========
/etc/drbd.d/mysql.res
resource mysql {
  startup {
    wfc-timeout 3;
    degr-wfc-timeout 2;
    outdated-wfc-timeout 1;
  }
  syncer {
    verify-alg sha1;
    #Only interesting for WAN setups where you need to sync via a rather thin connection.
    #csums-alg sha1;
    rate 33M;
    al-extents 3389;
  }
  net {
    #Only if you have to assume there is buggy hardware on the way between your nodes ... like NICs pretending csums are ok while they are not!
    #data-integrity-alg sha1;
    cram-hmac-alg sha1;
    shared-secret "MySecret123";
    max-buffers 8000;
    max-epoch-size 8000;
    unplug-watermark 16;
    sndbuf-size 512k;
    sndbuf-size 0;
  }
  disk {
    no-disk-barrier;
    no-disk-flushes;
    no-md-flushes;
  }
  on server1 {
    device    /dev/drbd0;
    disk      /dev/sdb;
    address   192.168.111.10:7789;
    meta-disk internal;
  }
  on server2 {
    device    /dev/drbd0;
    disk      /dev/sdb;
    address   192.168.111.11:7789;
    meta-disk internal;
  }
}

DRBD 8.4.1
==========
/etc/drbd.d/mysql.res
resource mysql {
  startup {
    wfc-timeout 3;
    degr-wfc-timeout 2;
    outdated-wfc-timeout 1;
  }
  net {
    protocol C;
    verify-alg sha1;
    #Only interesting for WAN setups where you need to sync via a rather thin connection.
    #csums-alg sha1;
    #Only if you have to assume there is buggy hardware on the way between your nodes ... like NICs pretending csums are ok while they are not!
    #data-integrity-alg sha1;
    cram-hmac-alg sha1;
    shared-secret "MySecret123";
  }
  disk {
    resync-rate 33M;
    no-disk-barrier;
    no-disk-flushes;
    no-md-flushes;
  }
  on server1 {
    device    /dev/drbd0;
    disk      /dev/sdb;
    address   192.168.111.10:7789;
    meta-disk internal;
  }
  on server2 {
    device    /dev/drbd0;
    disk      /dev/sdb;
    address   192.168.111.11:7789;
    meta-disk internal;
  }
}

RESULTS
=======
In order to check the impact of DRBD I first compared the disk throughput writing a large 4GB file in /home/userxxx
located on /dev/sda. Then I created a partition /dev/sdb1 formatted with XFS. Note that on /dev/sda all partitions are
Ext4 so I only did this measurement to have a ballpark base figure.

Test #1
  Note: Mysql is not running
  writing on /dev/sda
  dd if=/dev/zero of=/home/userxxx/disk-test.xxx bs=1M count=4096 oflag=direct
  415 MB/s (Ext4)

  writing on /dev/sdb
  dd if=/dev/zero of=/var/lib/mysql/disk-test.xxx bs=1M count=4096 oflag=direct
  415 MB/s (XFS)

Test #2
  MySQL is running with /var/lib/mysql on /dev/sdb.
  MySQL setup with only error logs. Write 10000 rows with script running on same host.
     ;general_log=1
     ;general_log-file=/var/lib/mysql/var/log/mysql/general.log
     ;slow_query_log=1
     ;slow_query_log_file=/var/lib/mysql/var/log/mysql/slow-query.log
     ;innodb_flush_log_at_trx_commit=1
     ;binlog-format=MIXED
     ;sync_binlog=1
     ;log-bin=/var/lib/mysql/var/log/mysql/binlog
  Time ~ 3s

Test #3
  MySQL is running with /var/lib/mysql on /dev/sdb.
  MySQL setup with bin logs, sync, tx, etc... Write 10000 rows with script running on same host.
     general_log=1
     general_log-file=/var/lib/mysql/var/log/mysql/general.log
     slow_query_log=1
     slow_query_log_file=/var/lib/mysql/var/log/mysql/slow-query.log
     innodb_flush_log_at_trx_commit=1
     binlog-format=MIXED
     sync_binlog=1
     log-bin=/var/lib/mysql/var/log/mysql/binlog
  Time ~ 6s

Adding DRBD.

DRBD 8.3.12
~~~~~~~~~~~
!With Secondary DRBD node down!
Test #1
  writing on /dev/sdb
  dd if=/dev/zero of=/var/lib/mysql/disk-test.xxx bs=1M count=4096 oflag=direct
  194 MB/s (XFS)     ~ x2 performance drop from ~ 415 MB/s

Test #2
  MySQL is running with /var/lib/mysql on /dev/sdb.
  MySQL setup with only error logs. Write 10000 rows with script running on same host.
     ;general_log=1
     ;general_log-file=/var/lib/mysql/var/log/mysql/general.log
     ;slow_query_log=1
     ;slow_query_log_file=/var/lib/mysql/var/log/mysql/slow-query.log
     ;innodb_flush_log_at_trx_commit=1
     ;binlog-format=MIXED
     ;sync_binlog=1
     ;log-bin=/var/lib/mysql/var/log/mysql/binlog
  Time ~ 3s          ~ on par

Test #3
  MySQL is running with /var/lib/mysql on /dev/sdb.
  MySQL setup with bin logs, sync, tx, etc... Write 10000 rows with script running on same host.
     general_log=1
     general_log-file=/var/lib/mysql/var/log/mysql/general.log
     slow_query_log=1
     slow_query_log_file=/var/lib/mysql/var/log/mysql/slow-query.log
     innodb_flush_log_at_trx_commit=1
     binlog-format=MIXED
     sync_binlog=1
     log-bin=/var/lib/mysql/var/log/mysql/binlog
  Time ~ 6s          ~ on par

!With Secondary DRBD node up!
Test #1
  writing on /dev/sdb
  dd if=/dev/zero of=/var/lib/mysql/disk-test.xxx bs=1M count=4096 oflag=direct
  89 MB/s (XFS)     ~ x4.5 performance dropped from 415 MB/s

Test #2
  MySQL is running with /var/lib/mysql on /dev/sdb.
  MySQL setup with only error logs. Write 10000 rows with script running on same host.
     ;general_log=1
     ;general_log-file=/var/lib/mysql/var/log/mysql/general.log
     ;slow_query_log=1
     ;slow_query_log_file=/var/lib/mysql/var/log/mysql/slow-query.log
     ;innodb_flush_log_at_trx_commit=1
     ;binlog-format=MIXED
     ;sync_binlog=1
     ;log-bin=/var/lib/mysql/var/log/mysql/binlog
  Time ~ 8s           ~ x2.5 performance drop from ~ 3s

Test #3
  MySQL is running with /var/lib/mysql on /dev/sdb.
  MySQL setup with bin logs, sync, tx, etc... Write 10000 rows with script running on same host.
     general_log=1
     general_log-file=/var/lib/mysql/var/log/mysql/general.log
     slow_query_log=1
     slow_query_log_file=/var/lib/mysql/var/log/mysql/slow-query.log
     innodb_flush_log_at_trx_commit=1
     binlog-format=MIXED
     sync_binlog=1
     log-bin=/var/lib/mysql/var/log/mysql/binlog
  Time ~ 25s           ~ x4 performance drop from ~ 6s

Test #4 (mysqlslap)
mysqlslap --concurrency=1,5,10,20,30,50 --iterations=1 --engine=innodb --auto-generate-sql --auto-generate-sql-load-type=write --number-of-queries=10000 --password=<root_password>

Benchmark
    Running for engine innodb
    Average number of seconds to run all queries: 24.737 seconds
    Minimum number of seconds to run all queries: 24.737 seconds
    Maximum number of seconds to run all queries: 24.737 seconds
    Number of clients running queries: 1
    Average number of queries per client: 10000

Benchmark
    Running for engine innodb
    Average number of seconds to run all queries: 10.321 seconds
    Minimum number of seconds to run all queries: 10.321 seconds
    Maximum number of seconds to run all queries: 10.321 seconds
    Number of clients running queries: 5
    Average number of queries per client: 2000

Benchmark
    Running for engine innodb
    Average number of seconds to run all queries: 9.752 seconds
    Minimum number of seconds to run all queries: 9.752 seconds
    Maximum number of seconds to run all queries: 9.752 seconds
    Number of clients running queries: 10
    Average number of queries per client: 1000

Benchmark
    Running for engine innodb
    Average number of seconds to run all queries: 9.491 seconds
    Minimum number of seconds to run all queries: 9.491 seconds
    Maximum number of seconds to run all queries: 9.491 seconds
    Number of clients running queries: 20
    Average number of queries per client: 500

Benchmark
    Running for engine innodb
    Average number of seconds to run all queries: 9.827 seconds
    Minimum number of seconds to run all queries: 9.827 seconds
    Maximum number of seconds to run all queries: 9.827 seconds
    Number of clients running queries: 30
    Average number of queries per client: 333

Benchmark
    Running for engine innodb
    Average number of seconds to run all queries: 9.938 seconds
    Minimum number of seconds to run all queries: 9.938 seconds
    Maximum number of seconds to run all queries: 9.938 seconds
    Number of clients running queries: 50
    Average number of queries per client: 200

DRBD 8.4.1
~~~~~~~~~~~
!With Secondary DRBD node down!
Test #1
  writing on /dev/sdb
  dd if=/dev/zero of=/var/lib/mysql/disk-test.xxx bs=1M count=4096 oflag=direct
  194 MB/s (XFS)     ~ x2 performance drop from ~ 415 MB/s

Test #2
  MySQL is running with /var/lib/mysql on /dev/sdb.
  MySQL setup with only error logs. Write 10000 rows with script running on same host.
     ;general_log=1
     ;general_log-file=/var/lib/mysql/var/log/mysql/general.log
     ;slow_query_log=1
     ;slow_query_log_file=/var/lib/mysql/var/log/mysql/slow-query.log
     ;innodb_flush_log_at_trx_commit=1
     ;binlog-format=MIXED
     ;sync_binlog=1
     ;log-bin=/var/lib/mysql/var/log/mysql/binlog
  Time ~ 24s          ~ 8x performance drop from ~ 3s
                      ~ 8x performance drop from 8.3.12

Test #3
  MySQL is running with /var/lib/mysql on /dev/sdb.
  MySQL setup with bin logs, sync, tx, etc... Write 10000 rows with script running on same host.
     general_log=1
     general_log-file=/var/lib/mysql/var/log/mysql/general.log
     slow_query_log=1
     slow_query_log_file=/var/lib/mysql/var/log/mysql/slow-query.log
     innodb_flush_log_at_trx_commit=1
     binlog-format=MIXED
     sync_binlog=1
     log-bin=/var/lib/mysql/var/log/mysql/binlog
  Time ~ 1m19s          ~ x13 performance drop from ~ 6s
                        ~ x13 performance drop from 8.3.12

!With Secondary DRBD node up!
Test #1
  writing on /dev/sdb
  dd if=/dev/zero of=/var/lib/mysql/disk-test.xxx bs=1M count=4096 oflag=direct
  102 MB/s (XFS)     ~ x4 performance dropped from 415 MB/s

Test #2
  MySQL is running with /var/lib/mysql on /dev/sdb.
  MySQL setup with only error logs. Write 10000 rows with script running on same host.
     ;general_log=1
     ;general_log-file=/var/lib/mysql/var/log/mysql/general.log
     ;slow_query_log=1
     ;slow_query_log_file=/var/lib/mysql/var/log/mysql/slow-query.log
     ;innodb_flush_log_at_trx_commit=1
     ;binlog-format=MIXED
     ;sync_binlog=1
     ;log-bin=/var/lib/mysql/var/log/mysql/binlog
  Time ~ 58s           ~ x20 performance drop from ~ 3s
                       ~ x7 performance drop from 8.3.12

Test #3
  MySQL is running with /var/lib/mysql on /dev/sdb.
  MySQL setup with bin logs, sync, tx, etc... Write 10000 rows with script running on same host.
     general_log=1
     general_log-file=/var/lib/mysql/var/log/mysql/general.log
     slow_query_log=1
     slow_query_log_file=/var/lib/mysql/var/log/mysql/slow-query.log
     innodb_flush_log_at_trx_commit=1
     binlog-format=MIXED
     sync_binlog=1
     log-bin=/var/lib/mysql/var/log/mysql/binlog
  Time ~ 2m50s           ~ x28 performance drop from ~ 6s
                         ~ x7 performance drop from 8.3.12

Test #4 (mysqlslap)
mysqlslap --concurrency=1,5,10,20,30,50 --iterations=1 --engine=innodb --auto-generate-sql --auto-generate-sql-load-type=write --number-of-queries=10000 --password=<root_password>

Benchmark
	Running for engine innodb"
	Average number of seconds to run all queries: 169.607 seconds
	Minimum number of seconds to run all queries: 169.607 seconds
	Maximum number of seconds to run all queries: 169.607 seconds
	Number of clients running queries: 1
	Average number of queries per client: 10000

Benchmark
	Running for engine innodb
	Average number of seconds to run all queries: 60.247 seconds
	Minimum number of seconds to run all queries: 60.247 seconds
	Maximum number of seconds to run all queries: 60.247 seconds
	Number of clients running queries: 5
	Average number of queries per client: 2000

Benchmark
	Running for engine innodb
	Average number of seconds to run all queries: 51.656 seconds
	Minimum number of seconds to run all queries: 51.656 seconds
	Maximum number of seconds to run all queries: 51.656 seconds
	Number of clients running queries: 10
	Average number of queries per client: 1000

Benchmark
	Running for engine innodb
	Average number of seconds to run all queries: 52.587 seconds
	Minimum number of seconds to run all queries: 52.587 seconds
	Maximum number of seconds to run all queries: 52.587 seconds
	Number of clients running queries: 20
	Average number of queries per client: 500

Benchmark
	Running for engine innodb
	Average number of seconds to run all queries: 50.869 seconds
	Minimum number of seconds to run all queries: 50.869 seconds
	Maximum number of seconds to run all queries: 50.869 seconds
	Number of clients running queries: 30
	Average number of queries per client: 333

Benchmark
	Running for engine innodb
	Average number of seconds to run all queries: 52.766 seconds
	Minimum number of seconds to run all queries: 52.766 seconds
	Maximum number of seconds to run all queries: 52.766 seconds
	Number of clients running queries: 50
	Average number of queries per client: 200

Huge performance drop compared to 8.3.12 as well.

Why is there such a big difference between 8.3.12 and 8.4.1?

Thank you
Fred

On 2 Feb 2012, at 14:31, Lars Ellenberg wrote:

> On Thu, Feb 02, 2012 at 02:24:08PM +0000, Frederic DeMarcy wrote:
>> Hi Lars
>> 
>> I'm usinng 8.4.1
>> 
>> [root at server1 ~]# service drbd status
>> drbd driver loaded OK; device status:
>> version: 8.4.1 (api:1/proto:86-100)
>> 
> 
> Is that so.
> Apologies, I seem to have mixed a few threads here.
> 
> In that case, invalidate-remote when in StandAlone should be allowed.
> Strange.
> 
> Maybe you should downgrade ;-)
> and see where things go.
> 
> In fact, my guess is moving meta data to RAID1
> will be most useful in your case.
> 
>>>> 
>>>> Is the result the same if you execute a "drbdadm invalidate-remote
>>>> mysql" on the primary before doing the "single node" test? .... that
>>>> would disable activity log updates ...
>>> 
>>> ... with recent enough versions of DRBD.
>>> Only the OP seems to be using 8.3.7.
>>> Maybe you should consider an upgrade?
> 
> 
> -- 
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
> 
> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
> __
> please don't Cc me, but send to list   --   I'm subscribed
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user