[DRBD-user] drbd 0.7.4 half the speed of 0.6.13

Sat Sep 18 01:08:52 CEST 2004

I've tried both on the same hosts, same drives (15K SCSI Ultra160
drives)  nothing changing except DRBD vers. (and conf. of course).

version: 0.6.13 (api:64/proto:62)

0: cs:SyncingAll st:Primary/Secondary ns:76445216 nr:0 dw:0 dr:76445216
pe:157 ua:0
        [=>..................] sync'ed:  6.7% (65350/70001)M
        finish: 0:16:09h speed: 73,575 (73,281) K/sec

version: 0.7.4 (api:76/proto:74)
SVN Revision: 1537M build by root at cluster3-1, 2004-09-17 14:58:25
 0: cs:SyncSource st:Secondary/Secondary ld:Consistent
    ns:855892 nr:0 dw:0 dr:856000 al:0 bm:52 lo:27 pe:0 ua:27 ap:0
        [>...................] sync'ed:  1.3% (69043/69879)M
        finish: 0:36:49 speed: 31,896 (27,608) K/sec

I've tried a plethora of different max-buffers / max-epoch values,
starting with 2048/2048, and ending up at 256/256 as this improved
performance from about 20MB/sec to about 28MB/sec. Still no where near
the 73MB/sec I can reliably obtain using 0.60.13

Now, I have noted that load avg on 0.7.4 is quite a bit lower than it is
under 0.6.13. But unless I'm mistaken sync-nice is deprecated. I'd
gladly give up some cpu cycles to increase throughput. Is there another
way to do this in 0.7.4?

Please help! I'd like to be able to take advantage of the added benefits
of 0.7.x, but this is for a DB server that is already I/O constrained...
I can't afford to take this kind of performance hit.

Thanks,

Josh

Below are the resource config sections for each version.

0.6.13
-------
resource db0 {

  protocol = C
  fsckcmd  = /bin/true

load-only

  disk {
    do-panic
    disk-size = 71681998
  }

  net {
sync-min   = 5M
sync-nice  = -18  # if synchronization is high priority for you
    sync-max    = 75M
    tl-size     = 2048
    timeout     = 60
    connect-int = 10
    ping-int    = 10
  }

#/dev/nb1 - /drbd/d01
  on cluster3-1 {
    device  = /dev/nb0
    disk    = /dev/sdb
    address = 10.10.10.1
    port    = 7789
  }

  on cluster3-2 {
    device  = /dev/nb0
    disk    = /dev/sdb
    address = 10.10.10.2
    port    = 7789
  }
}

0.7.4
------
resource raw0 {

  protocol C;

  incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ;
halt -f";

  startup {
    # wfc-timeout  0;
    degr-wfc-timeout 120;    # 2 minutes.
  }

  disk {
    on-io-error   detach;
  }

  net {
    timeout       60;    #  6 seconds  (unit = 0.1 seconds)
    connect-int   10;    # 10 seconds  (unit = 1 second)
    ping-int      10;    # 10 seconds  (unit = 1 second)

    #sndbuf-size            128;
    max-buffers     256;
    max-epoch-size  256;

    ko-count 4;

    on-disconnect stand_alone;
  }

  syncer {

    rate 100M;
    group 1;

    al-extents 257;
  }

  on cluster3-1 {
    device     /dev/drbd0;
    disk       /dev/sdb;
    address    10.10.10.1:7788;
    meta-disk  internal;

  }

  on cluster3-2 {
    device    /dev/drbd0;
    disk      /dev/sdb;
    address   10.10.10.2:7788;
    meta-disk internal;
  }
}