[DRBD-user] huge performance problems, drbd8 sync painfully slow

Digimer lists at alteeve.ca
Fri Mar 29 16:28:23 CET 2019


Sync performance is not DRBD performance. In fact, the faster sync is
forced to go, the more it takes away from application performance that
is using DRBD. By default, DRBD 8.4+ will watch the load coming in from
above and dynamically adjust the sync performance, speeding up when load
is light and slowing down when load is high.

Lets say, for simple math, your DRBD resource has a max sustained write
speed of 100 MB/sec. If Sync is running at 35 MB/sec, you've got only 65
MB/sec left for your applications. If you force sync to 90 MB/sec, only
10 MB/sec is available for your application. (Obviously this is
simplistic and doesn't touch on IOPS, but you get the idea I hope).

If you want to see what DRBD write performance is like in the real
world, pause the sync (or wait for it to finish), then try running write
tests. If that's slow, then we have a real problem.

In the meantime, I'd highly recommend simplifying your drbd config.
Premature optimization never is.

Digimer

On 2019-03-29 4:54 a.m., Juergen Sauer wrote:
> Hi!
> 
> I've made a setup for a 2 node cluster using drbd8.
> 
> Hardware are two 8core Cpu, 10Gbe Cluster link servers.
> # hdparm -tT /dev/md0p2
> 
> /dev/md0p2:
>  Timing cached reads:   11750 MB in  1.99 seconds = 5894.71 MB/sec
>  Timing buffered disk reads: 806 MB in  3.04 seconds = 265.02 MB/sec
> 
> The raw harddisk device reaches >250MB/sec
> The Network for the drbd is 10Gbe.
> copying a huge vm file via nfs to an other partion on the md0 device
> reaches about 200 MB/sec.
> 
> Syncing Result is:
> cat /proc/drbd
> version: 8.4.11 (api:1/proto:86-101)
> srcversion: C27D50EE6C67ED861348AA6
>  0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
>     ns:3127296 nr:0 dw:0 dr:3127296 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1
> wo:d oos:7222256
>         [=====>..............] sync'ed: 30.3% (7052/10104)M
>         finish: 0:03:11 speed: 37,784 (37,676) K/sec
> 
> I see only 37 MB/sec instead of expected 150 ... 250 MB/sec.
> 
> This is less than 19% of the expected performance of 200 MB/sec.
> 
> I have this setup:
> ---x----x---x-----X---
> [root at cl1 drbd.d]# more cluster.res
> resource laf {
>          disk {
>              c-plan-ahead  15;
>              c-fill-target 24;
>              c-min-rate   150M;
>              c-max-rate   720M;
>              disk-barrier no;
>              disk-flushes no;
>              al-extents 3389;
>              }
> 
> net {
>         protocol C;
>         # max-epoch-size          20000;
>         sndbuf-size 36k;
>         sndbuf-size            1024k;
>         rcvbuf-size            2048k;
>         unplug-watermark 24;
>     }
> 
> on l1i {
>     device     /dev/drbd0;
>     disk       /dev/md0p2;
>     address    192.168.254.11:7778;
>     meta-disk  /dev/sda4;
>   }
> 
> on l2i {
>     device     /dev/drbd0;
>     disk       /dev/md0p2;
>     address    192.168.254.21:7778;
>     meta-disk  /dev/sda4;
>   }
> }
> ---x----x---x-----X---
> [root at l2i drbd.d]# more global_common.conf
> global {
>         usage-count no;
> }
> 
> common {
>         protocol C;
>         handlers {
>                 pri-on-incon-degr
> "/usr/lib/drbd/notify-pri-on-incon-degr.sh;
> /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
> reboot -f";
>                 pri-lost-after-sb
> "/usr/lib/drbd/notify-pri-lost-after-sb.sh;
> /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
> reboot -f";
>                 local-io-error "/usr/lib/drbd/notify-io-error.sh;
> /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger
> ; halt -f";
> 
>                 split-brain "/usr/lib/drbd/notify-split-brain.sh root";
>         }
> 
>         startup {
>                 become-primary-on both;
>                 wfc-timeout      60;
>                 degr-wfc-timeout 60;
>         }
> 
>         disk {
>              on-io-error   detach;
>              disk-barrier no;
>              disk-flushes no;
>              al-extents 3389;
>              no-disk-flushes ;
>              no-disk-barrier ;
>              no-md-flushes ;
>              c-plan-ahead  128;
>              c-fill-target 256M;
>              c-min-rate    250M;
>              c-max-rate    275M;
>         }
> 
>         net {
>                 allow-two-primaries;
>                 after-sb-0pri discard-zero-changes;
>                 after-sb-1pri discard-secondary;
>                 after-sb-2pri disconnect;
>                 sndbuf-size            1024k;
>                 rcvbuf-size            2048k;
>                 max-epoch-size         20000;
>                 max-buffers           131072;
>                 unplug-watermark 24;
>                 }
> 
>         syncer {
> 
>                 rate 1120M;
>          }
> }
> ---x----x---x-----X---
> 
> 
> Any Ideas, why drbd8 is so painfully slow?
> Any Hint ?
> 
> 
> TIA
> with kind regards
> Jürgen Sauer
> 
> 
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
> 


-- 
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould


More information about the drbd-user mailing list