[DRBD-user] Load high on primary node while doing backup on secondary

Thu May 1 01:12:08 CEST 2014

Lars,
Thank you for responding. It was quite helpful. I don't know why I thought I needed a temporary drbd resource to get at the snapshot. I do indeed see the VM img files just mounting the snapshot.
What can I do to reduce the effect of the backup on the primary node ? Would it make sense to do this before the copy starts:
/sbin/drbdadm net-options --protocol=A zapp
or perhaps increasing the sndbuf-size or al-extents ?
Irwin

> Date: Wed, 30 Apr 2014 22:31:17 +0200
> From: lars.ellenberg at linbit.com
> To: drbd-user at lists.linbit.com
> Subject: Re: [DRBD-user] Load high on primary node while doing backup on secondary
> 
> On Wed, Apr 23, 2014 at 10:16:24AM -0700, Irwin Nemetz wrote:
> > I have a two node cluster. There are 3 mail nodes running as KVM virtual
> > machines on one node. The 3 VM's sit on top of a DRBD disk on a LVM volume
> > which replicates to the passive 2nd node.
> > 
> > Hardware: 2x16 core AMD processors, 128gb memory, 5 3tb sas drives in a raid5
> 
> You likely won't get "terrific" performance out of a few large drivers in raid5.
> 
> > The drbd replication is over a crossover cable.
> > 
> > version: 8.4.4 (api:1/proto:86-101)
> > GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil at Build64R6, 2013-10-14 15:33:06
> 
> > resource zapp
> > {
> >   startup {
> >     wfc-timeout 10;
> >     outdated-wfc-timeout 10;
> >     degr-wfc-timeout 10;
> >   }
> >   disk {
> >     on-io-error detach; 
> >     rate 40M;
> >     al-extents 3389;
> >   }
> >   net {
> >    verify-alg sha1;
> >    max-buffers 8000;
> >    max-epoch-size 8000;
> >    sndbuf-size 512k;
> >    cram-hmac-alg sha1;
> >    shared-secret sync_disk;
> >    data-integrity-alg sha1;
> 
> Don't enable data-integrity in production.  It will just burn your
> cycles and limit your throughput to however fast your core can crunch sha1.
> 
> That's a *diagnostic* feature.
> It does not really do anything for the integrity of your data,
> it just happens to help to *detect* when integrity *may* be compromised.
> We should have named that
> "burn-cpu-cycles-and-calculate-extra-checksums-for-diagnostic-purposes".
> 
> 
> >   }
> >   on nodea.cluster.dns {
> >    device /dev/drbd1;
> >    disk /dev/virtimages/zapp;
> >    address 10.88.88.171:7787;
> >    meta-disk internal;
> >   }
> >   on nodeb.cluster.dns {
> >    device /dev/drbd1;
> >    disk /dev/virtimages/zapp;
> >    address 10.88.88.172:7787;
> >    meta-disk internal;
> >   }
> > }
> 
> You could probably do a lot of tuning ...
> 
> > I am trying to do a backup of the VM's nightly. They are about 2.7TB each.
> > I create a snapshot on the backup node, mount it and then do a copy to a
> > NAS backup storage device. The NAS is on it's own network.
> > 
> > Here's the script:
> > 
> > [root at nodeb ~]# cat backup-zapp.sh
> > #!/bin/bash
> > 
> > date
> > cat > /etc/drbd.d/snap.res <<EOF
> 
> *OUTCH*
> 
> What would you do that.
> 
> There is no point to create a throw away DRBD resource automatically
> to access a snapshot from below an other DRBD.
> 
> Just use that snapshot directly.
> 
> Or am I missing something?
> 
> > /sbin/lvcreate -L500G -s -n snap-zapp /dev/virtimages/zapp
> > 
> > /sbin/drbdadm up snap
> 
> No need. Really.
> 
> > sleep 2
> > /sbin/drbdadm primary snap
> > mount -t ext4 /dev/drbd99 /mnt/zapp
> 
> instead, this should do all you need:
> mount -t ext4 -o ro /dev/virtimages/snap-zapp /mnt/zapp
> 
> > cd /rackstation/images
> > mv -vf zapp.img zapp.img.-1
> > mv -vf zapp-opt.img zapp-opt.img.-1
> > cp -av /mnt/zapp/*.img /rackstation/images
> > umount /mnt/zapp
> > /sbin/drbdadm down snap
> > rm -f /etc/drbd.d/snap.res
> > /sbin/lvremove -f /dev/virtimages/snap-zapp
> > date
> > 
> > About half way thru the copy, the copy starts stuttering (network traffic
> > stops and starts) and the load on the primary machine and the virtual
> > machine being copied shoots thru the roof.
> 
> Maybe snapshots filling up, disks being slow and all,
> and everything that is written on the primary needs to be written on the
> secondary *twice* now, while you are hammering that secondary IO
> subsystem with reads...
> of course that impacts the Primary, as soon as the secondary RAID 5
> can no longer keep up with the load you throw at it.
> 
> > I am at lose to explain this since it's dealing with a snapshot of a
> > volume on a replicated node. The only reasonable explanation I can think
> > of is that the drbd replication is being blocked by something and this is
> > causing the disk on the primary node to become unresponsive.
> 
> Yep.
> See above.
> 
> You could try to use rsync --bwlimit instead of cp, that will reduce the
> read load, but it will also prolong the lifetime of the snapshot,
> so it may or may not actually help.
> 
> Or maybe you just need to defrag your VM images...
> possibly they are fragmented, and what you see is the secondary IO
> subsystem hitting the IOPS limit while trying to seek through all the VM
> image fragments...
> 
> Or use a "stacked" DBRD setup,
> and disconnect the third node.
> Or, if you can live with reduced redundancy during the backup,
> disconnect the secondary for that time.
> 
> Or add a dedicated PV for the snapshot "exeption store",
> or add non-volatile cache to your RAID.
> or a number of other options.
> 
> Thing is, if you stress the secondary IO subsystem enough,
> that *will* impact the (write performance on the) primary.
> 
> -- 
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
> 
> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
> __
> please don't Cc me, but send to list   --   I'm subscribed
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20140430/a8ce5b67/attachment.htm>