[DRBD-user] csums-alg seems not working on my cluster....

Dan Barker dbarker at visioncomm.net
Thu Sep 5 23:44:08 CEST 2013

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


That's a very difficult way to go about setting up internal metadata. Normally, just we create the metadata on the raw device (/dev/sdb) and then create the filesystem on the drbd device (/dev/sql_data1). No math!

You did not appear to specify a syncer rate. I thought the default was much higher than 2040K, but that's the target for the sync operation. Why not set the synch rate up to some reasonable percentage (most all for initial sync, maybe 30% of your bandwidth thereafter) of the available bandwidth. You say "low" without defining it. The displays appear syncher rate constrained.

Also, you don't have to do a full sync on initially empty disks. That's in the doc under clear-bitmap and/or new-current-uuid.

You can modify the syncher rate while running, or in the config files and then "adjust" the resources.

Dan

From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Lafaille Christophe
Sent: Thursday, September 05, 2013 9:37 AM
To: drbd-user at lists.linbit.com
Subject: [DRBD-user] csums-alg seems not working on my cluster....

Hi All,

I need to use very low bandwith network between 2 machines using drbd and I try using csums-alg/verify-alg.

But I've same duration with or without csums-alg !

Execution with csums-alg:
[root at sms246105 drbd.d]# cat /proc/drbd
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root at rh63_build, 2013-01-10 09:57:53

 1: cs:SyncTarget ro:Secondary/Secondary ds:Inconsistent/UpToDate C r-----
    ns:0 nr:512 dw:512 dr:147968 al:0 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:4904384
        [>....................] sync'ed:  3.0% (4788/4932)M
        finish: 0:15:17 speed: 5,332 (5,284) want: 2,040 K/sec

Execution without csums-alg:
[root at sms246105 drbd.d]# cat /proc/drbd
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root at rh63_build, 2013-01-10 09:57:53

 1: cs:SyncTarget ro:Secondary/Secondary ds:Inconsistent/UpToDate C r-----
    ns:0 nr:53760 dw:53760 dr:0 al:0 bm:3 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:4819904
        [>....................] sync'ed:  1.2% (4704/4756)M
        finish: 0:14:52 speed: 5,376 (5,376) want: 2,040 K/sec
I don't know where is the problem... is csums-alg usable only in a more recent version of DRBD (like 8.4.3 or 8.4.4) ?
I've built drbd packages from source, perhaps I need to specify an option in order to have csums-alg facility (I'll check for this) ?

I've put csums-alg in "net" section and in some web pages I've found a "syncer" section with csums-alg (seems no more available in 8.4.x versions).
==> what's the good place ?

On both machines, I do this sequence:
# /etc/init.d/drbd stop
# delete all partition on /dev/sdb and create a 5GB (for my tests, real size is around 300GB) partitions with fdisk
# partprobe /dev/sdb
# dd if=/dev/zero of=dev/sdb1 bs=4096   ==> to initialize disk content
# mkfs.ext3 -j -m 0 -b 4096 /dev/sdb1
# PARTSIZE=`sfdisk -s /dev/sdb1 | xargs -i echo "{} 1024 / 1024 / p" | dc`
# NEWSIZE=$[${PARTSIZE}-2]
# resize2fs /dev/sdb1 ${NEWSIZE}G
# e2fsck -f /dev/sdb1
# /etc/init.d/drbd start
# /sbin/drbdadm create-md sqldata
# /sbin/drbdadm up sqldata
On one machine: # /sbin/drbdadm --force primary sqldata
The file  /etc/drbd.d/sqldata.res :
resource sqldata {
    device     /dev/drbd_sqldata minor 1;
    disk       /dev/sdb1;
    meta-disk  internal;
    on sms246104 {
        address 135.117.246.104:7788;
    }
    on sms246105 {
        address 135.117.246.105:7788;
    }
}
The file /etc/drbd.d/global_common.conf :
global {
    usage-count yes;
    dialog-refresh 1;
    minor-count 5;
}
common {
    handlers {
        pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
        pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
        local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
        split-brain "/usr/lib/drbd/notify-split-brain.sh root";
    }

    startup {
        wfc-timeout 15;
    }

    options {
    }

    disk {
        on-io-error detach;
        c-plan-ahead 20;
        c-fill-target 50k;
        c-min-rate 250k;
        c-max-rate 2M;
    }

    net {
        timeout 60;
        ping-int 6;
        after-sb-0pri discard-younger-primary;
        after-sb-1pri discard-secondary;
        after-sb-2pri call-pri-lost-after-sb;
        ping-timeout 60;
        protocol C;
        cram-hmac-alg sha1;
        shared-secret "TestHA";
        csums-alg sha1;
        verify-alg sha1;
    }
}

Traces in /var/log/kern.log :
Sep  5 13:03:16 sms246104 kernel: events: mcg drbd: 2
Sep  5 13:03:16 sms246104 kernel: drbd: initialized. Version: 8.4.2 (api:1/proto:86-101)
Sep  5 13:03:16 sms246104 kernel: drbd: GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root at rh63_build, 2013-01-10 09:57:53
Sep  5 13:03:16 sms246104 kernel: drbd: registered as block device major 147
Sep  5 13:03:16 sms246104 kernel: d-con sqldata: Starting worker thread (from drbdsetup [21781])
Sep  5 13:03:16 sms246104 kernel: block drbd1: disk( Diskless -> Attaching )
Sep  5 13:03:16 sms246104 kernel: d-con sqldata: Method to ensure write ordering: flush
Sep  5 13:03:16 sms246104 kernel: block drbd1: max BIO size = 524288
Sep  5 13:03:16 sms246104 kernel: block drbd1: drbd_bm_resize called with capacity == 10486656
Sep  5 13:03:16 sms246104 kernel: block drbd1: resync bitmap: bits=1310832 words=40964 pages=41
Sep  5 13:03:16 sms246104 kernel: block drbd1: size = 5120 MB (5243328 KB)
Sep  5 13:03:16 sms246104 kernel: block drbd1: bitmap READ of 41 pages took 1 jiffies
Sep  5 13:03:16 sms246104 kernel: block drbd1: recounting of set bits took additional 0 jiffies
Sep  5 13:03:16 sms246104 kernel: block drbd1: 5051 MB (1293168 bits) marked out-of-sync by on disk bit-map.
Sep  5 13:03:16 sms246104 kernel: block drbd1: disk( Attaching -> UpToDate ) pdsk( DUnknown -> Outdated )
Sep  5 13:03:16 sms246104 kernel: block drbd1: attached to UUIDs 646C4E1151078FBF:DAB8D60C65E253A0:0000000000000004:0000000000000000
Sep  5 13:03:16 sms246104 kernel: d-con sqldata: conn( StandAlone -> Unconnected )
Sep  5 13:03:16 sms246104 kernel: d-con sqldata: Starting receiver thread (from drbd_w_sqldata [21782])
Sep  5 13:03:16 sms246104 kernel: d-con sqldata: receiver (re)started
Sep  5 13:03:16 sms246104 kernel: d-con sqldata: conn( Unconnected -> WFConnection )
Sep  5 13:03:37 sms246104 kernel: d-con sqldata: Handshake successful: Agreed network protocol version 101
Sep  5 13:03:37 sms246104 kernel: d-con sqldata: Peer authenticated using 20 bytes HMAC
Sep  5 13:03:37 sms246104 kernel: d-con sqldata: conn( WFConnection -> WFReportParams )
Sep  5 13:03:37 sms246104 kernel: d-con sqldata: Starting asender thread (from drbd_r_sqldata [21786])
Sep  5 13:03:37 sms246104 kernel: block drbd1: drbd_sync_handshake:
Sep  5 13:03:37 sms246104 kernel: block drbd1: self 646C4E1151078FBE:DAB8D60C65E253A0:0000000000000004:0000000000000000 bits:1293168 flags:0
Sep  5 13:03:37 sms246104 kernel: block drbd1: peer DAB8D60C65E253A0:0000000000000000:0000000000000000:0000000000000000 bits:1293168 flags:0
Sep  5 13:03:37 sms246104 kernel: block drbd1: uuid_compare()=1 by rule 70
Sep  5 13:03:37 sms246104 kernel: block drbd1: Becoming sync source due to disk states.
Sep  5 13:03:37 sms246104 kernel: block drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( Outdated -> Inconsistent )
Sep  5 13:03:37 sms246104 kernel: block drbd1: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 25(1), total 25; compression: 100.0%
Sep  5 13:03:37 sms246104 kernel: block drbd1: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 25(1), total 25; compression: 100.0%
Sep  5 13:03:37 sms246104 kernel: block drbd1: helper command: /sbin/drbdadm before-resync-source minor-1
Sep  5 13:03:37 sms246104 kernel: block drbd1: helper command: /sbin/drbdadm before-resync-source minor-1 exit code 0 (0x0)
Sep  5 13:03:37 sms246104 kernel: block drbd1: conn( WFBitMapS -> SyncSource )
Sep  5 13:03:37 sms246104 kernel: block drbd1: Began resync as SyncSource (will sync 5172672 KB [1293168 bits set]).
Sep  5 13:03:37 sms246104 kernel: block drbd1: updated sync UUID 646C4E1151078FBE:DAB9D60C65E253A0:DAB8D60C65E253A0:0000000000000004

I hope someone will be able to help me...

Perhaps, I'm totally wrong on csums-alg usage !

Regards




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20130905/6cd48f41/attachment.htm>


More information about the drbd-user mailing list