[DRBD-user] DRBD 8.2.6 - reason for full resync?!

Andrei Neagoe anne at imc.nl
Tue Jun 24 14:12:35 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Thanks a lot for the clarification. That was exactly the case... from my 
understanding of the docs I thought it was just necessary to run drbdadm 
adjust all on each node, regardless of the node state (primary or 
secondary). Right now it's pretty clear how I must proceed with the testing.
What still puzzles me is the fact that only one resource got the need to 
be fully resynchronized, because as I said, I'm running lvm2 over them 
(having drbd0 and drbd1 as physical volumes).
Another thing is the speed, which atm it's let's say satisfactory, but I 
found a thread on linbit archive where a user with a very similar setup 
and testing scheme was getting ~37 MB/s over fiber link between 2 
datacenters and if connected via crossover cable a transfer rate of 
almost 80 MB/s. You can view the thread here: 
http://archives.free.net.ph/message/20080523.225430.9ba8ceac.en.html
Testing both network and writing to the external storage box directly 
reveals that these are not the limitations:

    /------------------------------------------------------------/
    /Client connecting to 10.0.0.20, TCP port 5001/
    /TCP window size: 0.02 MByte (default)/
    /------------------------------------------------------------/
    /[  3] local 10.0.0.10 port 39353 connected with 10.0.0.20 port 5001/
    /[ ID] Interval       Transfer     Bandwidth/
    /[  3]  0.0-10.0 sec  1125 MBytes    113 MBytes/sec/
    /[ ID] Interval       Transfer     Bandwidth/
    /[  3]  0.0-10.0 sec  1125 MBytes    112 MBytes/sec
    -----------------------------------------------------------
    [root at erebus testing]# dd if=/dev/zero of=test.dat bs=1G count=1
    oflag=dsync
    1+0 records in
    1+0 records out
    1073741824 bytes (1.1 GB) copied, 10.321 seconds, 104 MB/s

    /

Note that in the above test a different device is mounted in /testing 
(just another logical drive on the storage box). As an additional 
information, the storage box is an IBM DS 3200 connected to the machine 
using 2 SAS HBA's (just for redundancy, no load balancing).

So at the moment I'm also pretty stuck with performance tuning as I 
don't know what else I could try.

Thanks,
Andrei Neagoe.

Lars Ellenberg wrote:
> On Tue, Jun 24, 2008 at 12:27:30PM +0200, Andrei Neagoe wrote:
>   
>> Hi,
>>
>> I was trying today to play with drbd's settings and benchmark the results in
>> order to obtain the best performance.
>> Here is my test setup:
>> 2 identical machines with sas storage boxes. Each machine has two 2TB device
>> (in my case /dev/sdb and /dev/sdc) that I mirror over drbd and on top of them
>> there's LVM set up. The nodes share a gbit link dedicated for drbd traffic.
>> After the initial sync which took something around 20 hours to finish, I've
>> created the LVM volume and formatted using ext3 FS. Then I started to play
>> around with params like al-extents, unplug-watermark, maxbuffers, max-epoch by
>> changing the  values and doing a drbdadm adjust all on each node (of course
>> after copying the config file accordingly). In the begining it went pretty
>> well, maximum value attained by dd test over drbd was 28.9 MB/s:
>>
>> [root at erebus testing]# dd if=/dev/zero of=test.dat bs=1G count=1 oflag=dsync
>> 1+0 records in
>> 1+0 records out
>> 1073741824 bytes (1.1 GB) copied, 37.1114 seconds, 28.9 MB/s
>>
>> The configuration used is described in the end. After a couple more tests, I
>> noticed a big impact on performance, getting around 19-20 MB/s so I checked /
>> proc/drbd to see what's going on. Surprisingly, it was doing a full resync on
>> one of the disks. Problem is, I don't understand why, as normally it should
>> only resync discrepancies.
>>     
>
> if you change anything in the config file that changes "disk"
> parameters (like on-io-error, size, fencing, use-bmbv, ...),
> which causes drbdadm adjust to think it needs to detach/attach, and you
> do that while being primary, you get a full sync.
>
> this is unfortunate, and there should probably
> be a dialog to warn you about it.
>
> if you detach a Primary, then reattach, it will receive a full sync.
> you need to make it secondary first, if you want to avoid that.
> detaching, then reattaching a secondary will only receive an
> "incremental" resync, which typically is a few KB or nothing at all,
> depending on the timing.
>
> if this is not what happened for you, read the kernel log,
> typically drbd tells you why a resync was necessary.
>
>
> --
> : Lars Ellenberg                           http://www.linbit.com :
> : DRBD/HA support and consulting             sales at linbit.com :
> : LINBIT Information Technologies GmbH      Tel +43-1-8178292-0  :
> : Vivenotgasse 48, A-1120 Vienna/Europe     Fax +43-1-8178292-82 :
> __
> please don't Cc me, but send to list -- I'm subscribed
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20080624/c8f654ac/attachment.htm>


More information about the drbd-user mailing list