[DRBD-user] c-min-rate priority

Mon Jun 8 22:24:11 CEST 2015

Thank you for your answer, and don't worry, we clearly understand you !

What I want is the following :
When the secondary node is outdated, I want it to be uptodate as soon as possible, even if new blocks arrive on the primary node, which would of course also need to be duplicated.
My priority is to recover the replication delay.
So, when the secondary node is outdated, I want DRBD to reduce / delay application IOs on the primary node, so that the replication link is almost dedicated to recover the delay, to make the secondary node uptodate as soon as possible.
Once secondary node is uptodate, replication link is fully back for new blocks, application data rate can then reach again the link bandwidth.

This is why I tried to use :
c-plan-ahead 0;
resync-rate 680M; --> my replication link can go up to 680M
c-min-rate 400M; --> I want to have at least 400M to recover a delay

However, it does not work ; application write data rate keeps reaching 600/680M even if secondary node is outdated / we have to recover replication delay.
I would have expected application data rate to be reduced to 280M (680-400) and 400M of bandwidth used to recover the delay.

I may be wrong in my settings / in my understanding of how DRBD works in this specific case.

Thank you again,

Ben

> Le 8 juin 2015 à 21:00, A.Rubio <aurusa at etsii.upv.es> a écrit :
> 
> Hello. 
> 
> I didnt reply your email because:
> 
> I'm a DRBD user. I'm not in the DRBD Team.
> 
> My english is poor.
> 
> And I dont understand what is your problem, because in the email you send the DRBD sync is aprox 680MB/s.
> 
> RAID 10 to 800MB/s is a theoretical or tested value ?
> 
> Do you test link connection ? What it is the real speed of your connection ? I think that 680MB/s is good. The theoretical value in a 10Gb/s is 1280MB/s but real depends
> 
> You can test speed link connection with
> 
> http://eggdrop.ch/projects/bwbench/ <http://eggdrop.ch/projects/bwbench/>
> 
> What is you want ?
> 
> more speed in sync ??
> 
> 
> El 08/06/15 a las 16:47, Ben RUBSON escribió:
>> Hello,
>> 
>> I am confused to ask again, but could you help me with this please ?
>> I really don't know how to go further, if the behavior I would like to have is supported by DRBD or not...
>> 
>> DRBD team ?
>> Any support would really be appreciated.
>> 
>> Thank you again,
>> 
>> Best regards,
>> 
>> Ben
>> 
>> 
>> 
>> 2015-05-28 17:05 GMT+02:00 Ben RUBSON <ben.rubson at gmail.com <mailto:ben.rubson at gmail.com>>:
>> So,
>> 
>> I played during hours with the dynamic resync rate controller.
>> Here are my settings :
>> 
>> c-plan-ahead 10; //10ms between my 2 nodes, but minimum 1 second recommended here : https://blogs.linbit.com/p/128/drbd-sync-rate-controller/ <https://blogs.linbit.com/p/128/drbd-sync-rate-controller/>
>> resync-rate 680M; //mainly ignored
>> c-min-rate 400M;
>> c-max-rate 680M;
>> c-fill-target 20M; //my BDP is 6.8M, guides say to use a value between 1x and 3x BDP
>> 
>> Resync can achieve up to 680M when there are no application IOs on the source.
>> However, as soon as there are application IOs (writes with dd in my tests), resync slows down to some MB/s...
>> I played with c-plan-ahead and c-fill-target without success.
>> I also tested c-delay-target.
>> I tried to set unplug-watermark to 16.
>> My IO scheduler is already the deadline one...
>> 
>> Well, I'm a little bit lost, I can't achieve to get resync with a minimum rate of 400M when there are application IOs...
>> 
>> Here, Lars says :
>> http://lists.linbit.com/pipermail/drbd-user/2011-August/016739.html <http://lists.linbit.com/pipermail/drbd-user/2011-August/016739.html>
>> The dynamic resync rate controller basically tries to use as much as c-max-rate bandwidth, but will automatically throttle, if
>> - application IO on the device is detected (READ or WRITE), AND the estimated current resync speed is above c-min-rate
>> - the amount of in-flight resync requests exceeds c-fill-target
>> 
>> However, does DRBD throttle application IOs when resync rate is lower than c-min-rate ?
>> According to my tests I'm not so sure.
>> 
>> 
>> 
>> 2015-05-26 15:06 GMT+02:00 A.Rubio <aurusa at etsii.upv.es <mailto:aurusa at etsii.upv.es>>:
>> Have you test these values ?
>> 
>> https://drbd.linbit.com/users-guide/s-throughput-tuning.html <https://drbd.linbit.com/users-guide/s-throughput-tuning.html>
>> 
>> 
>> El 26/05/15 a las 13:16, Ben RUBSON escribió:
>>> RAID controller is OK yes.
>>> 
>>> Here is a 4 steps example of the issue :
>>> 
>>> 
>>> 
>>> ### 1 - initial state :
>>> 
>>> Source :
>>> - sdb read MB/s      : 0
>>> - sdb write MB/s     : 0
>>> - eth1 incoming MB/s : 0
>>> - eth1 outgoing MB/s : 0
>>> Target :
>>> - sdb read MB/s      : 0
>>> - sdb write MB/s     : 0
>>> - eth1 incoming MB/s : 0
>>> - eth1 outgoing MB/s : 0
>>> 
>>> 
>>> 
>>> ### 2 - dd if=/dev/zero of=bigfile :
>>> 
>>> Source :
>>> - sdb read MB/s      : 0
>>> - sdb write MB/s     : 670
>>> - eth1 incoming MB/s : 1
>>> - eth1 outgoing MB/s : 670
>>> Target :
>>> - sdb read MB/s      : 0
>>> - sdb write MB/s     : 670
>>> - eth1 incoming MB/s : 670
>>> - eth1 outgoing MB/s : 1
>>> 
>>> 
>>> 
>>> ### 3 - disable the link between the 2 nodes :
>>> 
>>> Source :
>>> - sdb read MB/s      : 0
>>> - sdb write MB/s     : 670
>>> - eth1 incoming MB/s : 0
>>> - eth1 outgoing MB/s : 0
>>> Target :
>>> - sdb read MB/s      : 0
>>> - sdb write MB/s     : 0
>>> - eth1 incoming MB/s : 0
>>> - eth1 outgoing MB/s : 0
>>> 
>>> 
>>> 
>>> ### 4 - re-enable the link between the 2 nodes :
>>> 
>>> Source :
>>> - sdb read MB/s      : ~20
>>> - sdb write MB/s     : ~670
>>> - eth1 incoming MB/s : 1
>>> - eth1 outgoing MB/s : 670
>>> Target :
>>> - sdb read MB/s      : 0
>>> - sdb write MB/s     : 670
>>> - eth1 incoming MB/s : 670
>>> - eth1 outgoing MB/s : 1
>>> DRBD resource :
>>>  1: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r-----
>>>     ns:62950732 nr:1143320132 dw:1206271712 dr:1379744185 al:9869 bm:6499 lo:2 pe:681 ua:1 ap:0 ep:1 wo:d oos:11883000
>>>     [>...................] sync'ed:  6.9% (11604/12448)M
>>>     finish: 0:34:22 speed: 5,756 (6,568) want: 696,320 K/sec
>>> 
>>> 
>>> 
>>> ### values I would have expected in step 4 :
>>> 
>>> Source :
>>> - sdb read MB/s      : ~400 (because of c-min-rate 400M)
>>> - sdb write MB/s     : ~370
>>> - eth1 incoming MB/s : 1
>>> - eth1 outgoing MB/s : 670
>>> Target :
>>> - sdb read MB/s      : 0
>>> - sdb write MB/s     : 670
>>> - eth1 incoming MB/s : 670
>>> - eth1 outgoing MB/s : 1
>>> 
>>> Why resync is totally ignored and application (dd here in the example) still consumes all available IOs / bandwidth ?
>>> 
>>> 
>>> 
>>> 2015-05-25 16:50 GMT+02:00 A.Rubio <aurusa at etsii.upv.es <mailto:aurusa at etsii.upv.es>>:
>>> Cache settings an I/O in RAID controller is optimal ??? Write-back, write-through, cache enablad, I/O direct, ...
>>> 
>>> El 25/05/15 a las 11:50, Ben RUBSON escribió:
>>> The link between nodes is a 10Gb/s link.
>>> The DRBD resource is a RAID-10 array which is able to resync at up to 800M (as you can see I have lowered it to 680M in my configuration file).
>>> 
>>> The "issue" here seems to be a prioritization "issue" between application IOs and resync IOs.
>>> Perhaps I miss-configured something ?
>>> Goal is to have resync rate up to 680M, with a minimum of 400M, even if there are application IOs.
>>> This in order to be sure to complete the resync even if there are a lot of write IOs from the application.
>>> 
>>> With my simple test below, this is not the case, dd still writes at its best throughput, lowering resync rate which can’t reach 400M at all.
>>> 
>>> Thank you !
>>> 
>>> Le 25 mai 2015 à 11:18, A.Rubio <aurusa at etsii.upv.es <mailto:aurusa at etsii.upv.es>> a écrit :
>>> 
>>> the link between nodes is ???  1Gb/s ??? , 10Gb/s ??? ...
>>> 
>>> the Hard Disks are ??? SATA 7200rpm ???, 10000rpm ???, SAS ???,
>>> SSD ???...
>>> 
>>> 400M to 680M with a 10Gb/s link and SAS 15.000 rpm is OK but less ...
>>> 
>>> Le 12 avr. 2014 à 17:23, Ben RUBSON <ben.rubson at gmail.com <mailto:ben.rubson at gmail.com>> a écrit :
>>> 
>>> Hello,
>>> 
>>> Let's assume the following configuration :
>>> disk {
>>>     c-plan-ahead 0;
>>>     resync-rate 680M;
>>>     c-min-rate 400M;
>>> }
>>> 
>>> Both nodes are uptodate, and on the primary, I have a test IO burst running, using dd.
>>> 
>>> I then cut replication link for a few minutes so that secondary node will be several GB behind primary node.
>>> 
>>> I then re-enable replication link.
>>> What I expect here according to the configuration is that secondary node will fetch missing GB at a 400 MB/s throughput.
>>> DRBD should then prefer resync IOs over application (dd here) IOs.
>>> 
>>> However, it does not seems to work.
>>> dd still writes at its best throughput, meanwhile reads are made from the primary disk between 30 and 60 MB/s to complete the resync.
>>> Of course this is not the expected behaviour.
>>> 
>>> Did I miss something ?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20150608/afadee69/attachment.htm>