[DRBD-user] How long does it take OOS to clear

G C gcstang at gmail.com
Tue Oct 22 15:58:46 CEST 2019


I didn't explain myself well enough, I meant like would a stop/start,
verify or some other command cause the OOS to be pushed to the secondary?

The link speed when performing a verify is about 75MB/s I would think that
would be fast enough and the disk is a cloud block volume which is
potentially much faster.
It just seems that once bytes are in OOS that they're not pushed again
until something else happens, maybe it's some idle state or something else?


On Mon, Oct 21, 2019 at 7:52 PM Digimer <lists at alteeve.ca> wrote:

> Without knowing your setup or what is limiting you; I can suggest two
> options;
>
> 1. Faster hardware (links speed / peer disk)
> 2. Switch to Protocol C
>
> digimer
>
> On 2019-10-21 4:36 p.m., G C wrote:
> > Is there anything that will force the OOS to push what is out of sync?
> >
> >
> > On Mon, Oct 21, 2019 at 11:00 AM Digimer <lists at alteeve.ca
> > <mailto:lists at alteeve.ca>> wrote:
> >
> >     Tuning is quite instance-specific. I would always suggest starting by
> >     commenting out all tuning, see how it behaves, then tune. Premature
> >     optimization never is.
> >
> >     digimer
> >
> >     On 2019-10-21 1:31 p.m., G C wrote:
> >     > Would any of these values being changed help or would it need to
> >     be the
> >     > actual speed between the two nodes that needs to be increased?
> >     >
> >     > disk {
> >     >         on-io-error detach;
> >     >         c-plan-ahead 10;
> >     >         c-fill-target 24M;
> >     >         c-min-rate 80M;
> >     >         c-max-rate 720M;
> >     >     }
> >     >     net {
> >     >         protocol A;
> >     >         max-buffers 36k;
> >     >         sndbuf-size 1024k;
> >     >         rcvbuf-size 2048k;
> >     >     }
> >     >
> >     > Thank you
> >     >
> >     >
> >     >
> >     > On Mon, Oct 21, 2019 at 10:10 AM Digimer <lists at alteeve.ca
> >     <mailto:lists at alteeve.ca>
> >     > <mailto:lists at alteeve.ca <mailto:lists at alteeve.ca>>> wrote:
> >     >
> >     >     I assumed it wasn't paused, but that confirms it.
> >     >
> >     >     Protocol A allows for out of sync to grow. It says "when the
> >     data in on
> >     >     the network buffer to send to the peer, consider the write
> >     complete". As
> >     >     such, data that hasn't made it over to the peer causes oos to
> >     climb. If
> >     >     you have a steady write rate that is faster than your transmit
> >     >     bandwidth, then seeing fairly steady OOS makes sense.
> >     >
> >     >     To "fix" it, you need to increase the connection speed to the
> >     peer node.
> >     >     Or, less likely, if the peer's disk is slower than the
> bandwidth
> >     >     connecting it, speed up the disk write speed.
> >     >
> >     >     In either case, what you are seeing is not a surprise, and
> >     it's not a
> >     >     problem with DRBD. The only other option is to use protocol C,
> >     so that a
> >     >     write isn't complete until it reaches the peer, but that will
> >     slow down
> >     >     the write performance of the primary node to be whatever speed
> >     you have
> >     >     to the peer. That's likely unacceptable.
> >     >
> >     >     In short, you have a hardware/resource issue.
> >     >
> >     >     digimer
> >     >
> >     >     On 2019-10-21 12:19 p.m., G C wrote:
> >     >     > version: 8.4.10
> >     >     > Ran the resume-sync all and received:
> >     >     > 0: Failure: (135) Sync-pause flag is already cleared
> >     >     > Command 'drbdsetup-84 resume-sync 0' terminated with exit
> >     code 10
> >     >     >
> >     >     > Protocol used is 'A', our systems are running on a cloud
> >     environment.
> >     >     >
> >     >     >
> >     >     >
> >     >     >
> >     >     > On Mon, Oct 21, 2019 at 9:09 AM Digimer <lists at alteeve.ca
> >     <mailto:lists at alteeve.ca>
> >     >     <mailto:lists at alteeve.ca <mailto:lists at alteeve.ca>>
> >     >     > <mailto:lists at alteeve.ca <mailto:lists at alteeve.ca>
> >     <mailto:lists at alteeve.ca <mailto:lists at alteeve.ca>>>> wrote:
> >     >     >
> >     >     >     8.9.2 is the utils version, what is the kernel module
> >     version?
> >     >     >     (8.3.x/8.4.x/9.0.x)?
> >     >     >
> >     >     >     It's possible something paused sync, but I doubt it. You
> >     can try
> >     >     >     'drbdadm resume-sync all'. The oos number should change
> >     >     constantly, any
> >     >     >     time a block changes it should go up and every time a
> block
> >     >     syncs it
> >     >     >     should go down.
> >     >     >
> >     >     >     What protocol are you using? A, B or C?
> >     >     >
> >     >     >     digimer
> >     >
> >     >
> >     >     --
> >     >     Digimer
> >     >     Papers and Projects: https://alteeve.com/w/
> >     >     "I am, somehow, less interested in the weight and convolutions
> of
> >     >     Einstein’s brain than in the near certainty that people of
> >     equal talent
> >     >     have lived and died in cotton fields and sweatshops." -
> >     Stephen Jay
> >     >     Gould
> >     >
> >
> >
> >     --
> >     Digimer
> >     Papers and Projects: https://alteeve.com/w/
> >     "I am, somehow, less interested in the weight and convolutions of
> >     Einstein’s brain than in the near certainty that people of equal
> talent
> >     have lived and died in cotton fields and sweatshops." - Stephen Jay
> >     Gould
> >
>
>
> --
> Digimer
> Papers and Projects: https://alteeve.com/w/
> "I am, somehow, less interested in the weight and convolutions of
> Einstein’s brain than in the near certainty that people of equal talent
> have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20191022/a04f236c/attachment.htm>


More information about the drbd-user mailing list