[DRBD-user] the timing of restarting thread

Lars Ellenberg lars.ellenberg at linbit.com
Sun Jul 25 16:41:25 CEST 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On Sun, Jul 25, 2010 at 01:38:56AM +0900, Junko IKEDA wrote:
> Hi,
> >> > DRBD has _two_ tcp sessions per device,
> >> > one end will have a "random high port",
> >> > the end the configured port.
> >>
> >> Are these two sessions for "data" and "meta" socket as you mentioned below?
> >> I think I want to simulate the blocking of "meta" socket.
> >
> > Ah.  Why?
> > Please step back bit and suggest which _real world_ scenario
> > you have in mind. What is it that you are trying to prove or analyse?
> >
> > Appart from sniffing the traffic, there is no easy way to
> > determine which is which just from looking at it.
> I want to reproduce the following situation.
> Primary can send "data" to Secondary,
> but only "meta data" is dropped unfortunately.

Again, please suggest a real world failure scenario.
Did you experience any strange replication problems,
or are you just "fantasizing" about esotheric failure modes.

> It might be a unrealistic worry...

If sockets fail in some detectable (by tcp) fashion,
(RST, icmp unreachable or similar),
both sockets are dropped.

Replication is only reestablished once both sockets
have successfully be reestablished.

If sockets fail in some "strange" way (no RST, no icmp,
just a black hole), periodic in-protocol DRBD Ping packets
(on the meta socket) would no longer be answered,
again both sockets are dropped.

If meta data socket is still ok (DRBD Pings are still answered in a
timely fashion), but there is no progress on the data socket,
read about ko-count.

> >>DRBD can not replicate the data if "data" socket is blocked
> >>and DRBD reopen the new socket if "meta" socket is blocked,
> >>Is that right?
> >
> >No.
> >If one of the sockets is detected to not work,
> >both are dropped, and eventually reestablished.
> ok, that means,
> in my previous test that I could _by chance_ blocked the "data" socket,
> the socket should be eventually reestablished.
> Is there any special delay for only "data" socket?

read about ko-count.

> Does "delay_prove" have some relation?

it was probe, not prove.
and it is totally unrelated.

That was an attempt to do auto-throttling of the resyncer
to have less impact on application IO
while utilizing as much as possible of "idle bandwidth".

This has been reverted since, as it did not meat expectations.
The "auto-throttling" feature is being implemented differently,
and expected to be released with 8.3.9.

It has absolutely nothing to do with connection problems.

: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
please don't Cc me, but send to list   --   I'm subscribed

More information about the drbd-user mailing list