[DRBD-user] Question about the 'ping' mechanism

Iustin Pop iustin at google.com
Tue Mar 17 10:27:21 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tue, Mar 17, 2009 at 10:15:36AM +0100, Lars Ellenberg wrote:
> On Tue, Mar 17, 2009 at 09:57:48AM +0100, Iustin Pop wrote:
> > Hi,
> > 
> > I've searched the docs (and slightly the sources) but I can't find a
> > definitive answer about the following. Note that I'm using DRBD 8.0.12,
> > not latest version.
> > 
> > The man page for drbdsetup makes it sound like both ends of a drbd pair
> > will ping the other in case of no activity. However, in practice on an
> > unused device, only one of the two TCP connections that is used by the
> > DRBD pair sees pings; the other one sees no traffic at all.
> 
> right.
> "DRBD Pings" are only send via the "meta connection",
> not the "data connetion".

Aha, now I understand why there are two TCP connections (and the reasons
why the meta connections are teared down nicely, but not the data ones).
Thanks.

> > (This would not usually be a problem, except that if one has iptables
> > and conntrack enabled on the machine, and the device is not used for
> > long enough time, the second TCP connection will be forgotten by the
> > conntrack module)
> 
> Ouch.
> 
> hm. are there tcp keepalive packets?

There would be, if only DRBD would enable them; but a quick “git grep
SO_KEEPALIVE” returns nothing.

> or should be do our in DRBD protocol keepalive as well?

If it would be possible to enable pings on the data connection as well,
that would be better I think, since DRBD can detect if the actual remote
DRBD thread is responding correctly (and not just its TCP stack). I also
don't know how DRBD would react to its connection being shutdown by the
TCP stack when not having activity (as opposed to a send error during
activity).

But the TCP keepalive is indeed a quicker and smaller change (just
setsockopt(..., SO_KEEPALIVE)) than changing the DRBD data protocol. Up
to you if you think could be a common enough occurrence that it needs
handling by DRBD.

Anyway, for us I now know how to proceed (since this is the expected
behaviour).

thanks!
iustin



More information about the drbd-user mailing list