[DRBD-user] Is it possible to set idle ping retries before drbd assumes Network Failure?

Wed Oct 24 02:32:26 CEST 2012

Hello,

We have a highly available cluster using  DRBD over an
arp-monitored bonded interface. Due to the system design,
the bonded interface can take as long 5 sec to fail-over.

If a bonding fail-over happens while the DRBD link is idle,
a situation sometimes occurs where a DRBD-ping is sent
but is never received by the secondary (packet is lost in the TCP layer),
resulting in the DRBD Network Failure and fail-over starting even though
the link comes back up in the mean time.

>From drbd.conf:
Net options:
timeout = 8.0 sec
connect-int = 10 sec (default)
ping-int = 10 sec (default
max-epoch-size = 2048  (default)
max-buffers = 2048  (default)
unplug-watermark = 128  (default)
sndbuf-size = 2097152
ko-count = 2
Syncer options:
rate = 8192 KB/sec
group = 0  (default)
al-extents = 127  (default)

In the case when bonding fail-over occurs while drbd is active due to
disk IO, the DRBD Network Failure never seems to happen, because
write requests are re-transmitted in case when write-ack is not received
in the timeout period.

Is there a way to set the DRBD-ping to be retried a number of times
before the link is assumed broken?

Thanks in advance for any feedback.

Alex

PS Trully sorry if this appears again, but my first post did not appear on
the mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20121024/dd1cba91/attachment.htm>