[DRBD-user] [0.7.23] reconnect problem after link loss
Lars Ellenberg
lars.ellenberg at linbit.com
Tue Apr 24 12:32:31 CEST 2007
On Tue, Apr 24, 2007 at 12:10:37PM +0200, Lukasz Engel wrote:
> I have 2 machines running drdb 0.7.23 (self compiled) with configured 5
> drdbX resources (and heartbeat running above),
> drbd uses direct cross-over cable for synchronization. Kernel 2.6.19.2
> (vendor kernel - trustix 3) UP.
>
> Today I disconnected and connected direct cable and after that 2 of 5
> drbds was failing to reconnect:
> drbd0,2,4 successuly connected
> drbd1 on secondary blocked in NetworkFailure state (WFConnection on
> primary)
> drbd3 was retrying to reconnect, but could not succeed (always went to
> BrokenPipe after WFReportParams)
this should not happen.
it is known to happen sometimes anyways.
it is some sort of race condition.
the scheme to avoid it is heavily dependend on timeouts.
> drbdadm down/up for both failed devices helped
that is the recommended workaround to solve this behaviour.
--
: Lars Ellenberg Tel +43-1-8178292-0 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com :
__
please use the "List-Reply" function of your email client.
More information about the drbd-user
mailing list