[DRBD-user] [0.7.23] reconnect problem after link loss

Lars Ellenberg lars.ellenberg at linbit.com
Tue Apr 24 12:32:31 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tue, Apr 24, 2007 at 12:10:37PM +0200, Lukasz Engel wrote:
> I have 2 machines running drdb 0.7.23 (self compiled) with configured 5 
> drdbX resources (and heartbeat running above),
> drbd uses direct cross-over cable for synchronization. Kernel 2.6.19.2 
> (vendor kernel - trustix 3) UP.
> 
> Today I disconnected and connected direct cable and after that 2 of 5 
> drbds was failing to reconnect:
> drbd0,2,4 successuly connected
> drbd1 on secondary blocked in NetworkFailure state (WFConnection on 
> primary)
> drbd3 was retrying to reconnect, but could not succeed (always went to 
> BrokenPipe after WFReportParams)

this should not happen.
it is known to happen sometimes anyways.
it is some sort of race condition.
the scheme to avoid it is heavily dependend on timeouts.

> drbdadm down/up for both failed devices helped

that is the recommended workaround to solve this behaviour.

-- 
: Lars Ellenberg                            Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list