[DRBD-user] [0.7.23] reconnect problem after link loss
Lukasz Engel
lukasz.engel at softax.com.pl
Tue Apr 24 17:25:09 CEST 2007
Lars Ellenberg napisał(a):
> On Tue, Apr 24, 2007 at 02:14:58PM +0200, Lukasz Engel wrote:
>
>>> On Tue, Apr 24, 2007 at 12:10:37PM +0200, Lukasz Engel wrote:
>>>
>>>
>>>> I have 2 machines running drdb 0.7.23 (self compiled) with configured 5
>>>> drdbX resources (and heartbeat running above),
>>>> drbd uses direct cross-over cable for synchronization. Kernel 2.6.19.2
>>>> (vendor kernel - trustix 3) UP.
>>>>
>>>> Today I disconnected and connected direct cable and after that 2 of 5
>>>> drbds was failing to reconnect:
>>>> drbd0,2,4 successuly connected
>>>> drbd1 on secondary blocked in NetworkFailure state (WFConnection on
>>>> primary)
>>>> drbd3 was retrying to reconnect, but could not succeed (always went to
>>>> BrokenPipe after WFReportParams)
>>>>
>>>>
>>> this should not happen.
>>> it is known to happen sometimes anyways.
>>> it is some sort of race condition.
>>>
>>> the scheme to avoid it is heavily dependend on timeouts.
>>>
>>>
>> Any chances for fix ?
>> (If it help I should be able to disconnect my drbd link sometimes to
>> make some test...)
>>
>
> I remembered similar symptoms from a long time ago,
> when we spend a long time to debug this.
> We thought we had fixed it.
> You see the same symptoms again.
> It may be a different problem, it may be out "fix" back then
> only mad it less likely to occur.
>
> Since I can not reproduce it, I can not debug it.
> If you can track down _why_ it happens, great.
> I'm happy to fix it then.
>
Any hints how to debug the problem ?
This is my production environment, but I think I can add some debug
(printk's) in drbd code (good question - where?) - if the problem
appeared more than once it's highly probable it will appear again (I may
"help" by playing with drbd eth cable...).
[I am resending with correct (subscribed) address, another copy probably
is already waiting for moderator...]
--
Lukasz Engel
More information about the drbd-user
mailing list