[Drbd-dev] DRBD8: disconnecting while already disconnecting can
hang the receiver
Montrose, Ernest
Ernest.Montrose at stratus.com
Tue Nov 27 16:06:25 CET 2007
Phil,
Interesting...With a delay at the end of drbd_disconnect() it happens
every time for me. What I did is that I delay for 30 seconds and
quickly issue the disconnect in that window.
I added this at the very end of drbd_disconnect:
if(os.conn == TearDown && ns.conn == Unconnected && mdev->minor ==11)
{
INFO("drbd_disconnect: ##5# EM-- Done but waiting 30 seconds######\n");
set_current_state(TASK_INTERRUPTIBLE);
schedule_timeout(HZ * 30);
INFO("drbd_disconnect: ##5# EM-- Done ##### waiting 30
seconds######\n");
}
Notice mdev->minor == 11..you can change the 11 to some other device
that you are doing the disconnect on. Once you see the message "done
waiting" then you'd issue the local disconnect. Put the instrumented
driver on one side (The side that will do the last disconnect)
BTW, I agree that your spin on the patch is less intrusive. I will test
that and let you know.
EM--
-----Original Message-----
From: Philipp Reisner [mailto:philipp.reisner at linbit.com]
Sent: Tuesday, November 27, 2007 9:53 AM
To: drbd-dev at linbit.com
Cc: Montrose, Ernest
Subject: Re: [Drbd-dev] DRBD8: disconnecting while already disconnecting
can hang the receiver
On Tuesday 27 November 2007 14:06:46 Montrose, Ernest wrote:
> Phil,
> I looked at my notes...To reproduce this you can fake the condition
this
> way:
> * Issue a disconnect on node0 for r5.
> * Locally on node1 we will get into drbd_receiver.c:drbd_disconnect()
> and while there in drbd_disconnect() (Put a small delay there or
> something); issue a "drbdsetup /dev/drbd5 disconnect".
>
> This last drbdsetup will time out with " No response from the DRBD
> driver! Is the module loaded?"
> But the driver will be waiting forever in
> drbd_nl.c:drbd_nl_disconnect().
>
Yes. This is what I tested. I had a delay in drbd_disconenct().
I did not managed to get it into troubles.
BTW, while looking at the patch, I would have done it like this:
@@ -589,7 +589,8 @@ STATIC int is_valid_state_transition(drbd_dev*
mdev,drbd_state_t ns,drbd_state_t
if( (ns.conn == StartingSyncT || ns.conn == StartingSyncS ) &&
os.conn > Connected) rv=SS_ResyncRunning;
- if( ns.conn == Disconnecting && os.conn == StandAlone)
+ if ( ns.conn == Disconnecting &&
+ ( os.conn == StandAlone || os.conn == TearDown ) )
rv=SS_AlreadyStandAlone;
if( ns.disk > Attaching && os.disk == Diskless)
-Phil
--
: Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Vivenotgasse 48, 1120 Vienna, Austria http://www.linbit.com :
More information about the drbd-dev
mailing list