[Drbd-dev] Avoid nested sleeping on TCP connect
Andreas Osterburg
andreas.osterburg at digide.net
Mon Feb 20 17:58:16 CET 2017
Thanks for your investigations.
I didn't use a loop since the old behaviour was to leave the function returning -EAGAIN
on timeout or interrupt. There is just one difference: When an event from the socket occures
and no TCP-connection is established, the function leaves before the timeout elapses. It
makes no real difference to an interrupt, so I didn't handle it specially.
Thanks,
Andreas Osterburg
Am 20.02.2017 um 15:07 schrieb Lars Ellenberg:
> On Mon, Feb 20, 2017 at 11:54:45AM +0100, Andreas Osterburg wrote:
>> Recent Linux-kernels (since 3.19) emit a warning when using nested sleeping
>> statements within kernel code. CONFIG_DEBUG_ATOMIC_SLEEP must be enabled to
>> see it.
>> Module drbd_transport_tcp is affected and always triggers a warning
>> on first connect:
>> [ 6187.934573] WARNING: CPU: 33 PID: 17430 at ../kernel/sched/core.c:7963 __might_sleep+0x76/0x80()
>> [ 6187.934580] do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810c2dce>] prepare_to_wait_event+0x5e/0xf0
>
>> [ 6187.934926] [<ffffffff810a30b6>] __might_sleep+0x76/0x80
>> [ 6187.934936] [<ffffffff8160984c>] mutex_lock+0x1c/0x38
>> [ 6187.934981] [<ffffffffa05ba8f0>] dtt_wait_connect_cond+0x20/0xa0 [drbd_transport_tcp]
>> [ 6187.935017] [<ffffffffa05bb3ce>] dtt_wait_for_connect.constprop.10+0x29e/0x440 [drbd_transport_tcp]
>> [ 6187.935033] [<ffffffffa05bbde7>] dtt_connect+0x247/0x7b7 [drbd_transport_tcp]
>> [ 6187.935072] [<ffffffffa05300e1>] drbd_receiver+0x171/0x680 [drbd]
>
>> I fixed this, the patch is attached on this mail. When it is ok, someone should apply it.
>
> Looks almost correct (loop is missing).
> I don't yet see the real problem with this particular code,
> even just annotating that "this is ok" so the warning goes away
> would be "legal". (sched_annotate_sleep() before mutex_lock()).
>
> We are discussing to maybe replace the mutex_lock
> by a mutex_trylock, or even by a spinlock.
> Either way, real fix should be in "soon".
>
> Thanks,
>
> Lars
>
>> --- drbd/drbd_transport_tcp.c 2016-12-06 16:20:39.000000000 +0100
>> +++ drbd/drbd_transport_tcp.c 2017-02-20 11:23:46.794979063 +0100
>> @@ -568,6 +568,7 @@
>> struct drbd_path *drbd_path2;
>> struct dtt_listener *listener = container_of(drbd_listener, struct dtt_listener, listener);
>> struct dtt_path *path = NULL;
>> + DEFINE_WAIT_FUNC(wait_connect, woken_wake_function);
>>
>> rcu_read_lock();
>> nc = rcu_dereference(transport->net_conf);
>> @@ -582,9 +583,15 @@
>> timeo += (prandom_u32() & 1) ? timeo / 7 : -timeo / 7; /* 28.5% random jitter */
>>
>> retry:
>> - timeo = wait_event_interruptible_timeout(listener->wait,
>> - (path = dtt_wait_connect_cond(transport)),
>> - timeo);
>> + add_wait_queue(&listener->wait, &wait_connect);
>> + path = dtt_wait_connect_cond(transport);
>> + if(!path) {
>> + wait_woken(&wait_connect, TASK_INTERRUPTIBLE, timeo);
>> + path = dtt_wait_connect_cond(transport);
>> + if(!path) timeo = 0;
>> + }
>> + remove_wait_queue(&listener->wait, &wait_connect);
>> +
>> if (timeo <= 0)
>> return -EAGAIN;
>
>
--
Andreas Osterburg
IT Software GmbH & Data Security Elbe KG Tel.: +49 (391) 509609-55
Lorenzweg 42 - Haus 3, D-39124 Magdeburg Fax : +49 (391) 509609-56
Geschäftsführer: Jens Henning Amtsgericht Stendal, HRA 22588
Zertifiziert nach ISO 9001:2008
More information about the drbd-dev
mailing list