[Drbd-dev] wait_event called while nested inside another wait_event, using the same wait queue
David Butterfield
dab21774 at gmail.com
Mon Jul 8 19:53:55 CEST 2019
change_cluster_wide_state() calls wait_event, nested inside the condition-test of another
wait_event call, using *the same wait queue* for both the outer and the inner wait calls.
I don't understand the intended kernel model precisely enough to know if this is a subtle bug,
or if it's really guaranteed OK to do this. But it's pretty suspicious if it hasn't been
thought through carefully, so I thought I'd mention it.
The outer wait on resource->state_wait occurs when drbd_set_role() calls stable_state_change()
to wait for the condition of a successful change_role(). change_role() is called for the
condition-test and calls change_cluster_wide_state() which issues the inner (nested) wait on
resource->state_wait, waiting for cluster_wide_reply_ready().
Possibly it could work or fail depending on changes to the kernel wait implementation.
For example, suppose the underlying wait implementation held a wait-queue lock during the
condition check. Calling wait_event nested inside the condition-test of another wait_event
on the same wait_queue would lead to a recursive lock-acquisition attempt. (On the other
hand that may only prove that the implementation can't hold such a lock.)
The nested usage does not seem to have any bad effects (now that my wait implementation no
longer holds a wait-queue lock during the condition-test).
diff --git a/drbd/drbd_state.c b/drbd/drbd_state.c
index c3371e18..409b9afc 100644
--- a/drbd/drbd_state.c
+++ b/drbd/drbd_state.c
@@ -4195,6 +4195,18 @@ change_cluster_wide_state(bool (*change)(struct change_context *, enum change_ph
&request, reach_immediately);
have_peers = rv == SS_CW_SUCCESS;
if (have_peers) {
+ // We are here as the change_state macro from inside the condition of
+ // wait_event_interruptible((resource)->state_wait,
+ // (rv = (change_state)) != SS_IN_TRANSIENT_STATE);
+ //
+ //XXX Is it really kosher to now use that same resource->state_wait
+ // in another wait_event while we are nested within the first one?
+ //
+ // In drbd_set_role():
+ // rv = stable_state_change(resource,
+ // change_role(resource, role, flags, with_force, &err_str));
+ // where change_role() calls this function.
+ //
if (wait_event_timeout(resource->state_wait,
cluster_wide_reply_ready(resource),
twopc_timeout(resource)))
More information about the drbd-dev
mailing list