[Drbd-dev] stonith-related regression introduced around kernel 3.13, with 3.15.3 still affected

Lars Ellenberg lars.ellenberg at linbit.com
Thu Jul 3 15:54:40 CEST 2014


On Thu, Jul 03, 2014 at 03:44:17PM +0200, Lars Ellenberg wrote:
> On Thu, Jul 03, 2014 at 03:07:18PM +0200, Mariusz Mazur wrote:
> > My setup is two nodes with drbd double master, corosync, pacemaker, clvmd, xen 
> > 4.4.0.
> > 
> > Here's what happens when I reboot -f one of the nodes and the surviving node 
> > is kernel 3.12.23 or earlier (oldest tested was 3.6.something):
> 
> Yep, someone changed the in kernel kthread api
> to use wait_for_completion_killable()
> where it used to be wait_for_completion().
> 
> Which has some bad interactions with how DRBD handles things.
> This is being fixed.

Would you please try this patch:

diff --git a/drbd/drbd_nl.c b/drbd/drbd_nl.c
index 9e6adaa..88f480c 100644
--- a/drbd/drbd_nl.c
+++ b/drbd/drbd_nl.c
@@ -586,6 +586,7 @@ void conn_try_outdate_peer_async(struct drbd_connection *connection)
 	struct task_struct *opa;
 
 	kref_get(&connection->kref);
+	flush_pending_signals();
 	opa = kthread_run(_try_outdate_peer_async, connection, "drbd_async_h");
 	if (IS_ERR(opa)) {
 		drbd_err(connection, "out of mem, failed to invoke fence-peer helper\n");


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.


More information about the drbd-dev mailing list