Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, we have the impression this unveils a bug in DRBD. It might be triggered if: A resource with multiple volumes AND ko-count >=1 AND a write request triggers the timeout (ko-count * timeout) then a wrong state transition confuses DRBD's state handling. The fix: diff --git a/drbd/drbd_req.c b/drbd/drbd_req.c index 7cd9e14..b7df80e 100644 --- a/drbd/drbd_req.c +++ b/drbd/drbd_req.c @@ -1733,7 +1733,7 @@ void request_timer_fn(unsigned long data) time_after(now, req_peer->pre_send_jif + ent) && !time_in_range(now, connection->last_reconnect_jif, connection->last_reconnect_jif + e drbd_warn(device, "Remote failed to finish a request within ko-count * timeout\n"); - _drbd_set_state(_NS(device, conn, C_TIMEOUT), CS_VERBOSE | CS_HARD, NULL); + _conn_request_state(connection, NS(conn, C_TIMEOUT), CS_VERBOSE | CS_HARD); } if (dt && oldest_submit_jif != now && time_after(now, oldest_submit_jif + dt) && or here: http://git.drbd.org/gitweb.cgi?p=drbd-8.4.git;a=commit;h=79a03fc61fd04e91dd2a4562f28c57d256a075e4 best regards, Phil