[DRBD-user] strange drbd bug

Philipp Reisner philipp.reisner at linbit.com
Mon Oct 20 14:54:20 CEST 2014

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

we have the impression this unveils a bug in DRBD.
It might be triggered if: 

 A resource with multiple volumes
  AND
 ko-count >=1
  AND
 a write request triggers the timeout (ko-count * timeout)

then a wrong state transition confuses DRBD's state
handling.

The fix:
diff --git a/drbd/drbd_req.c b/drbd/drbd_req.c
index 7cd9e14..b7df80e 100644
--- a/drbd/drbd_req.c
+++ b/drbd/drbd_req.c
@@ -1733,7 +1733,7 @@ void request_timer_fn(unsigned long data)
                 time_after(now, req_peer->pre_send_jif + ent) &&
                !time_in_range(now, connection->last_reconnect_jif, connection->last_reconnect_jif + e
                drbd_warn(device, "Remote failed to finish a request within ko-count * timeout\n");
-               _drbd_set_state(_NS(device, conn, C_TIMEOUT), CS_VERBOSE | CS_HARD, NULL);
+               _conn_request_state(connection, NS(conn, C_TIMEOUT), CS_VERBOSE | CS_HARD);
        }
        if (dt && oldest_submit_jif != now &&
                 time_after(now, oldest_submit_jif + dt) &&

or here:
http://git.drbd.org/gitweb.cgi?p=drbd-8.4.git;a=commit;h=79a03fc61fd04e91dd2a4562f28c57d256a075e4

best regards,
 Phil



More information about the drbd-user mailing list