[DRBD-user] [CASE-20] What is the tail_recursion operation during drbd_req_destroy?

Lars Ellenberg lars.ellenberg at linbit.com
Mon Feb 15 14:54:26 CET 2016

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Sun, Feb 14, 2016 at 07:39:36PM +0900, 김재헌 wrote:
> Dear Philipp,
> 
> Please check my previous question of CASE-14("[DRBD-user] [CASE-14] primary
> node hang by VM-net-disconnect during big file copy").
> According to this case, Linux drbd deadlock may occur.
> On the other hand, Windows side there is no deadlock but sometimes the
> transfer_log  list is broken in _tl_restart function.
> 
> So, We are trying to modify the source code as follows:
> 
> 1. Modifications
> 
> 1) in drbd_send_and_submit()
> 
> if (likely(req->i.size != 0)) {
> if (rw == WRITE) {
> struct drbd_request *req2;
> resource->current_tle_writes++;
> #if 0 // WIN32 ### ignore tail_recursion ###
> list_for_each_entry_reverse(req2, &resource->transfer_log, tl_requests) {
> if (req2->rq_state[0] & RQ_WRITE) {
> /* Make the new write request depend on
> * the previous one. */
> kref_get(&req->kref);
> break;
> }
> }
> #endif
> }
> 
> list_add_tail(&req->tl_requests, &resource->transfer_log);
> }
> 
> 
> 2) in drbd_req_destroy()
> 
> if (s & RQ_WRITE && req_size) {
> list_for_each_entry(req, &device->resource->transfer_log, tl_requests) {
> if (req->rq_state[0] & RQ_WRITE) {
> /*
> * Do the equivalent of:
> *   kref_put(&req->kref, drbd_req_destroy)
> * without recursing into the destructor.
> */
> #if 0  // WIN32 ### ignore tail_recursion ###
> if (atomic_dec_and_test(&req->kref.refcount))
> goto tail_recursion;
> #endif
> break;
> }
> }
> }
> 
> 
> 2. Questions
> 
> 1) This part of "tail_recursion" is a new design on verson 9.
>      Is this essential operation?
>      I mean, what do you think about my ignoring tail_recursion part for
> temporary workaround?

I cannot explain all the implications within two lines of text, but we
want the "destruction" of drbd_requests to happen in the order they have
been put on the transfer log.
That is important in some multi-peer scenarios.

If there is no explicit dependency (here implemented via kref), they
could be destroyed out-of-order, and that could lead to bad decisions
elsewhere, or potentially not enough being resynced in some scenarios.

> 2)   And what is the reason for the marking of "kref_get(&req->kref);"  in
> drbd_send_and_submit and processing with recursion in  drbd_req_destroy
> later?

see above.

> 3) On Windows side, we ignore this part(see source code of "#if 0 // WIN32
> ### ignore tail_recursion ###").
>      Anyway, after ignore, Windows drbd engine works well, till now.  Is
> there any problem?

see above.

> On Linux side, you cannot see this list-crash-case because the CASE-14 test
> may be done by deadlock first.
> Please check the CASE-14 deadlock case first and then check this CASE-20.

Cheers,

	Lars




More information about the drbd-user mailing list