[Drbd-dev] [PATCH] drbd: fix a race of drbd_free_peer_req
joel.colledge at linbit.com
Wed Apr 20 12:22:32 CEST 2022
I think there is a potential race here, but I am not convinced that
this is a general solution. The peer request could still be freed by
got_peer_ack() between checking "list_empty(&peer_req->recv_order)"
and freeing it in drbd_finish_peer_reqs(). Also, this solution keeps
the page chain for peer requests for an unnecessarily long time, which
is not ideal in memory constrained situations.
The underlying race, as far as I understand it, is that got_peer_ack()
can be called while still processing the request in
drbd_finish_peer_reqs(). This is only relevant for peer writes and not
resync, so only the e_end_block() path is of interest. got_peer_ack()
will only be called after we have sent the corresponding barrier ack
for the peer request.
On the basis of this reasoning, I think a simple solution is to swap
drbd_may_finish_epoch() and drbd_free_page_chain() in e_end_block().
Please try this and send it as a patch if it solves your problem.
By the way, this patch doesn't compile due to a mismatched brace.
More information about the drbd-dev