[PATCH] drbd: fix a null-pointer dereference when the request event in drbd_request_endio() is READ_COMPLETED_WITH_ERROR
Christoph Böhmwalder
christoph.boehmwalder at linbit.com
Thu Feb 19 15:53:53 CET 2026
On 1/4/26 17:53, Tuo Li wrote:
> In drbd_request_endio(), the request event what can be set to
> READ_COMPLETED_WITH_ERROR. In this case, __req_mod() is invoked with a NULL
> peer_device:
>
> __req_mod(req, what, NULL, &m);
>
> When handling READ_COMPLETED_WITH_ERROR, __req_mod() unconditionally calls
> drbd_set_out_of_sync():
>
> case READ_COMPLETED_WITH_ERROR:
> drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);
>
> The drbd_set_out_of_sync() macro expands to __drbd_change_sync():
>
> #define drbd_set_out_of_sync(peer_device, sector, size) \
> __drbd_change_sync(peer_device, sector, size, SET_OUT_OF_SYNC)
>
> However, __drbd_change_sync() assumes a valid peer_device and immediately
> dereferences it:
>
> struct drbd_device *device = peer_device->device;
>
> If peer_device is NULL, this results in a NULL-pointer dereference.
>
> Fix this by adding a NULL check in __req_mod() before calling
> drbd_set_out_of_sync().
Thank you for the report and patch.
The bug analysis is correct, but the fix is not.
>
> Signed-off-by: Tuo Li <islituo at gmail.com>
> ---
> drivers/block/drbd/drbd_req.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
> index d15826f6ee81..aa3da2733f14 100644
> --- a/drivers/block/drbd/drbd_req.c
> +++ b/drivers/block/drbd/drbd_req.c
> @@ -621,7 +621,8 @@ int __req_mod(struct drbd_request *req, enum drbd_req_event what,
> break;
>
> case READ_COMPLETED_WITH_ERROR:
> - drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);
> + if (peer_device)
> + drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);
> drbd_report_io_error(device, req);
> __drbd_chk_io_error(device, DRBD_READ_ERROR);
> fallthrough;
In this code path, peer_device is *always* NULL -- the only caller that
sets READ_COMPLETED_WITH_ERROR is drbd_request_endio(), which always
passes NULL for peer_device. So this NULL check effectively turns the
drbd_set_out_of_sync() call into dead code.
Silently skipping the call here means we lose out-of-sync tracking
for local read errors, which is a data consistency problem.
The proper fix is to obtain the peer_device via
first_peer_device(device), like in a similar path in drbd_req_destroy
(drbd_req.c:125).
case READ_COMPLETED_WITH_ERROR:
drbd_set_out_of_sync(first_peer_device(device),
req->i.sector, req->i.size);
Regards,
Christoph
--
Christoph Böhmwalder
LINBIT | Keeping the Digital World Running
DRBD HA — Disaster Recovery — Software defined Storage
More information about the drbd-dev
mailing list