[PATCH] drbd: fix a null-pointer dereference when the request event in drbd_request_endio() is READ_COMPLETED_WITH_ERROR

Christoph Böhmwalder christoph.boehmwalder at linbit.com
Thu Feb 19 15:53:53 CET 2026


On 1/4/26 17:53, Tuo Li wrote:
> In drbd_request_endio(), the request event what can be set to
> READ_COMPLETED_WITH_ERROR. In this case, __req_mod() is invoked with a NULL
> peer_device:
> 
>    __req_mod(req, what, NULL, &m);
> 
> When handling READ_COMPLETED_WITH_ERROR, __req_mod() unconditionally calls
> drbd_set_out_of_sync():
> 
>    case READ_COMPLETED_WITH_ERROR:
>      drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);
> 
> The drbd_set_out_of_sync() macro expands to __drbd_change_sync():
> 
>    #define drbd_set_out_of_sync(peer_device, sector, size) \
> 	__drbd_change_sync(peer_device, sector, size, SET_OUT_OF_SYNC)
> 
> However, __drbd_change_sync() assumes a valid peer_device and immediately
> dereferences it:
> 
>    struct drbd_device *device = peer_device->device;
> 
> If peer_device is NULL, this results in a NULL-pointer dereference.
> 
> Fix this by adding a NULL check in __req_mod() before calling
> drbd_set_out_of_sync().

Thank you for the report and patch.
The bug analysis is correct, but the fix is not.

> 
> Signed-off-by: Tuo Li <islituo at gmail.com>
> ---
>   drivers/block/drbd/drbd_req.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
> index d15826f6ee81..aa3da2733f14 100644
> --- a/drivers/block/drbd/drbd_req.c
> +++ b/drivers/block/drbd/drbd_req.c
> @@ -621,7 +621,8 @@ int __req_mod(struct drbd_request *req, enum drbd_req_event what,
>   		break;
>   
>   	case READ_COMPLETED_WITH_ERROR:
> -		drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);
> +		if (peer_device)
> +			drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);
>   		drbd_report_io_error(device, req);
>   		__drbd_chk_io_error(device, DRBD_READ_ERROR);
>   		fallthrough;

In this code path, peer_device is *always* NULL -- the only caller that
sets READ_COMPLETED_WITH_ERROR is drbd_request_endio(), which always
passes NULL for peer_device. So this NULL check effectively turns the
drbd_set_out_of_sync() call into dead code.

Silently skipping the call here means we lose out-of-sync tracking
for local read errors, which is a data consistency problem.

The proper fix is to obtain the peer_device via 
first_peer_device(device), like in a similar path in drbd_req_destroy 
(drbd_req.c:125).

case READ_COMPLETED_WITH_ERROR:
	drbd_set_out_of_sync(first_peer_device(device),
			     req->i.sector, req->i.size);

Regards,
Christoph

--
Christoph Böhmwalder
LINBIT | Keeping the Digital World Running
DRBD HA —  Disaster Recovery — Software defined Storage


More information about the drbd-dev mailing list