Distributed Replicated Block Device (DRBD) development
 help / color / mirror / Atom feed
* Re: [PATCH] drbd: fix a null-pointer dereference when the request event in drbd_request_endio() is READ_COMPLETED_WITH_ERROR
       [not found] <20260104165355.151864-1-islituo@gmail.com>
@ 2026-02-19 14:53 ` Christoph Böhmwalder
  0 siblings, 0 replies; only message in thread
From: Christoph Böhmwalder @ 2026-02-19 14:53 UTC (permalink / raw)
  To: Tuo Li, philipp.reisner, lars.ellenberg, axboe
  Cc: linux-block, linux-kernel, drbd-dev

On 1/4/26 17:53, Tuo Li wrote:
> In drbd_request_endio(), the request event what can be set to
> READ_COMPLETED_WITH_ERROR. In this case, __req_mod() is invoked with a NULL
> peer_device:
> 
>    __req_mod(req, what, NULL, &m);
> 
> When handling READ_COMPLETED_WITH_ERROR, __req_mod() unconditionally calls
> drbd_set_out_of_sync():
> 
>    case READ_COMPLETED_WITH_ERROR:
>      drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);
> 
> The drbd_set_out_of_sync() macro expands to __drbd_change_sync():
> 
>    #define drbd_set_out_of_sync(peer_device, sector, size) \
> 	__drbd_change_sync(peer_device, sector, size, SET_OUT_OF_SYNC)
> 
> However, __drbd_change_sync() assumes a valid peer_device and immediately
> dereferences it:
> 
>    struct drbd_device *device = peer_device->device;
> 
> If peer_device is NULL, this results in a NULL-pointer dereference.
> 
> Fix this by adding a NULL check in __req_mod() before calling
> drbd_set_out_of_sync().

Thank you for the report and patch.
The bug analysis is correct, but the fix is not.

> 
> Signed-off-by: Tuo Li <islituo@gmail.com>
> ---
>   drivers/block/drbd/drbd_req.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
> index d15826f6ee81..aa3da2733f14 100644
> --- a/drivers/block/drbd/drbd_req.c
> +++ b/drivers/block/drbd/drbd_req.c
> @@ -621,7 +621,8 @@ int __req_mod(struct drbd_request *req, enum drbd_req_event what,
>   		break;
>   
>   	case READ_COMPLETED_WITH_ERROR:
> -		drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);
> +		if (peer_device)
> +			drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);
>   		drbd_report_io_error(device, req);
>   		__drbd_chk_io_error(device, DRBD_READ_ERROR);
>   		fallthrough;

In this code path, peer_device is *always* NULL -- the only caller that
sets READ_COMPLETED_WITH_ERROR is drbd_request_endio(), which always
passes NULL for peer_device. So this NULL check effectively turns the
drbd_set_out_of_sync() call into dead code.

Silently skipping the call here means we lose out-of-sync tracking
for local read errors, which is a data consistency problem.

The proper fix is to obtain the peer_device via 
first_peer_device(device), like in a similar path in drbd_req_destroy 
(drbd_req.c:125).

case READ_COMPLETED_WITH_ERROR:
	drbd_set_out_of_sync(first_peer_device(device),
			     req->i.sector, req->i.size);

Regards,
Christoph

--
Christoph Böhmwalder
LINBIT | Keeping the Digital World Running
DRBD HA —  Disaster Recovery — Software defined Storage

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2026-02-19 14:53 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260104165355.151864-1-islituo@gmail.com>
2026-02-19 14:53 ` [PATCH] drbd: fix a null-pointer dereference when the request event in drbd_request_endio() is READ_COMPLETED_WITH_ERROR Christoph Böhmwalder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox