public inbox for ceph-devel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dongsheng Yang <dongsheng.yang@linux.dev>
To: Ilya Dryomov <idryomov@gmail.com>, ceph-devel@vger.kernel.org
Cc: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Subject: Re: [PATCH v2] rbd: check for EOD after exclusive lock is ensured to be held
Date: Thu, 29 Jan 2026 16:52:46 +0800	[thread overview]
Message-ID: <fad9d632-5d27-4ff1-b787-48134f73853a@linux.dev> (raw)
In-Reply-To: <20260128124623.970785-1-idryomov@gmail.com>


在 1/28/2026 8:46 PM, Ilya Dryomov 写道:
> Similar to commit 870611e4877e ("rbd: get snapshot context after
> exclusive lock is ensured to be held"), move the "beyond EOD" check
> into the image request state machine so that it's performed after
> exclusive lock is ensured to be held.  This avoids various race
> conditions which can arise when the image is shrunk under I/O (in
> practice, mostly readahead).  In one such scenario
>
>      rbd_assert(objno < rbd_dev->object_map_size);
>
> can be triggered if a close-to-EOD read gets queued right before the
> shrink is initiated and the EOD check is performed against an outdated
> mapping_size.  After the resize is done on the server side and exclusive
> lock is (re)acquired bringing along the new (now shrunk) object map, the
> read starts going through the state machine and rbd_obj_may_exist() gets
> invoked on an object that is out of bounds of rbd_dev->object_map array.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

Reviewed-by: Dongsheng Yang <dongsheng.yang@linux.dev>


Thanx

> ---
> v1 -> v2:
> - refactor to avoid taking header_rwsem for IMG_REQ_CHILD requests
>    unnecessarily
>
>   drivers/block/rbd.c | 33 +++++++++++++++++++++------------
>   1 file changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index af0e21149dbc..8f441eb8b192 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -3495,11 +3495,29 @@ static void rbd_img_object_requests(struct rbd_img_request *img_req)
>   	rbd_assert(!need_exclusive_lock(img_req) ||
>   		   __rbd_is_lock_owner(rbd_dev));
>   
> -	if (rbd_img_is_write(img_req)) {
> -		rbd_assert(!img_req->snapc);
> +	if (test_bit(IMG_REQ_CHILD, &img_req->flags)) {
> +		rbd_assert(!rbd_img_is_write(img_req));
> +	} else {
> +		struct request *rq = blk_mq_rq_from_pdu(img_req);
> +		u64 off = (u64)blk_rq_pos(rq) << SECTOR_SHIFT;
> +		u64 len = blk_rq_bytes(rq);
> +		u64 mapping_size;
> +
>   		down_read(&rbd_dev->header_rwsem);
> -		img_req->snapc = ceph_get_snap_context(rbd_dev->header.snapc);
> +		mapping_size = rbd_dev->mapping.size;
> +		if (rbd_img_is_write(img_req)) {
> +			rbd_assert(!img_req->snapc);
> +			img_req->snapc =
> +			    ceph_get_snap_context(rbd_dev->header.snapc);
> +		}
>   		up_read(&rbd_dev->header_rwsem);
> +
> +		if (unlikely(off + len > mapping_size)) {
> +			rbd_warn(rbd_dev, "beyond EOD (%llu~%llu > %llu)",
> +				 off, len, mapping_size);
> +			img_req->pending.result = -EIO;
> +			return;
> +		}
>   	}
>   
>   	for_each_obj_request(img_req, obj_req) {
> @@ -4725,7 +4743,6 @@ static void rbd_queue_workfn(struct work_struct *work)
>   	struct request *rq = blk_mq_rq_from_pdu(img_request);
>   	u64 offset = (u64)blk_rq_pos(rq) << SECTOR_SHIFT;
>   	u64 length = blk_rq_bytes(rq);
> -	u64 mapping_size;
>   	int result;
>   
>   	/* Ignore/skip any zero-length requests */
> @@ -4738,17 +4755,9 @@ static void rbd_queue_workfn(struct work_struct *work)
>   	blk_mq_start_request(rq);
>   
>   	down_read(&rbd_dev->header_rwsem);
> -	mapping_size = rbd_dev->mapping.size;
>   	rbd_img_capture_header(img_request);
>   	up_read(&rbd_dev->header_rwsem);
>   
> -	if (offset + length > mapping_size) {
> -		rbd_warn(rbd_dev, "beyond EOD (%llu~%llu > %llu)", offset,
> -			 length, mapping_size);
> -		result = -EIO;
> -		goto err_img_request;
> -	}
> -
>   	dout("%s rbd_dev %p img_req %p %s %llu~%llu\n", __func__, rbd_dev,
>   	     img_request, obj_op_name(op_type), offset, length);
>   

      reply	other threads:[~2026-01-29  8:52 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-28 12:46 [PATCH v2] rbd: check for EOD after exclusive lock is ensured to be held Ilya Dryomov
2026-01-29  8:52 ` Dongsheng Yang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fad9d632-5d27-4ff1-b787-48134f73853a@linux.dev \
    --to=dongsheng.yang@linux.dev \
    --cc=Slava.Dubeyko@ibm.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox