From: Dongsheng Yang <dongsheng.yang@linux.dev>
To: Ilya Dryomov <idryomov@gmail.com>, ceph-devel@vger.kernel.org
Cc: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Subject: Re: [PATCH v2] rbd: check for EOD after exclusive lock is ensured to be held
Date: Thu, 29 Jan 2026 16:52:46 +0800 [thread overview]
Message-ID: <fad9d632-5d27-4ff1-b787-48134f73853a@linux.dev> (raw)
In-Reply-To: <20260128124623.970785-1-idryomov@gmail.com>
在 1/28/2026 8:46 PM, Ilya Dryomov 写道:
> Similar to commit 870611e4877e ("rbd: get snapshot context after
> exclusive lock is ensured to be held"), move the "beyond EOD" check
> into the image request state machine so that it's performed after
> exclusive lock is ensured to be held. This avoids various race
> conditions which can arise when the image is shrunk under I/O (in
> practice, mostly readahead). In one such scenario
>
> rbd_assert(objno < rbd_dev->object_map_size);
>
> can be triggered if a close-to-EOD read gets queued right before the
> shrink is initiated and the EOD check is performed against an outdated
> mapping_size. After the resize is done on the server side and exclusive
> lock is (re)acquired bringing along the new (now shrunk) object map, the
> read starts going through the state machine and rbd_obj_may_exist() gets
> invoked on an object that is out of bounds of rbd_dev->object_map array.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@linux.dev>
Thanx
> ---
> v1 -> v2:
> - refactor to avoid taking header_rwsem for IMG_REQ_CHILD requests
> unnecessarily
>
> drivers/block/rbd.c | 33 +++++++++++++++++++++------------
> 1 file changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index af0e21149dbc..8f441eb8b192 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -3495,11 +3495,29 @@ static void rbd_img_object_requests(struct rbd_img_request *img_req)
> rbd_assert(!need_exclusive_lock(img_req) ||
> __rbd_is_lock_owner(rbd_dev));
>
> - if (rbd_img_is_write(img_req)) {
> - rbd_assert(!img_req->snapc);
> + if (test_bit(IMG_REQ_CHILD, &img_req->flags)) {
> + rbd_assert(!rbd_img_is_write(img_req));
> + } else {
> + struct request *rq = blk_mq_rq_from_pdu(img_req);
> + u64 off = (u64)blk_rq_pos(rq) << SECTOR_SHIFT;
> + u64 len = blk_rq_bytes(rq);
> + u64 mapping_size;
> +
> down_read(&rbd_dev->header_rwsem);
> - img_req->snapc = ceph_get_snap_context(rbd_dev->header.snapc);
> + mapping_size = rbd_dev->mapping.size;
> + if (rbd_img_is_write(img_req)) {
> + rbd_assert(!img_req->snapc);
> + img_req->snapc =
> + ceph_get_snap_context(rbd_dev->header.snapc);
> + }
> up_read(&rbd_dev->header_rwsem);
> +
> + if (unlikely(off + len > mapping_size)) {
> + rbd_warn(rbd_dev, "beyond EOD (%llu~%llu > %llu)",
> + off, len, mapping_size);
> + img_req->pending.result = -EIO;
> + return;
> + }
> }
>
> for_each_obj_request(img_req, obj_req) {
> @@ -4725,7 +4743,6 @@ static void rbd_queue_workfn(struct work_struct *work)
> struct request *rq = blk_mq_rq_from_pdu(img_request);
> u64 offset = (u64)blk_rq_pos(rq) << SECTOR_SHIFT;
> u64 length = blk_rq_bytes(rq);
> - u64 mapping_size;
> int result;
>
> /* Ignore/skip any zero-length requests */
> @@ -4738,17 +4755,9 @@ static void rbd_queue_workfn(struct work_struct *work)
> blk_mq_start_request(rq);
>
> down_read(&rbd_dev->header_rwsem);
> - mapping_size = rbd_dev->mapping.size;
> rbd_img_capture_header(img_request);
> up_read(&rbd_dev->header_rwsem);
>
> - if (offset + length > mapping_size) {
> - rbd_warn(rbd_dev, "beyond EOD (%llu~%llu > %llu)", offset,
> - length, mapping_size);
> - result = -EIO;
> - goto err_img_request;
> - }
> -
> dout("%s rbd_dev %p img_req %p %s %llu~%llu\n", __func__, rbd_dev,
> img_request, obj_op_name(op_type), offset, length);
>
prev parent reply other threads:[~2026-01-29 8:52 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-28 12:46 [PATCH v2] rbd: check for EOD after exclusive lock is ensured to be held Ilya Dryomov
2026-01-29 8:52 ` Dongsheng Yang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fad9d632-5d27-4ff1-b787-48134f73853a@linux.dev \
--to=dongsheng.yang@linux.dev \
--cc=Slava.Dubeyko@ibm.com \
--cc=ceph-devel@vger.kernel.org \
--cc=idryomov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox