From: Ming Lei <ming.lei@redhat.com>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: Keith Busch <kbusch@kernel.org>,
Chaitanya Kulkarni <kch@nvidia.com>, Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, io-uring@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] io_uring/rsrc: don't use blk_rq_nr_phys_segments() as number of bvecs
Date: Wed, 12 Nov 2025 09:59:31 +0800 [thread overview]
Message-ID: <aRPqA1XGWnY4YpIm@fedora> (raw)
In-Reply-To: <CADUfDZovn5fPh_E6GGvGkPYbW12L2z6BS4jPkpQjuEjNd=bRGA@mail.gmail.com>
On Tue, Nov 11, 2025 at 05:44:18PM -0800, Caleb Sander Mateos wrote:
> On Tue, Nov 11, 2025 at 5:01 PM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Tue, Nov 11, 2025 at 12:15:29PM -0700, Caleb Sander Mateos wrote:
> > > io_buffer_register_bvec() currently uses blk_rq_nr_phys_segments() as
> > > the number of bvecs in the request. However, bvecs may be split into
> > > multiple segments depending on the queue limits. Thus, the number of
> > > segments may overestimate the number of bvecs. For ublk devices, the
> > > only current users of io_buffer_register_bvec(), virt_boundary_mask,
> > > seg_boundary_mask, max_segments, and max_segment_size can all be set
> > > arbitrarily by the ublk server process.
> > > Set imu->nr_bvecs based on the number of bvecs the rq_for_each_bvec()
> > > loop actually yields. However, continue using blk_rq_nr_phys_segments()
> > > as an upper bound on the number of bvecs when allocating imu to avoid
> > > needing to iterate the bvecs a second time.
> > >
> > > Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
> > > Fixes: 27cb27b6d5ea ("io_uring: add support for kernel registered bvecs")
> >
> > Reviewed-by: Ming Lei <ming.lei@redhat.com>
> >
> > BTW, this issue may not be a problem because ->nr_bvecs is only used in
> > iov_iter_bvec(), in which 'offset' and 'len' can control how far the
> > iterator can reach, so the uninitialized bvecs won't be touched basically.
>
> I see your point, but what about iov_iter_extract_bvec_pages()? That
> looks like it only uses i->nr_segs to bound the iteration, not
> i->count. Hopefully there aren't any other helpers relying on nr_segs.
iov_iter_extract_bvec_pages() is only called from iov_iter_extract_pages(),
in which 'maxsize' is capped by i->count.
> If you really don't think it's a problem, I'm fine deferring the patch
> to 6.19. We haven't encountered any problems caused by this bug, but
> we haven't tested with any non-default virt_boundary_mask,
> seg_boundary_mask, max_segments, or max_segment_size on the ublk
> device.
IMO it should belong to v6.18: your fix not only makes code more robust, but
also it is correct thing to do.
I am just thinking why the issue wasn't triggered because we have lots of
test cases(rw verify, mkfs & mount ...)
>
> >
> > Otherwise, the issue should have been triggered somewhere.
> >
> > Also the bvec allocation may be avoided in case of single-bio request,
> > which can be one future optimization.
>
> I'm not sure what you're suggesting. The bio_vec array is a flexible
> array member of io_mapped_ubuf, so unless we add another pointer
> indirection, I don't see how to reuse the bio's bi_io_vec array.
> io_mapped_ubuf is also used for user registered buffers, where this
> optimization isn't possible, so it may not be a clear win.
io_mapped_ubuf->acct_pages can be one field reused for the indirect
pointer, please see lo_rw_aio() about how to reuse the bvec array.
Thanks,
Ming
next prev parent reply other threads:[~2025-11-12 1:59 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-11 19:15 [PATCH] io_uring/rsrc: don't use blk_rq_nr_phys_segments() as number of bvecs Caleb Sander Mateos
2025-11-11 19:19 ` Chaitanya Kulkarni
2025-11-12 1:01 ` Ming Lei
2025-11-12 1:44 ` Caleb Sander Mateos
2025-11-12 1:59 ` Ming Lei [this message]
2025-11-12 15:26 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aRPqA1XGWnY4YpIm@fedora \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=csander@purestorage.com \
--cc=io-uring@vger.kernel.org \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.