From: Christoph Hellwig <hch@infradead.org>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>,
sw.prabhu6@gmail.com, axboe@kernel.dk, io-uring@vger.kernel.org,
linux-kernel@vger.kernel.org, dave@stgolabs.net,
dongjoo.seo1@samsung.com, Swarna Prabhu <s.prabhu@samsung.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Matthew Wilcox <willy@infradead.org>, Zi Yan <ziy@nvidia.com>
Subject: Re: [RFC v1] io_uring/rsrc: add fast path huge page handling in buffer registration
Date: Wed, 10 Jun 2026 04:34:19 -0700 [thread overview]
Message-ID: <ailLu70plC9WK2dB@infradead.org> (raw)
In-Reply-To: <f2b5189f-10de-4685-97f3-6ee08d159743@kernel.org>
On Wed, Jun 10, 2026 at 11:54:01AM +0200, David Hildenbrand (Arm) wrote:
> > Yes. iov_iter_extract_bvecs and thus the block direct I/O fast path
> > would instantly benefit from that.
> The tricky bit for such an interface is that, soon, some pages won't be folios,
> but we could still end up with non-folio pages in the address space (e.g.,
> vm_insert_page()) and have to pin+return them. So using folios is not future-proof.
I'm still doubtful on the "soon" beause of all the issues like this
in the I/O path.
> There are some long-term plans on providing an interface that would abstract how
> you refcount something you GUP'ed. (because, some pages we GUP in the future
> might not even have a dedicated refcount, all still fairly unclear). But it's
> all not really finalized I think.
>
> For now, we could expose a folio+page/offset+nr_pages interface, where we,
> long-term, would not be able to return non-folio pages (e.g., vm_insert_page())
> and would instead, in the future, fail the request if we stumble over a
> non-folio thing in the page tables. That sounds reasonable for now.
I think whatever we're going to use for direct I/O has to also support
non-folio pages, especially PCI P2P memory. So coming up with an
interface that support this ASAP would be helpful.
> Another solution would be, exposing page-ranges (e.g., page + nr_pages), whereby
> we'd say, that all pages in a range belong to the same compound page, and that
> we took a single reference for all pages in the range. IOW, page_folio() would
> for now be the same for all pages in a range.
This does sound like a reasonable short-term improvement. One annoying
issue with returning only order 0 page in the current interfaces is
that it fills up the pages array in the caller for no good reason.
next prev parent reply other threads:[~2026-06-10 11:34 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-08 6:29 [RFC v1] io_uring/rsrc: add fast path huge page handling in buffer registration sw.prabhu6
2026-06-08 15:57 ` Jens Axboe
2026-06-09 2:18 ` Swarna Prabhu
2026-06-09 18:36 ` David Hildenbrand (Arm)
2026-06-10 6:16 ` Christoph Hellwig
2026-06-10 9:54 ` David Hildenbrand (Arm)
2026-06-10 11:34 ` Christoph Hellwig [this message]
2026-06-10 13:18 ` David Hildenbrand (Arm)
2026-06-10 18:10 ` Matthew Wilcox
2026-06-10 18:45 ` David Hildenbrand (Arm)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ailLu70plC9WK2dB@infradead.org \
--to=hch@infradead.org \
--cc=axboe@kernel.dk \
--cc=dave@stgolabs.net \
--cc=david@kernel.org \
--cc=dongjoo.seo1@samsung.com \
--cc=io-uring@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=s.prabhu@samsung.com \
--cc=sw.prabhu6@gmail.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.