From: Christoph Hellwig <hch@infradead.org>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>,
sw.prabhu6@gmail.com, axboe@kernel.dk, io-uring@vger.kernel.org,
linux-kernel@vger.kernel.org, dave@stgolabs.net,
dongjoo.seo1@samsung.com, Swarna Prabhu <s.prabhu@samsung.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Matthew Wilcox <willy@infradead.org>, Zi Yan <ziy@nvidia.com>
Subject: Re: [RFC v1] io_uring/rsrc: add fast path huge page handling in buffer registration
Date: Wed, 10 Jun 2026 04:34:19 -0700 [thread overview]
Message-ID: <ailLu70plC9WK2dB@infradead.org> (raw)
In-Reply-To: <f2b5189f-10de-4685-97f3-6ee08d159743@kernel.org>
On Wed, Jun 10, 2026 at 11:54:01AM +0200, David Hildenbrand (Arm) wrote:
> > Yes. iov_iter_extract_bvecs and thus the block direct I/O fast path
> > would instantly benefit from that.
> The tricky bit for such an interface is that, soon, some pages won't be folios,
> but we could still end up with non-folio pages in the address space (e.g.,
> vm_insert_page()) and have to pin+return them. So using folios is not future-proof.
I'm still doubtful on the "soon" beause of all the issues like this
in the I/O path.
> There are some long-term plans on providing an interface that would abstract how
> you refcount something you GUP'ed. (because, some pages we GUP in the future
> might not even have a dedicated refcount, all still fairly unclear). But it's
> all not really finalized I think.
>
> For now, we could expose a folio+page/offset+nr_pages interface, where we,
> long-term, would not be able to return non-folio pages (e.g., vm_insert_page())
> and would instead, in the future, fail the request if we stumble over a
> non-folio thing in the page tables. That sounds reasonable for now.
I think whatever we're going to use for direct I/O has to also support
non-folio pages, especially PCI P2P memory. So coming up with an
interface that support this ASAP would be helpful.
> Another solution would be, exposing page-ranges (e.g., page + nr_pages), whereby
> we'd say, that all pages in a range belong to the same compound page, and that
> we took a single reference for all pages in the range. IOW, page_folio() would
> for now be the same for all pages in a range.
This does sound like a reasonable short-term improvement. One annoying
issue with returning only order 0 page in the current interfaces is
that it fills up the pages array in the caller for no good reason.
next prev parent reply other threads:[~2026-06-10 11:34 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260608062937.804758-1-sw.prabhu6@gmail.com>
[not found] ` <c924fb59-be47-4fa5-adbf-a50a831ccd7b@kernel.org>
[not found] ` <aikBIESiJftxBdfL@infradead.org>
2026-06-10 9:54 ` [RFC v1] io_uring/rsrc: add fast path huge page handling in buffer registration David Hildenbrand (Arm)
2026-06-10 11:34 ` Christoph Hellwig [this message]
2026-06-10 13:18 ` David Hildenbrand (Arm)
2026-06-10 18:10 ` Matthew Wilcox
2026-06-10 18:45 ` David Hildenbrand (Arm)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ailLu70plC9WK2dB@infradead.org \
--to=hch@infradead.org \
--cc=axboe@kernel.dk \
--cc=dave@stgolabs.net \
--cc=david@kernel.org \
--cc=dongjoo.seo1@samsung.com \
--cc=io-uring@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=s.prabhu@samsung.com \
--cc=sw.prabhu6@gmail.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox