From: Jason Gunthorpe <jgg@ziepe.ca>
To: Lorenzo Stoakes <ljs@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>,
Peter Xu <peterx@redhat.com>, Alex Williamson <alex@shazbot.org>,
Anthony Pighin <anthony.pighin@nokia.com>,
linux-kernel@vger.kernel.org,
Kefeng Wang <wangkefeng.wang@huawei.com>,
kvm@vger.kernel.org, linux-mm@kvack.org,
"Liam R. Howlett" <liam@infradead.org>,
Ryan Roberts <ryan.roberts@arm.com>
Subject: Re: [PATCH] vfio: Request THP-aligned mmap for device fds
Date: Thu, 18 Jun 2026 12:28:05 -0300 [thread overview]
Message-ID: <20260618152805.GF231643@ziepe.ca> (raw)
In-Reply-To: <ajQDVAu9b-LX8gUQ@lucifer>
On Thu, Jun 18, 2026 at 03:55:58PM +0100, Lorenzo Stoakes wrote:
> > A pfn driver often has a single already known physical range that it
> > will use for the VMA and that range should drive the alignment
> > decision of the VMA.
> >
> > vfio in particular has common use cases where you want to mmap from
> > weird offsets, but we still want to achieve a VMA starting point that
> > has pa % PUD_SIZE == va % PUD_SIZE. It is impossible to do this if the
> > thing building info does not know pa.
> >
> > I do think it makes sense that no file provider should be computing
> > the VA area itself, I think I made that case when Peter was last
> > working on this. Now that we have Lorenzo's mmap changes maybe we
> > should be talking about supporting VFIO by having a callback to obtain
> > the starting pfn for the VMA. Usable only by drivers like VFIO that
> > are working with the pfn functions.
>
> Can't we figure this out from what the driver tells us when it invokes an
> mmap_prepare action?
VFIO installs the pages via fault handler so there is not a naturally
existing way to pass in the pfn?
> Can't we figure it out from the PFN the driver tells mmap_prepare about?
Maybe it can pass the pfn anyhow and not have the mmap logic map
anything?
> > Maybe other users would prefer a 'max order' callback and then the mm
> > would assume the VMA will be popoulated with pgoff aligned folios up
> > to that highest order?
>
> Not in favour of that, fear it'll be seen as a new go-faster stripe. Ask
> somebody how many free pints they want and they may veer rather towards the
> upper bound :)
I think you need something, otherwise we will be aligning VMAs that
never have anything larger than a 2M THP to 1GB boundaries, doesn't
seem good.
Jason
prev parent reply other threads:[~2026-06-18 15:28 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-16 18:01 [PATCH] vfio: Request THP-aligned mmap for device fds Anthony Pighin
2026-06-16 22:30 ` Alex Williamson
2026-06-17 14:21 ` Peter Xu
2026-06-17 18:34 ` Matthew Wilcox
2026-06-17 19:29 ` Jason Gunthorpe
2026-06-18 14:55 ` Lorenzo Stoakes
2026-06-18 15:04 ` Matthew Wilcox
2026-06-18 15:30 ` Jason Gunthorpe
2026-06-18 15:56 ` Lorenzo Stoakes
2026-06-18 15:28 ` Jason Gunthorpe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260618152805.GF231643@ziepe.ca \
--to=jgg@ziepe.ca \
--cc=alex@shazbot.org \
--cc=anthony.pighin@nokia.com \
--cc=kvm@vger.kernel.org \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=peterx@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=wangkefeng.wang@huawei.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.