From: Peter Xu <peterx@redhat.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kvm@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>,
Alex Williamson <alex.williamson@redhat.com>,
Zi Yan <ziy@nvidia.com>, Alex Mastro <amastro@fb.com>,
David Hildenbrand <david@redhat.com>,
Nico Pache <npache@redhat.com>
Subject: Re: [PATCH 5/5] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings
Date: Wed, 2 Jul 2025 16:58:46 -0400 [thread overview]
Message-ID: <aGWdhnw7TKZKH5WM@x1.local> (raw)
In-Reply-To: <20250630140537.GW167785@nvidia.com>
On Mon, Jun 30, 2025 at 11:05:37AM -0300, Jason Gunthorpe wrote:
> On Wed, Jun 25, 2025 at 03:26:44PM -0400, Peter Xu wrote:
> > On Wed, Jun 25, 2025 at 03:41:54PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Jun 25, 2025 at 01:12:11PM -0400, Peter Xu wrote:
> > >
> > > > After I read the two use cases, I mostly agree. Just one trivial thing to
> > > > mention, it may not be direct map but vmap() (see io_region_init_ptr()).
> > >
> > > If it is vmapped then this is all silly, you should vmap and mmmap
> > > using the same cache colouring and, AFAIK, pgoff is how this works for
> > > purely userspace.
> > >
> > > Once vmap'd it should determine the cache colour and set the pgoff
> > > properly, then everything should already work no?
> >
> > I don't yet see how to set the pgoff. Here pgoff is passed from the
> > userspace, which follows io_uring's definition (per io_uring_mmap).
>
> That's too bad
>
> So you have to do it the other way and pass the pgoff to the vmap so
> the vmap ends up with the same colouring as a user VMa holding the
> same pages..
Not sure if I get that point, but.. it'll be hard to achieve at least.
The vmap() happens (submit/complete queues initializes) when io_uring
instance is created. The mmap() happens later, and it can also happen
multiple times, so that all of the VAs got mmap()ed need to share the same
colouring with the vmap().. In this case it sounds reasonable to me to
have the alignment done at mmap(), against the vmap() results.
>
> > So if we want the new API to be proposed here, and make VFIO use it first
> > (while consider it to be applicable to all existing MMU users at least,
> > which I checked all of them so far now), I'd think this proper:
> >
> > int (*mmap_va_hint)(struct file *file, unsigned long *pgoff, size_t len);
> >
> > The changes comparing to previous:
> >
> > (1) merged pgoff and *phys_pgoff parameters into one unsigned long, so
> > the hook can adjust the pgoff for the va allocator to be used. The
> > adjustment will not be visible to future mmap() when VMA is created.
>
> It seems functional, but the above is better, IMHO.
Do you mean we can start with no modification allowed on *pgoff? I'd
prefer having *pgoff modifiable from the start, as it'll not only work for
io_uring / parisc above since the 1st day (so we don't need to introduce it
on top, modifying existing users..), but it'll also be cleaner to be used
in the current VFIO's use case.
>
> > (2) I renamed it to mmap_va_hint(), because *pgoff will be able to be
> > updated, so it's not only about ordering, but "order" and "pgoff
> > adjustment" hints that the core mm will use when calculating the VA.
>
> Where does order come back though? Returns order?
Yes.
>
> It seems viable
After I double check with the API above, I can go and prepare a new version.
Thanks a lot, Jason.
--
Peter Xu
next prev parent reply other threads:[~2025-07-02 20:58 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-13 13:41 [PATCH 0/5] mm/vfio: huge pfnmaps with !MAP_FIXED mappings Peter Xu
2025-06-13 13:41 ` [PATCH 1/5] mm: Deduplicate mm_get_unmapped_area() Peter Xu
2025-06-13 14:12 ` Jason Gunthorpe
2025-06-13 14:55 ` Oscar Salvador
2025-06-13 14:58 ` Zi Yan
2025-06-13 15:57 ` Lorenzo Stoakes
2025-06-13 17:00 ` Pedro Falcato
2025-06-13 18:00 ` David Hildenbrand
2025-06-16 8:01 ` David Laight
2025-06-17 21:13 ` Peter Xu
2025-06-13 13:41 ` [PATCH 2/5] mm/hugetlb: Remove prepare_hugepage_range() Peter Xu
2025-06-13 14:12 ` Jason Gunthorpe
2025-06-13 14:59 ` Oscar Salvador
2025-06-13 15:13 ` Zi Yan
2025-06-13 16:24 ` Peter Xu
2025-06-13 18:01 ` David Hildenbrand
2025-06-14 4:11 ` Liam R. Howlett
2025-06-17 21:07 ` Peter Xu
2025-06-13 13:41 ` [PATCH 3/5] mm: Rename __thp_get_unmapped_area to mm_get_unmapped_area_aligned Peter Xu
2025-06-13 14:17 ` Jason Gunthorpe
2025-06-13 15:13 ` Peter Xu
2025-06-13 16:00 ` Jason Gunthorpe
2025-06-13 18:31 ` Peter Xu
2025-06-13 15:19 ` Zi Yan
2025-06-13 18:33 ` Peter Xu
2025-06-13 15:36 ` Lorenzo Stoakes
2025-06-13 18:45 ` Peter Xu
2025-06-13 19:18 ` Lorenzo Stoakes
2025-06-13 20:34 ` Peter Xu
2025-06-14 5:58 ` Lorenzo Stoakes
2025-06-14 5:23 ` Liam R. Howlett
2025-06-16 12:14 ` Jason Gunthorpe
2025-06-16 12:20 ` Lorenzo Stoakes
2025-06-16 12:26 ` Jason Gunthorpe
2025-06-13 13:41 ` [PATCH 4/5] vfio: Introduce vfio_device_ops.get_unmapped_area hook Peter Xu
2025-06-13 14:18 ` Jason Gunthorpe
2025-06-13 18:03 ` David Hildenbrand
2025-06-14 14:46 ` kernel test robot
2025-06-17 15:39 ` Peter Xu
2025-06-17 15:41 ` Jason Gunthorpe
2025-06-17 16:47 ` Peter Xu
2025-06-17 19:39 ` Peter Xu
2025-06-17 19:46 ` Jason Gunthorpe
2025-06-17 20:01 ` Peter Xu
2025-06-17 23:00 ` Jason Gunthorpe
2025-06-17 23:26 ` Peter Xu
2025-06-13 13:41 ` [PATCH 5/5] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings Peter Xu
2025-06-13 14:29 ` Jason Gunthorpe
2025-06-13 15:26 ` Peter Xu
2025-06-13 16:09 ` Jason Gunthorpe
2025-06-13 19:15 ` Peter Xu
2025-06-13 23:16 ` Jason Gunthorpe
2025-06-16 22:06 ` Peter Xu
2025-06-16 23:00 ` Jason Gunthorpe
2025-06-17 20:56 ` Peter Xu
2025-06-17 23:18 ` Jason Gunthorpe
2025-06-17 23:36 ` Peter Xu
2025-06-18 16:56 ` Peter Xu
2025-06-18 17:46 ` Jason Gunthorpe
2025-06-18 19:15 ` Peter Xu
2025-06-19 13:58 ` Jason Gunthorpe
2025-06-19 14:55 ` Peter Xu
2025-06-19 18:40 ` Jason Gunthorpe
2025-06-24 20:37 ` Peter Xu
2025-06-24 20:51 ` Peter Xu
2025-06-24 23:40 ` Jason Gunthorpe
2025-06-25 0:48 ` Peter Xu
2025-06-25 13:07 ` Jason Gunthorpe
2025-06-25 17:12 ` Peter Xu
2025-06-25 18:41 ` Jason Gunthorpe
2025-06-25 19:26 ` Peter Xu
2025-06-30 14:05 ` Jason Gunthorpe
2025-07-02 20:58 ` Peter Xu [this message]
2025-07-02 23:32 ` Jason Gunthorpe
2025-06-13 18:09 ` David Hildenbrand
2025-06-13 19:21 ` Peter Xu
[not found] ` <20250613174442.1589882-1-amastro@fb.com>
2025-06-13 18:53 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGWdhnw7TKZKH5WM@x1.local \
--to=peterx@redhat.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=amastro@fb.com \
--cc=david@redhat.com \
--cc=jgg@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=npache@redhat.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).