From: Peter Xu <peterx@redhat.com>
To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
Alex Williamson <alex.williamson@redhat.com>,
Zi Yan <ziy@nvidia.com>, Jason Gunthorpe <jgg@nvidia.com>,
Alex Mastro <amastro@fb.com>,
David Hildenbrand <david@redhat.com>,
Nico Pache <npache@redhat.com>,
peterx@redhat.com
Subject: [PATCH 0/5] mm/vfio: huge pfnmaps with !MAP_FIXED mappings
Date: Fri, 13 Jun 2025 09:41:06 -0400 [thread overview]
Message-ID: <20250613134111.469884-1-peterx@redhat.com> (raw)
[based on latest akpm/mm-new as of June 12th 2025, commit 19d47edf9]
This series enables !MAP_FIXED huge pfnmaps for vfio-pci.
Before this series, an userapp in most cases need to be modified to benefit
from huge mappings to provide huge size aligned VA using MAP_FIXED. After
this series, the userapp can benefit from huge pfnmap automatically after
the kernel upgrades, with no userspace modifications.
It's still best-effort, because the auto-alignment will require a larger VA
range to be allocated via the per-arch allocator, hence if the huge-mapping
aligned VA cannot be allocated then it'll still fallback to small mappings
like before. However that's really from theory POV: in reality I don't yet
know when it'll fail on any 64bits system due to it.
So far, only vfio-pci is supported. But the logic should be applicable to
all the drivers that support or will support huge pfnmaps.
Kudos goes to Jason on the suggestion:
https://lore.kernel.org/r/20250530131050.GA233377@nvidia.com
Though instead of refactoring shmem, I found we already have a function we
can directly reuse for THP calculations.
The idea is fairly simple too, which is to make sure whatever virtual
address got returned from an mmap() request of the MMIO BAR regions to be
huge-size-aligned with the physical address of the corresponding BARs.
It contains minimum mm changes, in reality only to rename and export the
THP function that can be reused. That is patch 3.
Patch 1 & 2 are trivial small cleanups that I found while I'm looking at
this problem. They can even be posted separately if anyone would like me
to.
Patch 4 is a tunneling needed to wire vfio-pci over to the mmap()
operations of vfio_device. Then, patch 5 is the real meat.
For testing: besides checkpatch and my daily cross-build harness, unit
tests working all fine from either myself [1] (based on another Alex's test
program) or Alex, checking the alignments look all sane with
mmap(!MAP_FIXED), and huge mappings properly installed.
Alex Mastro: please feel free to try this out with your internal tests. The
hope is that after this series applied your app should get huge pfnmaps
without any changes (with any pgoff specified). Logically there should be
minimal dependency on stable branches whenever huge pfnmap is available.
Comments welcomed, thanks.
[1] https://github.com/xzpeter/clibs/blob/master/misc/vfio-pci-nofix.c
[2] https://github.com/awilliam/tests/blob/vfio-pci-device-map-alignment/vfio-pci-device-map-alignment.c
Peter Xu (5):
mm: Deduplicate mm_get_unmapped_area()
mm/hugetlb: Remove prepare_hugepage_range()
mm: Rename __thp_get_unmapped_area to mm_get_unmapped_area_aligned
vfio: Introduce vfio_device_ops.get_unmapped_area hook
vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings
arch/loongarch/include/asm/hugetlb.h | 14 ------
arch/mips/include/asm/hugetlb.h | 14 ------
drivers/vfio/pci/vfio_pci.c | 3 ++
drivers/vfio/pci/vfio_pci_core.c | 65 ++++++++++++++++++++++++++++
drivers/vfio/vfio_main.c | 18 ++++++++
fs/hugetlbfs/inode.c | 8 +---
include/asm-generic/hugetlb.h | 8 ----
include/linux/huge_mm.h | 14 +++++-
include/linux/hugetlb.h | 6 ---
include/linux/vfio.h | 7 +++
include/linux/vfio_pci_core.h | 6 +++
mm/huge_memory.c | 6 ++-
mm/mmap.c | 5 +--
13 files changed, 120 insertions(+), 54 deletions(-)
--
2.49.0
next reply other threads:[~2025-06-13 13:41 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-13 13:41 Peter Xu [this message]
2025-06-13 13:41 ` [PATCH 1/5] mm: Deduplicate mm_get_unmapped_area() Peter Xu
2025-06-13 14:12 ` Jason Gunthorpe
2025-06-13 14:55 ` Oscar Salvador
2025-06-13 14:58 ` Zi Yan
2025-06-13 15:57 ` Lorenzo Stoakes
2025-06-13 17:00 ` Pedro Falcato
2025-06-13 18:00 ` David Hildenbrand
2025-06-16 8:01 ` David Laight
2025-06-17 21:13 ` Peter Xu
2025-06-13 13:41 ` [PATCH 2/5] mm/hugetlb: Remove prepare_hugepage_range() Peter Xu
2025-06-13 14:12 ` Jason Gunthorpe
2025-06-13 14:59 ` Oscar Salvador
2025-06-13 15:13 ` Zi Yan
2025-06-13 16:24 ` Peter Xu
2025-06-13 18:01 ` David Hildenbrand
2025-06-14 4:11 ` Liam R. Howlett
2025-06-17 21:07 ` Peter Xu
2025-06-13 13:41 ` [PATCH 3/5] mm: Rename __thp_get_unmapped_area to mm_get_unmapped_area_aligned Peter Xu
2025-06-13 14:17 ` Jason Gunthorpe
2025-06-13 15:13 ` Peter Xu
2025-06-13 16:00 ` Jason Gunthorpe
2025-06-13 18:31 ` Peter Xu
2025-06-13 15:19 ` Zi Yan
2025-06-13 18:33 ` Peter Xu
2025-06-13 15:36 ` Lorenzo Stoakes
2025-06-13 18:45 ` Peter Xu
2025-06-13 19:18 ` Lorenzo Stoakes
2025-06-13 20:34 ` Peter Xu
2025-06-14 5:58 ` Lorenzo Stoakes
2025-06-14 5:23 ` Liam R. Howlett
2025-06-16 12:14 ` Jason Gunthorpe
2025-06-16 12:20 ` Lorenzo Stoakes
2025-06-16 12:26 ` Jason Gunthorpe
2025-06-13 13:41 ` [PATCH 4/5] vfio: Introduce vfio_device_ops.get_unmapped_area hook Peter Xu
2025-06-13 14:18 ` Jason Gunthorpe
2025-06-13 18:03 ` David Hildenbrand
2025-06-14 14:46 ` kernel test robot
2025-06-17 15:39 ` Peter Xu
2025-06-17 15:41 ` Jason Gunthorpe
2025-06-17 16:47 ` Peter Xu
2025-06-17 19:39 ` Peter Xu
2025-06-17 19:46 ` Jason Gunthorpe
2025-06-17 20:01 ` Peter Xu
2025-06-17 23:00 ` Jason Gunthorpe
2025-06-17 23:26 ` Peter Xu
2025-06-13 13:41 ` [PATCH 5/5] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings Peter Xu
2025-06-13 14:29 ` Jason Gunthorpe
2025-06-13 15:26 ` Peter Xu
2025-06-13 16:09 ` Jason Gunthorpe
2025-06-13 19:15 ` Peter Xu
2025-06-13 23:16 ` Jason Gunthorpe
2025-06-16 22:06 ` Peter Xu
2025-06-16 23:00 ` Jason Gunthorpe
2025-06-17 20:56 ` Peter Xu
2025-06-17 23:18 ` Jason Gunthorpe
2025-06-17 23:36 ` Peter Xu
2025-06-18 16:56 ` Peter Xu
2025-06-18 17:46 ` Jason Gunthorpe
2025-06-18 19:15 ` Peter Xu
2025-06-19 13:58 ` Jason Gunthorpe
2025-06-19 14:55 ` Peter Xu
2025-06-19 18:40 ` Jason Gunthorpe
2025-06-24 20:37 ` Peter Xu
2025-06-24 20:51 ` Peter Xu
2025-06-24 23:40 ` Jason Gunthorpe
2025-06-25 0:48 ` Peter Xu
2025-06-25 13:07 ` Jason Gunthorpe
2025-06-25 17:12 ` Peter Xu
2025-06-25 18:41 ` Jason Gunthorpe
2025-06-25 19:26 ` Peter Xu
2025-06-30 14:05 ` Jason Gunthorpe
2025-07-02 20:58 ` Peter Xu
2025-07-02 23:32 ` Jason Gunthorpe
2025-06-13 17:44 ` Alex Mastro
2025-06-13 18:53 ` Peter Xu
2025-06-13 18:09 ` David Hildenbrand
2025-06-13 19:21 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250613134111.469884-1-peterx@redhat.com \
--to=peterx@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=amastro@fb.com \
--cc=david@redhat.com \
--cc=jgg@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npache@redhat.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).