linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Cédric Le Goater" <clg@redhat.com>
To: Peter Xu <peterx@redhat.com>,
	kvm@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Cc: Jason Gunthorpe <jgg@nvidia.com>, Nico Pache <npache@redhat.com>,
	Zi Yan <ziy@nvidia.com>, Alex Mastro <amastro@fb.com>,
	David Hildenbrand <david@redhat.com>,
	Alex Williamson <alex@shazbot.org>, Zhi Wang <zhiw@nvidia.com>,
	David Laight <david.laight.linux@gmail.com>,
	Yi Liu <yi.l.liu@intel.com>, Ankit Agrawal <ankita@nvidia.com>,
	Kevin Tian <kevin.tian@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v2 0/4] mm/vfio: huge pfnmaps with !MAP_FIXED mappings
Date: Thu, 4 Dec 2025 19:16:54 +0100	[thread overview]
Message-ID: <e2033095-9bf1-4d9c-9a5b-01148eaffc30@redhat.com> (raw)
In-Reply-To: <20251204151003.171039-1-peterx@redhat.com>

On 12/4/25 16:09, Peter Xu wrote:
> This series is based on v6.18.  It allows mmap(!MAP_FIXED) to work with
> huge pfnmaps with best effort.  Meanwhile, it enables it for vfio-pci as
> the first user.
> 
> v1: https://lore.kernel.org/r/20250613134111.469884-1-peterx@redhat.com
> 
> A changelog may not apply because all the patches were rewrote based on a
> new interface this v2 introduced.  Hence omitted.
> 
> In this version, a new file operation, get_mapping_order(), is introduced
> (based on discussion with Jason on v1) to minimize the code needed for
> drivers to implement this.  It also helps avoid exporting any mm functions.
> One can refer to the discussion in v1 for more information.
> 
> Currently, get_mapping_order() API is define as:
> 
>    int (*get_mapping_order)(struct file *file, unsigned long pgoff, size_t len);
> 
> The first argument is the file pointer, the 2nd+3rd are the pgoff+len
> specified from a mmap() request.  The driver can use this interface to
> opt-in providing mapping order hints to core mm on VA allocations for the
> range of the file specified.  I kept the interface as simple for now, so
> that core mm will always do the alignment with pgoff assuming that would
> always work.  The driver can only report the order from pgoff+len, which
> will be used to do the alignment.
> 
> Before this series, an userapp in most cases need to be modified to benefit
> from huge mappings to provide huge size aligned VA using MAP_FIXED.  After
> this series, the userapp can benefit from huge pfnmap automatically after
> the kernel upgrades, with no userspace modifications.
> 
> It's still best-effort, because the auto-alignment will require a larger VA
> range to be allocated via the per-arch allocator, hence if the huge-mapping
> aligned VA cannot be allocated then it'll still fallback to small mappings
> like before.  However that's from theory POV: in reality I don't yet know
> when it'll fail especially when on a 64bits system.
> 
> So far, only vfio-pci is supported.  But the logic should be applicable to
> all the drivers that support or will support huge pfnmaps.  I've copied
> some more people in this version too from hardware perspective.
> 
> For testings:
> 
> - checkpatch.pl
> - cross build harness
> - unit test that I got from Alex [1], checking mmap() alignments on a QEMU
>    instance with an 128MB bar.
> 
> Checking the alignments look all sane with mmap(!MAP_FIXED), and huge
> mappings properly installed.  I didn't observe anything wrong.
> 
> I currently lack larger bars to test PUD sizes.  Please kindly report if
> one can run this with 1G+ bars and hit issues.

LGTM, with a 32G BAR :

Using device 0000:02:00.0 in IOMMU group 27
Device 0000:02:00.0 supports 9 regions, 5 irqs
[BAR0]: size 0x1000000, order 24, offset 0x0, flags 0xf
Testing BAR0, require at least 21 bit alignment
[PASS] Minimum alignment 21
Testing random offset
[PASS] Random offset
Testing random size
[PASS] Random size
[BAR1]: size 0x800000000, order 35, offset 0x10000000000, flags 0x7
Testing BAR1, require at least 30 bit alignment
[PASS] Minimum alignment 31
Testing random offset
[PASS] Random offset
Testing random size
[PASS] Random size
[BAR3]: size 0x2000000, order 25, offset 0x30000000000, flags 0x7
Testing BAR3, require at least 21 bit alignment
[PASS] Minimum alignment 21
Testing random offset
[PASS] Random offset
Testing random size
[PASS] Random size


C.

> 
> Alex Mastro: thanks for the testing offered in v1, but since this series
> was rewritten, a re-test will be needed.  I hence didn't collect the T-b.
> 
> Comments welcomed, thanks.
> 
> [1] https://github.com/awilliam/tests/blob/vfio-pci-device-map-alignment/vfio-pci-device-map-alignment.c
> 
> Peter Xu (4):
>    mm/thp: Allow thp_get_unmapped_area_vmflags() to take alignment
>    mm: Add file_operations.get_mapping_order()
>    vfio: Introduce vfio_device_ops.get_mapping_order hook
>    vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings
> 
>   Documentation/filesystems/vfs.rst |  4 +++
>   drivers/vfio/pci/vfio_pci.c       |  1 +
>   drivers/vfio/pci/vfio_pci_core.c  | 49 ++++++++++++++++++++++++++
>   drivers/vfio/vfio_main.c          | 14 ++++++++
>   include/linux/fs.h                |  1 +
>   include/linux/huge_mm.h           |  5 +--
>   include/linux/vfio.h              |  5 +++
>   include/linux/vfio_pci_core.h     |  2 ++
>   mm/huge_memory.c                  |  7 ++--
>   mm/mmap.c                         | 58 +++++++++++++++++++++++++++----
>   10 files changed, 135 insertions(+), 11 deletions(-)
> 



  parent reply	other threads:[~2025-12-04 18:17 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-04 15:09 [PATCH v2 0/4] mm/vfio: huge pfnmaps with !MAP_FIXED mappings Peter Xu
2025-12-04 15:10 ` [PATCH v2 1/4] mm/thp: Allow thp_get_unmapped_area_vmflags() to take alignment Peter Xu
2025-12-04 15:10 ` [PATCH v2 2/4] mm: Add file_operations.get_mapping_order() Peter Xu
2025-12-04 15:19   ` Peter Xu
2025-12-08  9:21     ` Matthew Wilcox
2025-12-10 20:24       ` Peter Xu
2025-12-07 16:21   ` Jason Gunthorpe
2025-12-10 20:23     ` Peter Xu
2025-12-04 15:10 ` [PATCH v2 3/4] vfio: Introduce vfio_device_ops.get_mapping_order hook Peter Xu
2025-12-04 15:10 ` [PATCH v2 4/4] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings Peter Xu
2025-12-05  4:33   ` kernel test robot
2025-12-05  7:45   ` kernel test robot
2025-12-07 16:26   ` Jason Gunthorpe
2025-12-10 20:43     ` Peter Xu
2025-12-08  3:11   ` Alex Mastro
2025-12-04 18:16 ` Cédric Le Goater [this message]
2025-12-07  9:13 ` [PATCH v2 0/4] mm/vfio: " Alex Mastro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e2033095-9bf1-4d9c-9a5b-01148eaffc30@redhat.com \
    --to=clg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@shazbot.org \
    --cc=amastro@fb.com \
    --cc=ankita@nvidia.com \
    --cc=david.laight.linux@gmail.com \
    --cc=david@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npache@redhat.com \
    --cc=peterx@redhat.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhiw@nvidia.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).