All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] vfio: Request THP-aligned mmap for device fds
@ 2026-06-16 18:01 Anthony Pighin
  2026-06-16 22:30 ` Alex Williamson
  0 siblings, 1 reply; 2+ messages in thread
From: Anthony Pighin @ 2026-06-16 18:01 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-kernel, stable, Kefeng Wang, Vlastimil Babka, Andrew Morton,
	kvm

VFIO PCI devices support PMD-sized page table entries for BAR mappings
via their huge_fault handler (vfio_pci_mmap_huge_fault).  However, the
VFIO device file_operations never provided a get_unmapped_area callback
to request PMD-aligned virtual address placement from the mmap address
allocator.

Before commit 34d7cf637c43 ("mm: don't try THP alignment for FS without
get_unmapped_area"), this was masked by a bug introduced in commit
ed48e87c7df3 ("thp: add thp_get_unmapped_area_vmflags()") which
inadvertently applied THP alignment to all file-backed mappings,
regardless of whether they provided a get_unmapped_area callback.

When commit 34d7cf637c43 ("mm: don't try THP alignment for FS without
get_unmapped_area") correctly restricted THP alignment to anonymous
mappings and files that explicitly opt in via get_unmapped_area, VFIO BAR
mappings lost their PMD-aligned placement.  Since the huge_fault handler
requires both the VMA start address and the physical PFN to be
PMD-aligned, unaligned VMAs force a fallback to 4KB page faults.

For example, a 2GiB BAR results in 524,288 individual page faults
instead of 1,024 PMD-sized faults, increasing the VFIO_IOMMU_MAP_DMA
pinning time by orders of magnitude -- a regression directly visible to
KVM guests during PCI device initialization.

Fix this by providing a get_unmapped_area callback in vfio_device_fops,
following the same pattern used by ext4, xfs, btrfs, fuse, and other
subsystems that benefit from THP-aligned placement.

Fixes: 34d7cf637c43 ("mm: don't try THP alignment for FS without get_unmapped_area")
Cc: stable@vger.kernel.org
Cc: Alex Williamson <alex@shazbot.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: kvm@vger.kernel.org
Signed-off-by: Anthony Pighin <anthony.pighin@nokia.com>
---
 drivers/vfio/vfio_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 6222376ab6ab..2dbb1a84dbac 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -40,6 +40,7 @@
 #include <linux/interval_tree.h>
 #include <linux/iova_bitmap.h>
 #include <linux/iommufd.h>
+#include <linux/huge_mm.h>
 #include "vfio.h"
 
 #define DRIVER_VERSION	"0.3"
@@ -1461,6 +1462,7 @@ const struct file_operations vfio_device_fops = {
 	.unlocked_ioctl	= vfio_device_fops_unl_ioctl,
 	.compat_ioctl	= compat_ptr_ioctl,
 	.mmap		= vfio_device_fops_mmap,
+	.get_unmapped_area = thp_get_unmapped_area,
 #ifdef CONFIG_PROC_FS
 	.show_fdinfo	= vfio_device_show_fdinfo,
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] vfio: Request THP-aligned mmap for device fds
  2026-06-16 18:01 [PATCH] vfio: Request THP-aligned mmap for device fds Anthony Pighin
@ 2026-06-16 22:30 ` Alex Williamson
  0 siblings, 0 replies; 2+ messages in thread
From: Alex Williamson @ 2026-06-16 22:30 UTC (permalink / raw)
  To: Anthony Pighin
  Cc: linux-kernel, stable, Kefeng Wang, Vlastimil Babka, Andrew Morton,
	kvm, alex, Matthew Wilcox, Jason Gunthorpe, Peter Xu

On Tue, 16 Jun 2026 14:01:29 -0400
Anthony Pighin <anthony.pighin@nokia.com> wrote:

> VFIO PCI devices support PMD-sized page table entries for BAR mappings
> via their huge_fault handler (vfio_pci_mmap_huge_fault).  However, the
> VFIO device file_operations never provided a get_unmapped_area callback
> to request PMD-aligned virtual address placement from the mmap address
> allocator.
> 
> Before commit 34d7cf637c43 ("mm: don't try THP alignment for FS without
> get_unmapped_area"), this was masked by a bug introduced in commit
> ed48e87c7df3 ("thp: add thp_get_unmapped_area_vmflags()") which
> inadvertently applied THP alignment to all file-backed mappings,
> regardless of whether they provided a get_unmapped_area callback.
> 
> When commit 34d7cf637c43 ("mm: don't try THP alignment for FS without
> get_unmapped_area") correctly restricted THP alignment to anonymous
> mappings and files that explicitly opt in via get_unmapped_area, VFIO BAR
> mappings lost their PMD-aligned placement.  Since the huge_fault handler
> requires both the VMA start address and the physical PFN to be
> PMD-aligned, unaligned VMAs force a fallback to 4KB page faults.
> 
> For example, a 2GiB BAR results in 524,288 individual page faults
> instead of 1,024 PMD-sized faults, increasing the VFIO_IOMMU_MAP_DMA
> pinning time by orders of magnitude -- a regression directly visible to
> KVM guests during PCI device initialization.
> 
> Fix this by providing a get_unmapped_area callback in vfio_device_fops,
> following the same pattern used by ext4, xfs, btrfs, fuse, and other
> subsystems that benefit from THP-aligned placement.

The trouble is that PMD alignment isn't right either, your 1024 PMD
faults on a 2GiB BAR would be 2 faults on x86_64 with PUD mappings.
QEMU has forced the alignment to make it optimal for some time[1], so
there are userspace VMM options.  Seems like you were previously
getting lucky.

Peter Xu was working on a more comprehensive solution[2] late last
year, but it seems there was an objection to the
file_operations.get_mapping_order() proposal before Plumbers and the
thread hasn't rekindled.

Gentle bump to Peter and Willy that maybe we could resurrect that
effort.  Thanks,

Alex

[1]https://gitlab.com/qemu-project/qemu/-/commit/00b519c0bca0
[2]https://lore.kernel.org/all/20251204151003.171039-1-peterx@redhat.com/

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-16 22:30 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-16 18:01 [PATCH] vfio: Request THP-aligned mmap for device fds Anthony Pighin
2026-06-16 22:30 ` Alex Williamson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.