public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] vfio: selftests: Add MMIO DMA mapping test
@ 2026-01-13 23:08 Alex Mastro
  2026-01-13 23:08 ` [PATCH v2 1/3] vfio: selftests: Centralize IOMMU mode name definitions Alex Mastro
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Alex Mastro @ 2026-01-13 23:08 UTC (permalink / raw)
  To: Alex Williamson, David Matlack, Shuah Khan
  Cc: Peter Xu, linux-kernel, kvm, linux-kselftest, Jason Gunthorpe,
	Alex Mastro

Test IOMMU mapping the BAR mmaps created during vfio_pci_device_setup().

All IOMMU modes are tested: vfio_type1 variants are expected to succeed,
while non-type1 modes are expected to fail. iommufd compat mode can be
updated to expect success once kernel support lands; native iommufd will
not support mapping vaddrs backed by MMIO (it will support dma-buf based
MMIO mapping instead).

To: Alex Williamson <alex@shazbot.org>
To: David Matlack <dmatlack@google.com>
To: Shuah Khan <shuah@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: Jason Gunthorpe <jgg@ziepe.ca>

Changes in v2:
- Split into patch series
- Factor out mmap_aligned() for vaddr alignment
- Align BAR mmaps to improve hugepage IOMMU mapping efficiency
- Centralize MODE_* string definitions
- Add is_power_of_2() assertion for BAR size
- Simplify align calculation to min(size, 1G)
- Add map_bar_misaligned test case
- Link to v1: https://lore.kernel.org/all/aWA4GKp5ld92sY6e@devgpu015.cco6.facebook.com

Signed-off-by: Alex Mastro <amastro@fb.com>
---
Alex Mastro (3):
      vfio: selftests: Centralize IOMMU mode name definitions
      vfio: selftests: Align BAR mmaps for efficient IOMMU mapping
      vfio: selftests: Add vfio_dma_mapping_mmio_test

 tools/testing/selftests/vfio/Makefile              |   1 +
 tools/testing/selftests/vfio/lib/include/libvfio.h |   9 ++
 .../selftests/vfio/lib/include/libvfio/iommu.h     |   6 +
 tools/testing/selftests/vfio/lib/iommu.c           |  12 +-
 tools/testing/selftests/vfio/lib/libvfio.c         |  25 ++++
 tools/testing/selftests/vfio/lib/vfio_pci_device.c |  24 +++-
 .../selftests/vfio/vfio_dma_mapping_mmio_test.c    | 144 +++++++++++++++++++++
 .../testing/selftests/vfio/vfio_dma_mapping_test.c |   2 +-
 8 files changed, 215 insertions(+), 8 deletions(-)
---
base-commit: d721f52e31553a848e0e9947ca15a49c5674aef3
change-id: 20260112-map-mmio-test-b4e4c2d917a9

Best regards,
-- 
Alex Mastro <amastro@fb.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/3] vfio: selftests: Centralize IOMMU mode name definitions
  2026-01-13 23:08 [PATCH v2 0/3] vfio: selftests: Add MMIO DMA mapping test Alex Mastro
@ 2026-01-13 23:08 ` Alex Mastro
  2026-01-13 23:08 ` [PATCH v2 2/3] vfio: selftests: Align BAR mmaps for efficient IOMMU mapping Alex Mastro
  2026-01-13 23:08 ` [PATCH v2 3/3] vfio: selftests: Add vfio_dma_mapping_mmio_test Alex Mastro
  2 siblings, 0 replies; 8+ messages in thread
From: Alex Mastro @ 2026-01-13 23:08 UTC (permalink / raw)
  To: Alex Williamson, David Matlack, Shuah Khan
  Cc: Peter Xu, linux-kernel, kvm, linux-kselftest, Jason Gunthorpe,
	Alex Mastro

Replace scattered string literals with MODE_* macros in iommu.h. This
provides a single source of truth for IOMMU mode name strings.

Signed-off-by: Alex Mastro <amastro@fb.com>
---
 tools/testing/selftests/vfio/lib/include/libvfio/iommu.h |  6 ++++++
 tools/testing/selftests/vfio/lib/iommu.c                 | 12 ++++++------
 tools/testing/selftests/vfio/vfio_dma_mapping_test.c     |  2 +-
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/vfio/lib/include/libvfio/iommu.h b/tools/testing/selftests/vfio/lib/include/libvfio/iommu.h
index 5c9b9dc6d993..e9a3386a4719 100644
--- a/tools/testing/selftests/vfio/lib/include/libvfio/iommu.h
+++ b/tools/testing/selftests/vfio/lib/include/libvfio/iommu.h
@@ -61,6 +61,12 @@ iova_t iommu_hva2iova(struct iommu *iommu, void *vaddr);
 
 struct iommu_iova_range *iommu_iova_ranges(struct iommu *iommu, u32 *nranges);
 
+#define MODE_VFIO_TYPE1_IOMMU "vfio_type1_iommu"
+#define MODE_VFIO_TYPE1V2_IOMMU "vfio_type1v2_iommu"
+#define MODE_IOMMUFD_COMPAT_TYPE1 "iommufd_compat_type1"
+#define MODE_IOMMUFD_COMPAT_TYPE1V2 "iommufd_compat_type1v2"
+#define MODE_IOMMUFD "iommufd"
+
 /*
  * Generator for VFIO selftests fixture variants that replicate across all
  * possible IOMMU modes. Tests must define FIXTURE_VARIANT_ADD_IOMMU_MODE()
diff --git a/tools/testing/selftests/vfio/lib/iommu.c b/tools/testing/selftests/vfio/lib/iommu.c
index 8079d43523f3..27d1d13abfeb 100644
--- a/tools/testing/selftests/vfio/lib/iommu.c
+++ b/tools/testing/selftests/vfio/lib/iommu.c
@@ -21,32 +21,32 @@
 #include "../../../kselftest.h"
 #include <libvfio.h>
 
-const char *default_iommu_mode = "iommufd";
+const char *default_iommu_mode = MODE_IOMMUFD;
 
 /* Reminder: Keep in sync with FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(). */
 static const struct iommu_mode iommu_modes[] = {
 	{
-		.name = "vfio_type1_iommu",
+		.name = MODE_VFIO_TYPE1_IOMMU,
 		.container_path = "/dev/vfio/vfio",
 		.iommu_type = VFIO_TYPE1_IOMMU,
 	},
 	{
-		.name = "vfio_type1v2_iommu",
+		.name = MODE_VFIO_TYPE1V2_IOMMU,
 		.container_path = "/dev/vfio/vfio",
 		.iommu_type = VFIO_TYPE1v2_IOMMU,
 	},
 	{
-		.name = "iommufd_compat_type1",
+		.name = MODE_IOMMUFD_COMPAT_TYPE1,
 		.container_path = "/dev/iommu",
 		.iommu_type = VFIO_TYPE1_IOMMU,
 	},
 	{
-		.name = "iommufd_compat_type1v2",
+		.name = MODE_IOMMUFD_COMPAT_TYPE1V2,
 		.container_path = "/dev/iommu",
 		.iommu_type = VFIO_TYPE1v2_IOMMU,
 	},
 	{
-		.name = "iommufd",
+		.name = MODE_IOMMUFD,
 	},
 };
 
diff --git a/tools/testing/selftests/vfio/vfio_dma_mapping_test.c b/tools/testing/selftests/vfio/vfio_dma_mapping_test.c
index 5397822c3dd4..7cd396aa205c 100644
--- a/tools/testing/selftests/vfio/vfio_dma_mapping_test.c
+++ b/tools/testing/selftests/vfio/vfio_dma_mapping_test.c
@@ -166,7 +166,7 @@ TEST_F(vfio_dma_mapping_test, dma_map_unmap)
 	 * IOMMUFD compatibility-mode does not support huge mappings when
 	 * using VFIO_TYPE1_IOMMU.
 	 */
-	if (!strcmp(variant->iommu_mode, "iommufd_compat_type1"))
+	if (!strcmp(variant->iommu_mode, MODE_IOMMUFD_COMPAT_TYPE1))
 		mapping_size = SZ_4K;
 
 	ASSERT_EQ(0, rc);

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/3] vfio: selftests: Align BAR mmaps for efficient IOMMU mapping
  2026-01-13 23:08 [PATCH v2 0/3] vfio: selftests: Add MMIO DMA mapping test Alex Mastro
  2026-01-13 23:08 ` [PATCH v2 1/3] vfio: selftests: Centralize IOMMU mode name definitions Alex Mastro
@ 2026-01-13 23:08 ` Alex Mastro
  2026-01-14 17:30   ` David Matlack
  2026-01-13 23:08 ` [PATCH v2 3/3] vfio: selftests: Add vfio_dma_mapping_mmio_test Alex Mastro
  2 siblings, 1 reply; 8+ messages in thread
From: Alex Mastro @ 2026-01-13 23:08 UTC (permalink / raw)
  To: Alex Williamson, David Matlack, Shuah Khan
  Cc: Peter Xu, linux-kernel, kvm, linux-kselftest, Jason Gunthorpe,
	Alex Mastro

Update vfio_pci_bar_map() to align BAR mmaps for efficient huge page
mappings. The manual mmap alignment can be removed once mmap(!MAP_FIXED)
on vfio device fds improves to automatically return well-aligned
addresses.

Signed-off-by: Alex Mastro <amastro@fb.com>
---
 tools/testing/selftests/vfio/lib/include/libvfio.h |  9 ++++++++
 tools/testing/selftests/vfio/lib/libvfio.c         | 25 ++++++++++++++++++++++
 tools/testing/selftests/vfio/lib/vfio_pci_device.c | 24 ++++++++++++++++++++-
 3 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/vfio/lib/include/libvfio.h b/tools/testing/selftests/vfio/lib/include/libvfio.h
index 279ddcd70194..5ebf8503586e 100644
--- a/tools/testing/selftests/vfio/lib/include/libvfio.h
+++ b/tools/testing/selftests/vfio/lib/include/libvfio.h
@@ -23,4 +23,13 @@
 const char *vfio_selftests_get_bdf(int *argc, char *argv[]);
 char **vfio_selftests_get_bdfs(int *argc, char *argv[], int *nr_bdfs);
 
+/*
+ * Reserve virtual address space of size at an address satisfying
+ * (vaddr % align) == offset.
+ *
+ * Returns the reserved vaddr. The caller is responsible for unmapping
+ * the returned region.
+ */
+void *mmap_aligned(size_t size, size_t align, size_t offset);
+
 #endif /* SELFTESTS_VFIO_LIB_INCLUDE_LIBVFIO_H */
diff --git a/tools/testing/selftests/vfio/lib/libvfio.c b/tools/testing/selftests/vfio/lib/libvfio.c
index a23a3cc5be69..4529bb1e69d1 100644
--- a/tools/testing/selftests/vfio/lib/libvfio.c
+++ b/tools/testing/selftests/vfio/lib/libvfio.c
@@ -2,6 +2,9 @@
 
 #include <stdio.h>
 #include <stdlib.h>
+#include <sys/mman.h>
+
+#include <linux/align.h>
 
 #include "../../../kselftest.h"
 #include <libvfio.h>
@@ -76,3 +79,25 @@ const char *vfio_selftests_get_bdf(int *argc, char *argv[])
 
 	return vfio_selftests_get_bdfs(argc, argv, &nr_bdfs)[0];
 }
+
+void *mmap_aligned(size_t size, size_t align, size_t offset)
+{
+	void *map_base, *map_align;
+	size_t delta;
+
+	VFIO_ASSERT_GT(align, offset);
+	delta = align - offset;
+
+	map_base = mmap(NULL, size + align, PROT_NONE,
+			MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+	VFIO_ASSERT_NE(map_base, MAP_FAILED);
+
+	map_align = (void *)(ALIGN((uintptr_t)map_base + delta, align) - delta);
+
+	if (map_align > map_base)
+		VFIO_ASSERT_EQ(munmap(map_base, map_align - map_base), 0);
+
+	VFIO_ASSERT_EQ(munmap(map_align + size, map_base + align - map_align), 0);
+
+	return map_align;
+}
diff --git a/tools/testing/selftests/vfio/lib/vfio_pci_device.c b/tools/testing/selftests/vfio/lib/vfio_pci_device.c
index 13fdb4b0b10f..03f35011b5f7 100644
--- a/tools/testing/selftests/vfio/lib/vfio_pci_device.c
+++ b/tools/testing/selftests/vfio/lib/vfio_pci_device.c
@@ -12,10 +12,14 @@
 #include <sys/mman.h>
 
 #include <uapi/linux/types.h>
+#include <linux/align.h>
 #include <linux/iommufd.h>
+#include <linux/kernel.h>
 #include <linux/limits.h>
+#include <linux/log2.h>
 #include <linux/mman.h>
 #include <linux/overflow.h>
+#include <linux/sizes.h>
 #include <linux/types.h>
 #include <linux/vfio.h>
 
@@ -124,20 +128,38 @@ static void vfio_pci_region_get(struct vfio_pci_device *device, int index,
 static void vfio_pci_bar_map(struct vfio_pci_device *device, int index)
 {
 	struct vfio_pci_bar *bar = &device->bars[index];
+	size_t align, size;
+	void *vaddr;
 	int prot = 0;
 
 	VFIO_ASSERT_LT(index, PCI_STD_NUM_BARS);
 	VFIO_ASSERT_NULL(bar->vaddr);
 	VFIO_ASSERT_TRUE(bar->info.flags & VFIO_REGION_INFO_FLAG_MMAP);
+	VFIO_ASSERT_TRUE(is_power_of_2(bar->info.size));
 
 	if (bar->info.flags & VFIO_REGION_INFO_FLAG_READ)
 		prot |= PROT_READ;
 	if (bar->info.flags & VFIO_REGION_INFO_FLAG_WRITE)
 		prot |= PROT_WRITE;
 
-	bar->vaddr = mmap(NULL, bar->info.size, prot, MAP_FILE | MAP_SHARED,
+	size = bar->info.size;
+
+	/*
+	 * Align BAR mmaps to improve page fault granularity during potential
+	 * subsequent IOMMU mapping of these BAR vaddr. 1G for x86 is the
+	 * largest hugepage size across any architecture, so no benefit from
+	 * larger alignment. BARs smaller than 1G will be aligned by their
+	 * power-of-two size, guaranteeing sufficient alignment for smaller
+	 * hugepages, if present.
+	 */
+	align = min_t(size_t, size, SZ_1G);
+
+	vaddr = mmap_aligned(size, align, 0);
+	bar->vaddr = mmap(vaddr, size, prot, MAP_SHARED | MAP_FIXED,
 			  device->fd, bar->info.offset);
 	VFIO_ASSERT_NE(bar->vaddr, MAP_FAILED);
+
+	madvise(bar->vaddr, size, MADV_HUGEPAGE);
 }
 
 static void vfio_pci_bar_unmap(struct vfio_pci_device *device, int index)

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] vfio: selftests: Add vfio_dma_mapping_mmio_test
  2026-01-13 23:08 [PATCH v2 0/3] vfio: selftests: Add MMIO DMA mapping test Alex Mastro
  2026-01-13 23:08 ` [PATCH v2 1/3] vfio: selftests: Centralize IOMMU mode name definitions Alex Mastro
  2026-01-13 23:08 ` [PATCH v2 2/3] vfio: selftests: Align BAR mmaps for efficient IOMMU mapping Alex Mastro
@ 2026-01-13 23:08 ` Alex Mastro
  2026-01-14 17:52   ` David Matlack
  2 siblings, 1 reply; 8+ messages in thread
From: Alex Mastro @ 2026-01-13 23:08 UTC (permalink / raw)
  To: Alex Williamson, David Matlack, Shuah Khan
  Cc: Peter Xu, linux-kernel, kvm, linux-kselftest, Jason Gunthorpe,
	Alex Mastro

Test IOMMU mapping the BAR mmaps created during vfio_pci_device_setup().

All IOMMU modes are tested: vfio_type1 variants are expected to succeed,
while non-type1 modes are expected to fail. iommufd compat mode can be
updated to expect success once kernel support lands; native iommufd will
not support mapping vaddrs backed by MMIO (it will support dma-buf based
MMIO mapping instead).

Signed-off-by: Alex Mastro <amastro@fb.com>
---
 tools/testing/selftests/vfio/Makefile              |   1 +
 .../selftests/vfio/vfio_dma_mapping_mmio_test.c    | 144 +++++++++++++++++++++
 2 files changed, 145 insertions(+)

diff --git a/tools/testing/selftests/vfio/Makefile b/tools/testing/selftests/vfio/Makefile
index 3c796ca99a50..ead27892ab65 100644
--- a/tools/testing/selftests/vfio/Makefile
+++ b/tools/testing/selftests/vfio/Makefile
@@ -1,5 +1,6 @@
 CFLAGS = $(KHDR_INCLUDES)
 TEST_GEN_PROGS += vfio_dma_mapping_test
+TEST_GEN_PROGS += vfio_dma_mapping_mmio_test
 TEST_GEN_PROGS += vfio_iommufd_setup_test
 TEST_GEN_PROGS += vfio_pci_device_test
 TEST_GEN_PROGS += vfio_pci_device_init_perf_test
diff --git a/tools/testing/selftests/vfio/vfio_dma_mapping_mmio_test.c b/tools/testing/selftests/vfio/vfio_dma_mapping_mmio_test.c
new file mode 100644
index 000000000000..5a86b34329ad
--- /dev/null
+++ b/tools/testing/selftests/vfio/vfio_dma_mapping_mmio_test.c
@@ -0,0 +1,144 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <stdio.h>
+#include <sys/mman.h>
+#include <unistd.h>
+
+#include <uapi/linux/types.h>
+#include <linux/pci_regs.h>
+#include <linux/sizes.h>
+#include <linux/vfio.h>
+
+#include <libvfio.h>
+
+#include "../kselftest_harness.h"
+
+static const char *device_bdf;
+
+static struct vfio_pci_bar *largest_mapped_bar(struct vfio_pci_device *device)
+{
+	u32 flags = VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_WRITE;
+	struct vfio_pci_bar *largest = NULL;
+	u64 bar_size = 0;
+
+	for (int i = 0; i < PCI_STD_NUM_BARS; i++) {
+		struct vfio_pci_bar *bar = &device->bars[i];
+
+		if (!bar->vaddr)
+			continue;
+
+		/*
+		 * iommu_map() maps with READ|WRITE, so require the same
+		 * abilities for the underlying VFIO region.
+		 */
+		if ((bar->info.flags & flags) != flags)
+			continue;
+
+		if (bar->info.size > bar_size) {
+			bar_size = bar->info.size;
+			largest = bar;
+		}
+	}
+
+	return largest;
+}
+
+FIXTURE(vfio_dma_mapping_mmio_test) {
+	struct iommu *iommu;
+	struct vfio_pci_device *device;
+	struct iova_allocator *iova_allocator;
+	struct vfio_pci_bar *bar;
+};
+
+FIXTURE_VARIANT(vfio_dma_mapping_mmio_test) {
+	const char *iommu_mode;
+};
+
+#define FIXTURE_VARIANT_ADD_IOMMU_MODE(_iommu_mode)			       \
+FIXTURE_VARIANT_ADD(vfio_dma_mapping_mmio_test, _iommu_mode) {		       \
+	.iommu_mode = #_iommu_mode,					       \
+}
+
+FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES();
+
+#undef FIXTURE_VARIANT_ADD_IOMMU_MODE
+
+FIXTURE_SETUP(vfio_dma_mapping_mmio_test)
+{
+	self->iommu = iommu_init(variant->iommu_mode);
+	self->device = vfio_pci_device_init(device_bdf, self->iommu);
+	self->iova_allocator = iova_allocator_init(self->iommu);
+	self->bar = largest_mapped_bar(self->device);
+
+	if (!self->bar)
+		SKIP(return, "No mappable BAR found on device %s", device_bdf);
+
+	if (self->bar->info.size < 2 * getpagesize())
+		SKIP(return, "BAR too small (size=0x%llx)", self->bar->info.size);
+}
+
+FIXTURE_TEARDOWN(vfio_dma_mapping_mmio_test)
+{
+	iova_allocator_cleanup(self->iova_allocator);
+	vfio_pci_device_cleanup(self->device);
+	iommu_cleanup(self->iommu);
+}
+
+static void do_mmio_map_test(struct iommu *iommu,
+			     struct iova_allocator *iova_allocator,
+			     void *vaddr, size_t size)
+{
+	struct dma_region region = {
+		.vaddr = vaddr,
+		.size = size,
+		.iova = iova_allocator_alloc(iova_allocator, size),
+	};
+
+	/*
+	 * NOTE: Check for iommufd compat success once it lands. Native iommufd
+	 * will never support this.
+	 */
+	if (!strcmp(iommu->mode->name, MODE_VFIO_TYPE1V2_IOMMU) ||
+	    !strcmp(iommu->mode->name, MODE_VFIO_TYPE1_IOMMU)) {
+		iommu_map(iommu, &region);
+		iommu_unmap(iommu, &region);
+	} else {
+		VFIO_ASSERT_NE(__iommu_map(iommu, &region), 0);
+		VFIO_ASSERT_NE(__iommu_unmap(iommu, &region, NULL), 0);
+	}
+}
+
+TEST_F(vfio_dma_mapping_mmio_test, map_full_bar)
+{
+	do_mmio_map_test(self->iommu, self->iova_allocator,
+			 self->bar->vaddr, self->bar->info.size);
+}
+
+TEST_F(vfio_dma_mapping_mmio_test, map_partial_bar)
+{
+	do_mmio_map_test(self->iommu, self->iova_allocator,
+			 self->bar->vaddr, getpagesize());
+}
+
+/* Test IOMMU mapping of BAR mmap with intentionally poor vaddr alignment. */
+TEST_F(vfio_dma_mapping_mmio_test, map_bar_misaligned)
+{
+	/* Limit size to bound test time for large BARs */
+	size_t size = min_t(size_t, self->bar->info.size, SZ_1G);
+	size_t page_size = getpagesize();
+	void *vaddr;
+
+	vaddr = mmap_aligned(size, SZ_1G, page_size);
+	vaddr = mmap(vaddr, size, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED,
+		     self->device->fd, self->bar->info.offset);
+	VFIO_ASSERT_NE(vaddr, MAP_FAILED);
+
+	do_mmio_map_test(self->iommu, self->iova_allocator, vaddr, size);
+
+	VFIO_ASSERT_EQ(munmap(vaddr, size), 0);
+}
+
+int main(int argc, char *argv[])
+{
+	device_bdf = vfio_selftests_get_bdf(&argc, argv);
+	return test_harness_run(argc, argv);
+}

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/3] vfio: selftests: Align BAR mmaps for efficient IOMMU mapping
  2026-01-13 23:08 ` [PATCH v2 2/3] vfio: selftests: Align BAR mmaps for efficient IOMMU mapping Alex Mastro
@ 2026-01-14 17:30   ` David Matlack
  2026-01-14 18:44     ` Alex Mastro
  0 siblings, 1 reply; 8+ messages in thread
From: David Matlack @ 2026-01-14 17:30 UTC (permalink / raw)
  To: Alex Mastro
  Cc: Alex Williamson, Shuah Khan, Peter Xu, linux-kernel, kvm,
	linux-kselftest, Jason Gunthorpe

On 2026-01-13 03:08 PM, Alex Mastro wrote:
> Update vfio_pci_bar_map() to align BAR mmaps for efficient huge page
> mappings. The manual mmap alignment can be removed once mmap(!MAP_FIXED)
> on vfio device fds improves to automatically return well-aligned
> addresses.

Please also mention that you added MADV_HUGEPAGE and why, and that you
dropped MAP_FILE (just mention that it was unnecessary in the first
place).

> 
> Signed-off-by: Alex Mastro <amastro@fb.com>
> ---
>  tools/testing/selftests/vfio/lib/include/libvfio.h |  9 ++++++++
>  tools/testing/selftests/vfio/lib/libvfio.c         | 25 ++++++++++++++++++++++
>  tools/testing/selftests/vfio/lib/vfio_pci_device.c | 24 ++++++++++++++++++++-
>  3 files changed, 57 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/vfio/lib/include/libvfio.h b/tools/testing/selftests/vfio/lib/include/libvfio.h
> index 279ddcd70194..5ebf8503586e 100644
> --- a/tools/testing/selftests/vfio/lib/include/libvfio.h
> +++ b/tools/testing/selftests/vfio/lib/include/libvfio.h
> @@ -23,4 +23,13 @@
>  const char *vfio_selftests_get_bdf(int *argc, char *argv[]);
>  char **vfio_selftests_get_bdfs(int *argc, char *argv[], int *nr_bdfs);
>  
> +/*
> + * Reserve virtual address space of size at an address satisfying
> + * (vaddr % align) == offset.
> + *
> + * Returns the reserved vaddr. The caller is responsible for unmapping
> + * the returned region.
> + */
> +void *mmap_aligned(size_t size, size_t align, size_t offset);

nit: Perhaps we should name this mmap_reserve()? The current name
implies something is being mmap'ed.

> +
>  #endif /* SELFTESTS_VFIO_LIB_INCLUDE_LIBVFIO_H */
> diff --git a/tools/testing/selftests/vfio/lib/libvfio.c b/tools/testing/selftests/vfio/lib/libvfio.c
> index a23a3cc5be69..4529bb1e69d1 100644
> --- a/tools/testing/selftests/vfio/lib/libvfio.c
> +++ b/tools/testing/selftests/vfio/lib/libvfio.c
> @@ -2,6 +2,9 @@
>  
>  #include <stdio.h>
>  #include <stdlib.h>
> +#include <sys/mman.h>
> +
> +#include <linux/align.h>
>  
>  #include "../../../kselftest.h"
>  #include <libvfio.h>
> @@ -76,3 +79,25 @@ const char *vfio_selftests_get_bdf(int *argc, char *argv[])
>  
>  	return vfio_selftests_get_bdfs(argc, argv, &nr_bdfs)[0];
>  }
> +
> +void *mmap_aligned(size_t size, size_t align, size_t offset)
> +{
> +	void *map_base, *map_align;
> +	size_t delta;
> +
> +	VFIO_ASSERT_GT(align, offset);
> +	delta = align - offset;
> +
> +	map_base = mmap(NULL, size + align, PROT_NONE,
> +			MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> +	VFIO_ASSERT_NE(map_base, MAP_FAILED);
> +
> +	map_align = (void *)(ALIGN((uintptr_t)map_base + delta, align) - delta);
> +
> +	if (map_align > map_base)
> +		VFIO_ASSERT_EQ(munmap(map_base, map_align - map_base), 0);
> +
> +	VFIO_ASSERT_EQ(munmap(map_align + size, map_base + align - map_align), 0);
> +
> +	return map_align;
> +}
> diff --git a/tools/testing/selftests/vfio/lib/vfio_pci_device.c b/tools/testing/selftests/vfio/lib/vfio_pci_device.c
> index 13fdb4b0b10f..03f35011b5f7 100644
> --- a/tools/testing/selftests/vfio/lib/vfio_pci_device.c
> +++ b/tools/testing/selftests/vfio/lib/vfio_pci_device.c
> @@ -12,10 +12,14 @@
>  #include <sys/mman.h>
>  
>  #include <uapi/linux/types.h>
> +#include <linux/align.h>
>  #include <linux/iommufd.h>
> +#include <linux/kernel.h>
>  #include <linux/limits.h>
> +#include <linux/log2.h>
>  #include <linux/mman.h>
>  #include <linux/overflow.h>
> +#include <linux/sizes.h>
>  #include <linux/types.h>
>  #include <linux/vfio.h>
>  
> @@ -124,20 +128,38 @@ static void vfio_pci_region_get(struct vfio_pci_device *device, int index,
>  static void vfio_pci_bar_map(struct vfio_pci_device *device, int index)
>  {
>  	struct vfio_pci_bar *bar = &device->bars[index];
> +	size_t align, size;
> +	void *vaddr;
>  	int prot = 0;

uber-nit: Put vaddr after prot to preserve the reverse-fir-tree ordering
of variables.

Here's the tip tree documentation:

  https://docs.kernel.org/process/maintainer-tip.html#variable-declarations

I should probably document somewhere that this is preferred in VFIO
selftests as well.

>  
>  	VFIO_ASSERT_LT(index, PCI_STD_NUM_BARS);
>  	VFIO_ASSERT_NULL(bar->vaddr);
>  	VFIO_ASSERT_TRUE(bar->info.flags & VFIO_REGION_INFO_FLAG_MMAP);
> +	VFIO_ASSERT_TRUE(is_power_of_2(bar->info.size));
>  
>  	if (bar->info.flags & VFIO_REGION_INFO_FLAG_READ)
>  		prot |= PROT_READ;
>  	if (bar->info.flags & VFIO_REGION_INFO_FLAG_WRITE)
>  		prot |= PROT_WRITE;
>  
> -	bar->vaddr = mmap(NULL, bar->info.size, prot, MAP_FILE | MAP_SHARED,
> +	size = bar->info.size;
> +
> +	/*
> +	 * Align BAR mmaps to improve page fault granularity during potential
> +	 * subsequent IOMMU mapping of these BAR vaddr. 1G for x86 is the
> +	 * largest hugepage size across any architecture, so no benefit from
> +	 * larger alignment. BARs smaller than 1G will be aligned by their
> +	 * power-of-two size, guaranteeing sufficient alignment for smaller
> +	 * hugepages, if present.
> +	 */
> +	align = min_t(size_t, size, SZ_1G);
> +
> +	vaddr = mmap_aligned(size, align, 0);
> +	bar->vaddr = mmap(vaddr, size, prot, MAP_SHARED | MAP_FIXED,
>  			  device->fd, bar->info.offset);
>  	VFIO_ASSERT_NE(bar->vaddr, MAP_FAILED);
> +
> +	madvise(bar->vaddr, size, MADV_HUGEPAGE);
>  }
>  
>  static void vfio_pci_bar_unmap(struct vfio_pci_device *device, int index)
> 
> -- 
> 2.47.3
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] vfio: selftests: Add vfio_dma_mapping_mmio_test
  2026-01-13 23:08 ` [PATCH v2 3/3] vfio: selftests: Add vfio_dma_mapping_mmio_test Alex Mastro
@ 2026-01-14 17:52   ` David Matlack
  2026-01-14 18:45     ` Alex Mastro
  0 siblings, 1 reply; 8+ messages in thread
From: David Matlack @ 2026-01-14 17:52 UTC (permalink / raw)
  To: Alex Mastro
  Cc: Alex Williamson, Shuah Khan, Peter Xu, linux-kernel, kvm,
	linux-kselftest, Jason Gunthorpe

On 2026-01-13 03:08 PM, Alex Mastro wrote:

> +FIXTURE_SETUP(vfio_dma_mapping_mmio_test)
> +{
> +	self->iommu = iommu_init(variant->iommu_mode);
> +	self->device = vfio_pci_device_init(device_bdf, self->iommu);
> +	self->iova_allocator = iova_allocator_init(self->iommu);
> +	self->bar = largest_mapped_bar(self->device);
> +
> +	if (!self->bar)
> +		SKIP(return, "No mappable BAR found on device %s", device_bdf);
> +
> +	if (self->bar->info.size < 2 * getpagesize())
> +		SKIP(return, "BAR too small (size=0x%llx)", self->bar->info.size);

It seems like the selftest should only skip map_partial_bar if the BAR
is less than 2 pages. map_full_bar would still be a valid test to run.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/3] vfio: selftests: Align BAR mmaps for efficient IOMMU mapping
  2026-01-14 17:30   ` David Matlack
@ 2026-01-14 18:44     ` Alex Mastro
  0 siblings, 0 replies; 8+ messages in thread
From: Alex Mastro @ 2026-01-14 18:44 UTC (permalink / raw)
  To: David Matlack
  Cc: Alex Williamson, Shuah Khan, Peter Xu, linux-kernel, kvm,
	linux-kselftest, Jason Gunthorpe

On Wed, Jan 14, 2026 at 05:30:04PM +0000, David Matlack wrote:
> On 2026-01-13 03:08 PM, Alex Mastro wrote:
> > Update vfio_pci_bar_map() to align BAR mmaps for efficient huge page
> > mappings. The manual mmap alignment can be removed once mmap(!MAP_FIXED)
> > on vfio device fds improves to automatically return well-aligned
> > addresses.
> 
> Please also mention that you added MADV_HUGEPAGE and why, and that you
> dropped MAP_FILE (just mention that it was unnecessary in the first
> place).

Ack

> 
> > 
> > Signed-off-by: Alex Mastro <amastro@fb.com>
> > ---
> >  tools/testing/selftests/vfio/lib/include/libvfio.h |  9 ++++++++
> >  tools/testing/selftests/vfio/lib/libvfio.c         | 25 ++++++++++++++++++++++
> >  tools/testing/selftests/vfio/lib/vfio_pci_device.c | 24 ++++++++++++++++++++-
> >  3 files changed, 57 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/testing/selftests/vfio/lib/include/libvfio.h b/tools/testing/selftests/vfio/lib/include/libvfio.h
> > index 279ddcd70194..5ebf8503586e 100644
> > --- a/tools/testing/selftests/vfio/lib/include/libvfio.h
> > +++ b/tools/testing/selftests/vfio/lib/include/libvfio.h
> > @@ -23,4 +23,13 @@
> >  const char *vfio_selftests_get_bdf(int *argc, char *argv[]);
> >  char **vfio_selftests_get_bdfs(int *argc, char *argv[], int *nr_bdfs);
> >  
> > +/*
> > + * Reserve virtual address space of size at an address satisfying
> > + * (vaddr % align) == offset.
> > + *
> > + * Returns the reserved vaddr. The caller is responsible for unmapping
> > + * the returned region.
> > + */
> > +void *mmap_aligned(size_t size, size_t align, size_t offset);
> 
> nit: Perhaps we should name this mmap_reserve()? The current name
> implies something is being mmap'ed.

SGTM

> 
> > +
> >  #endif /* SELFTESTS_VFIO_LIB_INCLUDE_LIBVFIO_H */
> > diff --git a/tools/testing/selftests/vfio/lib/libvfio.c b/tools/testing/selftests/vfio/lib/libvfio.c
> > index a23a3cc5be69..4529bb1e69d1 100644
> > --- a/tools/testing/selftests/vfio/lib/libvfio.c
> > +++ b/tools/testing/selftests/vfio/lib/libvfio.c
> > @@ -2,6 +2,9 @@
> >  
> >  #include <stdio.h>
> >  #include <stdlib.h>
> > +#include <sys/mman.h>
> > +
> > +#include <linux/align.h>
> >  
> >  #include "../../../kselftest.h"
> >  #include <libvfio.h>
> > @@ -76,3 +79,25 @@ const char *vfio_selftests_get_bdf(int *argc, char *argv[])
> >  
> >  	return vfio_selftests_get_bdfs(argc, argv, &nr_bdfs)[0];
> >  }
> > +
> > +void *mmap_aligned(size_t size, size_t align, size_t offset)
> > +{
> > +	void *map_base, *map_align;
> > +	size_t delta;
> > +
> > +	VFIO_ASSERT_GT(align, offset);
> > +	delta = align - offset;
> > +
> > +	map_base = mmap(NULL, size + align, PROT_NONE,
> > +			MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> > +	VFIO_ASSERT_NE(map_base, MAP_FAILED);
> > +
> > +	map_align = (void *)(ALIGN((uintptr_t)map_base + delta, align) - delta);
> > +
> > +	if (map_align > map_base)
> > +		VFIO_ASSERT_EQ(munmap(map_base, map_align - map_base), 0);
> > +
> > +	VFIO_ASSERT_EQ(munmap(map_align + size, map_base + align - map_align), 0);
> > +
> > +	return map_align;
> > +}
> > diff --git a/tools/testing/selftests/vfio/lib/vfio_pci_device.c b/tools/testing/selftests/vfio/lib/vfio_pci_device.c
> > index 13fdb4b0b10f..03f35011b5f7 100644
> > --- a/tools/testing/selftests/vfio/lib/vfio_pci_device.c
> > +++ b/tools/testing/selftests/vfio/lib/vfio_pci_device.c
> > @@ -12,10 +12,14 @@
> >  #include <sys/mman.h>
> >  
> >  #include <uapi/linux/types.h>
> > +#include <linux/align.h>
> >  #include <linux/iommufd.h>
> > +#include <linux/kernel.h>
> >  #include <linux/limits.h>
> > +#include <linux/log2.h>
> >  #include <linux/mman.h>
> >  #include <linux/overflow.h>
> > +#include <linux/sizes.h>
> >  #include <linux/types.h>
> >  #include <linux/vfio.h>
> >  
> > @@ -124,20 +128,38 @@ static void vfio_pci_region_get(struct vfio_pci_device *device, int index,
> >  static void vfio_pci_bar_map(struct vfio_pci_device *device, int index)
> >  {
> >  	struct vfio_pci_bar *bar = &device->bars[index];
> > +	size_t align, size;
> > +	void *vaddr;
> >  	int prot = 0;
> 
> uber-nit: Put vaddr after prot to preserve the reverse-fir-tree ordering
> of variables.
> 
> Here's the tip tree documentation:
> 
>   https://docs.kernel.org/process/maintainer-tip.html#variable-declarations
> 
> I should probably document somewhere that this is preferred in VFIO
> selftests as well.

Ah, thanks. I usually try to do this but missed it here.

> 
> >  
> >  	VFIO_ASSERT_LT(index, PCI_STD_NUM_BARS);
> >  	VFIO_ASSERT_NULL(bar->vaddr);
> >  	VFIO_ASSERT_TRUE(bar->info.flags & VFIO_REGION_INFO_FLAG_MMAP);
> > +	VFIO_ASSERT_TRUE(is_power_of_2(bar->info.size));
> >  
> >  	if (bar->info.flags & VFIO_REGION_INFO_FLAG_READ)
> >  		prot |= PROT_READ;
> >  	if (bar->info.flags & VFIO_REGION_INFO_FLAG_WRITE)
> >  		prot |= PROT_WRITE;
> >  
> > -	bar->vaddr = mmap(NULL, bar->info.size, prot, MAP_FILE | MAP_SHARED,
> > +	size = bar->info.size;
> > +
> > +	/*
> > +	 * Align BAR mmaps to improve page fault granularity during potential
> > +	 * subsequent IOMMU mapping of these BAR vaddr. 1G for x86 is the
> > +	 * largest hugepage size across any architecture, so no benefit from
> > +	 * larger alignment. BARs smaller than 1G will be aligned by their
> > +	 * power-of-two size, guaranteeing sufficient alignment for smaller
> > +	 * hugepages, if present.
> > +	 */
> > +	align = min_t(size_t, size, SZ_1G);
> > +
> > +	vaddr = mmap_aligned(size, align, 0);
> > +	bar->vaddr = mmap(vaddr, size, prot, MAP_SHARED | MAP_FIXED,
> >  			  device->fd, bar->info.offset);
> >  	VFIO_ASSERT_NE(bar->vaddr, MAP_FAILED);
> > +
> > +	madvise(bar->vaddr, size, MADV_HUGEPAGE);
> >  }
> >  
> >  static void vfio_pci_bar_unmap(struct vfio_pci_device *device, int index)
> > 
> > -- 
> > 2.47.3
> > 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] vfio: selftests: Add vfio_dma_mapping_mmio_test
  2026-01-14 17:52   ` David Matlack
@ 2026-01-14 18:45     ` Alex Mastro
  0 siblings, 0 replies; 8+ messages in thread
From: Alex Mastro @ 2026-01-14 18:45 UTC (permalink / raw)
  To: David Matlack
  Cc: Alex Williamson, Shuah Khan, Peter Xu, linux-kernel, kvm,
	linux-kselftest, Jason Gunthorpe

On Wed, Jan 14, 2026 at 05:52:15PM +0000, David Matlack wrote:
> On 2026-01-13 03:08 PM, Alex Mastro wrote:
> 
> > +FIXTURE_SETUP(vfio_dma_mapping_mmio_test)
> > +{
> > +	self->iommu = iommu_init(variant->iommu_mode);
> > +	self->device = vfio_pci_device_init(device_bdf, self->iommu);
> > +	self->iova_allocator = iova_allocator_init(self->iommu);
> > +	self->bar = largest_mapped_bar(self->device);
> > +
> > +	if (!self->bar)
> > +		SKIP(return, "No mappable BAR found on device %s", device_bdf);
> > +
> > +	if (self->bar->info.size < 2 * getpagesize())
> > +		SKIP(return, "BAR too small (size=0x%llx)", self->bar->info.size);
> 
> It seems like the selftest should only skip map_partial_bar if the BAR
> is less than 2 pages. map_full_bar would still be a valid test to run.

True. This was me being lax.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-01-14 18:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-13 23:08 [PATCH v2 0/3] vfio: selftests: Add MMIO DMA mapping test Alex Mastro
2026-01-13 23:08 ` [PATCH v2 1/3] vfio: selftests: Centralize IOMMU mode name definitions Alex Mastro
2026-01-13 23:08 ` [PATCH v2 2/3] vfio: selftests: Align BAR mmaps for efficient IOMMU mapping Alex Mastro
2026-01-14 17:30   ` David Matlack
2026-01-14 18:44     ` Alex Mastro
2026-01-13 23:08 ` [PATCH v2 3/3] vfio: selftests: Add vfio_dma_mapping_mmio_test Alex Mastro
2026-01-14 17:52   ` David Matlack
2026-01-14 18:45     ` Alex Mastro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox