* [PATCH 0/4] Introduce vfio_dma_mapping_perf_test
@ 2026-07-01 20:33 Aaron Lewis
2026-07-01 20:33 ` [PATCH 1/4] vfio: selftests: " Aaron Lewis
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Aaron Lewis @ 2026-07-01 20:33 UTC (permalink / raw)
To: kvm; +Cc: alex, dmatlack, jgg, Aaron Lewis
Add a new VFIO selftest, vfio_dma_mapping_perf_test, to provide a
configurable tool for measuring DMA mapping and unmapping latencies.
This selftest introduces command-line parameters to provide a convenient
way to tune the test for specific situations. For example, it can be used
to observe the lengthy unmap times of the Type 1 IOMMU on large memory
regions, or to compare DMA mapping performance between the Type 1 IOMMU
and IOMMUFD. These scenarios can now be easily configured and run via the
command line.
During development, I abandoned an approach where I initialized memfd
globally in main() for all tests. While that allowed memfd to be
initialized once for all test variants, it proved awkward because a
different memfd would be needed for each supported page size (e.g., 4K,
2M, 1G). Also, If a user is only running a single test variant,
pre-allocating all sizes is unnecessary overhead. Therefore, I dropped
that approach in favor of test-local allocation. However, I'm open to
ideas on a better way to handle this.
Aaron Lewis (4):
vfio: selftests: Introduce vfio_dma_mapping_perf_test
vfio: selftests: Add memfd test to vfio_dma_mapping_perf_test
vfio: selftests: Allow a size for vfio_dma_mapping_perf_test
vfio: selftests: Allow the flag MAP_POPULATE to be set on the cmdline
tools/testing/selftests/vfio/Makefile | 1 +
.../vfio/lib/include/libvfio/iommu.h | 7 +
tools/testing/selftests/vfio/lib/iommu.c | 24 ++
.../vfio/vfio_dma_mapping_perf_test.c | 380 ++++++++++++++++++
4 files changed, 412 insertions(+)
create mode 100644 tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
--
2.55.0.rc0.799.gd6f94ed593-goog
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/4] vfio: selftests: Introduce vfio_dma_mapping_perf_test
2026-07-01 20:33 [PATCH 0/4] Introduce vfio_dma_mapping_perf_test Aaron Lewis
@ 2026-07-01 20:33 ` Aaron Lewis
2026-07-01 20:44 ` sashiko-bot
2026-07-01 20:33 ` [PATCH 2/4] vfio: selftests: Add memfd test to vfio_dma_mapping_perf_test Aaron Lewis
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Aaron Lewis @ 2026-07-01 20:33 UTC (permalink / raw)
To: kvm; +Cc: alex, dmatlack, jgg, Aaron Lewis
Introduce a tool to aid in latency testing rather than a traditional
pass/fail test. The goal is to make the test configurable to expose
latency issues and make them easy to reproduce and share.
Start with a basic test that performs a simple DMA map/unmap and reports
the duration of each operation. Report the following four metrics:
1. Time taken by mmap().
2. Time taken to map a DMA region.
3. Time taken to unmap a DMA region.
4. Time taken by munmap().
Subsequent commits in this series will introduce more features to allow
for a wider variety of testing.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
tools/testing/selftests/vfio/Makefile | 1 +
.../vfio/vfio_dma_mapping_perf_test.c | 124 ++++++++++++++++++
2 files changed, 125 insertions(+)
create mode 100644 tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
diff --git a/tools/testing/selftests/vfio/Makefile b/tools/testing/selftests/vfio/Makefile
index e6e8cb52ab03..9f620c5ef6ab 100644
--- a/tools/testing/selftests/vfio/Makefile
+++ b/tools/testing/selftests/vfio/Makefile
@@ -8,6 +8,7 @@ else
CFLAGS = $(KHDR_INCLUDES)
TEST_GEN_PROGS += vfio_dma_mapping_test
TEST_GEN_PROGS += vfio_dma_mapping_mmio_test
+TEST_GEN_PROGS += vfio_dma_mapping_perf_test
TEST_GEN_PROGS += vfio_iommufd_setup_test
TEST_GEN_PROGS += vfio_pci_device_test
TEST_GEN_PROGS += vfio_pci_device_init_perf_test
diff --git a/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
new file mode 100644
index 000000000000..87d84450fc47
--- /dev/null
+++ b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
@@ -0,0 +1,124 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <limits.h>
+#include <stdio.h>
+#include <sys/mman.h>
+#include <time.h>
+#include <unistd.h>
+
+#include <linux/iommufd.h>
+#include <linux/limits.h>
+#include <linux/memfd.h>
+#include <linux/mman.h>
+#include <linux/sizes.h>
+#include <linux/time64.h>
+#include <linux/vfio.h>
+
+#include <libvfio.h>
+
+#include "kselftest_harness.h"
+
+static const char *device_bdf;
+
+struct iommu_mapping {
+ u64 pgd;
+ u64 p4d;
+ u64 pud;
+ u64 pmd;
+ u64 pte;
+};
+
+static void timer_start(struct timespec *start) {
+ clock_gettime(CLOCK_MONOTONIC, start);
+}
+
+static double timer_elapsed_ms(struct timespec start)
+{
+ struct timespec end;
+
+ clock_gettime(CLOCK_MONOTONIC, &end);
+
+ return (double)(end.tv_sec - start.tv_sec) * MSEC_PER_SEC +
+ (double)(end.tv_nsec - start.tv_nsec) / NSEC_PER_MSEC;
+}
+
+FIXTURE(vfio_dma_mapping_perf_test) {
+ struct iommu *iommu;
+ struct vfio_pci_device *device;
+ struct iova_allocator *iova_allocator;
+};
+
+FIXTURE_VARIANT(vfio_dma_mapping_perf_test) {
+ const char *iommu_mode;
+ int mmap_flags;
+};
+
+#define FIXTURE_VARIANT_ADD_IOMMU_MODE(_iommu_mode, _name, _mmap_flags) \
+FIXTURE_VARIANT_ADD(vfio_dma_mapping_perf_test, _iommu_mode ## _ ## _name) { \
+ .iommu_mode = #_iommu_mode, \
+ .mmap_flags = MAP_ANONYMOUS | MAP_PRIVATE | (_mmap_flags), \
+}
+
+FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(anonymous, 0);
+FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(anonymous_hugetlb_2mb, MAP_HUGETLB | MAP_HUGE_2MB);
+FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(anonymous_hugetlb_1gb, MAP_HUGETLB | MAP_HUGE_1GB);
+
+#undef FIXTURE_VARIANT_ADD_IOMMU_MODE
+
+FIXTURE_SETUP(vfio_dma_mapping_perf_test)
+{
+ self->iommu = iommu_init(variant->iommu_mode);
+ self->device = vfio_pci_device_init(device_bdf, self->iommu);
+ self->iova_allocator = iova_allocator_init(self->iommu);
+}
+
+FIXTURE_TEARDOWN(vfio_dma_mapping_perf_test)
+{
+ iova_allocator_cleanup(self->iova_allocator);
+ vfio_pci_device_cleanup(self->device);
+ iommu_cleanup(self->iommu);
+}
+
+TEST_F(vfio_dma_mapping_perf_test, dma_map_unmap)
+{
+ const u64 size = SZ_1G;
+ const int flags = variant->mmap_flags;
+ struct dma_region region;
+ struct timespec start;
+ u64 unmapped;
+ int rc;
+
+ timer_start(&start);
+ region.vaddr = mmap(NULL, size, PROT_READ | PROT_WRITE, flags, -1, 0);
+ printf("Completed mmap() in %.2lfms\n", timer_elapsed_ms(start));
+
+ /* Skip the test if there aren't enough HugeTLB pages available. */
+ if (flags & MAP_HUGETLB && region.vaddr == MAP_FAILED)
+ SKIP(return, "mmap() failed: %s (%d)\n", strerror(errno), errno);
+ else
+ ASSERT_NE(region.vaddr, MAP_FAILED);
+
+ region.iova = iova_allocator_alloc(self->iova_allocator, size);
+ region.size = size;
+
+ timer_start(&start);
+ iommu_map(self->iommu, ®ion);
+ printf("Mapped HVA %p (size %luG) at IOVA 0x%lx in %.2lfms\n",
+ region.vaddr, size / SZ_1G, region.iova, timer_elapsed_ms(start));
+ ASSERT_EQ(region.iova, to_iova(self->device, region.vaddr));
+
+ timer_start(&start);
+ rc = __iommu_unmap(self->iommu, ®ion, &unmapped);
+ printf("Unmapped IOVA 0x%lx in %.2lfms\n", region.iova, timer_elapsed_ms(start));
+ ASSERT_EQ(rc, 0);
+ ASSERT_EQ(unmapped, region.size);
+
+ timer_start(&start);
+ ASSERT_TRUE(!munmap(region.vaddr, size));
+ printf("Completed munmap() in %.2lfms\n", timer_elapsed_ms(start));
+}
+
+int main(int argc, char *argv[])
+{
+ device_bdf = vfio_selftests_get_bdf(&argc, argv);
+ return test_harness_run(argc, argv);
+}
--
2.55.0.rc0.799.gd6f94ed593-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/4] vfio: selftests: Add memfd test to vfio_dma_mapping_perf_test
2026-07-01 20:33 [PATCH 0/4] Introduce vfio_dma_mapping_perf_test Aaron Lewis
2026-07-01 20:33 ` [PATCH 1/4] vfio: selftests: " Aaron Lewis
@ 2026-07-01 20:33 ` Aaron Lewis
2026-07-01 20:43 ` sashiko-bot
2026-07-01 20:33 ` [PATCH 3/4] vfio: selftests: Allow a size for vfio_dma_mapping_perf_test Aaron Lewis
2026-07-01 20:33 ` [PATCH 4/4] vfio: selftests: Allow the flag MAP_POPULATE to be set on the cmdline Aaron Lewis
3 siblings, 1 reply; 8+ messages in thread
From: Aaron Lewis @ 2026-07-01 20:33 UTC (permalink / raw)
To: kvm; +Cc: alex, dmatlack, jgg, Aaron Lewis
Add a second test to "vfio_dma_mapping_perf_test" to evaluate memfd
latencies. A key motivator for adding this test is to demonstrate that
the IOMMU_IOAS_MAP_FILE ioctl is significantly faster than other methods
of mapping DMA regions. While this performance difference is not fully
apparent with the test's current capabilities, it will become evident
as more features are introduced later in the series.
To support this, add IOMMU_IOAS_MAP_FILE ioctl support to the VFIO
selftest library via iommufd_map_file().
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
.../vfio/lib/include/libvfio/iommu.h | 7 ++
tools/testing/selftests/vfio/lib/iommu.c | 24 ++++
.../vfio/vfio_dma_mapping_perf_test.c | 118 ++++++++++++++++++
3 files changed, 149 insertions(+)
diff --git a/tools/testing/selftests/vfio/lib/include/libvfio/iommu.h b/tools/testing/selftests/vfio/lib/include/libvfio/iommu.h
index e9a3386a4719..eeef5990292c 100644
--- a/tools/testing/selftests/vfio/lib/include/libvfio/iommu.h
+++ b/tools/testing/selftests/vfio/lib/include/libvfio/iommu.h
@@ -35,6 +35,13 @@ struct iommu {
struct iommu *iommu_init(const char *iommu_mode);
void iommu_cleanup(struct iommu *iommu);
+int __iommufd_map_file(struct iommu *iommu, struct dma_region *region, int fd);
+
+static inline void iommufd_map_file(struct iommu *iommu, struct dma_region *region, int fd)
+{
+ VFIO_ASSERT_EQ(__iommufd_map_file(iommu, region, fd), 0);
+}
+
int __iommu_map(struct iommu *iommu, struct dma_region *region);
static inline void iommu_map(struct iommu *iommu, struct dma_region *region)
diff --git a/tools/testing/selftests/vfio/lib/iommu.c b/tools/testing/selftests/vfio/lib/iommu.c
index 035dac069d60..6730b39060ca 100644
--- a/tools/testing/selftests/vfio/lib/iommu.c
+++ b/tools/testing/selftests/vfio/lib/iommu.c
@@ -97,6 +97,30 @@ iova_t iommu_hva2iova(struct iommu *iommu, void *vaddr)
return iova;
}
+int __iommufd_map_file(struct iommu *iommu, struct dma_region *region, int fd)
+{
+ VFIO_ASSERT_TRUE(iommu->iommufd, "IOMMU_IOAS_MAP_FILE is an IOMMUFD IOCTL.");
+
+ struct iommu_ioas_map_file args = {
+ .size = sizeof(args),
+ .flags = IOMMU_IOAS_MAP_READABLE |
+ IOMMU_IOAS_MAP_WRITEABLE |
+ IOMMU_IOAS_MAP_FIXED_IOVA,
+ .ioas_id = iommu->ioas_id,
+ .fd = fd,
+ .start = 0,
+ .iova = region->iova,
+ .length = region->size,
+ };
+
+ if (ioctl(iommu->iommufd, IOMMU_IOAS_MAP_FILE, &args))
+ return -errno;
+
+ list_add(®ion->link, &iommu->dma_regions);
+
+ return 0;
+}
+
static int vfio_iommu_map(struct iommu *iommu, struct dma_region *region)
{
struct vfio_iommu_type1_dma_map args = {
diff --git a/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
index 87d84450fc47..4639bda7ebaa 100644
--- a/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
+++ b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
@@ -117,6 +117,124 @@ TEST_F(vfio_dma_mapping_perf_test, dma_map_unmap)
printf("Completed munmap() in %.2lfms\n", timer_elapsed_ms(start));
}
+FIXTURE(vfio_dma_mapping_perf_memfd_test) {
+ struct iommu *iommu;
+ struct vfio_pci_device *device;
+ struct iova_allocator *iova_allocator;
+};
+
+FIXTURE_VARIANT(vfio_dma_mapping_perf_memfd_test) {
+ const char *iommu_mode;
+ int mmap_flags;
+ int memfd_flags;
+};
+
+#define FIXTURE_VARIANT_ADD_IOMMU_MODE(_iommu_mode, _name, _mmap_flags, _memfd_flags) \
+FIXTURE_VARIANT_ADD(vfio_dma_mapping_perf_memfd_test, _iommu_mode ## _ ## _name) { \
+ .iommu_mode = #_iommu_mode, \
+ .mmap_flags = MAP_SHARED | (_mmap_flags), \
+ .memfd_flags = MAP_SHARED | (_memfd_flags), \
+}
+
+FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(memfd, 0, 0);
+FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(memfd_hugetlb_2mb,
+ MAP_HUGETLB | MAP_HUGE_2MB,
+ MFD_HUGETLB | MFD_HUGE_2MB);
+FIXTURE_VARIANT_ADD_ALL_IOMMU_MODES(memfd_hugetlb_1gb,
+ MAP_HUGETLB | MAP_HUGE_1GB,
+ MFD_HUGETLB | MFD_HUGE_1GB);
+
+#undef FIXTURE_VARIANT_ADD_IOMMU_MODE
+
+FIXTURE_SETUP(vfio_dma_mapping_perf_memfd_test)
+{
+ self->iommu = iommu_init(variant->iommu_mode);
+ self->device = vfio_pci_device_init(device_bdf, self->iommu);
+ self->iova_allocator = iova_allocator_init(self->iommu);
+}
+
+FIXTURE_TEARDOWN(vfio_dma_mapping_perf_memfd_test)
+{
+ iova_allocator_cleanup(self->iova_allocator);
+ vfio_pci_device_cleanup(self->device);
+ iommu_cleanup(self->iommu);
+}
+
+static void *setup_memfd(int *fd, u64 size, int mmap_flags, int mfd_flags)
+{
+ void *buf = MAP_FAILED;
+ struct timespec start;
+
+ timer_start(&start);
+ *fd = memfd_create("vfio_dma_mapping_perf_memfd_test", mfd_flags);
+ printf("Completed memfd_create() in %.2lfms\n", timer_elapsed_ms(start));
+ if (*fd <= 0)
+ return MAP_FAILED;
+
+ if (ftruncate(*fd, size))
+ goto out;
+
+ timer_start(&start);
+ buf = mmap(NULL, size, PROT_READ | PROT_WRITE, mmap_flags, *fd, 0);
+ printf("Completed mmap() for memfd in %.2lfms\n", timer_elapsed_ms(start));
+
+out:
+ if (buf == MAP_FAILED)
+ close(*fd);
+
+ return buf;
+}
+
+static void teardown_memfd(int fd, u64 size, void *vaddr)
+{
+ struct timespec start;
+
+ if (vaddr != MAP_FAILED) {
+ timer_start(&start);
+ munmap(vaddr, size);
+ printf("Completed munmap() in %.2lfms\n", timer_elapsed_ms(start));
+ }
+
+ if (fd != -1) {
+ timer_start(&start);
+ close(fd);
+ printf("Completed close() in %.2lfms\n", timer_elapsed_ms(start));
+ }
+}
+
+TEST_F(vfio_dma_mapping_perf_memfd_test, dma_map_unmap_from_file)
+{
+ const u64 size = SZ_1G;
+ struct dma_region region;
+ struct timespec start;
+ u64 unmapped;
+ int rc, fd;
+
+ region.vaddr = setup_memfd(&fd, size, variant->mmap_flags, variant->memfd_flags);
+ ASSERT_NE(region.vaddr, MAP_FAILED);
+
+ region.iova = iova_allocator_alloc(self->iova_allocator, size);
+ region.size = size;
+
+ timer_start(&start);
+ if (strcmp(variant->iommu_mode, MODE_IOMMUFD) == 0) {
+ iommufd_map_file(self->iommu, ®ion, fd);
+ } else {
+ iommu_map(self->iommu, ®ion);
+ }
+ printf("Mapped HVA %p (size %luG) at IOVA 0x%lx in %.2lfms\n",
+ region.vaddr, size / SZ_1G, region.iova, timer_elapsed_ms(start));
+ ASSERT_EQ(region.iova, to_iova(self->device, region.vaddr));
+
+ timer_start(&start);
+ rc = __iommu_unmap(self->iommu, ®ion, &unmapped);
+ printf("Unmapped IOVA 0x%lx in %.2lfms\n", region.iova, timer_elapsed_ms(start));
+ ASSERT_EQ(rc, 0);
+ ASSERT_EQ(unmapped, region.size);
+
+ teardown_memfd(fd, size, region.vaddr);
+}
+
int main(int argc, char *argv[])
{
device_bdf = vfio_selftests_get_bdf(&argc, argv);
--
2.55.0.rc0.799.gd6f94ed593-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/4] vfio: selftests: Allow a size for vfio_dma_mapping_perf_test
2026-07-01 20:33 [PATCH 0/4] Introduce vfio_dma_mapping_perf_test Aaron Lewis
2026-07-01 20:33 ` [PATCH 1/4] vfio: selftests: " Aaron Lewis
2026-07-01 20:33 ` [PATCH 2/4] vfio: selftests: Add memfd test to vfio_dma_mapping_perf_test Aaron Lewis
@ 2026-07-01 20:33 ` Aaron Lewis
2026-07-01 20:44 ` sashiko-bot
2026-07-01 20:33 ` [PATCH 4/4] vfio: selftests: Allow the flag MAP_POPULATE to be set on the cmdline Aaron Lewis
3 siblings, 1 reply; 8+ messages in thread
From: Aaron Lewis @ 2026-07-01 20:33 UTC (permalink / raw)
To: kvm; +Cc: alex, dmatlack, jgg, Aaron Lewis
Allow the user to specify a DMA region size via the command line for
vfio_dma_mapping_perf_test.
Because the selftest harness also parses command-line parameters, sharing
them directly is problematic. Adding options directly to the test could
create conflicts with harness-defined options. Even without conflicts, the
harness would need to be updated to recognize test-specific options to avoid
failing on unknown parameters.
Resolve this by isolating the two sets of parameters. The standard command-line
options are consumed by the test itself. To pass options through to the test
harness, introduce a new '-a' option.
For example, both the test size and the test harness options can be set
like this:
./vfio_dma_mapping_perf_test -b 16G -a "-v vfio_type1_iommu_memfd_hugetlb_1gb"
This invocation configures a 16G DMA region and restricts execution to the
specified test variant, which is useful when debugging DMA mapping latency
issues for a specific IOMMU type.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
.../vfio/vfio_dma_mapping_perf_test.c | 132 +++++++++++++++++-
1 file changed, 129 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
index 4639bda7ebaa..6c025e9c4420 100644
--- a/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
+++ b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
@@ -4,6 +4,7 @@
#include <sys/mman.h>
#include <time.h>
#include <unistd.h>
+#include <wordexp.h>
#include <linux/iommufd.h>
#include <linux/limits.h>
@@ -19,6 +20,12 @@
static const char *device_bdf;
+struct test_params {
+ u64 size;
+};
+
+struct test_params test_params;
+
struct iommu_mapping {
u64 pgd;
u64 p4d;
@@ -80,7 +87,7 @@ FIXTURE_TEARDOWN(vfio_dma_mapping_perf_test)
TEST_F(vfio_dma_mapping_perf_test, dma_map_unmap)
{
- const u64 size = SZ_1G;
+ const u64 size = test_params.size;
const int flags = variant->mmap_flags;
struct dma_region region;
struct timespec start;
@@ -204,7 +211,7 @@ static void teardown_memfd(int fd, u64 size, void *vaddr)
TEST_F(vfio_dma_mapping_perf_memfd_test, dma_map_unmap_from_file)
{
- const u64 size = SZ_1G;
+ const u64 size = test_params.size;
struct dma_region region;
struct timespec start;
u64 unmapped;
@@ -235,8 +242,127 @@ TEST_F(vfio_dma_mapping_perf_memfd_test, dma_map_unmap_from_file)
teardown_memfd(fd, size, region.vaddr);
}
+/*
+ * Parses "[0-9]+[kmgt]?".
+ */
+size_t parse_size(const char *size)
+{
+ size_t base;
+ char *scale;
+ int shift = 0;
+
+ VFIO_ASSERT_TRUE(size && isdigit(size[0]),
+ "Need at least one digit in '%s'.", size);
+
+ base = strtoull(size, &scale, 0);
+
+ VFIO_ASSERT_TRUE(base != ULLONG_MAX, "Overflow parsing size!");
+
+ switch (tolower(*scale)) {
+ case 't':
+ shift = 40;
+ break;
+ case 'g':
+ shift = 30;
+ break;
+ case 'm':
+ shift = 20;
+ break;
+ case 'k':
+ shift = 10;
+ break;
+ case 'b':
+ case '\0':
+ shift = 0;
+ break;
+ default:
+ VFIO_FAIL("Unknown size letter '%c'.", *scale);
+ }
+
+ VFIO_ASSERT_TRUE((base << shift) >> shift == base,
+ "Overflow scaling size!");
+
+ return base << shift;
+}
+
+static void help(char *name)
+{
+ puts("");
+ printf("usage: %s [-h] [-b bytes] [-a \"test harness args\"]\n", name);
+ puts("");
+ printf(" -h: Display this help message.\n"
+ " -b: Specify the size of the DMA region to be mapped\n"
+ " and unmapped. e.g. 16M or 8G, (default: 1G)\n"
+ " -a: Args that are forwarded to the test harness,\n"
+ " e.g. -a \"-t dma_map_unmap_from_file\"\n");
+}
+
+struct harness_args
+{
+ int argc;
+ char **argv;
+ wordexp_t exp;
+};
+
+static void populate_harness_args(struct harness_args *args, const char *argv_0,
+ const char *cmdlne)
+{
+ if (wordexp(argv_0, &args->exp, WRDE_NOCMD) == 0 &&
+ wordexp(cmdlne, &args->exp, WRDE_APPEND | WRDE_NOCMD) == 0) {
+ args->argc = args->exp.we_wordc;
+ args->argv = args->exp.we_wordv;
+ }
+}
+
+static void setup_test(struct harness_args *args, int argc, char *argv[])
+{
+ int opt;
+
+ test_params = (struct test_params) {
+ .size = SZ_1G,
+ };
+
+ while ((opt = getopt(argc, argv, "a:b:h")) != -1) {
+ switch (opt) {
+ case 'a':
+ populate_harness_args(args, argv[0], optarg);
+ break;
+ case 'b':
+ test_params.size = parse_size(optarg);
+ break;
+ case 'h':
+ default:
+ help(argv[0]);
+ goto out;
+ }
+ }
+
+out:
+ // Reset getopt() state to allow the test harness to use it.
+ optind = 1;
+}
+
+static void teardown_test(struct harness_args *args)
+{
+ if (args->argv) {
+ args->argc = 0;
+ args->argv = NULL;
+ wordfree(&args->exp);
+ }
+}
+
int main(int argc, char *argv[])
{
+ struct harness_args args = (struct harness_args) {
+ .argc = 0,
+ .argv = NULL,
+ };
+ int r;
+
+ setup_test(&args, argc, argv);
device_bdf = vfio_selftests_get_bdf(&argc, argv);
- return test_harness_run(argc, argv);
+ r = test_harness_run(args.argc, args.argv);
+ teardown_test(&args);
+
+ return r;
}
--
2.55.0.rc0.799.gd6f94ed593-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 4/4] vfio: selftests: Allow the flag MAP_POPULATE to be set on the cmdline
2026-07-01 20:33 [PATCH 0/4] Introduce vfio_dma_mapping_perf_test Aaron Lewis
` (2 preceding siblings ...)
2026-07-01 20:33 ` [PATCH 3/4] vfio: selftests: Allow a size for vfio_dma_mapping_perf_test Aaron Lewis
@ 2026-07-01 20:33 ` Aaron Lewis
3 siblings, 0 replies; 8+ messages in thread
From: Aaron Lewis @ 2026-07-01 20:33 UTC (permalink / raw)
To: kvm; +Cc: alex, dmatlack, jgg, Aaron Lewis
Add a command-line option to vfio_dma_mapping_perf_test to set the
MAP_POPULATE flag during mmap(). This ensures consistent and accurate DMA
mapping latency measurements by forcing page prefaulting to occur during
the mmap() phase, rather than during the DMA mapping ioctl.
For example, mapping a 32G region without the -p (MAP_POPULATE) flag
results in:
$ ./vfio_dma_mapping_perf_test -b 32G -a "-v vfio_type1_iommu_memfd_hugetlb_1gb"
Completed mmap() for memfd in 0.00ms
Mapped HVA 0x7f1840000000 (size 32G) at IOVA 0x800000000 in 2887.56ms
Here, the mmap() call completes instantly, but the ioctl to map the DMA
region incurs the ~3s cost of prefaulting the pages.
Running the same test with the -p flag results in:
$ ./vfio_dma_mapping_perf_test -p -b 32G -a "-v vfio_type1_iommu_memfd_hugetlb_1gb"
Completed mmap() for memfd in 2887.41ms
Mapped HVA 0x7f8000000000 (size 32G) at IOVA 0x800000000 in 11.51ms
By shifting the prefaulting overhead to mmap(), the actual DMA mapping
latency is isolated and accurately reported as 11.51ms.
With user-defined mapping sizes and prefaulting in place, this test now
clearly demonstrates the difference in mapping speeds between VFIO's Type 1
IOMMU and IOMMUFD (i.e., IOMMU_IOAS_MAP_FILE).
Running the same configuration with the "iommufd_memfd_hugetlb_1gb" variant
highlights this difference:
$ ./vfio_dma_mapping_perf_test -p -b 32G -a "-v iommufd_memfd_hugetlb_1gb"
Mapped HVA 0x7f4fc0000000 (size 32G) at IOVA 0x800000000 in 0.11ms
The DMA mapping time drops dramatically from 11.51ms to 0.11ms.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
.../vfio/vfio_dma_mapping_perf_test.c | 20 +++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
index 6c025e9c4420..8032f51c1a3e 100644
--- a/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
+++ b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
@@ -22,6 +22,7 @@ static const char *device_bdf;
struct test_params {
u64 size;
+ int mmap_flags;
};
struct test_params test_params;
@@ -88,7 +89,7 @@ FIXTURE_TEARDOWN(vfio_dma_mapping_perf_test)
TEST_F(vfio_dma_mapping_perf_test, dma_map_unmap)
{
const u64 size = test_params.size;
- const int flags = variant->mmap_flags;
+ const int flags = variant->mmap_flags | test_params.mmap_flags;
struct dma_region region;
struct timespec start;
u64 unmapped;
@@ -217,7 +218,9 @@ TEST_F(vfio_dma_mapping_perf_memfd_test, dma_map_unmap_from_file)
u64 unmapped;
int rc, fd;
- region.vaddr = setup_memfd(&fd, size, variant->mmap_flags, variant->memfd_flags);
+ region.vaddr = setup_memfd(&fd, size,
+ variant->mmap_flags | test_params.mmap_flags,
+ variant->memfd_flags);
ASSERT_NE(region.vaddr, MAP_FAILED);
region.iova = iova_allocator_alloc(self->iova_allocator, size);
@@ -288,11 +291,16 @@ size_t parse_size(const char *size)
static void help(char *name)
{
puts("");
- printf("usage: %s [-h] [-b bytes] [-a \"test harness args\"]\n", name);
+ printf("usage: %s [-h|-p] [-b bytes] [-a \"test harness args\"]\n", name);
puts("");
printf(" -h: Display this help message.\n"
" -b: Specify the size of the DMA region to be mapped\n"
" and unmapped. e.g. 16M or 8G, (default: 1G)\n"
+ " -p: Append 'MAP_POPULATE' to the mmap() flags to avoid\n"
+ " prefaulting while mapping DMA regions. Instead, any\n"
+ " and all prefaulting needed will happen during the\n"
+ " mmap() call. This will make mapping DMA regions\n"
+ " more consistent.\n"
" -a: Args that are forwarded to the test harness,\n"
" e.g. -a \"-t dma_map_unmap_from_file\"\n");
}
@@ -320,9 +328,10 @@ static void setup_test(struct harness_args *args, int argc, char *argv[])
test_params = (struct test_params) {
.size = SZ_1G,
+ .mmap_flags = 0,
};
- while ((opt = getopt(argc, argv, "a:b:h")) != -1) {
+ while ((opt = getopt(argc, argv, "a:b:ph")) != -1) {
switch (opt) {
case 'a':
populate_harness_args(args, argv[0], optarg);
@@ -330,6 +339,9 @@ static void setup_test(struct harness_args *args, int argc, char *argv[])
case 'b':
test_params.size = parse_size(optarg);
break;
+ case 'p':
+ test_params.mmap_flags = MAP_POPULATE;
+ break;
case 'h':
default:
help(argv[0]);
--
2.55.0.rc0.799.gd6f94ed593-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/4] vfio: selftests: Add memfd test to vfio_dma_mapping_perf_test
2026-07-01 20:33 ` [PATCH 2/4] vfio: selftests: Add memfd test to vfio_dma_mapping_perf_test Aaron Lewis
@ 2026-07-01 20:43 ` sashiko-bot
0 siblings, 0 replies; 8+ messages in thread
From: sashiko-bot @ 2026-07-01 20:43 UTC (permalink / raw)
To: Aaron Lewis; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [Medium] Missing SKIP fallback for hugepage allocation failures.
- [Low] MAP_SHARED (an mmap flag) is incorrectly passed to memfd_create() via memfd_flags.
- [Low] memfd_create() return value is incorrectly validated against <= 0, which rejects fd 0 and leaks it.
--
commit ffa43469ab562932001b32361e9549a362e44c3c
Author: Aaron Lewis <aaronlewis@google.com>
vfio: selftests: Add memfd test to vfio_dma_mapping_perf_test
Add a second test to "vfio_dma_mapping_perf_test" to evaluate memfd
latencies. A key motivator for adding this test is to demonstrate that
the IOMMU_IOAS_MAP_FILE ioctl is significantly faster than other methods
of mapping DMA regions.
> diff --git a/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
> index 87d84450fc47e..4639bda7ebaa3 100644
> --- a/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
> +++ b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
[ ... ]
> +#define FIXTURE_VARIANT_ADD_IOMMU_MODE(_iommu_mode, _name, _mmap_flags, _memfd_flags) \
> +FIXTURE_VARIANT_ADD(vfio_dma_mapping_perf_memfd_test, _iommu_mode ## _ ## _name) { \
> + .iommu_mode = #_iommu_mode, \
> + .mmap_flags = MAP_SHARED | (_mmap_flags), \
> + .memfd_flags = MAP_SHARED | (_memfd_flags), \
[Severity: Low]
Does this incorrectly pass the mmap flag MAP_SHARED to memfd_create() via
memfd_flags?
MAP_SHARED (0x01) aliases with MFD_CLOEXEC (0x0001U) in the memfd flag
namespace, which silently sets close-on-exec rather than failing at runtime.
Could conflating the namespaces here cause confusion?
> +}
> +
[ ... ]
> +static void *setup_memfd(int *fd, u64 size, int mmap_flags, int mfd_flags)
> +{
> + void *buf = MAP_FAILED;
> + struct timespec start;
> +
> + timer_start(&start);
> + *fd = memfd_create("vfio_dma_mapping_perf_memfd_test", mfd_flags);
> + printf("Completed memfd_create() in %.2lfms\n", timer_elapsed_ms(start));
> + if (*fd <= 0)
> + return MAP_FAILED;
[Severity: Low]
Could this leak fd 0 if memfd_create() successfully returns it?
File descriptors can validly be 0 if the test is run with stdin closed.
Would it be better to check if (*fd < 0) instead so it doesn't reject fd 0?
> +
> + if (ftruncate(*fd, size))
> + goto out;
[ ... ]
> +TEST_F(vfio_dma_mapping_perf_memfd_test, dma_map_unmap_from_file)
> +{
> + const u64 size = SZ_1G;
> + struct dma_region region;
> + struct timespec start;
> + u64 unmapped;
> + int rc, fd;
> +
> + region.vaddr = setup_memfd(&fd, size, variant->mmap_flags, variant->memfd_flags);
> + ASSERT_NE(region.vaddr, MAP_FAILED);
[Severity: Medium]
Will this unconditional ASSERT_NE cause the test suite to fail on systems
that do not have sufficient hugepages configured (e.g., for MFD_HUGE_1GB or
MFD_HUGE_2MB) instead of gracefully skipping the unsupported variants?
> +
> + region.iova = iova_allocator_alloc(self->iova_allocator, size);
> + region.size = size;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701203311.326798-1-aaronlewis@google.com?part=2
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/4] vfio: selftests: Introduce vfio_dma_mapping_perf_test
2026-07-01 20:33 ` [PATCH 1/4] vfio: selftests: " Aaron Lewis
@ 2026-07-01 20:44 ` sashiko-bot
0 siblings, 0 replies; 8+ messages in thread
From: sashiko-bot @ 2026-07-01 20:44 UTC (permalink / raw)
To: Aaron Lewis; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Medium] Format String Mismatch (Undefined Behavior) for 64-bit `u64` values.
- [Low] Unused struct `iommu_mapping` definition.
--
commit ed41e119412fd95d65a62de77183ae8dfc8f2449
Author: Aaron Lewis <aaronlewis@google.com>
vfio: selftests: Introduce vfio_dma_mapping_perf_test
Introduce a tool to aid in latency testing rather than a traditional
pass/fail test. Start with a basic test that performs a simple DMA
map/unmap and reports the duration of each operation.
> diff --git a/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
> new file mode 100644
> index 0000000000000..87d84450fc47e
> --- /dev/null
> +++ b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
> @@ -0,0 +1,124 @@
> +// SPDX-License-Identifier: GPL-2.0-only
[ ... ]
> +static const char *device_bdf;
> +
> +struct iommu_mapping {
> + u64 pgd;
> + u64 p4d;
> + u64 pud;
> + u64 pmd;
> + u64 pte;
> +};
[Severity: Low]
This isn't a bug, but struct iommu_mapping is defined and never used in this
file. Is it needed, or is it left over from earlier iterations?
[ ... ]
> +TEST_F(vfio_dma_mapping_perf_test, dma_map_unmap)
> +{
> + const u64 size = SZ_1G;
> + const int flags = variant->mmap_flags;
> + struct dma_region region;
> + struct timespec start;
> + u64 unmapped;
> + int rc;
[ ... ]
> + timer_start(&start);
> + iommu_map(self->iommu, ®ion);
> + printf("Mapped HVA %p (size %luG) at IOVA 0x%lx in %.2lfms\n",
> + region.vaddr, size / SZ_1G, region.iova, timer_elapsed_ms(start));
[Severity: Medium]
Does this printf format string mismatch cause stack misalignment on 32-bit
platforms?
Because size / SZ_1G and region.iova are 64-bit u64 values, using %lu and %lx
instead of %llu and %llx can misalign the variadic arguments. On 32-bit
systems, %lu expects a 32-bit value, so printf will misread the 64-bit integer,
corrupting the subsequent double argument and producing garbage output for
timer_elapsed_ms().
> + ASSERT_EQ(region.iova, to_iova(self->device, region.vaddr));
> +
> + timer_start(&start);
> + rc = __iommu_unmap(self->iommu, ®ion, &unmapped);
> + printf("Unmapped IOVA 0x%lx in %.2lfms\n", region.iova, timer_elapsed_ms(start));
[Severity: Medium]
Similarly here, does region.iova need %llx to avoid stack misalignment issues
with the subsequent double on 32-bit platforms?
> + ASSERT_EQ(rc, 0);
> + ASSERT_EQ(unmapped, region.size);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701203311.326798-1-aaronlewis@google.com?part=1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 3/4] vfio: selftests: Allow a size for vfio_dma_mapping_perf_test
2026-07-01 20:33 ` [PATCH 3/4] vfio: selftests: Allow a size for vfio_dma_mapping_perf_test Aaron Lewis
@ 2026-07-01 20:44 ` sashiko-bot
0 siblings, 0 replies; 8+ messages in thread
From: sashiko-bot @ 2026-07-01 20:44 UTC (permalink / raw)
To: Aaron Lewis; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [Medium] The -h option and invalid arguments fail to exit the program, causing unintended execution of all tests.
- [Medium] Multiple usages of the '-a' flag overwrite the wordexp structure without WRDE_REUSE, causing memory leaks and dropping previous arguments.
- [Medium] Using size_t for math and return type in parse_size() causes undefined behavior and silent truncation on 32-bit architectures.
--
commit d938d0d26cc3c95a10e15ae011fce57f55a320c9
Author: Aaron Lewis <aaronlewis@google.com>
vfio: selftests: Allow a size for vfio_dma_mapping_perf_test
Allow the user to specify a DMA region size via the command line for
vfio_dma_mapping_perf_test.
> diff --git a/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
> index 4639bda7ebaa3..2d8fc3e8e0787 100644
> --- a/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
> +++ b/tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c
[ ... ]
> +size_t parse_size(const char *size)
> +{
> + size_t base;
> + char *scale;
> + int shift = 0;
[ ... ]
> + switch (tolower(*scale)) {
> + case 't':
> + shift = 40;
> + break;
[ ... ]
> + VFIO_ASSERT_TRUE((base << shift) >> shift == base,
> + "Overflow scaling size!");
> +
> + return base << shift;
> +}
[Severity: Medium]
On 32-bit architectures where size_t is 32-bit, will shifting a 32-bit integer
by 40 when parsing terabytes cause undefined behavior?
Because compilers can mask over-sized shift counts, the overflow check here
might evaluate as true and silently pass while returning a drastically
truncated size. Should this function use u64 internally and as its return
type to match test_params.size?
[ ... ]
> +static void populate_harness_args(struct harness_args *args, const char *argv_0,
> + const char *cmdlne)
> +{
> + if (wordexp(argv_0, &args->exp, WRDE_NOCMD) == 0 &&
> + wordexp(cmdlne, &args->exp, WRDE_APPEND | WRDE_NOCMD) == 0) {
[Severity: Medium]
If a user passes multiple '-a' flags, does the first wordexp() call without
WRDE_REUSE or WRDE_APPEND leak the previously allocated memory in args->exp?
This also looks like it would discard any arguments appended by earlier '-a'
flags, and potentially leave args->argv pointing to leaked allocations if the
second wordexp() fails.
> + args->argc = args->exp.we_wordc;
> + args->argv = args->exp.we_wordv;
> + }
> +}
> +
> +static void setup_test(struct harness_args *args, int argc, char *argv[])
> +{
> + int opt;
> +
> + test_params = (struct test_params) {
> + .size = SZ_1G,
> + };
> +
> + while ((opt = getopt(argc, argv, "a:b:h")) != -1) {
> + switch (opt) {
[ ... ]
> + case 'h':
> + default:
> + help(argv[0]);
> + goto out;
> + }
> + }
> +
> +out:
> + // Reset getopt() state to allow the test harness to use it.
> + optind = 1;
> +}
[Severity: Medium]
When the '-h' option or an invalid argument is provided, does jumping to 'out'
unintentionally allow the program to continue executing?
Because setup_test() returns normally to main() in this path, main() will
subsequently execute test_harness_run(), which runs the full test suite
immediately after printing the help message instead of terminating.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701203311.326798-1-aaronlewis@google.com?part=3
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-07-01 20:44 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-01 20:33 [PATCH 0/4] Introduce vfio_dma_mapping_perf_test Aaron Lewis
2026-07-01 20:33 ` [PATCH 1/4] vfio: selftests: " Aaron Lewis
2026-07-01 20:44 ` sashiko-bot
2026-07-01 20:33 ` [PATCH 2/4] vfio: selftests: Add memfd test to vfio_dma_mapping_perf_test Aaron Lewis
2026-07-01 20:43 ` sashiko-bot
2026-07-01 20:33 ` [PATCH 3/4] vfio: selftests: Allow a size for vfio_dma_mapping_perf_test Aaron Lewis
2026-07-01 20:44 ` sashiko-bot
2026-07-01 20:33 ` [PATCH 4/4] vfio: selftests: Allow the flag MAP_POPULATE to be set on the cmdline Aaron Lewis
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.