From: David Matlack <dmatlack@google.com>
To: Alex Mastro <amastro@fb.com>
Cc: Alex Williamson <alex@shazbot.org>, Shuah Khan <shuah@kernel.org>,
Peter Xu <peterx@redhat.com>,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
linux-kselftest@vger.kernel.org, Jason Gunthorpe <jgg@ziepe.ca>
Subject: Re: [PATCH v2 2/3] vfio: selftests: Align BAR mmaps for efficient IOMMU mapping
Date: Wed, 14 Jan 2026 17:30:04 +0000 [thread overview]
Message-ID: <aWfSnNIF-3xDQr4A@google.com> (raw)
In-Reply-To: <20260113-map-mmio-test-v2-2-e6d34f09c0bb@fb.com>
On 2026-01-13 03:08 PM, Alex Mastro wrote:
> Update vfio_pci_bar_map() to align BAR mmaps for efficient huge page
> mappings. The manual mmap alignment can be removed once mmap(!MAP_FIXED)
> on vfio device fds improves to automatically return well-aligned
> addresses.
Please also mention that you added MADV_HUGEPAGE and why, and that you
dropped MAP_FILE (just mention that it was unnecessary in the first
place).
>
> Signed-off-by: Alex Mastro <amastro@fb.com>
> ---
> tools/testing/selftests/vfio/lib/include/libvfio.h | 9 ++++++++
> tools/testing/selftests/vfio/lib/libvfio.c | 25 ++++++++++++++++++++++
> tools/testing/selftests/vfio/lib/vfio_pci_device.c | 24 ++++++++++++++++++++-
> 3 files changed, 57 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/vfio/lib/include/libvfio.h b/tools/testing/selftests/vfio/lib/include/libvfio.h
> index 279ddcd70194..5ebf8503586e 100644
> --- a/tools/testing/selftests/vfio/lib/include/libvfio.h
> +++ b/tools/testing/selftests/vfio/lib/include/libvfio.h
> @@ -23,4 +23,13 @@
> const char *vfio_selftests_get_bdf(int *argc, char *argv[]);
> char **vfio_selftests_get_bdfs(int *argc, char *argv[], int *nr_bdfs);
>
> +/*
> + * Reserve virtual address space of size at an address satisfying
> + * (vaddr % align) == offset.
> + *
> + * Returns the reserved vaddr. The caller is responsible for unmapping
> + * the returned region.
> + */
> +void *mmap_aligned(size_t size, size_t align, size_t offset);
nit: Perhaps we should name this mmap_reserve()? The current name
implies something is being mmap'ed.
> +
> #endif /* SELFTESTS_VFIO_LIB_INCLUDE_LIBVFIO_H */
> diff --git a/tools/testing/selftests/vfio/lib/libvfio.c b/tools/testing/selftests/vfio/lib/libvfio.c
> index a23a3cc5be69..4529bb1e69d1 100644
> --- a/tools/testing/selftests/vfio/lib/libvfio.c
> +++ b/tools/testing/selftests/vfio/lib/libvfio.c
> @@ -2,6 +2,9 @@
>
> #include <stdio.h>
> #include <stdlib.h>
> +#include <sys/mman.h>
> +
> +#include <linux/align.h>
>
> #include "../../../kselftest.h"
> #include <libvfio.h>
> @@ -76,3 +79,25 @@ const char *vfio_selftests_get_bdf(int *argc, char *argv[])
>
> return vfio_selftests_get_bdfs(argc, argv, &nr_bdfs)[0];
> }
> +
> +void *mmap_aligned(size_t size, size_t align, size_t offset)
> +{
> + void *map_base, *map_align;
> + size_t delta;
> +
> + VFIO_ASSERT_GT(align, offset);
> + delta = align - offset;
> +
> + map_base = mmap(NULL, size + align, PROT_NONE,
> + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> + VFIO_ASSERT_NE(map_base, MAP_FAILED);
> +
> + map_align = (void *)(ALIGN((uintptr_t)map_base + delta, align) - delta);
> +
> + if (map_align > map_base)
> + VFIO_ASSERT_EQ(munmap(map_base, map_align - map_base), 0);
> +
> + VFIO_ASSERT_EQ(munmap(map_align + size, map_base + align - map_align), 0);
> +
> + return map_align;
> +}
> diff --git a/tools/testing/selftests/vfio/lib/vfio_pci_device.c b/tools/testing/selftests/vfio/lib/vfio_pci_device.c
> index 13fdb4b0b10f..03f35011b5f7 100644
> --- a/tools/testing/selftests/vfio/lib/vfio_pci_device.c
> +++ b/tools/testing/selftests/vfio/lib/vfio_pci_device.c
> @@ -12,10 +12,14 @@
> #include <sys/mman.h>
>
> #include <uapi/linux/types.h>
> +#include <linux/align.h>
> #include <linux/iommufd.h>
> +#include <linux/kernel.h>
> #include <linux/limits.h>
> +#include <linux/log2.h>
> #include <linux/mman.h>
> #include <linux/overflow.h>
> +#include <linux/sizes.h>
> #include <linux/types.h>
> #include <linux/vfio.h>
>
> @@ -124,20 +128,38 @@ static void vfio_pci_region_get(struct vfio_pci_device *device, int index,
> static void vfio_pci_bar_map(struct vfio_pci_device *device, int index)
> {
> struct vfio_pci_bar *bar = &device->bars[index];
> + size_t align, size;
> + void *vaddr;
> int prot = 0;
uber-nit: Put vaddr after prot to preserve the reverse-fir-tree ordering
of variables.
Here's the tip tree documentation:
https://docs.kernel.org/process/maintainer-tip.html#variable-declarations
I should probably document somewhere that this is preferred in VFIO
selftests as well.
>
> VFIO_ASSERT_LT(index, PCI_STD_NUM_BARS);
> VFIO_ASSERT_NULL(bar->vaddr);
> VFIO_ASSERT_TRUE(bar->info.flags & VFIO_REGION_INFO_FLAG_MMAP);
> + VFIO_ASSERT_TRUE(is_power_of_2(bar->info.size));
>
> if (bar->info.flags & VFIO_REGION_INFO_FLAG_READ)
> prot |= PROT_READ;
> if (bar->info.flags & VFIO_REGION_INFO_FLAG_WRITE)
> prot |= PROT_WRITE;
>
> - bar->vaddr = mmap(NULL, bar->info.size, prot, MAP_FILE | MAP_SHARED,
> + size = bar->info.size;
> +
> + /*
> + * Align BAR mmaps to improve page fault granularity during potential
> + * subsequent IOMMU mapping of these BAR vaddr. 1G for x86 is the
> + * largest hugepage size across any architecture, so no benefit from
> + * larger alignment. BARs smaller than 1G will be aligned by their
> + * power-of-two size, guaranteeing sufficient alignment for smaller
> + * hugepages, if present.
> + */
> + align = min_t(size_t, size, SZ_1G);
> +
> + vaddr = mmap_aligned(size, align, 0);
> + bar->vaddr = mmap(vaddr, size, prot, MAP_SHARED | MAP_FIXED,
> device->fd, bar->info.offset);
> VFIO_ASSERT_NE(bar->vaddr, MAP_FAILED);
> +
> + madvise(bar->vaddr, size, MADV_HUGEPAGE);
> }
>
> static void vfio_pci_bar_unmap(struct vfio_pci_device *device, int index)
>
> --
> 2.47.3
>
next prev parent reply other threads:[~2026-01-14 17:30 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-13 23:08 [PATCH v2 0/3] vfio: selftests: Add MMIO DMA mapping test Alex Mastro
2026-01-13 23:08 ` [PATCH v2 1/3] vfio: selftests: Centralize IOMMU mode name definitions Alex Mastro
2026-01-13 23:08 ` [PATCH v2 2/3] vfio: selftests: Align BAR mmaps for efficient IOMMU mapping Alex Mastro
2026-01-14 17:30 ` David Matlack [this message]
2026-01-14 18:44 ` Alex Mastro
2026-01-13 23:08 ` [PATCH v2 3/3] vfio: selftests: Add vfio_dma_mapping_mmio_test Alex Mastro
2026-01-14 17:52 ` David Matlack
2026-01-14 18:45 ` Alex Mastro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aWfSnNIF-3xDQr4A@google.com \
--to=dmatlack@google.com \
--cc=alex@shazbot.org \
--cc=amastro@fb.com \
--cc=jgg@ziepe.ca \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=peterx@redhat.com \
--cc=shuah@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.