linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] vfio: Improve DMA mapping performance for huge pfnmaps
@ 2025-02-05 23:17 Alex Williamson
  2025-02-05 23:17 ` [PATCH 4/5] mm: Provide page mask in struct follow_pfnmap_args Alex Williamson
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Alex Williamson @ 2025-02-05 23:17 UTC (permalink / raw)
  To: alex.williamson
  Cc: kvm, linux-kernel, peterx, mitchell.augustin, clg, akpm, linux-mm

As GPU BAR sizes increase, the overhead of DMA mapping pfnmap ranges has
become a significant overhead for VMs making use of device assignment.
Not only does each mapping require upwards of a few seconds, but BARs
are mapped in and out of the VM address space multiple times during
guest boot.  Also factor in that multi-GPU configurations are
increasingly commonplace and BAR sizes are continuing to increase.
Configurations today can already be delayed minutes during guest boot.

We've taken steps to make Linux a better guest by batching PCI BAR
sizing operations[1], but it only provides and incremental improvement.

This series attempts to fully address the issue by leveraging the huge
pfnmap support added in v6.12.  When we insert pfnmaps using pud and pmd
mappings, we can later take advantage of the knowledge of the mapping
level page mask to iterate on the relevant mapping stride.  In the
commonly achieved optimal case, this results in a reduction of pfn
lookups by a factor of 256k.  For a local test system, an overhead of
~1s for DMA mapping a 32GB PCI BAR is reduced to sub-millisecond (8M
page sized operations reduced to 32 pud sized operations).

Please review, test, and provide feedback.  I hope that mm folks can
ack the trivial follow_pfnmap_args update to provide the mapping level
page mask.  Naming is hard, so any preference other than pgmask is
welcome.  Thanks,

Alex

[1]https://lore.kernel.org/all/20250120182202.1878581-1-alex.williamson@redhat.com/


Alex Williamson (5):
  vfio/type1: Catch zero from pin_user_pages_remote()
  vfio/type1: Convert all vaddr_get_pfns() callers to use vfio_batch
  vfio/type1: Use vfio_batch for vaddr_get_pfns()
  mm: Provide page mask in struct follow_pfnmap_args
  vfio/type1: Use mapping page mask for pfnmaps

 drivers/vfio/vfio_iommu_type1.c | 107 ++++++++++++++++++++------------
 include/linux/mm.h              |   2 +
 mm/memory.c                     |   1 +
 3 files changed, 72 insertions(+), 38 deletions(-)

-- 
2.47.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-02-17 21:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-05 23:17 [PATCH 0/5] vfio: Improve DMA mapping performance for huge pfnmaps Alex Williamson
2025-02-05 23:17 ` [PATCH 4/5] mm: Provide page mask in struct follow_pfnmap_args Alex Williamson
2025-02-07  1:38   ` Mitchell Augustin
2025-02-14 17:17   ` Alex Williamson
2025-02-14 21:39     ` David Hildenbrand
2025-02-17 21:56       ` Alex Williamson
2025-02-14 19:14   ` Jason Gunthorpe
2025-02-05 23:17 ` [PATCH 5/5] vfio/type1: Use mapping page mask for pfnmaps Alex Williamson
2025-02-07  1:39   ` Mitchell Augustin
2025-02-14 19:27   ` Jason Gunthorpe
2025-02-17 21:52     ` Alex Williamson
2025-02-14 19:46   ` Matthew Wilcox
2025-02-17 19:33     ` Alex Williamson
2025-02-06 19:14 ` [PATCH 0/5] vfio: Improve DMA mapping performance for huge pfnmaps Peter Xu
2025-02-07  1:39 ` Mitchell Augustin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).