All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] vfio: Improve DMA mapping performance for huge pages
@ 2025-12-23 23:00 Aaron Lewis
  2025-12-23 23:00 ` [RFC PATCH 1/2] " Aaron Lewis
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Aaron Lewis @ 2025-12-23 23:00 UTC (permalink / raw)
  To: alex.williamson, jgg, dmatlack; +Cc: kvm, seanjc, Aaron Lewis

This RFC explores the current state of DMA mapping performance across
vfio, and proposes an implementation to improve the performance
for "vfio_type1_iommu" for huge pages.

In putting this together the IOMMU modes: vfio_type1_iommu,
iommufd_compat_type1, and iommufd were used to get performance metrics
using the selftest, "vfio_dma_mapping_perf_test" (included in this
series).

These changes were developed on the branch "vfio/next" in the repro:
 - https://github.com/awilliam/linux-vfio

The optimization demonstrated in patch 1/2 shows a >300x speed up when
pinning gigantic pages in "vfio_type1_iommu".  More work will be needed to
improve iommufd's mapping performance for gigantic pages, but a
callstack showing the slow path is included in that patch to help drive
the conversation forward.

The iommu mode "iommufd_compat_type1" lags much farther behind the other
two.  If the intention is to have it perform on par (or near par) I can
attach a callstack in a follow up to see if there is any low hanging
fruit to be had.  But as it sits right now the performance of this iommu
mode is an order of magnitude slower than the others.

This is being sent as an RFC because while there is a proposed solution
for "vfio_type1_iommu", there are no solutions for the other two iommu
modes.  Attached is a callstack in patch 1/2 showing where the latency
issues are for iommufd, however, I haven't posted one
for "iommufd_compat_type1". I'm also not clear on what the intention is
for "iommufd_compat_type1" w.r.t. this issue.  Especially given it is so
much slower than the others.

Aaron Lewis (2):
  vfio: Improve DMA mapping performance for huge pages
  vfio: selftest: Add vfio_dma_mapping_perf_test

 drivers/vfio/vfio_iommu_type1.c               |  37 ++-
 tools/testing/selftests/vfio/Makefile         |   1 +
 .../vfio/vfio_dma_mapping_perf_test.c         | 247 ++++++++++++++++++
 3 files changed, 277 insertions(+), 8 deletions(-)
 create mode 100644 tools/testing/selftests/vfio/vfio_dma_mapping_perf_test.c

-- 
2.52.0.351.gbe84eed79e-goog


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-01-05 19:37 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-23 23:00 [RFC PATCH 0/2] vfio: Improve DMA mapping performance for huge pages Aaron Lewis
2025-12-23 23:00 ` [RFC PATCH 1/2] " Aaron Lewis
2025-12-24  2:10   ` Jason Gunthorpe
2025-12-29 21:40     ` Aaron Lewis
2025-12-30  1:12       ` Jason Gunthorpe
2026-01-05 18:31         ` David Matlack
2026-01-05 19:01           ` Jason Gunthorpe
2026-01-05 19:36             ` David Matlack
2025-12-23 23:00 ` [RFC PATCH 2/2] vfio: selftest: Add vfio_dma_mapping_perf_test Aaron Lewis
2025-12-24  2:04 ` [RFC PATCH 0/2] vfio: Improve DMA mapping performance for huge pages Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.