public inbox for dri-devel@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH v5 0/5] Use new dma-map IOVA alloc, link, and sync API in GPU SVM and DRM pagemap
@ 2026-02-19 20:10 Matthew Brost
  2026-02-19 20:10 ` [PATCH v5 1/5] drm/pagemap: Add helper to access zone_device_data Matthew Brost
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Matthew Brost @ 2026-02-19 20:10 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: leonro, jgg, francois.dugast, thomas.hellstrom,
	himal.prasad.ghimiray

The dma-map IOVA alloc, link, and sync APIs perform significantly better
than dma-map / dma-unmap, as they avoid costly IOMMU synchronizations.
This difference is especially noticeable when mapping a 2MB region in
4KB pages.

Use dma-map IOVA alloc, link, and sync APIs for GPU SVM and DRM page,
which mappings between the CPU and GPU.

Initial results are promising.

Baseline CPU time during 2M / 64K fault with a migration:
Average migrate 2M cpu time (us, percentage): 333.99665178571428571429, .61102853199282922865
Average migrate 64K cpu time (us, percentage): 18.62723214285714285714, .30127985269960467173

After this series CPU time during 2M / 64K fault with a migration:
Average migrate 2M cpu time (us, percentage): 224.81808035714285714286, .51412827364772602557
Average migrate 64K cpu time (us, percentage): 14.65625000000000000000, .25659463050529524405

Matt

v2:
 - Include missing basline patch for CI
v3:
 - Fix memory corruption
 - PoC IOVA alloc for multi-GPU
v4:
 - Pack IOVA / drop dummy pages
 - Drop multi-GPU IOVA alloc
v5:
 - Address Thomas's comments

Francois Dugast (1):
  drm/pagemap: Add helper to access zone_device_data

Matthew Brost (4):
  drm/gpusvm: Use dma-map IOVA alloc, link, and sync API in GPU SVM
  drm/pagemap: Drop source_peer_migrates flag and assume true
  drm/pagemap: Split drm_pagemap_migrate_map_pages into device / system
  drm/pagemap: Use dma-map IOVA alloc, link, and sync API for DRM
    pagemap

 drivers/gpu/drm/drm_gpusvm.c  |  62 +++++++--
 drivers/gpu/drm/drm_pagemap.c | 242 ++++++++++++++++++++++++----------
 drivers/gpu/drm/xe/xe_svm.c   |   1 -
 include/drm/drm_gpusvm.h      |   5 +
 include/drm/drm_pagemap.h     |  22 +++-
 5 files changed, 247 insertions(+), 85 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v5 1/5] drm/pagemap: Add helper to access zone_device_data
  2026-02-19 20:10 [PATCH v5 0/5] Use new dma-map IOVA alloc, link, and sync API in GPU SVM and DRM pagemap Matthew Brost
@ 2026-02-19 20:10 ` Matthew Brost
  2026-02-19 20:10 ` [PATCH v5 2/5] drm/gpusvm: Use dma-map IOVA alloc, link, and sync API in GPU SVM Matthew Brost
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Matthew Brost @ 2026-02-19 20:10 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: leonro, jgg, francois.dugast, thomas.hellstrom,
	himal.prasad.ghimiray

From: Francois Dugast <francois.dugast@intel.com>

This new helper helps ensure all accesses to zone_device_data use the
correct API whether the page is part of a folio or not.

Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
 drivers/gpu/drm/drm_gpusvm.c  |  7 +++++--
 drivers/gpu/drm/drm_pagemap.c | 21 ++++++++++++---------
 include/drm/drm_pagemap.h     | 14 ++++++++++++++
 3 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
index 81626b00b755..1490d1929b1a 100644
--- a/drivers/gpu/drm/drm_gpusvm.c
+++ b/drivers/gpu/drm/drm_gpusvm.c
@@ -1488,12 +1488,15 @@ int drm_gpusvm_get_pages(struct drm_gpusvm *gpusvm,
 		order = drm_gpusvm_hmm_pfn_to_order(pfns[i], i, npages);
 		if (is_device_private_page(page) ||
 		    is_device_coherent_page(page)) {
+			struct drm_pagemap_zdd *__zdd =
+				drm_pagemap_page_zone_device_data(page);
+
 			if (!ctx->allow_mixed &&
-			    zdd != page->zone_device_data && i > 0) {
+			    zdd != __zdd && i > 0) {
 				err = -EOPNOTSUPP;
 				goto err_unmap;
 			}
-			zdd = page->zone_device_data;
+			zdd = __zdd;
 			if (pagemap != page_pgmap(page)) {
 				if (pagemap) {
 					err = -EOPNOTSUPP;
diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
index f83a76f7f37c..01a06d1fd1a0 100644
--- a/drivers/gpu/drm/drm_pagemap.c
+++ b/drivers/gpu/drm/drm_pagemap.c
@@ -244,7 +244,7 @@ static int drm_pagemap_migrate_map_pages(struct device *dev,
 		order = folio_order(folio);
 
 		if (is_device_private_page(page)) {
-			struct drm_pagemap_zdd *zdd = page->zone_device_data;
+			struct drm_pagemap_zdd *zdd = drm_pagemap_page_zone_device_data(page);
 			struct drm_pagemap *dpagemap = zdd->dpagemap;
 			struct drm_pagemap_addr addr;
 
@@ -315,7 +315,7 @@ static void drm_pagemap_migrate_unmap_pages(struct device *dev,
 			goto next;
 
 		if (is_zone_device_page(page)) {
-			struct drm_pagemap_zdd *zdd = page->zone_device_data;
+			struct drm_pagemap_zdd *zdd = drm_pagemap_page_zone_device_data(page);
 			struct drm_pagemap *dpagemap = zdd->dpagemap;
 
 			dpagemap->ops->device_unmap(dpagemap, dev, &pagemap_addr[i]);
@@ -593,7 +593,8 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
 
 		pages[i] = NULL;
 		if (src_page && is_device_private_page(src_page)) {
-			struct drm_pagemap_zdd *src_zdd = src_page->zone_device_data;
+			struct drm_pagemap_zdd *src_zdd =
+				drm_pagemap_page_zone_device_data(src_page);
 
 			if (page_pgmap(src_page) == pagemap &&
 			    !mdetails->can_migrate_same_pagemap) {
@@ -715,8 +716,8 @@ static int drm_pagemap_migrate_populate_ram_pfn(struct vm_area_struct *vas,
 			goto next;
 
 		if (fault_page) {
-			if (src_page->zone_device_data !=
-			    fault_page->zone_device_data)
+			if (drm_pagemap_page_zone_device_data(src_page) !=
+			    drm_pagemap_page_zone_device_data(fault_page))
 				goto next;
 		}
 
@@ -1057,7 +1058,7 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
 	void *buf;
 	int i, err = 0;
 
-	zdd = page->zone_device_data;
+	zdd = drm_pagemap_page_zone_device_data(page);
 	if (time_before64(get_jiffies_64(), zdd->devmem_allocation->timeslice_expiration))
 		return 0;
 
@@ -1140,7 +1141,9 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
  */
 static void drm_pagemap_folio_free(struct folio *folio)
 {
-	drm_pagemap_zdd_put(folio->page.zone_device_data);
+	struct page *page = folio_page(folio, 0);
+
+	drm_pagemap_zdd_put(drm_pagemap_page_zone_device_data(page));
 }
 
 /**
@@ -1156,7 +1159,7 @@ static void drm_pagemap_folio_free(struct folio *folio)
  */
 static vm_fault_t drm_pagemap_migrate_to_ram(struct vm_fault *vmf)
 {
-	struct drm_pagemap_zdd *zdd = vmf->page->zone_device_data;
+	struct drm_pagemap_zdd *zdd = drm_pagemap_page_zone_device_data(vmf->page);
 	int err;
 
 	err = __drm_pagemap_migrate_to_ram(vmf->vma,
@@ -1222,7 +1225,7 @@ EXPORT_SYMBOL_GPL(drm_pagemap_devmem_init);
  */
 struct drm_pagemap *drm_pagemap_page_to_dpagemap(struct page *page)
 {
-	struct drm_pagemap_zdd *zdd = page->zone_device_data;
+	struct drm_pagemap_zdd *zdd = drm_pagemap_page_zone_device_data(page);
 
 	return zdd->devmem_allocation->dpagemap;
 }
diff --git a/include/drm/drm_pagemap.h b/include/drm/drm_pagemap.h
index c848f578e3da..72f6828f2604 100644
--- a/include/drm/drm_pagemap.h
+++ b/include/drm/drm_pagemap.h
@@ -4,6 +4,7 @@
 
 #include <linux/dma-direction.h>
 #include <linux/hmm.h>
+#include <linux/memremap.h>
 #include <linux/types.h>
 
 #define NR_PAGES(order) (1U << (order))
@@ -341,6 +342,19 @@ struct drm_pagemap_migrate_details {
 	u32 source_peer_migrates : 1;
 };
 
+/**
+ * drm_pagemap_page_zone_device_data() - Page to zone_device_data
+ * @page: Pointer to the page
+ *
+ * Return: Page's zone_device_data
+ */
+static inline struct drm_pagemap_zdd *drm_pagemap_page_zone_device_data(struct page *page)
+{
+	struct folio *folio = page_folio(page);
+
+	return folio_zone_device_data(folio);
+}
+
 #if IS_ENABLED(CONFIG_ZONE_DEVICE)
 
 int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v5 2/5] drm/gpusvm: Use dma-map IOVA alloc, link, and sync API in GPU SVM
  2026-02-19 20:10 [PATCH v5 0/5] Use new dma-map IOVA alloc, link, and sync API in GPU SVM and DRM pagemap Matthew Brost
  2026-02-19 20:10 ` [PATCH v5 1/5] drm/pagemap: Add helper to access zone_device_data Matthew Brost
@ 2026-02-19 20:10 ` Matthew Brost
  2026-02-19 20:10 ` [PATCH v5 3/5] drm/pagemap: Drop source_peer_migrates flag and assume true Matthew Brost
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Matthew Brost @ 2026-02-19 20:10 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: leonro, jgg, francois.dugast, thomas.hellstrom,
	himal.prasad.ghimiray

The dma-map IOVA alloc, link, and sync APIs perform significantly better
than dma-map / dma-unmap, as they avoid costly IOMMU synchronizations.
This difference is especially noticeable when mapping a 2MB region in
4KB pages.

Use the IOVA alloc, link, and sync APIs for GPU SVM, which create DMA
mappings between the CPU and GPU.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/drm_gpusvm.c | 55 ++++++++++++++++++++++++++++++------
 include/drm/drm_gpusvm.h     |  5 ++++
 2 files changed, 52 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
index 1490d1929b1a..e406ceb8d000 100644
--- a/drivers/gpu/drm/drm_gpusvm.c
+++ b/drivers/gpu/drm/drm_gpusvm.c
@@ -1139,11 +1139,19 @@ static void __drm_gpusvm_unmap_pages(struct drm_gpusvm *gpusvm,
 		struct drm_gpusvm_pages_flags flags = {
 			.__flags = svm_pages->flags.__flags,
 		};
+		bool use_iova = dma_use_iova(&svm_pages->state);
+
+		if (use_iova) {
+			dma_iova_unlink(dev, &svm_pages->state, 0,
+					svm_pages->state_offset,
+					svm_pages->dma_addr[0].dir, 0);
+			dma_iova_free(dev, &svm_pages->state);
+		}
 
 		for (i = 0, j = 0; i < npages; j++) {
 			struct drm_pagemap_addr *addr = &svm_pages->dma_addr[j];
 
-			if (addr->proto == DRM_INTERCONNECT_SYSTEM)
+			if (!use_iova && addr->proto == DRM_INTERCONNECT_SYSTEM)
 				dma_unmap_page(dev,
 					       addr->addr,
 					       PAGE_SIZE << addr->order,
@@ -1408,6 +1416,7 @@ int drm_gpusvm_get_pages(struct drm_gpusvm *gpusvm,
 	struct drm_gpusvm_pages_flags flags;
 	enum dma_data_direction dma_dir = ctx->read_only ? DMA_TO_DEVICE :
 							   DMA_BIDIRECTIONAL;
+	struct dma_iova_state *state = &svm_pages->state;
 
 retry:
 	if (time_after(jiffies, timeout))
@@ -1446,6 +1455,9 @@ int drm_gpusvm_get_pages(struct drm_gpusvm *gpusvm,
 	if (err)
 		goto err_free;
 
+	*state = (struct dma_iova_state){};
+	svm_pages->state_offset = 0;
+
 map_pages:
 	/*
 	 * Perform all dma mappings under the notifier lock to not
@@ -1539,13 +1551,33 @@ int drm_gpusvm_get_pages(struct drm_gpusvm *gpusvm,
 				goto err_unmap;
 			}
 
-			addr = dma_map_page(gpusvm->drm->dev,
-					    page, 0,
-					    PAGE_SIZE << order,
-					    dma_dir);
-			if (dma_mapping_error(gpusvm->drm->dev, addr)) {
-				err = -EFAULT;
-				goto err_unmap;
+			if (!i)
+				dma_iova_try_alloc(gpusvm->drm->dev, state,
+						   npages * PAGE_SIZE >=
+						   HPAGE_PMD_SIZE ?
+						   HPAGE_PMD_SIZE : 0,
+						   npages * PAGE_SIZE);
+
+			if (dma_use_iova(state)) {
+				err = dma_iova_link(gpusvm->drm->dev, state,
+						    hmm_pfn_to_phys(pfns[i]),
+						    svm_pages->state_offset,
+						    PAGE_SIZE << order,
+						    dma_dir, 0);
+				if (err)
+					goto err_unmap;
+
+				addr = state->addr + svm_pages->state_offset;
+				svm_pages->state_offset += PAGE_SIZE << order;
+			} else {
+				addr = dma_map_page(gpusvm->drm->dev,
+						    page, 0,
+						    PAGE_SIZE << order,
+						    dma_dir);
+				if (dma_mapping_error(gpusvm->drm->dev, addr)) {
+					err = -EFAULT;
+					goto err_unmap;
+				}
 			}
 
 			svm_pages->dma_addr[j] = drm_pagemap_addr_encode
@@ -1557,6 +1589,13 @@ int drm_gpusvm_get_pages(struct drm_gpusvm *gpusvm,
 		flags.has_dma_mapping = true;
 	}
 
+	if (dma_use_iova(state)) {
+		err = dma_iova_sync(gpusvm->drm->dev, state, 0,
+				    svm_pages->state_offset);
+		if (err)
+			goto err_unmap;
+	}
+
 	if (pagemap) {
 		flags.has_devmem_pages = true;
 		drm_pagemap_get(dpagemap);
diff --git a/include/drm/drm_gpusvm.h b/include/drm/drm_gpusvm.h
index 2578ac92a8d4..cd94bb2ee6ee 100644
--- a/include/drm/drm_gpusvm.h
+++ b/include/drm/drm_gpusvm.h
@@ -6,6 +6,7 @@
 #ifndef __DRM_GPUSVM_H__
 #define __DRM_GPUSVM_H__
 
+#include <linux/dma-mapping.h>
 #include <linux/kref.h>
 #include <linux/interval_tree.h>
 #include <linux/mmu_notifier.h>
@@ -136,6 +137,8 @@ struct drm_gpusvm_pages_flags {
  * @dma_addr: Device address array
  * @dpagemap: The struct drm_pagemap of the device pages we're dma-mapping.
  *            Note this is assuming only one drm_pagemap per range is allowed.
+ * @state: DMA IOVA state for mapping.
+ * @state_offset: DMA IOVA offset for mapping.
  * @notifier_seq: Notifier sequence number of the range's pages
  * @flags: Flags for range
  * @flags.migrate_devmem: Flag indicating whether the range can be migrated to device memory
@@ -147,6 +150,8 @@ struct drm_gpusvm_pages_flags {
 struct drm_gpusvm_pages {
 	struct drm_pagemap_addr *dma_addr;
 	struct drm_pagemap *dpagemap;
+	struct dma_iova_state state;
+	unsigned long state_offset;
 	unsigned long notifier_seq;
 	struct drm_gpusvm_pages_flags flags;
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v5 3/5] drm/pagemap: Drop source_peer_migrates flag and assume true
  2026-02-19 20:10 [PATCH v5 0/5] Use new dma-map IOVA alloc, link, and sync API in GPU SVM and DRM pagemap Matthew Brost
  2026-02-19 20:10 ` [PATCH v5 1/5] drm/pagemap: Add helper to access zone_device_data Matthew Brost
  2026-02-19 20:10 ` [PATCH v5 2/5] drm/gpusvm: Use dma-map IOVA alloc, link, and sync API in GPU SVM Matthew Brost
@ 2026-02-19 20:10 ` Matthew Brost
  2026-02-19 20:53   ` Matthew Brost
  2026-02-19 20:10 ` [PATCH v5 4/5] drm/pagemap: Split drm_pagemap_migrate_map_pages into device / system Matthew Brost
  2026-02-19 20:10 ` [PATCH v5 5/5] drm/pagemap: Use dma-map IOVA alloc, link, and sync API for DRM pagemap Matthew Brost
  4 siblings, 1 reply; 11+ messages in thread
From: Matthew Brost @ 2026-02-19 20:10 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: leonro, jgg, francois.dugast, thomas.hellstrom,
	himal.prasad.ghimiray

All current users of DRM pagemap set source_peer_migrates to true during
migration, and it is unclear whether any user would ever want to disable
this for performance reasons or for features such as compression. It is
also questionable whether this flag could be made to work with
high-speed fabric mapping APIs.

Drop the flag and make DRM pagemap unconditionally assume that
source_peer_migrates is true.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/drm_pagemap.c | 10 ++++------
 drivers/gpu/drm/xe/xe_svm.c   |  1 -
 include/drm/drm_pagemap.h     |  8 ++------
 3 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
index 01a06d1fd1a0..32535ab01c0f 100644
--- a/drivers/gpu/drm/drm_pagemap.c
+++ b/drivers/gpu/drm/drm_pagemap.c
@@ -602,12 +602,10 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
 				own_pages++;
 				continue;
 			}
-			if (mdetails->source_peer_migrates) {
-				cur.dpagemap = src_zdd->dpagemap;
-				cur.ops = src_zdd->devmem_allocation->ops;
-				cur.device = cur.dpagemap->drm->dev;
-				pages[i] = src_page;
-			}
+			cur.dpagemap = src_zdd->dpagemap;
+			cur.ops = src_zdd->devmem_allocation->ops;
+			cur.device = cur.dpagemap->drm->dev;
+			pages[i] = src_page;
 		}
 		if (!pages[i]) {
 			cur.dpagemap = NULL;
diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
index c96ed760c077..e86e69087c7e 100644
--- a/drivers/gpu/drm/xe/xe_svm.c
+++ b/drivers/gpu/drm/xe/xe_svm.c
@@ -1027,7 +1027,6 @@ static int xe_drm_pagemap_populate_mm(struct drm_pagemap *dpagemap,
 	struct xe_pagemap *xpagemap = container_of(dpagemap, typeof(*xpagemap), dpagemap);
 	struct drm_pagemap_migrate_details mdetails = {
 		.timeslice_ms = timeslice_ms,
-		.source_peer_migrates = 1,
 	};
 	struct xe_vram_region *vr = xe_pagemap_to_vr(xpagemap);
 	struct dma_fence *pre_migrate_fence = NULL;
diff --git a/include/drm/drm_pagemap.h b/include/drm/drm_pagemap.h
index 72f6828f2604..5c33982141c2 100644
--- a/include/drm/drm_pagemap.h
+++ b/include/drm/drm_pagemap.h
@@ -329,12 +329,8 @@ struct drm_pagemap_devmem {
  * struct drm_pagemap_migrate_details - Details to govern migration.
  * @timeslice_ms: The time requested for the migrated pagemap pages to
  * be present in @mm before being allowed to be migrated back.
- * @can_migrate_same_pagemap: Whether the copy function as indicated by
- * the @source_peer_migrates flag, can migrate device pages within a
- * single drm_pagemap.
- * @source_peer_migrates: Whether on p2p migration, The source drm_pagemap
- * should use the copy_to_ram() callback rather than the destination
- * drm_pagemap should use the copy_to_devmem() callback.
+ * @can_migrate_same_pagemap: Whether the copy function can migrate
+ * device pages within a single drm_pagemap.
  */
 struct drm_pagemap_migrate_details {
 	unsigned long timeslice_ms;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v5 4/5] drm/pagemap: Split drm_pagemap_migrate_map_pages into device / system
  2026-02-19 20:10 [PATCH v5 0/5] Use new dma-map IOVA alloc, link, and sync API in GPU SVM and DRM pagemap Matthew Brost
                   ` (2 preceding siblings ...)
  2026-02-19 20:10 ` [PATCH v5 3/5] drm/pagemap: Drop source_peer_migrates flag and assume true Matthew Brost
@ 2026-02-19 20:10 ` Matthew Brost
  2026-04-02 14:12   ` Francois Dugast
  2026-02-19 20:10 ` [PATCH v5 5/5] drm/pagemap: Use dma-map IOVA alloc, link, and sync API for DRM pagemap Matthew Brost
  4 siblings, 1 reply; 11+ messages in thread
From: Matthew Brost @ 2026-02-19 20:10 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: leonro, jgg, francois.dugast, thomas.hellstrom,
	himal.prasad.ghimiray

Split drm_pagemap_migrate_map_pages into device / system helpers clearly
seperating these operations. Will help with upcoming changes to split
IOVA allocation steps.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>

---
v5:
 - s/map_device_pages/map_device_private_pages (Thomas)
 - Fix map_system_pages kernel doc (Thomas)
---
 drivers/gpu/drm/drm_pagemap.c | 150 ++++++++++++++++++++++------------
 1 file changed, 100 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
index 32535ab01c0f..ef8b9c69d1d4 100644
--- a/drivers/gpu/drm/drm_pagemap.c
+++ b/drivers/gpu/drm/drm_pagemap.c
@@ -205,7 +205,8 @@ static void drm_pagemap_get_devmem_page(struct page *page,
 }
 
 /**
- * drm_pagemap_migrate_map_pages() - Map migration pages for GPU SVM migration
+ * drm_pagemap_migrate_map_device_private_pages() - Map device privaet migration
+ * pages for GPU SVM migration
  * @dev: The device performing the migration.
  * @local_dpagemap: The drm_pagemap local to the migrating device.
  * @pagemap_addr: Array to store DMA information corresponding to mapped pages.
@@ -221,19 +222,22 @@ static void drm_pagemap_get_devmem_page(struct page *page,
  *
  * Returns: 0 on success, -EFAULT if an error occurs during mapping.
  */
-static int drm_pagemap_migrate_map_pages(struct device *dev,
-					 struct drm_pagemap *local_dpagemap,
-					 struct drm_pagemap_addr *pagemap_addr,
-					 unsigned long *migrate_pfn,
-					 unsigned long npages,
-					 enum dma_data_direction dir,
-					 const struct drm_pagemap_migrate_details *mdetails)
+static int
+drm_pagemap_migrate_map_device_private_pages(struct device *dev,
+					     struct drm_pagemap *local_dpagemap,
+					     struct drm_pagemap_addr *pagemap_addr,
+					     unsigned long *migrate_pfn,
+					     unsigned long npages,
+					     enum dma_data_direction dir,
+					     const struct drm_pagemap_migrate_details *mdetails)
 {
 	unsigned long num_peer_pages = 0, num_local_pages = 0, i;
 
 	for (i = 0; i < npages;) {
 		struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
-		dma_addr_t dma_addr;
+		struct drm_pagemap_zdd *zdd;
+		struct drm_pagemap *dpagemap;
+		struct drm_pagemap_addr addr;
 		struct folio *folio;
 		unsigned int order = 0;
 
@@ -243,36 +247,26 @@ static int drm_pagemap_migrate_map_pages(struct device *dev,
 		folio = page_folio(page);
 		order = folio_order(folio);
 
-		if (is_device_private_page(page)) {
-			struct drm_pagemap_zdd *zdd = drm_pagemap_page_zone_device_data(page);
-			struct drm_pagemap *dpagemap = zdd->dpagemap;
-			struct drm_pagemap_addr addr;
-
-			if (dpagemap == local_dpagemap) {
-				if (!mdetails->can_migrate_same_pagemap)
-					goto next;
+		WARN_ON_ONCE(!is_device_private_page(page));
 
-				num_local_pages += NR_PAGES(order);
-			} else {
-				num_peer_pages += NR_PAGES(order);
-			}
+		zdd = drm_pagemap_page_zone_device_data(page);
+		dpagemap = zdd->dpagemap;
 
-			addr = dpagemap->ops->device_map(dpagemap, dev, page, order, dir);
-			if (dma_mapping_error(dev, addr.addr))
-				return -EFAULT;
+		if (dpagemap == local_dpagemap) {
+			if (!mdetails->can_migrate_same_pagemap)
+				goto next;
 
-			pagemap_addr[i] = addr;
+			num_local_pages += NR_PAGES(order);
 		} else {
-			dma_addr = dma_map_page(dev, page, 0, page_size(page), dir);
-			if (dma_mapping_error(dev, dma_addr))
-				return -EFAULT;
-
-			pagemap_addr[i] =
-				drm_pagemap_addr_encode(dma_addr,
-							DRM_INTERCONNECT_SYSTEM,
-							order, dir);
+			num_peer_pages += NR_PAGES(order);
 		}
 
+		addr = dpagemap->ops->device_map(dpagemap, dev, page, order, dir);
+		if (dma_mapping_error(dev, addr.addr))
+			return -EFAULT;
+
+		pagemap_addr[i] = addr;
+
 next:
 		i += NR_PAGES(order);
 	}
@@ -287,6 +281,60 @@ static int drm_pagemap_migrate_map_pages(struct device *dev,
 	return 0;
 }
 
+/**
+ * drm_pagemap_migrate_map_system_pages() - Map system or device coherent
+ * migration pages for GPU SVM migration
+ * @dev: The device performing the migration.
+ * @pagemap_addr: Array to store DMA information corresponding to mapped pages.
+ * @migrate_pfn: Array of page frame numbers of system pages or peer pages to map.
+ * @npages: Number of system or device coherent pages to map.
+ * @dir: Direction of data transfer (e.g., DMA_BIDIRECTIONAL)
+ *
+ * This function maps pages of memory for migration usage in GPU SVM. It
+ * iterates over each page frame number provided in @migrate_pfn, maps the
+ * corresponding page, and stores the DMA address in the provided @dma_addr
+ * array.
+ *
+ * Returns: 0 on success, -EFAULT if an error occurs during mapping.
+ */
+static int
+drm_pagemap_migrate_map_system_pages(struct device *dev,
+				     struct drm_pagemap_addr *pagemap_addr,
+				     unsigned long *migrate_pfn,
+				     unsigned long npages,
+				     enum dma_data_direction dir)
+{
+	unsigned long i;
+
+	for (i = 0; i < npages;) {
+		struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
+		dma_addr_t dma_addr;
+		struct folio *folio;
+		unsigned int order = 0;
+
+		if (!page)
+			goto next;
+
+		WARN_ON_ONCE(is_device_private_page(page));
+		folio = page_folio(page);
+		order = folio_order(folio);
+
+		dma_addr = dma_map_page(dev, page, 0, page_size(page), dir);
+		if (dma_mapping_error(dev, dma_addr))
+			return -EFAULT;
+
+		pagemap_addr[i] =
+			drm_pagemap_addr_encode(dma_addr,
+						DRM_INTERCONNECT_SYSTEM,
+						order, dir);
+
+next:
+		i += NR_PAGES(order);
+	}
+
+	return 0;
+}
+
 /**
  * drm_pagemap_migrate_unmap_pages() - Unmap pages previously mapped for GPU SVM migration
  * @dev: The device for which the pages were mapped
@@ -347,9 +395,13 @@ drm_pagemap_migrate_remote_to_local(struct drm_pagemap_devmem *devmem,
 				    const struct drm_pagemap_migrate_details *mdetails)
 
 {
-	int err = drm_pagemap_migrate_map_pages(remote_device, remote_dpagemap,
-						pagemap_addr, local_pfns,
-						npages, DMA_FROM_DEVICE, mdetails);
+	int err = drm_pagemap_migrate_map_device_private_pages(remote_device,
+							       remote_dpagemap,
+							       pagemap_addr,
+							       local_pfns,
+							       npages,
+							       DMA_FROM_DEVICE,
+							       mdetails);
 
 	if (err)
 		goto out;
@@ -368,12 +420,11 @@ drm_pagemap_migrate_sys_to_dev(struct drm_pagemap_devmem *devmem,
 			       struct page *local_pages[],
 			       struct drm_pagemap_addr pagemap_addr[],
 			       unsigned long npages,
-			       const struct drm_pagemap_devmem_ops *ops,
-			       const struct drm_pagemap_migrate_details *mdetails)
+			       const struct drm_pagemap_devmem_ops *ops)
 {
-	int err = drm_pagemap_migrate_map_pages(devmem->dev, devmem->dpagemap,
-						pagemap_addr, sys_pfns, npages,
-						DMA_TO_DEVICE, mdetails);
+	int err = drm_pagemap_migrate_map_system_pages(devmem->dev,
+						       pagemap_addr, sys_pfns,
+						       npages, DMA_TO_DEVICE);
 
 	if (err)
 		goto out;
@@ -437,7 +488,7 @@ static int drm_pagemap_migrate_range(struct drm_pagemap_devmem *devmem,
 						     &pages[last->start],
 						     &pagemap_addr[last->start],
 						     cur->start - last->start,
-						     last->ops, mdetails);
+						     last->ops);
 
 out:
 	*last = *cur;
@@ -942,7 +993,6 @@ EXPORT_SYMBOL(drm_pagemap_put);
 int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
 {
 	const struct drm_pagemap_devmem_ops *ops = devmem_allocation->ops;
-	struct drm_pagemap_migrate_details mdetails = {};
 	unsigned long npages, mpages = 0;
 	struct page **pages;
 	unsigned long *src, *dst;
@@ -981,10 +1031,10 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
 	if (err || !mpages)
 		goto err_finalize;
 
-	err = drm_pagemap_migrate_map_pages(devmem_allocation->dev,
-					    devmem_allocation->dpagemap, pagemap_addr,
-					    dst, npages, DMA_FROM_DEVICE,
-					    &mdetails);
+	err = drm_pagemap_migrate_map_system_pages(devmem_allocation->dev,
+						   pagemap_addr,
+						   dst, npages,
+						   DMA_FROM_DEVICE);
 	if (err)
 		goto err_finalize;
 
@@ -1045,7 +1095,6 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
 		MIGRATE_VMA_SELECT_DEVICE_COHERENT,
 		.fault_page	= page,
 	};
-	struct drm_pagemap_migrate_details mdetails = {};
 	struct drm_pagemap_zdd *zdd;
 	const struct drm_pagemap_devmem_ops *ops;
 	struct device *dev = NULL;
@@ -1103,8 +1152,9 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
 	if (err)
 		goto err_finalize;
 
-	err = drm_pagemap_migrate_map_pages(dev, zdd->dpagemap, pagemap_addr, migrate.dst, npages,
-					    DMA_FROM_DEVICE, &mdetails);
+	err = drm_pagemap_migrate_map_system_pages(dev, pagemap_addr,
+						   migrate.dst, npages,
+						   DMA_FROM_DEVICE);
 	if (err)
 		goto err_finalize;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v5 5/5] drm/pagemap: Use dma-map IOVA alloc, link, and sync API for DRM pagemap
  2026-02-19 20:10 [PATCH v5 0/5] Use new dma-map IOVA alloc, link, and sync API in GPU SVM and DRM pagemap Matthew Brost
                   ` (3 preceding siblings ...)
  2026-02-19 20:10 ` [PATCH v5 4/5] drm/pagemap: Split drm_pagemap_migrate_map_pages into device / system Matthew Brost
@ 2026-02-19 20:10 ` Matthew Brost
  2026-04-02 15:59   ` Francois Dugast
  4 siblings, 1 reply; 11+ messages in thread
From: Matthew Brost @ 2026-02-19 20:10 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: leonro, jgg, francois.dugast, thomas.hellstrom,
	himal.prasad.ghimiray

The dma-map IOVA alloc, link, and sync APIs perform significantly better
than dma-map / dma-unmap, as they avoid costly IOMMU synchronizations.
This difference is especially noticeable when mapping a 2MB region in
4KB pages.

Use the IOVA alloc, link, and sync APIs for DRM pagemap, which create DMA
mappings between the CPU and GPU for copying data.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>

---
v5:
 - Remove extra newline (Thomas)
 - Adjust alignemnt calculation (Thomas)
---
 drivers/gpu/drm/drm_pagemap.c | 83 +++++++++++++++++++++++++++++------
 1 file changed, 69 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
index ef8b9c69d1d4..d9fceffce347 100644
--- a/drivers/gpu/drm/drm_pagemap.c
+++ b/drivers/gpu/drm/drm_pagemap.c
@@ -281,6 +281,19 @@ drm_pagemap_migrate_map_device_private_pages(struct device *dev,
 	return 0;
 }
 
+/**
+ * struct drm_pagemap_iova_state - DRM pagemap IOVA state
+ * @dma_state: DMA IOVA state.
+ * @offset: Current offset in IOVA.
+ *
+ * This structure acts as an iterator for packing all IOVA addresses within a
+ * contiguous range.
+ */
+struct drm_pagemap_iova_state {
+	struct dma_iova_state dma_state;
+	unsigned long offset;
+};
+
 /**
  * drm_pagemap_migrate_map_system_pages() - Map system or device coherent
  * migration pages for GPU SVM migration
@@ -289,6 +302,7 @@ drm_pagemap_migrate_map_device_private_pages(struct device *dev,
  * @migrate_pfn: Array of page frame numbers of system pages or peer pages to map.
  * @npages: Number of system or device coherent pages to map.
  * @dir: Direction of data transfer (e.g., DMA_BIDIRECTIONAL)
+ * @state: DMA IOVA state for mapping.
  *
  * This function maps pages of memory for migration usage in GPU SVM. It
  * iterates over each page frame number provided in @migrate_pfn, maps the
@@ -302,9 +316,11 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
 				     struct drm_pagemap_addr *pagemap_addr,
 				     unsigned long *migrate_pfn,
 				     unsigned long npages,
-				     enum dma_data_direction dir)
+				     enum dma_data_direction dir,
+				     struct drm_pagemap_iova_state *state)
 {
 	unsigned long i;
+	bool try_alloc = false;
 
 	for (i = 0; i < npages;) {
 		struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
@@ -319,9 +335,31 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
 		folio = page_folio(page);
 		order = folio_order(folio);
 
-		dma_addr = dma_map_page(dev, page, 0, page_size(page), dir);
-		if (dma_mapping_error(dev, dma_addr))
-			return -EFAULT;
+		if (!try_alloc) {
+			dma_iova_try_alloc(dev, &state->dma_state,
+					   (npages - i) * PAGE_SIZE >=
+					   HPAGE_PMD_SIZE ?
+					   HPAGE_PMD_SIZE : 0,
+					   npages * PAGE_SIZE);
+			try_alloc = true;
+		}
+
+		if (dma_use_iova(&state->dma_state)) {
+			int err = dma_iova_link(dev, &state->dma_state,
+						page_to_phys(page),
+						state->offset, page_size(page),
+						dir, 0);
+			if (err)
+				return err;
+
+			dma_addr = state->dma_state.addr + state->offset;
+			state->offset += page_size(page);
+		} else {
+			dma_addr = dma_map_page(dev, page, 0, page_size(page),
+						dir);
+			if (dma_mapping_error(dev, dma_addr))
+				return -EFAULT;
+		}
 
 		pagemap_addr[i] =
 			drm_pagemap_addr_encode(dma_addr,
@@ -332,6 +370,9 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
 		i += NR_PAGES(order);
 	}
 
+	if (dma_use_iova(&state->dma_state))
+		return dma_iova_sync(dev, &state->dma_state, 0, state->offset);
+
 	return 0;
 }
 
@@ -343,6 +384,7 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
  * @pagemap_addr: Array of DMA information corresponding to mapped pages
  * @npages: Number of pages to unmap
  * @dir: Direction of data transfer (e.g., DMA_BIDIRECTIONAL)
+ * @state: DMA IOVA state for mapping.
  *
  * This function unmaps previously mapped pages of memory for GPU Shared Virtual
  * Memory (SVM). It iterates over each DMA address provided in @dma_addr, checks
@@ -352,10 +394,17 @@ static void drm_pagemap_migrate_unmap_pages(struct device *dev,
 					    struct drm_pagemap_addr *pagemap_addr,
 					    unsigned long *migrate_pfn,
 					    unsigned long npages,
-					    enum dma_data_direction dir)
+					    enum dma_data_direction dir,
+					    struct drm_pagemap_iova_state *state)
 {
 	unsigned long i;
 
+	if (state && dma_use_iova(&state->dma_state)) {
+		dma_iova_unlink(dev, &state->dma_state, 0, state->offset, dir, 0);
+		dma_iova_free(dev, &state->dma_state);
+		return;
+	}
+
 	for (i = 0; i < npages;) {
 		struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
 
@@ -410,7 +459,7 @@ drm_pagemap_migrate_remote_to_local(struct drm_pagemap_devmem *devmem,
 			       devmem->pre_migrate_fence);
 out:
 	drm_pagemap_migrate_unmap_pages(remote_device, pagemap_addr, local_pfns,
-					npages, DMA_FROM_DEVICE);
+					npages, DMA_FROM_DEVICE, NULL);
 	return err;
 }
 
@@ -420,11 +469,13 @@ drm_pagemap_migrate_sys_to_dev(struct drm_pagemap_devmem *devmem,
 			       struct page *local_pages[],
 			       struct drm_pagemap_addr pagemap_addr[],
 			       unsigned long npages,
-			       const struct drm_pagemap_devmem_ops *ops)
+			       const struct drm_pagemap_devmem_ops *ops,
+			       struct drm_pagemap_iova_state *state)
 {
 	int err = drm_pagemap_migrate_map_system_pages(devmem->dev,
 						       pagemap_addr, sys_pfns,
-						       npages, DMA_TO_DEVICE);
+						       npages, DMA_TO_DEVICE,
+						       state);
 
 	if (err)
 		goto out;
@@ -433,7 +484,7 @@ drm_pagemap_migrate_sys_to_dev(struct drm_pagemap_devmem *devmem,
 				  devmem->pre_migrate_fence);
 out:
 	drm_pagemap_migrate_unmap_pages(devmem->dev, pagemap_addr, sys_pfns, npages,
-					DMA_TO_DEVICE);
+					DMA_TO_DEVICE, state);
 	return err;
 }
 
@@ -461,6 +512,7 @@ static int drm_pagemap_migrate_range(struct drm_pagemap_devmem *devmem,
 				     const struct migrate_range_loc *cur,
 				     const struct drm_pagemap_migrate_details *mdetails)
 {
+	struct drm_pagemap_iova_state state = {};
 	int ret = 0;
 
 	if (cur->start == 0)
@@ -488,7 +540,7 @@ static int drm_pagemap_migrate_range(struct drm_pagemap_devmem *devmem,
 						     &pages[last->start],
 						     &pagemap_addr[last->start],
 						     cur->start - last->start,
-						     last->ops);
+						     last->ops, &state);
 
 out:
 	*last = *cur;
@@ -993,6 +1045,7 @@ EXPORT_SYMBOL(drm_pagemap_put);
 int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
 {
 	const struct drm_pagemap_devmem_ops *ops = devmem_allocation->ops;
+	struct drm_pagemap_iova_state state = {};
 	unsigned long npages, mpages = 0;
 	struct page **pages;
 	unsigned long *src, *dst;
@@ -1034,7 +1087,7 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
 	err = drm_pagemap_migrate_map_system_pages(devmem_allocation->dev,
 						   pagemap_addr,
 						   dst, npages,
-						   DMA_FROM_DEVICE);
+						   DMA_FROM_DEVICE, &state);
 	if (err)
 		goto err_finalize;
 
@@ -1051,7 +1104,7 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
 	migrate_device_pages(src, dst, npages);
 	migrate_device_finalize(src, dst, npages);
 	drm_pagemap_migrate_unmap_pages(devmem_allocation->dev, pagemap_addr, dst, npages,
-					DMA_FROM_DEVICE);
+					DMA_FROM_DEVICE, &state);
 
 err_free:
 	kvfree(buf);
@@ -1095,6 +1148,7 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
 		MIGRATE_VMA_SELECT_DEVICE_COHERENT,
 		.fault_page	= page,
 	};
+	struct drm_pagemap_iova_state state = {};
 	struct drm_pagemap_zdd *zdd;
 	const struct drm_pagemap_devmem_ops *ops;
 	struct device *dev = NULL;
@@ -1154,7 +1208,7 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
 
 	err = drm_pagemap_migrate_map_system_pages(dev, pagemap_addr,
 						   migrate.dst, npages,
-						   DMA_FROM_DEVICE);
+						   DMA_FROM_DEVICE, &state);
 	if (err)
 		goto err_finalize;
 
@@ -1172,7 +1226,8 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
 	migrate_vma_finalize(&migrate);
 	if (dev)
 		drm_pagemap_migrate_unmap_pages(dev, pagemap_addr, migrate.dst,
-						npages, DMA_FROM_DEVICE);
+						npages, DMA_FROM_DEVICE,
+						&state);
 err_free:
 	kvfree(buf);
 err_out:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 3/5] drm/pagemap: Drop source_peer_migrates flag and assume true
  2026-02-19 20:10 ` [PATCH v5 3/5] drm/pagemap: Drop source_peer_migrates flag and assume true Matthew Brost
@ 2026-02-19 20:53   ` Matthew Brost
  2026-04-02 10:33     ` Francois Dugast
  0 siblings, 1 reply; 11+ messages in thread
From: Matthew Brost @ 2026-02-19 20:53 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: leonro, jgg, francois.dugast, thomas.hellstrom,
	himal.prasad.ghimiray

On Thu, Feb 19, 2026 at 12:10:55PM -0800, Matthew Brost wrote:
> All current users of DRM pagemap set source_peer_migrates to true during
> migration, and it is unclear whether any user would ever want to disable
> this for performance reasons or for features such as compression. It is
> also questionable whether this flag could be made to work with
> high-speed fabric mapping APIs.
> 
> Drop the flag and make DRM pagemap unconditionally assume that
> source_peer_migrates is true.
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/drm_pagemap.c | 10 ++++------
>  drivers/gpu/drm/xe/xe_svm.c   |  1 -
>  include/drm/drm_pagemap.h     |  8 ++------
>  3 files changed, 6 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
> index 01a06d1fd1a0..32535ab01c0f 100644
> --- a/drivers/gpu/drm/drm_pagemap.c
> +++ b/drivers/gpu/drm/drm_pagemap.c
> @@ -602,12 +602,10 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
>  				own_pages++;
>  				continue;
>  			}
> -			if (mdetails->source_peer_migrates) {
> -				cur.dpagemap = src_zdd->dpagemap;
> -				cur.ops = src_zdd->devmem_allocation->ops;
> -				cur.device = cur.dpagemap->drm->dev;
> -				pages[i] = src_page;
> -			}
> +			cur.dpagemap = src_zdd->dpagemap;
> +			cur.ops = src_zdd->devmem_allocation->ops;
> +			cur.device = cur.dpagemap->drm->dev;
> +			pages[i] = src_page;
>  		}
>  		if (!pages[i]) {
>  			cur.dpagemap = NULL;
> diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> index c96ed760c077..e86e69087c7e 100644
> --- a/drivers/gpu/drm/xe/xe_svm.c
> +++ b/drivers/gpu/drm/xe/xe_svm.c
> @@ -1027,7 +1027,6 @@ static int xe_drm_pagemap_populate_mm(struct drm_pagemap *dpagemap,
>  	struct xe_pagemap *xpagemap = container_of(dpagemap, typeof(*xpagemap), dpagemap);
>  	struct drm_pagemap_migrate_details mdetails = {
>  		.timeslice_ms = timeslice_ms,
> -		.source_peer_migrates = 1,
>  	};
>  	struct xe_vram_region *vr = xe_pagemap_to_vr(xpagemap);
>  	struct dma_fence *pre_migrate_fence = NULL;
> diff --git a/include/drm/drm_pagemap.h b/include/drm/drm_pagemap.h
> index 72f6828f2604..5c33982141c2 100644
> --- a/include/drm/drm_pagemap.h
> +++ b/include/drm/drm_pagemap.h
> @@ -329,12 +329,8 @@ struct drm_pagemap_devmem {
>   * struct drm_pagemap_migrate_details - Details to govern migration.
>   * @timeslice_ms: The time requested for the migrated pagemap pages to
>   * be present in @mm before being allowed to be migrated back.
> - * @can_migrate_same_pagemap: Whether the copy function as indicated by
> - * the @source_peer_migrates flag, can migrate device pages within a
> - * single drm_pagemap.
> - * @source_peer_migrates: Whether on p2p migration, The source drm_pagemap
> - * should use the copy_to_ram() callback rather than the destination
> - * drm_pagemap should use the copy_to_devmem() callback.
> + * @can_migrate_same_pagemap: Whether the copy function can migrate
> + * device pages within a single drm_pagemap.

I forgot to delete this variable, in effort to save CI cycles, will fix
this in next rev or when merging.

Matt 

>   */
>  struct drm_pagemap_migrate_details {
>  	unsigned long timeslice_ms;
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 3/5] drm/pagemap: Drop source_peer_migrates flag and assume true
  2026-02-19 20:53   ` Matthew Brost
@ 2026-04-02 10:33     ` Francois Dugast
  0 siblings, 0 replies; 11+ messages in thread
From: Francois Dugast @ 2026-04-02 10:33 UTC (permalink / raw)
  To: Matthew Brost
  Cc: intel-xe, dri-devel, leonro, jgg, thomas.hellstrom,
	himal.prasad.ghimiray

On Thu, Feb 19, 2026 at 12:53:13PM -0800, Matthew Brost wrote:
> On Thu, Feb 19, 2026 at 12:10:55PM -0800, Matthew Brost wrote:
> > All current users of DRM pagemap set source_peer_migrates to true during
> > migration, and it is unclear whether any user would ever want to disable
> > this for performance reasons or for features such as compression. It is
> > also questionable whether this flag could be made to work with
> > high-speed fabric mapping APIs.
> > 
> > Drop the flag and make DRM pagemap unconditionally assume that
> > source_peer_migrates is true.
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >  drivers/gpu/drm/drm_pagemap.c | 10 ++++------
> >  drivers/gpu/drm/xe/xe_svm.c   |  1 -
> >  include/drm/drm_pagemap.h     |  8 ++------
> >  3 files changed, 6 insertions(+), 13 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
> > index 01a06d1fd1a0..32535ab01c0f 100644
> > --- a/drivers/gpu/drm/drm_pagemap.c
> > +++ b/drivers/gpu/drm/drm_pagemap.c
> > @@ -602,12 +602,10 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
> >  				own_pages++;
> >  				continue;
> >  			}
> > -			if (mdetails->source_peer_migrates) {
> > -				cur.dpagemap = src_zdd->dpagemap;
> > -				cur.ops = src_zdd->devmem_allocation->ops;
> > -				cur.device = cur.dpagemap->drm->dev;
> > -				pages[i] = src_page;
> > -			}
> > +			cur.dpagemap = src_zdd->dpagemap;
> > +			cur.ops = src_zdd->devmem_allocation->ops;
> > +			cur.device = cur.dpagemap->drm->dev;
> > +			pages[i] = src_page;
> >  		}
> >  		if (!pages[i]) {
> >  			cur.dpagemap = NULL;
> > diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> > index c96ed760c077..e86e69087c7e 100644
> > --- a/drivers/gpu/drm/xe/xe_svm.c
> > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > @@ -1027,7 +1027,6 @@ static int xe_drm_pagemap_populate_mm(struct drm_pagemap *dpagemap,
> >  	struct xe_pagemap *xpagemap = container_of(dpagemap, typeof(*xpagemap), dpagemap);
> >  	struct drm_pagemap_migrate_details mdetails = {
> >  		.timeslice_ms = timeslice_ms,
> > -		.source_peer_migrates = 1,
> >  	};
> >  	struct xe_vram_region *vr = xe_pagemap_to_vr(xpagemap);
> >  	struct dma_fence *pre_migrate_fence = NULL;
> > diff --git a/include/drm/drm_pagemap.h b/include/drm/drm_pagemap.h
> > index 72f6828f2604..5c33982141c2 100644
> > --- a/include/drm/drm_pagemap.h
> > +++ b/include/drm/drm_pagemap.h
> > @@ -329,12 +329,8 @@ struct drm_pagemap_devmem {
> >   * struct drm_pagemap_migrate_details - Details to govern migration.
> >   * @timeslice_ms: The time requested for the migrated pagemap pages to
> >   * be present in @mm before being allowed to be migrated back.
> > - * @can_migrate_same_pagemap: Whether the copy function as indicated by
> > - * the @source_peer_migrates flag, can migrate device pages within a
> > - * single drm_pagemap.
> > - * @source_peer_migrates: Whether on p2p migration, The source drm_pagemap
> > - * should use the copy_to_ram() callback rather than the destination
> > - * drm_pagemap should use the copy_to_devmem() callback.
> > + * @can_migrate_same_pagemap: Whether the copy function can migrate
> > + * device pages within a single drm_pagemap.
> 
> I forgot to delete this variable, in effort to save CI cycles, will fix
> this in next rev or when merging.

With source_peer_migrates removed:

    Reviewed-by: Francois Dugast <francois.dugast@intel.com>

> 
> Matt 
> 
> >   */
> >  struct drm_pagemap_migrate_details {
> >  	unsigned long timeslice_ms;
> > -- 
> > 2.34.1
> > 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 4/5] drm/pagemap: Split drm_pagemap_migrate_map_pages into device / system
  2026-02-19 20:10 ` [PATCH v5 4/5] drm/pagemap: Split drm_pagemap_migrate_map_pages into device / system Matthew Brost
@ 2026-04-02 14:12   ` Francois Dugast
  0 siblings, 0 replies; 11+ messages in thread
From: Francois Dugast @ 2026-04-02 14:12 UTC (permalink / raw)
  To: Matthew Brost
  Cc: intel-xe, dri-devel, leonro, jgg, thomas.hellstrom,
	himal.prasad.ghimiray

On Thu, Feb 19, 2026 at 12:10:56PM -0800, Matthew Brost wrote:
> Split drm_pagemap_migrate_map_pages into device / system helpers clearly
> seperating these operations. Will help with upcoming changes to split
> IOVA allocation steps.

Side effect is that it makes the code a lot more readable. A couple of
nits below.

Reviewed-by: Francois Dugast <francois.dugast@intel.com>

> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> 
> ---
> v5:
>  - s/map_device_pages/map_device_private_pages (Thomas)
>  - Fix map_system_pages kernel doc (Thomas)
> ---
>  drivers/gpu/drm/drm_pagemap.c | 150 ++++++++++++++++++++++------------
>  1 file changed, 100 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
> index 32535ab01c0f..ef8b9c69d1d4 100644
> --- a/drivers/gpu/drm/drm_pagemap.c
> +++ b/drivers/gpu/drm/drm_pagemap.c
> @@ -205,7 +205,8 @@ static void drm_pagemap_get_devmem_page(struct page *page,
>  }
>  
>  /**
> - * drm_pagemap_migrate_map_pages() - Map migration pages for GPU SVM migration
> + * drm_pagemap_migrate_map_device_private_pages() - Map device privaet migration

s/privaet/private/

> + * pages for GPU SVM migration
>   * @dev: The device performing the migration.
>   * @local_dpagemap: The drm_pagemap local to the migrating device.
>   * @pagemap_addr: Array to store DMA information corresponding to mapped pages.
> @@ -221,19 +222,22 @@ static void drm_pagemap_get_devmem_page(struct page *page,
>   *
>   * Returns: 0 on success, -EFAULT if an error occurs during mapping.
>   */
> -static int drm_pagemap_migrate_map_pages(struct device *dev,
> -					 struct drm_pagemap *local_dpagemap,
> -					 struct drm_pagemap_addr *pagemap_addr,
> -					 unsigned long *migrate_pfn,
> -					 unsigned long npages,
> -					 enum dma_data_direction dir,
> -					 const struct drm_pagemap_migrate_details *mdetails)
> +static int
> +drm_pagemap_migrate_map_device_private_pages(struct device *dev,
> +					     struct drm_pagemap *local_dpagemap,
> +					     struct drm_pagemap_addr *pagemap_addr,
> +					     unsigned long *migrate_pfn,
> +					     unsigned long npages,
> +					     enum dma_data_direction dir,
> +					     const struct drm_pagemap_migrate_details *mdetails)
>  {
>  	unsigned long num_peer_pages = 0, num_local_pages = 0, i;
>  
>  	for (i = 0; i < npages;) {
>  		struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
> -		dma_addr_t dma_addr;
> +		struct drm_pagemap_zdd *zdd;
> +		struct drm_pagemap *dpagemap;
> +		struct drm_pagemap_addr addr;
>  		struct folio *folio;
>  		unsigned int order = 0;
>  
> @@ -243,36 +247,26 @@ static int drm_pagemap_migrate_map_pages(struct device *dev,
>  		folio = page_folio(page);
>  		order = folio_order(folio);
>  
> -		if (is_device_private_page(page)) {
> -			struct drm_pagemap_zdd *zdd = drm_pagemap_page_zone_device_data(page);
> -			struct drm_pagemap *dpagemap = zdd->dpagemap;
> -			struct drm_pagemap_addr addr;
> -
> -			if (dpagemap == local_dpagemap) {
> -				if (!mdetails->can_migrate_same_pagemap)
> -					goto next;
> +		WARN_ON_ONCE(!is_device_private_page(page));

Another nit: could we move this line ^ above that one:

    folio = page_folio(page);

so that the check is the first thing we do with that page, and also to
have a bit more symmetry with +drm_pagemap_migrate_map_system_pages().

Francois

>  
> -				num_local_pages += NR_PAGES(order);
> -			} else {
> -				num_peer_pages += NR_PAGES(order);
> -			}
> +		zdd = drm_pagemap_page_zone_device_data(page);
> +		dpagemap = zdd->dpagemap;
>  
> -			addr = dpagemap->ops->device_map(dpagemap, dev, page, order, dir);
> -			if (dma_mapping_error(dev, addr.addr))
> -				return -EFAULT;
> +		if (dpagemap == local_dpagemap) {
> +			if (!mdetails->can_migrate_same_pagemap)
> +				goto next;
>  
> -			pagemap_addr[i] = addr;
> +			num_local_pages += NR_PAGES(order);
>  		} else {
> -			dma_addr = dma_map_page(dev, page, 0, page_size(page), dir);
> -			if (dma_mapping_error(dev, dma_addr))
> -				return -EFAULT;
> -
> -			pagemap_addr[i] =
> -				drm_pagemap_addr_encode(dma_addr,
> -							DRM_INTERCONNECT_SYSTEM,
> -							order, dir);
> +			num_peer_pages += NR_PAGES(order);
>  		}
>  
> +		addr = dpagemap->ops->device_map(dpagemap, dev, page, order, dir);
> +		if (dma_mapping_error(dev, addr.addr))
> +			return -EFAULT;
> +
> +		pagemap_addr[i] = addr;
> +
>  next:
>  		i += NR_PAGES(order);
>  	}
> @@ -287,6 +281,60 @@ static int drm_pagemap_migrate_map_pages(struct device *dev,
>  	return 0;
>  }
>  
> +/**
> + * drm_pagemap_migrate_map_system_pages() - Map system or device coherent
> + * migration pages for GPU SVM migration
> + * @dev: The device performing the migration.
> + * @pagemap_addr: Array to store DMA information corresponding to mapped pages.
> + * @migrate_pfn: Array of page frame numbers of system pages or peer pages to map.
> + * @npages: Number of system or device coherent pages to map.
> + * @dir: Direction of data transfer (e.g., DMA_BIDIRECTIONAL)
> + *
> + * This function maps pages of memory for migration usage in GPU SVM. It
> + * iterates over each page frame number provided in @migrate_pfn, maps the
> + * corresponding page, and stores the DMA address in the provided @dma_addr
> + * array.
> + *
> + * Returns: 0 on success, -EFAULT if an error occurs during mapping.
> + */
> +static int
> +drm_pagemap_migrate_map_system_pages(struct device *dev,
> +				     struct drm_pagemap_addr *pagemap_addr,
> +				     unsigned long *migrate_pfn,
> +				     unsigned long npages,
> +				     enum dma_data_direction dir)
> +{
> +	unsigned long i;
> +
> +	for (i = 0; i < npages;) {
> +		struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
> +		dma_addr_t dma_addr;
> +		struct folio *folio;
> +		unsigned int order = 0;
> +
> +		if (!page)
> +			goto next;
> +
> +		WARN_ON_ONCE(is_device_private_page(page));
> +		folio = page_folio(page);
> +		order = folio_order(folio);
> +
> +		dma_addr = dma_map_page(dev, page, 0, page_size(page), dir);
> +		if (dma_mapping_error(dev, dma_addr))
> +			return -EFAULT;
> +
> +		pagemap_addr[i] =
> +			drm_pagemap_addr_encode(dma_addr,
> +						DRM_INTERCONNECT_SYSTEM,
> +						order, dir);
> +
> +next:
> +		i += NR_PAGES(order);
> +	}
> +
> +	return 0;
> +}
> +
>  /**
>   * drm_pagemap_migrate_unmap_pages() - Unmap pages previously mapped for GPU SVM migration
>   * @dev: The device for which the pages were mapped
> @@ -347,9 +395,13 @@ drm_pagemap_migrate_remote_to_local(struct drm_pagemap_devmem *devmem,
>  				    const struct drm_pagemap_migrate_details *mdetails)
>  
>  {
> -	int err = drm_pagemap_migrate_map_pages(remote_device, remote_dpagemap,
> -						pagemap_addr, local_pfns,
> -						npages, DMA_FROM_DEVICE, mdetails);
> +	int err = drm_pagemap_migrate_map_device_private_pages(remote_device,
> +							       remote_dpagemap,
> +							       pagemap_addr,
> +							       local_pfns,
> +							       npages,
> +							       DMA_FROM_DEVICE,
> +							       mdetails);
>  
>  	if (err)
>  		goto out;
> @@ -368,12 +420,11 @@ drm_pagemap_migrate_sys_to_dev(struct drm_pagemap_devmem *devmem,
>  			       struct page *local_pages[],
>  			       struct drm_pagemap_addr pagemap_addr[],
>  			       unsigned long npages,
> -			       const struct drm_pagemap_devmem_ops *ops,
> -			       const struct drm_pagemap_migrate_details *mdetails)
> +			       const struct drm_pagemap_devmem_ops *ops)
>  {
> -	int err = drm_pagemap_migrate_map_pages(devmem->dev, devmem->dpagemap,
> -						pagemap_addr, sys_pfns, npages,
> -						DMA_TO_DEVICE, mdetails);
> +	int err = drm_pagemap_migrate_map_system_pages(devmem->dev,
> +						       pagemap_addr, sys_pfns,
> +						       npages, DMA_TO_DEVICE);
>  
>  	if (err)
>  		goto out;
> @@ -437,7 +488,7 @@ static int drm_pagemap_migrate_range(struct drm_pagemap_devmem *devmem,
>  						     &pages[last->start],
>  						     &pagemap_addr[last->start],
>  						     cur->start - last->start,
> -						     last->ops, mdetails);
> +						     last->ops);
>  
>  out:
>  	*last = *cur;
> @@ -942,7 +993,6 @@ EXPORT_SYMBOL(drm_pagemap_put);
>  int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
>  {
>  	const struct drm_pagemap_devmem_ops *ops = devmem_allocation->ops;
> -	struct drm_pagemap_migrate_details mdetails = {};
>  	unsigned long npages, mpages = 0;
>  	struct page **pages;
>  	unsigned long *src, *dst;
> @@ -981,10 +1031,10 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
>  	if (err || !mpages)
>  		goto err_finalize;
>  
> -	err = drm_pagemap_migrate_map_pages(devmem_allocation->dev,
> -					    devmem_allocation->dpagemap, pagemap_addr,
> -					    dst, npages, DMA_FROM_DEVICE,
> -					    &mdetails);
> +	err = drm_pagemap_migrate_map_system_pages(devmem_allocation->dev,
> +						   pagemap_addr,
> +						   dst, npages,
> +						   DMA_FROM_DEVICE);
>  	if (err)
>  		goto err_finalize;
>  
> @@ -1045,7 +1095,6 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
>  		MIGRATE_VMA_SELECT_DEVICE_COHERENT,
>  		.fault_page	= page,
>  	};
> -	struct drm_pagemap_migrate_details mdetails = {};
>  	struct drm_pagemap_zdd *zdd;
>  	const struct drm_pagemap_devmem_ops *ops;
>  	struct device *dev = NULL;
> @@ -1103,8 +1152,9 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
>  	if (err)
>  		goto err_finalize;
>  
> -	err = drm_pagemap_migrate_map_pages(dev, zdd->dpagemap, pagemap_addr, migrate.dst, npages,
> -					    DMA_FROM_DEVICE, &mdetails);
> +	err = drm_pagemap_migrate_map_system_pages(dev, pagemap_addr,
> +						   migrate.dst, npages,
> +						   DMA_FROM_DEVICE);
>  	if (err)
>  		goto err_finalize;
>  
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 5/5] drm/pagemap: Use dma-map IOVA alloc, link, and sync API for DRM pagemap
  2026-02-19 20:10 ` [PATCH v5 5/5] drm/pagemap: Use dma-map IOVA alloc, link, and sync API for DRM pagemap Matthew Brost
@ 2026-04-02 15:59   ` Francois Dugast
  2026-04-08 16:46     ` Matthew Brost
  0 siblings, 1 reply; 11+ messages in thread
From: Francois Dugast @ 2026-04-02 15:59 UTC (permalink / raw)
  To: Matthew Brost
  Cc: intel-xe, dri-devel, leonro, jgg, thomas.hellstrom,
	himal.prasad.ghimiray

On Thu, Feb 19, 2026 at 12:10:57PM -0800, Matthew Brost wrote:
> The dma-map IOVA alloc, link, and sync APIs perform significantly better
> than dma-map / dma-unmap, as they avoid costly IOMMU synchronizations.
> This difference is especially noticeable when mapping a 2MB region in
> 4KB pages.

Still a good improvement but with device THP now in drm-tip for GPU SVM,
the speedup is less noticeable when looking at latency and throughput.

> 
> Use the IOVA alloc, link, and sync APIs for DRM pagemap, which create DMA
> mappings between the CPU and GPU for copying data.
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> 
> ---
> v5:
>  - Remove extra newline (Thomas)
>  - Adjust alignemnt calculation (Thomas)
> ---
>  drivers/gpu/drm/drm_pagemap.c | 83 +++++++++++++++++++++++++++++------
>  1 file changed, 69 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
> index ef8b9c69d1d4..d9fceffce347 100644
> --- a/drivers/gpu/drm/drm_pagemap.c
> +++ b/drivers/gpu/drm/drm_pagemap.c
> @@ -281,6 +281,19 @@ drm_pagemap_migrate_map_device_private_pages(struct device *dev,
>  	return 0;
>  }
>  
> +/**
> + * struct drm_pagemap_iova_state - DRM pagemap IOVA state
> + * @dma_state: DMA IOVA state.
> + * @offset: Current offset in IOVA.
> + *
> + * This structure acts as an iterator for packing all IOVA addresses within a
> + * contiguous range.
> + */
> +struct drm_pagemap_iova_state {
> +	struct dma_iova_state dma_state;
> +	unsigned long offset;
> +};
> +
>  /**
>   * drm_pagemap_migrate_map_system_pages() - Map system or device coherent
>   * migration pages for GPU SVM migration
> @@ -289,6 +302,7 @@ drm_pagemap_migrate_map_device_private_pages(struct device *dev,
>   * @migrate_pfn: Array of page frame numbers of system pages or peer pages to map.
>   * @npages: Number of system or device coherent pages to map.
>   * @dir: Direction of data transfer (e.g., DMA_BIDIRECTIONAL)
> + * @state: DMA IOVA state for mapping.
>   *
>   * This function maps pages of memory for migration usage in GPU SVM. It
>   * iterates over each page frame number provided in @migrate_pfn, maps the

Not visible in this diff but we should update the doc as the return value is
not only 0 or -EFAULT, it can be any error code returned by dma_iova_link().

> @@ -302,9 +316,11 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
>  				     struct drm_pagemap_addr *pagemap_addr,
>  				     unsigned long *migrate_pfn,
>  				     unsigned long npages,
> -				     enum dma_data_direction dir)
> +				     enum dma_data_direction dir,
> +				     struct drm_pagemap_iova_state *state)
>  {
>  	unsigned long i;
> +	bool try_alloc = false;
>  
>  	for (i = 0; i < npages;) {
>  		struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
> @@ -319,9 +335,31 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
>  		folio = page_folio(page);
>  		order = folio_order(folio);
>  
> -		dma_addr = dma_map_page(dev, page, 0, page_size(page), dir);
> -		if (dma_mapping_error(dev, dma_addr))
> -			return -EFAULT;
> +		if (!try_alloc) {
> +			dma_iova_try_alloc(dev, &state->dma_state,
> +					   (npages - i) * PAGE_SIZE >=
> +					   HPAGE_PMD_SIZE ?
> +					   HPAGE_PMD_SIZE : 0,
> +					   npages * PAGE_SIZE);
> +			try_alloc = true;
> +		}
> +
> +		if (dma_use_iova(&state->dma_state)) {
> +			int err = dma_iova_link(dev, &state->dma_state,
> +						page_to_phys(page),
> +						state->offset, page_size(page),
> +						dir, 0);
> +			if (err)
> +				return err;
> +
> +			dma_addr = state->dma_state.addr + state->offset;
> +			state->offset += page_size(page);
> +		} else {
> +			dma_addr = dma_map_page(dev, page, 0, page_size(page),
> +						dir);
> +			if (dma_mapping_error(dev, dma_addr))
> +				return -EFAULT;
> +		}
>  
>  		pagemap_addr[i] =
>  			drm_pagemap_addr_encode(dma_addr,
> @@ -332,6 +370,9 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
>  		i += NR_PAGES(order);
>  	}
>  
> +	if (dma_use_iova(&state->dma_state))
> +		return dma_iova_sync(dev, &state->dma_state, 0, state->offset);
> +
>  	return 0;
>  }
>  
> @@ -343,6 +384,7 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
>   * @pagemap_addr: Array of DMA information corresponding to mapped pages
>   * @npages: Number of pages to unmap
>   * @dir: Direction of data transfer (e.g., DMA_BIDIRECTIONAL)
> + * @state: DMA IOVA state for mapping.
>   *
>   * This function unmaps previously mapped pages of memory for GPU Shared Virtual
>   * Memory (SVM). It iterates over each DMA address provided in @dma_addr, checks

While we are here: s/@dma_addr/@pagemap_addr/

Francois

> @@ -352,10 +394,17 @@ static void drm_pagemap_migrate_unmap_pages(struct device *dev,
>  					    struct drm_pagemap_addr *pagemap_addr,
>  					    unsigned long *migrate_pfn,
>  					    unsigned long npages,
> -					    enum dma_data_direction dir)
> +					    enum dma_data_direction dir,
> +					    struct drm_pagemap_iova_state *state)
>  {
>  	unsigned long i;
>  
> +	if (state && dma_use_iova(&state->dma_state)) {
> +		dma_iova_unlink(dev, &state->dma_state, 0, state->offset, dir, 0);
> +		dma_iova_free(dev, &state->dma_state);
> +		return;
> +	}
> +
>  	for (i = 0; i < npages;) {
>  		struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
>  
> @@ -410,7 +459,7 @@ drm_pagemap_migrate_remote_to_local(struct drm_pagemap_devmem *devmem,
>  			       devmem->pre_migrate_fence);
>  out:
>  	drm_pagemap_migrate_unmap_pages(remote_device, pagemap_addr, local_pfns,
> -					npages, DMA_FROM_DEVICE);
> +					npages, DMA_FROM_DEVICE, NULL);
>  	return err;
>  }
>  
> @@ -420,11 +469,13 @@ drm_pagemap_migrate_sys_to_dev(struct drm_pagemap_devmem *devmem,
>  			       struct page *local_pages[],
>  			       struct drm_pagemap_addr pagemap_addr[],
>  			       unsigned long npages,
> -			       const struct drm_pagemap_devmem_ops *ops)
> +			       const struct drm_pagemap_devmem_ops *ops,
> +			       struct drm_pagemap_iova_state *state)
>  {
>  	int err = drm_pagemap_migrate_map_system_pages(devmem->dev,
>  						       pagemap_addr, sys_pfns,
> -						       npages, DMA_TO_DEVICE);
> +						       npages, DMA_TO_DEVICE,
> +						       state);
>  
>  	if (err)
>  		goto out;
> @@ -433,7 +484,7 @@ drm_pagemap_migrate_sys_to_dev(struct drm_pagemap_devmem *devmem,
>  				  devmem->pre_migrate_fence);
>  out:
>  	drm_pagemap_migrate_unmap_pages(devmem->dev, pagemap_addr, sys_pfns, npages,
> -					DMA_TO_DEVICE);
> +					DMA_TO_DEVICE, state);
>  	return err;
>  }
>  
> @@ -461,6 +512,7 @@ static int drm_pagemap_migrate_range(struct drm_pagemap_devmem *devmem,
>  				     const struct migrate_range_loc *cur,
>  				     const struct drm_pagemap_migrate_details *mdetails)
>  {
> +	struct drm_pagemap_iova_state state = {};
>  	int ret = 0;
>  
>  	if (cur->start == 0)
> @@ -488,7 +540,7 @@ static int drm_pagemap_migrate_range(struct drm_pagemap_devmem *devmem,
>  						     &pages[last->start],
>  						     &pagemap_addr[last->start],
>  						     cur->start - last->start,
> -						     last->ops);
> +						     last->ops, &state);
>  
>  out:
>  	*last = *cur;
> @@ -993,6 +1045,7 @@ EXPORT_SYMBOL(drm_pagemap_put);
>  int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
>  {
>  	const struct drm_pagemap_devmem_ops *ops = devmem_allocation->ops;
> +	struct drm_pagemap_iova_state state = {};
>  	unsigned long npages, mpages = 0;
>  	struct page **pages;
>  	unsigned long *src, *dst;
> @@ -1034,7 +1087,7 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
>  	err = drm_pagemap_migrate_map_system_pages(devmem_allocation->dev,
>  						   pagemap_addr,
>  						   dst, npages,
> -						   DMA_FROM_DEVICE);
> +						   DMA_FROM_DEVICE, &state);
>  	if (err)
>  		goto err_finalize;
>  
> @@ -1051,7 +1104,7 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
>  	migrate_device_pages(src, dst, npages);
>  	migrate_device_finalize(src, dst, npages);
>  	drm_pagemap_migrate_unmap_pages(devmem_allocation->dev, pagemap_addr, dst, npages,
> -					DMA_FROM_DEVICE);
> +					DMA_FROM_DEVICE, &state);
>  
>  err_free:
>  	kvfree(buf);
> @@ -1095,6 +1148,7 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
>  		MIGRATE_VMA_SELECT_DEVICE_COHERENT,
>  		.fault_page	= page,
>  	};
> +	struct drm_pagemap_iova_state state = {};
>  	struct drm_pagemap_zdd *zdd;
>  	const struct drm_pagemap_devmem_ops *ops;
>  	struct device *dev = NULL;
> @@ -1154,7 +1208,7 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
>  
>  	err = drm_pagemap_migrate_map_system_pages(dev, pagemap_addr,
>  						   migrate.dst, npages,
> -						   DMA_FROM_DEVICE);
> +						   DMA_FROM_DEVICE, &state);
>  	if (err)
>  		goto err_finalize;
>  
> @@ -1172,7 +1226,8 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
>  	migrate_vma_finalize(&migrate);
>  	if (dev)
>  		drm_pagemap_migrate_unmap_pages(dev, pagemap_addr, migrate.dst,
> -						npages, DMA_FROM_DEVICE);
> +						npages, DMA_FROM_DEVICE,
> +						&state);
>  err_free:
>  	kvfree(buf);
>  err_out:
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 5/5] drm/pagemap: Use dma-map IOVA alloc, link, and sync API for DRM pagemap
  2026-04-02 15:59   ` Francois Dugast
@ 2026-04-08 16:46     ` Matthew Brost
  0 siblings, 0 replies; 11+ messages in thread
From: Matthew Brost @ 2026-04-08 16:46 UTC (permalink / raw)
  To: Francois Dugast
  Cc: intel-xe, dri-devel, leonro, jgg, thomas.hellstrom,
	himal.prasad.ghimiray

On Thu, Apr 02, 2026 at 05:59:21PM +0200, Francois Dugast wrote:
> On Thu, Feb 19, 2026 at 12:10:57PM -0800, Matthew Brost wrote:
> > The dma-map IOVA alloc, link, and sync APIs perform significantly better
> > than dma-map / dma-unmap, as they avoid costly IOMMU synchronizations.
> > This difference is especially noticeable when mapping a 2MB region in
> > 4KB pages.
> 
> Still a good improvement but with device THP now in drm-tip for GPU SVM,
> the speedup is less noticeable when looking at latency and throughput.
> 

Yes, it is less important with THP but 64k gets a speedup or memory gets
fragmented and THP allocation fails we will get a perf win.

> > 
> > Use the IOVA alloc, link, and sync APIs for DRM pagemap, which create DMA
> > mappings between the CPU and GPU for copying data.
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > 
> > ---
> > v5:
> >  - Remove extra newline (Thomas)
> >  - Adjust alignemnt calculation (Thomas)
> > ---
> >  drivers/gpu/drm/drm_pagemap.c | 83 +++++++++++++++++++++++++++++------
> >  1 file changed, 69 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
> > index ef8b9c69d1d4..d9fceffce347 100644
> > --- a/drivers/gpu/drm/drm_pagemap.c
> > +++ b/drivers/gpu/drm/drm_pagemap.c
> > @@ -281,6 +281,19 @@ drm_pagemap_migrate_map_device_private_pages(struct device *dev,
> >  	return 0;
> >  }
> >  
> > +/**
> > + * struct drm_pagemap_iova_state - DRM pagemap IOVA state
> > + * @dma_state: DMA IOVA state.
> > + * @offset: Current offset in IOVA.
> > + *
> > + * This structure acts as an iterator for packing all IOVA addresses within a
> > + * contiguous range.
> > + */
> > +struct drm_pagemap_iova_state {
> > +	struct dma_iova_state dma_state;
> > +	unsigned long offset;
> > +};
> > +
> >  /**
> >   * drm_pagemap_migrate_map_system_pages() - Map system or device coherent
> >   * migration pages for GPU SVM migration
> > @@ -289,6 +302,7 @@ drm_pagemap_migrate_map_device_private_pages(struct device *dev,
> >   * @migrate_pfn: Array of page frame numbers of system pages or peer pages to map.
> >   * @npages: Number of system or device coherent pages to map.
> >   * @dir: Direction of data transfer (e.g., DMA_BIDIRECTIONAL)
> > + * @state: DMA IOVA state for mapping.
> >   *
> >   * This function maps pages of memory for migration usage in GPU SVM. It
> >   * iterates over each page frame number provided in @migrate_pfn, maps the
> 
> Not visible in this diff but we should update the doc as the return value is
> not only 0 or -EFAULT, it can be any error code returned by dma_iova_link().
> 

Will fix.

> > @@ -302,9 +316,11 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
> >  				     struct drm_pagemap_addr *pagemap_addr,
> >  				     unsigned long *migrate_pfn,
> >  				     unsigned long npages,
> > -				     enum dma_data_direction dir)
> > +				     enum dma_data_direction dir,
> > +				     struct drm_pagemap_iova_state *state)
> >  {
> >  	unsigned long i;
> > +	bool try_alloc = false;
> >  
> >  	for (i = 0; i < npages;) {
> >  		struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
> > @@ -319,9 +335,31 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
> >  		folio = page_folio(page);
> >  		order = folio_order(folio);
> >  
> > -		dma_addr = dma_map_page(dev, page, 0, page_size(page), dir);
> > -		if (dma_mapping_error(dev, dma_addr))
> > -			return -EFAULT;
> > +		if (!try_alloc) {
> > +			dma_iova_try_alloc(dev, &state->dma_state,
> > +					   (npages - i) * PAGE_SIZE >=
> > +					   HPAGE_PMD_SIZE ?
> > +					   HPAGE_PMD_SIZE : 0,
> > +					   npages * PAGE_SIZE);
> > +			try_alloc = true;
> > +		}
> > +
> > +		if (dma_use_iova(&state->dma_state)) {
> > +			int err = dma_iova_link(dev, &state->dma_state,
> > +						page_to_phys(page),
> > +						state->offset, page_size(page),
> > +						dir, 0);
> > +			if (err)
> > +				return err;
> > +
> > +			dma_addr = state->dma_state.addr + state->offset;
> > +			state->offset += page_size(page);
> > +		} else {
> > +			dma_addr = dma_map_page(dev, page, 0, page_size(page),
> > +						dir);
> > +			if (dma_mapping_error(dev, dma_addr))
> > +				return -EFAULT;
> > +		}
> >  
> >  		pagemap_addr[i] =
> >  			drm_pagemap_addr_encode(dma_addr,
> > @@ -332,6 +370,9 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
> >  		i += NR_PAGES(order);
> >  	}
> >  
> > +	if (dma_use_iova(&state->dma_state))
> > +		return dma_iova_sync(dev, &state->dma_state, 0, state->offset);
> > +
> >  	return 0;
> >  }
> >  
> > @@ -343,6 +384,7 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
> >   * @pagemap_addr: Array of DMA information corresponding to mapped pages
> >   * @npages: Number of pages to unmap
> >   * @dir: Direction of data transfer (e.g., DMA_BIDIRECTIONAL)
> > + * @state: DMA IOVA state for mapping.
> >   *
> >   * This function unmaps previously mapped pages of memory for GPU Shared Virtual
> >   * Memory (SVM). It iterates over each DMA address provided in @dma_addr, checks
> 
> While we are here: s/@dma_addr/@pagemap_addr/
> 

Will fix.

Matt

> Francois
> 
> > @@ -352,10 +394,17 @@ static void drm_pagemap_migrate_unmap_pages(struct device *dev,
> >  					    struct drm_pagemap_addr *pagemap_addr,
> >  					    unsigned long *migrate_pfn,
> >  					    unsigned long npages,
> > -					    enum dma_data_direction dir)
> > +					    enum dma_data_direction dir,
> > +					    struct drm_pagemap_iova_state *state)
> >  {
> >  	unsigned long i;
> >  
> > +	if (state && dma_use_iova(&state->dma_state)) {
> > +		dma_iova_unlink(dev, &state->dma_state, 0, state->offset, dir, 0);
> > +		dma_iova_free(dev, &state->dma_state);
> > +		return;
> > +	}
> > +
> >  	for (i = 0; i < npages;) {
> >  		struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
> >  
> > @@ -410,7 +459,7 @@ drm_pagemap_migrate_remote_to_local(struct drm_pagemap_devmem *devmem,
> >  			       devmem->pre_migrate_fence);
> >  out:
> >  	drm_pagemap_migrate_unmap_pages(remote_device, pagemap_addr, local_pfns,
> > -					npages, DMA_FROM_DEVICE);
> > +					npages, DMA_FROM_DEVICE, NULL);
> >  	return err;
> >  }
> >  
> > @@ -420,11 +469,13 @@ drm_pagemap_migrate_sys_to_dev(struct drm_pagemap_devmem *devmem,
> >  			       struct page *local_pages[],
> >  			       struct drm_pagemap_addr pagemap_addr[],
> >  			       unsigned long npages,
> > -			       const struct drm_pagemap_devmem_ops *ops)
> > +			       const struct drm_pagemap_devmem_ops *ops,
> > +			       struct drm_pagemap_iova_state *state)
> >  {
> >  	int err = drm_pagemap_migrate_map_system_pages(devmem->dev,
> >  						       pagemap_addr, sys_pfns,
> > -						       npages, DMA_TO_DEVICE);
> > +						       npages, DMA_TO_DEVICE,
> > +						       state);
> >  
> >  	if (err)
> >  		goto out;
> > @@ -433,7 +484,7 @@ drm_pagemap_migrate_sys_to_dev(struct drm_pagemap_devmem *devmem,
> >  				  devmem->pre_migrate_fence);
> >  out:
> >  	drm_pagemap_migrate_unmap_pages(devmem->dev, pagemap_addr, sys_pfns, npages,
> > -					DMA_TO_DEVICE);
> > +					DMA_TO_DEVICE, state);
> >  	return err;
> >  }
> >  
> > @@ -461,6 +512,7 @@ static int drm_pagemap_migrate_range(struct drm_pagemap_devmem *devmem,
> >  				     const struct migrate_range_loc *cur,
> >  				     const struct drm_pagemap_migrate_details *mdetails)
> >  {
> > +	struct drm_pagemap_iova_state state = {};
> >  	int ret = 0;
> >  
> >  	if (cur->start == 0)
> > @@ -488,7 +540,7 @@ static int drm_pagemap_migrate_range(struct drm_pagemap_devmem *devmem,
> >  						     &pages[last->start],
> >  						     &pagemap_addr[last->start],
> >  						     cur->start - last->start,
> > -						     last->ops);
> > +						     last->ops, &state);
> >  
> >  out:
> >  	*last = *cur;
> > @@ -993,6 +1045,7 @@ EXPORT_SYMBOL(drm_pagemap_put);
> >  int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
> >  {
> >  	const struct drm_pagemap_devmem_ops *ops = devmem_allocation->ops;
> > +	struct drm_pagemap_iova_state state = {};
> >  	unsigned long npages, mpages = 0;
> >  	struct page **pages;
> >  	unsigned long *src, *dst;
> > @@ -1034,7 +1087,7 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
> >  	err = drm_pagemap_migrate_map_system_pages(devmem_allocation->dev,
> >  						   pagemap_addr,
> >  						   dst, npages,
> > -						   DMA_FROM_DEVICE);
> > +						   DMA_FROM_DEVICE, &state);
> >  	if (err)
> >  		goto err_finalize;
> >  
> > @@ -1051,7 +1104,7 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
> >  	migrate_device_pages(src, dst, npages);
> >  	migrate_device_finalize(src, dst, npages);
> >  	drm_pagemap_migrate_unmap_pages(devmem_allocation->dev, pagemap_addr, dst, npages,
> > -					DMA_FROM_DEVICE);
> > +					DMA_FROM_DEVICE, &state);
> >  
> >  err_free:
> >  	kvfree(buf);
> > @@ -1095,6 +1148,7 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
> >  		MIGRATE_VMA_SELECT_DEVICE_COHERENT,
> >  		.fault_page	= page,
> >  	};
> > +	struct drm_pagemap_iova_state state = {};
> >  	struct drm_pagemap_zdd *zdd;
> >  	const struct drm_pagemap_devmem_ops *ops;
> >  	struct device *dev = NULL;
> > @@ -1154,7 +1208,7 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
> >  
> >  	err = drm_pagemap_migrate_map_system_pages(dev, pagemap_addr,
> >  						   migrate.dst, npages,
> > -						   DMA_FROM_DEVICE);
> > +						   DMA_FROM_DEVICE, &state);
> >  	if (err)
> >  		goto err_finalize;
> >  
> > @@ -1172,7 +1226,8 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
> >  	migrate_vma_finalize(&migrate);
> >  	if (dev)
> >  		drm_pagemap_migrate_unmap_pages(dev, pagemap_addr, migrate.dst,
> > -						npages, DMA_FROM_DEVICE);
> > +						npages, DMA_FROM_DEVICE,
> > +						&state);
> >  err_free:
> >  	kvfree(buf);
> >  err_out:
> > -- 
> > 2.34.1
> > 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-04-08 16:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-19 20:10 [PATCH v5 0/5] Use new dma-map IOVA alloc, link, and sync API in GPU SVM and DRM pagemap Matthew Brost
2026-02-19 20:10 ` [PATCH v5 1/5] drm/pagemap: Add helper to access zone_device_data Matthew Brost
2026-02-19 20:10 ` [PATCH v5 2/5] drm/gpusvm: Use dma-map IOVA alloc, link, and sync API in GPU SVM Matthew Brost
2026-02-19 20:10 ` [PATCH v5 3/5] drm/pagemap: Drop source_peer_migrates flag and assume true Matthew Brost
2026-02-19 20:53   ` Matthew Brost
2026-04-02 10:33     ` Francois Dugast
2026-02-19 20:10 ` [PATCH v5 4/5] drm/pagemap: Split drm_pagemap_migrate_map_pages into device / system Matthew Brost
2026-04-02 14:12   ` Francois Dugast
2026-02-19 20:10 ` [PATCH v5 5/5] drm/pagemap: Use dma-map IOVA alloc, link, and sync API for DRM pagemap Matthew Brost
2026-04-02 15:59   ` Francois Dugast
2026-04-08 16:46     ` Matthew Brost

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox