* [PATCH 0/7] Replace xe_hmm with gpusvm
@ 2025-03-20 17:29 Matthew Auld
2025-03-20 17:29 ` [PATCH 1/7] drm/gpusvm: fix hmm_pfn_to_map_order() usage Matthew Auld
` (6 more replies)
0 siblings, 7 replies; 22+ messages in thread
From: Matthew Auld @ 2025-03-20 17:29 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel
As a first step to moving userptr handling over to drm, replace the hmm
usage in xe over to gpusvm, which already offers similar functionality. As
some prep steps we also align on some of the missing pieces that were
already handled in xe_hmm.
--
2.48.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 1/7] drm/gpusvm: fix hmm_pfn_to_map_order() usage
2025-03-20 17:29 [PATCH 0/7] Replace xe_hmm with gpusvm Matthew Auld
@ 2025-03-20 17:29 ` Matthew Auld
2025-03-20 19:52 ` Thomas Hellström
2025-03-20 20:40 ` Matthew Brost
2025-03-20 17:29 ` [PATCH 2/7] drm/gpusvm: use more selective dma dir in get_pages() Matthew Auld
` (5 subsequent siblings)
6 siblings, 2 replies; 22+ messages in thread
From: Matthew Auld @ 2025-03-20 17:29 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, Thomas Hellström, Matthew Brost
Handle the case where the hmm range partially covers a huge page (like
2M), otherwise we can potentially end up doing something nasty like
mapping memory which potentially is outside the range, and maybe not
even mapped by the mm. Fix is based on the xe userptr code, which in a
future patch will directly use gpusvm, so needs alignment here.
Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/drm_gpusvm.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
index 2451c816edd5..48993cef4a74 100644
--- a/drivers/gpu/drm/drm_gpusvm.c
+++ b/drivers/gpu/drm/drm_gpusvm.c
@@ -817,6 +817,27 @@ drm_gpusvm_range_alloc(struct drm_gpusvm *gpusvm,
return range;
}
+/*
+ * To allow skipping PFNs with the same flags (like when they belong to
+ * the same huge PTE) when looping over the pfn array, take a given a hmm_pfn,
+ * and return the largest order that will fit inside the PTE, but also crucially
+ * accounting for the original hmm range boundaries.
+ */
+static unsigned int drm_gpusvm_hmm_pfn_to_order(unsigned long hmm_pfn,
+ unsigned long hmm_pfn_index,
+ unsigned long npages)
+{
+ unsigned long size;
+
+ size = 1UL << hmm_pfn_to_map_order(hmm_pfn);
+ size -= (hmm_pfn & ~HMM_PFN_FLAGS) & (size - 1);
+ hmm_pfn_index += size;
+ if (hmm_pfn_index > npages)
+ size -= (hmm_pfn_index - npages);
+
+ return fls(size) - 1;
+}
+
/**
* drm_gpusvm_check_pages() - Check pages
* @gpusvm: Pointer to the GPU SVM structure
@@ -875,7 +896,7 @@ static bool drm_gpusvm_check_pages(struct drm_gpusvm *gpusvm,
err = -EFAULT;
goto err_free;
}
- i += 0x1 << hmm_pfn_to_map_order(pfns[i]);
+ i += 0x1 << drm_gpusvm_hmm_pfn_to_order(pfns[i], i, npages);
}
err_free:
@@ -1408,7 +1429,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
for (i = 0, j = 0; i < npages; ++j) {
struct page *page = hmm_pfn_to_page(pfns[i]);
- order = hmm_pfn_to_map_order(pfns[i]);
+ order = drm_gpusvm_hmm_pfn_to_order(pfns[i], i, npages);
if (is_device_private_page(page) ||
is_device_coherent_page(page)) {
if (zdd != page->zone_device_data && i > 0) {
--
2.48.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 2/7] drm/gpusvm: use more selective dma dir in get_pages()
2025-03-20 17:29 [PATCH 0/7] Replace xe_hmm with gpusvm Matthew Auld
2025-03-20 17:29 ` [PATCH 1/7] drm/gpusvm: fix hmm_pfn_to_map_order() usage Matthew Auld
@ 2025-03-20 17:29 ` Matthew Auld
2025-03-20 19:31 ` Thomas Hellström
2025-03-20 19:37 ` Matthew Brost
2025-03-20 17:30 ` [PATCH 3/7] drm/gpusvm: mark pages as dirty Matthew Auld
` (4 subsequent siblings)
6 siblings, 2 replies; 22+ messages in thread
From: Matthew Auld @ 2025-03-20 17:29 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, Thomas Hellström, Matthew Brost
If we are only reading the memory then from the device pov the direction
can be DMA_TO_DEVICE. This aligns with the xe-userptr code. Using the
most restrictive data direction to represent the access is normally a
good idea.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/drm_gpusvm.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
index 48993cef4a74..7f1cf5492bba 100644
--- a/drivers/gpu/drm/drm_gpusvm.c
+++ b/drivers/gpu/drm/drm_gpusvm.c
@@ -1355,6 +1355,8 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
int err = 0;
struct dev_pagemap *pagemap;
struct drm_pagemap *dpagemap;
+ enum dma_data_direction dma_dir = ctx->read_only ? DMA_TO_DEVICE :
+ DMA_BIDIRECTIONAL;
retry:
hmm_range.notifier_seq = mmu_interval_read_begin(notifier);
@@ -1459,7 +1461,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
dpagemap->ops->device_map(dpagemap,
gpusvm->drm->dev,
page, order,
- DMA_BIDIRECTIONAL);
+ dma_dir);
if (dma_mapping_error(gpusvm->drm->dev,
range->dma_addr[j].addr)) {
err = -EFAULT;
@@ -1478,7 +1480,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
addr = dma_map_page(gpusvm->drm->dev,
page, 0,
PAGE_SIZE << order,
- DMA_BIDIRECTIONAL);
+ dma_dir);
if (dma_mapping_error(gpusvm->drm->dev, addr)) {
err = -EFAULT;
goto err_unmap;
@@ -1486,7 +1488,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
range->dma_addr[j] = drm_pagemap_device_addr_encode
(addr, DRM_INTERCONNECT_SYSTEM, order,
- DMA_BIDIRECTIONAL);
+ dma_dir);
}
i += 1 << order;
num_dma_mapped = i;
--
2.48.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 3/7] drm/gpusvm: mark pages as dirty
2025-03-20 17:29 [PATCH 0/7] Replace xe_hmm with gpusvm Matthew Auld
2025-03-20 17:29 ` [PATCH 1/7] drm/gpusvm: fix hmm_pfn_to_map_order() usage Matthew Auld
2025-03-20 17:29 ` [PATCH 2/7] drm/gpusvm: use more selective dma dir in get_pages() Matthew Auld
@ 2025-03-20 17:30 ` Matthew Auld
2025-03-20 19:29 ` Thomas Hellström
2025-03-20 17:30 ` [PATCH 4/7] drm/gpusvm: pull out drm_gpusvm_pages substructure Matthew Auld
` (3 subsequent siblings)
6 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2025-03-20 17:30 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, Thomas Hellström, Matthew Brost
If the memory is going to be accessed by the device, make sure we mark
the pages accordingly such that the kernel knows this. This aligns with
the xe-userptr code.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/drm_gpusvm.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
index 7f1cf5492bba..5b4ecd36dff1 100644
--- a/drivers/gpu/drm/drm_gpusvm.c
+++ b/drivers/gpu/drm/drm_gpusvm.c
@@ -1471,6 +1471,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
pages[i] = page;
} else {
dma_addr_t addr;
+ unsigned int k;
if (is_zone_device_page(page) || zdd) {
err = -EOPNOTSUPP;
@@ -1489,6 +1490,14 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
range->dma_addr[j] = drm_pagemap_device_addr_encode
(addr, DRM_INTERCONNECT_SYSTEM, order,
dma_dir);
+
+ for (k = 0; k < 1u << order; k++) {
+ if (!ctx->read_only)
+ set_page_dirty_lock(page);
+
+ mark_page_accessed(page);
+ page++;
+ }
}
i += 1 << order;
num_dma_mapped = i;
--
2.48.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 4/7] drm/gpusvm: pull out drm_gpusvm_pages substructure
2025-03-20 17:29 [PATCH 0/7] Replace xe_hmm with gpusvm Matthew Auld
` (2 preceding siblings ...)
2025-03-20 17:30 ` [PATCH 3/7] drm/gpusvm: mark pages as dirty Matthew Auld
@ 2025-03-20 17:30 ` Matthew Auld
2025-03-20 20:43 ` Matthew Brost
2025-03-20 17:30 ` [PATCH 5/7] drm/gpusvm: lower get/unmap pages Matthew Auld
` (2 subsequent siblings)
6 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2025-03-20 17:30 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, Matthew Brost, Thomas Hellström
Pull the pages stuff from the svm range into its own substructure, with
the idea of having the main pages related routines, like get_pages(),
unmap_pages() and free_pages() all operating on some lower level
structures, which can then be re-used for stuff like userptr which wants
to use basic stuff like get_pages(), but not all the other more complex
svm stuff.
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
drivers/gpu/drm/drm_gpusvm.c | 95 +++++++++++++++++++-----------------
drivers/gpu/drm/xe/xe_pt.c | 2 +-
drivers/gpu/drm/xe/xe_svm.c | 6 +--
drivers/gpu/drm/xe/xe_svm.h | 2 +-
include/drm/drm_gpusvm.h | 43 +++++++++-------
5 files changed, 82 insertions(+), 66 deletions(-)
diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
index 5b4ecd36dff1..f27731a51f34 100644
--- a/drivers/gpu/drm/drm_gpusvm.c
+++ b/drivers/gpu/drm/drm_gpusvm.c
@@ -812,7 +812,7 @@ drm_gpusvm_range_alloc(struct drm_gpusvm *gpusvm,
range->itree.last = ALIGN(fault_addr + 1, chunk_size) - 1;
INIT_LIST_HEAD(&range->entry);
range->notifier_seq = LONG_MAX;
- range->flags.migrate_devmem = migrate_devmem ? 1 : 0;
+ range->pages.flags.migrate_devmem = migrate_devmem ? 1 : 0;
return range;
}
@@ -1120,27 +1120,27 @@ drm_gpusvm_range_find_or_insert(struct drm_gpusvm *gpusvm,
EXPORT_SYMBOL_GPL(drm_gpusvm_range_find_or_insert);
/**
- * __drm_gpusvm_range_unmap_pages() - Unmap pages associated with a GPU SVM range (internal)
+ * __drm_gpusvm_unmap_pages() - Unmap pages associated with a GPU SVM range (internal)
* @gpusvm: Pointer to the GPU SVM structure
- * @range: Pointer to the GPU SVM range structure
+ * @svm_pages: Pointer to the GPU SVM pages structure
* @npages: Number of pages to unmap
*
* This function unmap pages associated with a GPU SVM range. Assumes and
* asserts correct locking is in place when called.
*/
-static void __drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
- struct drm_gpusvm_range *range,
- unsigned long npages)
+static void __drm_gpusvm_unmap_pages(struct drm_gpusvm *gpusvm,
+ struct drm_gpusvm_pages *svm_pages,
+ unsigned long npages)
{
unsigned long i, j;
- struct drm_pagemap *dpagemap = range->dpagemap;
+ struct drm_pagemap *dpagemap = svm_pages->dpagemap;
struct device *dev = gpusvm->drm->dev;
lockdep_assert_held(&gpusvm->notifier_lock);
- if (range->flags.has_dma_mapping) {
+ if (svm_pages->flags.has_dma_mapping) {
for (i = 0, j = 0; i < npages; j++) {
- struct drm_pagemap_device_addr *addr = &range->dma_addr[j];
+ struct drm_pagemap_device_addr *addr = &svm_pages->dma_addr[j];
if (addr->proto == DRM_INTERCONNECT_SYSTEM)
dma_unmap_page(dev,
@@ -1152,9 +1152,9 @@ static void __drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
dev, *addr);
i += 1 << addr->order;
}
- range->flags.has_devmem_pages = false;
- range->flags.has_dma_mapping = false;
- range->dpagemap = NULL;
+ svm_pages->flags.has_devmem_pages = false;
+ svm_pages->flags.has_dma_mapping = false;
+ svm_pages->dpagemap = NULL;
}
}
@@ -1165,14 +1165,14 @@ static void __drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
*
* This function frees the dma address array associated with a GPU SVM range.
*/
-static void drm_gpusvm_range_free_pages(struct drm_gpusvm *gpusvm,
- struct drm_gpusvm_range *range)
+static void drm_gpusvm_free_pages(struct drm_gpusvm *gpusvm,
+ struct drm_gpusvm_pages *svm_pages)
{
lockdep_assert_held(&gpusvm->notifier_lock);
- if (range->dma_addr) {
- kvfree(range->dma_addr);
- range->dma_addr = NULL;
+ if (svm_pages->dma_addr) {
+ kvfree(svm_pages->dma_addr);
+ svm_pages->dma_addr = NULL;
}
}
@@ -1200,8 +1200,8 @@ void drm_gpusvm_range_remove(struct drm_gpusvm *gpusvm,
return;
drm_gpusvm_notifier_lock(gpusvm);
- __drm_gpusvm_range_unmap_pages(gpusvm, range, npages);
- drm_gpusvm_range_free_pages(gpusvm, range);
+ __drm_gpusvm_unmap_pages(gpusvm, &range->pages, npages);
+ drm_gpusvm_free_pages(gpusvm, &range->pages);
__drm_gpusvm_range_remove(notifier, range);
drm_gpusvm_notifier_unlock(gpusvm);
@@ -1266,6 +1266,14 @@ void drm_gpusvm_range_put(struct drm_gpusvm_range *range)
}
EXPORT_SYMBOL_GPL(drm_gpusvm_range_put);
+static bool drm_gpusvm_pages_valid(struct drm_gpusvm *gpusvm,
+ struct drm_gpusvm_pages *svm_pages)
+{
+ lockdep_assert_held(&gpusvm->notifier_lock);
+
+ return svm_pages->flags.has_devmem_pages || svm_pages->flags.has_dma_mapping;
+}
+
/**
* drm_gpusvm_range_pages_valid() - GPU SVM range pages valid
* @gpusvm: Pointer to the GPU SVM structure
@@ -1283,9 +1291,7 @@ EXPORT_SYMBOL_GPL(drm_gpusvm_range_put);
bool drm_gpusvm_range_pages_valid(struct drm_gpusvm *gpusvm,
struct drm_gpusvm_range *range)
{
- lockdep_assert_held(&gpusvm->notifier_lock);
-
- return range->flags.has_devmem_pages || range->flags.has_dma_mapping;
+ return drm_gpusvm_pages_valid(gpusvm, &range->pages);
}
EXPORT_SYMBOL_GPL(drm_gpusvm_range_pages_valid);
@@ -1301,17 +1307,17 @@ EXPORT_SYMBOL_GPL(drm_gpusvm_range_pages_valid);
*/
static bool
drm_gpusvm_range_pages_valid_unlocked(struct drm_gpusvm *gpusvm,
- struct drm_gpusvm_range *range)
+ struct drm_gpusvm_pages *svm_pages)
{
bool pages_valid;
- if (!range->dma_addr)
+ if (!svm_pages->dma_addr)
return false;
drm_gpusvm_notifier_lock(gpusvm);
- pages_valid = drm_gpusvm_range_pages_valid(gpusvm, range);
+ pages_valid = drm_gpusvm_pages_valid(gpusvm, svm_pages);
if (!pages_valid)
- drm_gpusvm_range_free_pages(gpusvm, range);
+ drm_gpusvm_free_pages(gpusvm, svm_pages);
drm_gpusvm_notifier_unlock(gpusvm);
return pages_valid;
@@ -1332,6 +1338,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
struct drm_gpusvm_range *range,
const struct drm_gpusvm_ctx *ctx)
{
+ struct drm_gpusvm_pages *svm_pages = &range->pages;
struct mmu_interval_notifier *notifier = &range->notifier->notifier;
struct hmm_range hmm_range = {
.default_flags = HMM_PFN_REQ_FAULT | (ctx->read_only ? 0 :
@@ -1360,7 +1367,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
retry:
hmm_range.notifier_seq = mmu_interval_read_begin(notifier);
- if (drm_gpusvm_range_pages_valid_unlocked(gpusvm, range))
+ if (drm_gpusvm_range_pages_valid_unlocked(gpusvm, svm_pages))
goto set_seqno;
pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL);
@@ -1401,7 +1408,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
*/
drm_gpusvm_notifier_lock(gpusvm);
- if (range->flags.unmapped) {
+ if (svm_pages->flags.unmapped) {
drm_gpusvm_notifier_unlock(gpusvm);
err = -EFAULT;
goto err_free;
@@ -1413,13 +1420,12 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
goto retry;
}
- if (!range->dma_addr) {
+ if (!svm_pages->dma_addr) {
/* Unlock and restart mapping to allocate memory. */
drm_gpusvm_notifier_unlock(gpusvm);
- range->dma_addr = kvmalloc_array(npages,
- sizeof(*range->dma_addr),
- GFP_KERNEL);
- if (!range->dma_addr) {
+ svm_pages->dma_addr =
+ kvmalloc_array(npages, sizeof(*svm_pages->dma_addr), GFP_KERNEL);
+ if (!svm_pages->dma_addr) {
err = -ENOMEM;
goto err_free;
}
@@ -1457,13 +1463,13 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
goto err_unmap;
}
}
- range->dma_addr[j] =
+ svm_pages->dma_addr[j] =
dpagemap->ops->device_map(dpagemap,
gpusvm->drm->dev,
page, order,
dma_dir);
if (dma_mapping_error(gpusvm->drm->dev,
- range->dma_addr[j].addr)) {
+ svm_pages->dma_addr[j].addr)) {
err = -EFAULT;
goto err_unmap;
}
@@ -1487,7 +1493,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
goto err_unmap;
}
- range->dma_addr[j] = drm_pagemap_device_addr_encode
+ svm_pages->dma_addr[j] = drm_pagemap_device_addr_encode
(addr, DRM_INTERCONNECT_SYSTEM, order,
dma_dir);
@@ -1503,10 +1509,10 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
num_dma_mapped = i;
}
- range->flags.has_dma_mapping = true;
+ svm_pages->flags.has_dma_mapping = true;
if (zdd) {
- range->flags.has_devmem_pages = true;
- range->dpagemap = dpagemap;
+ svm_pages->flags.has_devmem_pages = true;
+ svm_pages->dpagemap = dpagemap;
}
drm_gpusvm_notifier_unlock(gpusvm);
@@ -1517,7 +1523,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
return 0;
err_unmap:
- __drm_gpusvm_range_unmap_pages(gpusvm, range, num_dma_mapped);
+ __drm_gpusvm_unmap_pages(gpusvm, svm_pages, num_dma_mapped);
drm_gpusvm_notifier_unlock(gpusvm);
err_free:
kvfree(pfns);
@@ -1543,6 +1549,7 @@ void drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
struct drm_gpusvm_range *range,
const struct drm_gpusvm_ctx *ctx)
{
+ struct drm_gpusvm_pages *svm_pages = &range->pages;
unsigned long npages = npages_in_range(drm_gpusvm_range_start(range),
drm_gpusvm_range_end(range));
@@ -1551,7 +1558,7 @@ void drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
else
drm_gpusvm_notifier_lock(gpusvm);
- __drm_gpusvm_range_unmap_pages(gpusvm, range, npages);
+ __drm_gpusvm_unmap_pages(gpusvm, svm_pages, npages);
if (!ctx->in_notifier)
drm_gpusvm_notifier_unlock(gpusvm);
@@ -1719,7 +1726,7 @@ int drm_gpusvm_migrate_to_devmem(struct drm_gpusvm *gpusvm,
mmap_assert_locked(gpusvm->mm);
- if (!range->flags.migrate_devmem)
+ if (!range->pages.flags.migrate_devmem)
return -EINVAL;
if (!ops->populate_devmem_pfn || !ops->copy_to_devmem ||
@@ -2248,10 +2255,10 @@ void drm_gpusvm_range_set_unmapped(struct drm_gpusvm_range *range,
{
lockdep_assert_held_write(&range->gpusvm->notifier_lock);
- range->flags.unmapped = true;
+ range->pages.flags.unmapped = true;
if (drm_gpusvm_range_start(range) < mmu_range->start ||
drm_gpusvm_range_end(range) > mmu_range->end)
- range->flags.partial_unmap = true;
+ range->pages.flags.partial_unmap = true;
}
EXPORT_SYMBOL_GPL(drm_gpusvm_range_set_unmapped);
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index ffaf0d02dc7d..c43e7619cb80 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -659,7 +659,7 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
return -EAGAIN;
}
if (xe_svm_range_has_dma_mapping(range)) {
- xe_res_first_dma(range->base.dma_addr, 0,
+ xe_res_first_dma(range->base.pages.dma_addr, 0,
range->base.itree.last + 1 - range->base.itree.start,
&curs);
is_devmem = xe_res_is_vram(&curs);
diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
index 08617a62ab07..4e7f2e77b38d 100644
--- a/drivers/gpu/drm/xe/xe_svm.c
+++ b/drivers/gpu/drm/xe/xe_svm.c
@@ -17,7 +17,7 @@
static bool xe_svm_range_in_vram(struct xe_svm_range *range)
{
/* Not reliable without notifier lock */
- return range->base.flags.has_devmem_pages;
+ return range->base.pages.flags.has_devmem_pages;
}
static bool xe_svm_range_has_vram_binding(struct xe_svm_range *range)
@@ -135,7 +135,7 @@ xe_svm_range_notifier_event_begin(struct xe_vm *vm, struct drm_gpusvm_range *r,
range_debug(range, "NOTIFIER");
/* Skip if already unmapped or if no binding exist */
- if (range->base.flags.unmapped || !range->tile_present)
+ if (range->base.pages.flags.unmapped || !range->tile_present)
return 0;
range_debug(range, "NOTIFIER - EXECUTE");
@@ -766,7 +766,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
range_debug(range, "PAGE FAULT");
/* XXX: Add migration policy, for now migrate range once */
- if (!range->skip_migrate && range->base.flags.migrate_devmem &&
+ if (!range->skip_migrate && range->base.pages.flags.migrate_devmem &&
xe_svm_range_size(range) >= SZ_64K) {
range->skip_migrate = true;
diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
index 93442738666e..79fbd4fec1fb 100644
--- a/drivers/gpu/drm/xe/xe_svm.h
+++ b/drivers/gpu/drm/xe/xe_svm.h
@@ -136,7 +136,7 @@ void xe_svm_range_debug(struct xe_svm_range *range, const char *operation)
static inline bool xe_svm_range_has_dma_mapping(struct xe_svm_range *range)
{
lockdep_assert_held(&range->base.gpusvm->notifier_lock);
- return range->base.flags.has_dma_mapping;
+ return range->base.pages.flags.has_dma_mapping;
}
#define xe_svm_assert_in_notifier(vm__) \
diff --git a/include/drm/drm_gpusvm.h b/include/drm/drm_gpusvm.h
index df120b4d1f83..c15c733ef0ad 100644
--- a/include/drm/drm_gpusvm.h
+++ b/include/drm/drm_gpusvm.h
@@ -186,14 +186,8 @@ struct drm_gpusvm_notifier {
};
/**
- * struct drm_gpusvm_range - Structure representing a GPU SVM range
+ * struct drm_gpusvm_pages - Structure representing a GPU SVM mapped pages
*
- * @gpusvm: Pointer to the GPU SVM structure
- * @notifier: Pointer to the GPU SVM notifier
- * @refcount: Reference count for the range
- * @itree: Interval tree node for the range (inserted in GPU SVM notifier)
- * @entry: List entry to fast interval tree traversal
- * @notifier_seq: Notifier sequence number of the range's pages
* @dma_addr: Device address array
* @dpagemap: The struct drm_pagemap of the device pages we're dma-mapping.
* Note this is assuming only one drm_pagemap per range is allowed.
@@ -203,17 +197,8 @@ struct drm_gpusvm_notifier {
* @flags.partial_unmap: Flag indicating if the range has been partially unmapped
* @flags.has_devmem_pages: Flag indicating if the range has devmem pages
* @flags.has_dma_mapping: Flag indicating if the range has a DMA mapping
- *
- * This structure represents a GPU SVM range used for tracking memory ranges
- * mapped in a DRM device.
*/
-struct drm_gpusvm_range {
- struct drm_gpusvm *gpusvm;
- struct drm_gpusvm_notifier *notifier;
- struct kref refcount;
- struct interval_tree_node itree;
- struct list_head entry;
- unsigned long notifier_seq;
+struct drm_gpusvm_pages {
struct drm_pagemap_device_addr *dma_addr;
struct drm_pagemap *dpagemap;
struct {
@@ -227,6 +212,30 @@ struct drm_gpusvm_range {
} flags;
};
+/**
+ * struct drm_gpusvm_range - Structure representing a GPU SVM range
+ *
+ * @gpusvm: Pointer to the GPU SVM structure
+ * @notifier: Pointer to the GPU SVM notifier
+ * @refcount: Reference count for the range
+ * @itree: Interval tree node for the range (inserted in GPU SVM notifier)
+ * @entry: List entry to fast interval tree traversal
+ * @notifier_seq: Notifier sequence number of the range's pages
+ * @pages: The pages for this range.
+ *
+ * This structure represents a GPU SVM range used for tracking memory ranges
+ * mapped in a DRM device.
+ */
+struct drm_gpusvm_range {
+ struct drm_gpusvm *gpusvm;
+ struct drm_gpusvm_notifier *notifier;
+ struct kref refcount;
+ struct interval_tree_node itree;
+ struct list_head entry;
+ unsigned long notifier_seq;
+ struct drm_gpusvm_pages pages;
+};
+
/**
* struct drm_gpusvm - GPU SVM structure
*
--
2.48.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 5/7] drm/gpusvm: lower get/unmap pages
2025-03-20 17:29 [PATCH 0/7] Replace xe_hmm with gpusvm Matthew Auld
` (3 preceding siblings ...)
2025-03-20 17:30 ` [PATCH 4/7] drm/gpusvm: pull out drm_gpusvm_pages substructure Matthew Auld
@ 2025-03-20 17:30 ` Matthew Auld
2025-03-20 20:59 ` Matthew Brost
2025-03-20 17:30 ` [PATCH 6/7] drm/gpusvm: support basic_range Matthew Auld
2025-03-20 17:30 ` [PATCH 7/7] drm/xe/userptr: replace xe_hmm with gpusvm Matthew Auld
6 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2025-03-20 17:30 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, Thomas Hellström, Matthew Brost
Lower get/unmap pages to facilitate operating on the lowest level
pieces, without needing a full drm_gpusvm_range structure. In the next
patch we want to extract get/unmap/free to operate on a different range
type.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/drm_gpusvm.c | 90 ++++++++++++++++++++++--------------
1 file changed, 55 insertions(+), 35 deletions(-)
diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
index f27731a51f34..2beca5a6dc0a 100644
--- a/drivers/gpu/drm/drm_gpusvm.c
+++ b/drivers/gpu/drm/drm_gpusvm.c
@@ -1323,38 +1323,28 @@ drm_gpusvm_range_pages_valid_unlocked(struct drm_gpusvm *gpusvm,
return pages_valid;
}
-/**
- * drm_gpusvm_range_get_pages() - Get pages for a GPU SVM range
- * @gpusvm: Pointer to the GPU SVM structure
- * @range: Pointer to the GPU SVM range structure
- * @ctx: GPU SVM context
- *
- * This function gets pages for a GPU SVM range and ensures they are mapped for
- * DMA access.
- *
- * Return: 0 on success, negative error code on failure.
- */
-int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
- struct drm_gpusvm_range *range,
- const struct drm_gpusvm_ctx *ctx)
+static int drm_gpusvm_get_pages(struct drm_gpusvm *gpusvm,
+ struct drm_gpusvm_pages *svm_pages,
+ struct mm_struct *mm,
+ struct mmu_interval_notifier *notifier,
+ unsigned long *notifier_seq,
+ unsigned long mm_start,
+ unsigned long mm_end,
+ const struct drm_gpusvm_ctx *ctx)
{
- struct drm_gpusvm_pages *svm_pages = &range->pages;
- struct mmu_interval_notifier *notifier = &range->notifier->notifier;
struct hmm_range hmm_range = {
.default_flags = HMM_PFN_REQ_FAULT | (ctx->read_only ? 0 :
HMM_PFN_REQ_WRITE),
.notifier = notifier,
- .start = drm_gpusvm_range_start(range),
- .end = drm_gpusvm_range_end(range),
+ .start = mm_start,
+ .end = mm_end,
.dev_private_owner = gpusvm->device_private_page_owner,
};
- struct mm_struct *mm = gpusvm->mm;
struct drm_gpusvm_zdd *zdd;
unsigned long timeout =
jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
unsigned long i, j;
- unsigned long npages = npages_in_range(drm_gpusvm_range_start(range),
- drm_gpusvm_range_end(range));
+ unsigned long npages = npages_in_range(mm_start, mm_end);
unsigned long num_dma_mapped;
unsigned int order = 0;
unsigned long *pfns;
@@ -1518,7 +1508,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
drm_gpusvm_notifier_unlock(gpusvm);
kvfree(pfns);
set_seqno:
- range->notifier_seq = hmm_range.notifier_seq;
+ *notifier_seq = hmm_range.notifier_seq;
return 0;
@@ -1531,8 +1521,48 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
goto retry;
return err;
}
+
+/**
+ * drm_gpusvm_range_get_pages() - Get pages for a GPU SVM range
+ * @gpusvm: Pointer to the GPU SVM structure
+ * @range: Pointer to the GPU SVM range structure
+ * @ctx: GPU SVM context
+ *
+ * This function gets pages for a GPU SVM range and ensures they are mapped for
+ * DMA access.
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
+ struct drm_gpusvm_range *range,
+ const struct drm_gpusvm_ctx *ctx)
+{
+ return drm_gpusvm_get_pages(gpusvm, &range->pages, gpusvm->mm,
+ &range->notifier->notifier,
+ &range->notifier_seq,
+ drm_gpusvm_range_start(range),
+ drm_gpusvm_range_end(range), ctx);
+}
EXPORT_SYMBOL_GPL(drm_gpusvm_range_get_pages);
+static void drm_gpusvm_unmap_pages(struct drm_gpusvm *gpusvm,
+ unsigned long mm_start, unsigned long mm_end,
+ struct drm_gpusvm_pages *svm_pages,
+ const struct drm_gpusvm_ctx *ctx)
+{
+ unsigned long npages = npages_in_range(mm_start, mm_end);
+
+ if (ctx->in_notifier)
+ lockdep_assert_held_write(&gpusvm->notifier_lock);
+ else
+ drm_gpusvm_notifier_lock(gpusvm);
+
+ __drm_gpusvm_unmap_pages(gpusvm, svm_pages, npages);
+
+ if (!ctx->in_notifier)
+ drm_gpusvm_notifier_unlock(gpusvm);
+}
+
/**
* drm_gpusvm_range_unmap_pages() - Unmap pages associated with a GPU SVM range
* @gpusvm: Pointer to the GPU SVM structure
@@ -1549,19 +1579,9 @@ void drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
struct drm_gpusvm_range *range,
const struct drm_gpusvm_ctx *ctx)
{
- struct drm_gpusvm_pages *svm_pages = &range->pages;
- unsigned long npages = npages_in_range(drm_gpusvm_range_start(range),
- drm_gpusvm_range_end(range));
-
- if (ctx->in_notifier)
- lockdep_assert_held_write(&gpusvm->notifier_lock);
- else
- drm_gpusvm_notifier_lock(gpusvm);
-
- __drm_gpusvm_unmap_pages(gpusvm, svm_pages, npages);
-
- if (!ctx->in_notifier)
- drm_gpusvm_notifier_unlock(gpusvm);
+ return drm_gpusvm_unmap_pages(gpusvm, drm_gpusvm_range_start(range),
+ drm_gpusvm_range_end(range),
+ &range->pages, ctx);
}
EXPORT_SYMBOL_GPL(drm_gpusvm_range_unmap_pages);
--
2.48.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 6/7] drm/gpusvm: support basic_range
2025-03-20 17:29 [PATCH 0/7] Replace xe_hmm with gpusvm Matthew Auld
` (4 preceding siblings ...)
2025-03-20 17:30 ` [PATCH 5/7] drm/gpusvm: lower get/unmap pages Matthew Auld
@ 2025-03-20 17:30 ` Matthew Auld
2025-03-20 21:04 ` Matthew Brost
2025-03-20 17:30 ` [PATCH 7/7] drm/xe/userptr: replace xe_hmm with gpusvm Matthew Auld
6 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2025-03-20 17:30 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, Thomas Hellström, Matthew Brost
Idea is to use this for userptr, where we mostly just need
get/unmap/free pages, plus some kind of common notifier lock at the svm
level (provided by gpusvm). The range itself also maps to a single
notifier, covering the entire range (provided by the user).
TODO: needs proper kernel-doc, assuming this change makes sense.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/drm_gpusvm.c | 138 +++++++++++++++++++++++++++++------
include/drm/drm_gpusvm.h | 23 ++++++
2 files changed, 137 insertions(+), 24 deletions(-)
diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
index 2beca5a6dc0a..ca571610214c 100644
--- a/drivers/gpu/drm/drm_gpusvm.c
+++ b/drivers/gpu/drm/drm_gpusvm.c
@@ -521,6 +521,41 @@ static const struct mmu_interval_notifier_ops drm_gpusvm_notifier_ops = {
.invalidate = drm_gpusvm_notifier_invalidate,
};
+static void __drm_gpusvm_init(struct drm_gpusvm *gpusvm, const char *name,
+ struct drm_device *drm, struct mm_struct *mm,
+ void *device_private_page_owner,
+ unsigned long mm_start, unsigned long mm_range,
+ unsigned long notifier_size,
+ const struct drm_gpusvm_ops *ops,
+ const unsigned long *chunk_sizes, int num_chunks)
+{
+ gpusvm->name = name;
+ gpusvm->drm = drm;
+ gpusvm->mm = mm;
+ gpusvm->device_private_page_owner = device_private_page_owner;
+ gpusvm->mm_start = mm_start;
+ gpusvm->mm_range = mm_range;
+ gpusvm->notifier_size = notifier_size;
+ gpusvm->ops = ops;
+ gpusvm->chunk_sizes = chunk_sizes;
+ gpusvm->num_chunks = num_chunks;
+
+ if (mm)
+ mmgrab(mm);
+ gpusvm->root = RB_ROOT_CACHED;
+ INIT_LIST_HEAD(&gpusvm->notifier_list);
+
+ init_rwsem(&gpusvm->notifier_lock);
+
+ fs_reclaim_acquire(GFP_KERNEL);
+ might_lock(&gpusvm->notifier_lock);
+ fs_reclaim_release(GFP_KERNEL);
+
+#ifdef CONFIG_LOCKDEP
+ gpusvm->lock_dep_map = NULL;
+#endif
+}
+
/**
* drm_gpusvm_init() - Initialize the GPU SVM.
* @gpusvm: Pointer to the GPU SVM structure.
@@ -552,35 +587,32 @@ int drm_gpusvm_init(struct drm_gpusvm *gpusvm,
if (!ops->invalidate || !num_chunks)
return -EINVAL;
- gpusvm->name = name;
- gpusvm->drm = drm;
- gpusvm->mm = mm;
- gpusvm->device_private_page_owner = device_private_page_owner;
- gpusvm->mm_start = mm_start;
- gpusvm->mm_range = mm_range;
- gpusvm->notifier_size = notifier_size;
- gpusvm->ops = ops;
- gpusvm->chunk_sizes = chunk_sizes;
- gpusvm->num_chunks = num_chunks;
-
- mmgrab(mm);
- gpusvm->root = RB_ROOT_CACHED;
- INIT_LIST_HEAD(&gpusvm->notifier_list);
-
- init_rwsem(&gpusvm->notifier_lock);
-
- fs_reclaim_acquire(GFP_KERNEL);
- might_lock(&gpusvm->notifier_lock);
- fs_reclaim_release(GFP_KERNEL);
-
-#ifdef CONFIG_LOCKDEP
- gpusvm->lock_dep_map = NULL;
-#endif
+ __drm_gpusvm_init(gpusvm, name, drm, mm, device_private_page_owner,
+ mm_start, mm_range, notifier_size, ops, chunk_sizes,
+ num_chunks);
return 0;
}
EXPORT_SYMBOL_GPL(drm_gpusvm_init);
+static bool drm_gpusvm_is_basic(struct drm_gpusvm *svm)
+{
+ return !svm->mm;
+}
+
+void drm_gpusvm_basic_init(struct drm_gpusvm *gpusvm, const char *name,
+ struct drm_device *drm)
+{
+ __drm_gpusvm_init(gpusvm, name, drm, NULL, NULL, 0, 0, 0, NULL, NULL,
+ 0);
+}
+EXPORT_SYMBOL_GPL(drm_gpusvm_basic_init);
+
+void drm_gpusvm_basic_fini(struct drm_gpusvm *gpusvm)
+{
+}
+EXPORT_SYMBOL_GPL(drm_gpusvm_basic_fini);
+
/**
* drm_gpusvm_notifier_find() - Find GPU SVM notifier
* @gpusvm: Pointer to the GPU SVM structure
@@ -1001,6 +1033,27 @@ static void drm_gpusvm_driver_lock_held(struct drm_gpusvm *gpusvm)
}
#endif
+void drm_gpusvm_basic_range_init(struct drm_gpusvm *svm,
+ struct drm_gpusvm_basic_range *range,
+ struct mmu_interval_notifier *notifier,
+ unsigned long *notifier_seq)
+{
+ WARN_ON(!drm_gpusvm_is_basic(svm));
+
+ range->gpusvm = svm;
+ range->notifier = notifier;
+ range->notifier_seq = notifier_seq;
+ *notifier_seq = LONG_MAX;
+ memset(&range->pages, 0, sizeof(range->pages));
+}
+EXPORT_SYMBOL_GPL(drm_gpusvm_basic_range_init);
+
+void drm_gpusvm_basic_range_fini(struct drm_gpusvm_basic_range *range)
+{
+ WARN_ON(range->pages.flags.has_dma_mapping);
+}
+EXPORT_SYMBOL_GPL(drm_gpusvm_basic_range_fini);
+
/**
* drm_gpusvm_range_find_or_insert() - Find or insert GPU SVM range
* @gpusvm: Pointer to the GPU SVM structure
@@ -1176,6 +1229,19 @@ static void drm_gpusvm_free_pages(struct drm_gpusvm *gpusvm,
}
}
+void drm_gpusvm_basic_range_free_pages(struct drm_gpusvm_basic_range *range)
+{
+ unsigned long npages =
+ npages_in_range(range->notifier->interval_tree.start,
+ range->notifier->interval_tree.last + 1);
+
+ drm_gpusvm_notifier_lock(range->gpusvm);
+ __drm_gpusvm_unmap_pages(range->gpusvm, &range->pages, npages);
+ drm_gpusvm_free_pages(range->gpusvm, &range->pages);
+ drm_gpusvm_notifier_unlock(range->gpusvm);
+}
+EXPORT_SYMBOL_GPL(drm_gpusvm_basic_range_free_pages);
+
/**
* drm_gpusvm_range_remove() - Remove GPU SVM range
* @gpusvm: Pointer to the GPU SVM structure
@@ -1545,6 +1611,20 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
}
EXPORT_SYMBOL_GPL(drm_gpusvm_range_get_pages);
+int drm_gpusvm_basic_range_get_pages(struct drm_gpusvm_basic_range *range,
+ const struct drm_gpusvm_ctx *ctx)
+{
+ drm_gpusvm_driver_lock_held(range->gpusvm);
+
+ return drm_gpusvm_get_pages(range->gpusvm, &range->pages,
+ range->notifier->mm, range->notifier,
+ range->notifier_seq,
+ range->notifier->interval_tree.start,
+ range->notifier->interval_tree.last + 1,
+ ctx);
+}
+EXPORT_SYMBOL_GPL(drm_gpusvm_basic_range_get_pages);
+
static void drm_gpusvm_unmap_pages(struct drm_gpusvm *gpusvm,
unsigned long mm_start, unsigned long mm_end,
struct drm_gpusvm_pages *svm_pages,
@@ -1585,6 +1665,16 @@ void drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
}
EXPORT_SYMBOL_GPL(drm_gpusvm_range_unmap_pages);
+void drm_gpusvm_basic_range_unmap_pages(struct drm_gpusvm_basic_range *range,
+ const struct drm_gpusvm_ctx *ctx)
+{
+ drm_gpusvm_unmap_pages(range->gpusvm,
+ range->notifier->interval_tree.start,
+ range->notifier->interval_tree.last + 1,
+ &range->pages, ctx);
+}
+EXPORT_SYMBOL_GPL(drm_gpusvm_basic_range_unmap_pages);
+
/**
* drm_gpusvm_migration_unlock_put_page() - Put a migration page
* @page: Pointer to the page to put
diff --git a/include/drm/drm_gpusvm.h b/include/drm/drm_gpusvm.h
index c15c733ef0ad..82c4e58fa84c 100644
--- a/include/drm/drm_gpusvm.h
+++ b/include/drm/drm_gpusvm.h
@@ -305,6 +305,29 @@ struct drm_gpusvm_ctx {
unsigned int devmem_possible :1;
};
+struct drm_gpusvm_basic_range {
+ struct drm_gpusvm *gpusvm;
+ struct drm_gpusvm_pages pages;
+ struct mmu_interval_notifier *notifier;
+ unsigned long *notifier_seq;
+};
+
+void drm_gpusvm_basic_init(struct drm_gpusvm *gpusvm, const char *name,
+ struct drm_device *drm);
+void drm_gpusvm_basic_fini(struct drm_gpusvm *gpusvm);
+
+void drm_gpusvm_basic_range_init(struct drm_gpusvm *svm,
+ struct drm_gpusvm_basic_range *range,
+ struct mmu_interval_notifier *notifier,
+ unsigned long *notifier_seq);
+void drm_gpusvm_basic_range_fini(struct drm_gpusvm_basic_range *range);
+
+int drm_gpusvm_basic_range_get_pages(struct drm_gpusvm_basic_range *range,
+ const struct drm_gpusvm_ctx *ctx);
+void drm_gpusvm_basic_range_unmap_pages(struct drm_gpusvm_basic_range *range,
+ const struct drm_gpusvm_ctx *ctx);
+void drm_gpusvm_basic_range_free_pages(struct drm_gpusvm_basic_range *range);
+
int drm_gpusvm_init(struct drm_gpusvm *gpusvm,
const char *name, struct drm_device *drm,
struct mm_struct *mm, void *device_private_page_owner,
--
2.48.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 7/7] drm/xe/userptr: replace xe_hmm with gpusvm
2025-03-20 17:29 [PATCH 0/7] Replace xe_hmm with gpusvm Matthew Auld
` (5 preceding siblings ...)
2025-03-20 17:30 ` [PATCH 6/7] drm/gpusvm: support basic_range Matthew Auld
@ 2025-03-20 17:30 ` Matthew Auld
2025-03-20 20:52 ` Matthew Brost
6 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2025-03-20 17:30 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, Thomas Hellström, Matthew Brost
Goal here is cut over to gpusvm and remove xe_hmm, relying instead on
common code. The core facilities we need are get_pages(), unmap_pages()
and free_pages() for a given useptr range, plus a vm level notifier
lock, which is now provided by gpusvm.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/xe/Makefile | 1 -
drivers/gpu/drm/xe/xe_exec.c | 4 +-
drivers/gpu/drm/xe/xe_hmm.c | 349 -------------------------------
drivers/gpu/drm/xe/xe_hmm.h | 18 --
drivers/gpu/drm/xe/xe_pt.c | 20 +-
drivers/gpu/drm/xe/xe_vm.c | 74 ++++---
drivers/gpu/drm/xe/xe_vm_types.h | 30 ++-
7 files changed, 70 insertions(+), 426 deletions(-)
delete mode 100644 drivers/gpu/drm/xe/xe_hmm.c
delete mode 100644 drivers/gpu/drm/xe/xe_hmm.h
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 9699b08585f7..ada1f0fad629 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -124,7 +124,6 @@ xe-y += xe_bb.o \
xe_wait_user_fence.o \
xe_wopcm.o
-xe-$(CONFIG_HMM_MIRROR) += xe_hmm.o
xe-$(CONFIG_DRM_GPUSVM) += xe_svm.o
# graphics hardware monitoring (HWMON) support
diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c
index b75adfc99fb7..84daeedd65d4 100644
--- a/drivers/gpu/drm/xe/xe_exec.c
+++ b/drivers/gpu/drm/xe/xe_exec.c
@@ -294,7 +294,7 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
if (err)
goto err_put_job;
- err = down_read_interruptible(&vm->userptr.notifier_lock);
+ err = down_read_interruptible(&vm->userptr.gpusvm.notifier_lock);
if (err)
goto err_put_job;
@@ -336,7 +336,7 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
err_repin:
if (!xe_vm_in_lr_mode(vm))
- up_read(&vm->userptr.notifier_lock);
+ up_read(&vm->userptr.gpusvm.notifier_lock);
err_put_job:
if (err)
xe_sched_job_put(job);
diff --git a/drivers/gpu/drm/xe/xe_hmm.c b/drivers/gpu/drm/xe/xe_hmm.c
deleted file mode 100644
index c3cc0fa105e8..000000000000
--- a/drivers/gpu/drm/xe/xe_hmm.c
+++ /dev/null
@@ -1,349 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2024 Intel Corporation
- */
-
-#include <linux/scatterlist.h>
-#include <linux/mmu_notifier.h>
-#include <linux/dma-mapping.h>
-#include <linux/memremap.h>
-#include <linux/swap.h>
-#include <linux/hmm.h>
-#include <linux/mm.h>
-#include "xe_hmm.h"
-#include "xe_vm.h"
-#include "xe_bo.h"
-
-static u64 xe_npages_in_range(unsigned long start, unsigned long end)
-{
- return (end - start) >> PAGE_SHIFT;
-}
-
-/**
- * xe_mark_range_accessed() - mark a range is accessed, so core mm
- * have such information for memory eviction or write back to
- * hard disk
- * @range: the range to mark
- * @write: if write to this range, we mark pages in this range
- * as dirty
- */
-static void xe_mark_range_accessed(struct hmm_range *range, bool write)
-{
- struct page *page;
- u64 i, npages;
-
- npages = xe_npages_in_range(range->start, range->end);
- for (i = 0; i < npages; i++) {
- page = hmm_pfn_to_page(range->hmm_pfns[i]);
- if (write)
- set_page_dirty_lock(page);
-
- mark_page_accessed(page);
- }
-}
-
-static int xe_alloc_sg(struct xe_device *xe, struct sg_table *st,
- struct hmm_range *range, struct rw_semaphore *notifier_sem)
-{
- unsigned long i, npages, hmm_pfn;
- unsigned long num_chunks = 0;
- int ret;
-
- /* HMM docs says this is needed. */
- ret = down_read_interruptible(notifier_sem);
- if (ret)
- return ret;
-
- if (mmu_interval_read_retry(range->notifier, range->notifier_seq)) {
- up_read(notifier_sem);
- return -EAGAIN;
- }
-
- npages = xe_npages_in_range(range->start, range->end);
- for (i = 0; i < npages;) {
- unsigned long len;
-
- hmm_pfn = range->hmm_pfns[i];
- xe_assert(xe, hmm_pfn & HMM_PFN_VALID);
-
- len = 1UL << hmm_pfn_to_map_order(hmm_pfn);
-
- /* If order > 0 the page may extend beyond range->start */
- len -= (hmm_pfn & ~HMM_PFN_FLAGS) & (len - 1);
- i += len;
- num_chunks++;
- }
- up_read(notifier_sem);
-
- return sg_alloc_table(st, num_chunks, GFP_KERNEL);
-}
-
-/**
- * xe_build_sg() - build a scatter gather table for all the physical pages/pfn
- * in a hmm_range. dma-map pages if necessary. dma-address is save in sg table
- * and will be used to program GPU page table later.
- * @xe: the xe device who will access the dma-address in sg table
- * @range: the hmm range that we build the sg table from. range->hmm_pfns[]
- * has the pfn numbers of pages that back up this hmm address range.
- * @st: pointer to the sg table.
- * @notifier_sem: The xe notifier lock.
- * @write: whether we write to this range. This decides dma map direction
- * for system pages. If write we map it bi-diretional; otherwise
- * DMA_TO_DEVICE
- *
- * All the contiguous pfns will be collapsed into one entry in
- * the scatter gather table. This is for the purpose of efficiently
- * programming GPU page table.
- *
- * The dma_address in the sg table will later be used by GPU to
- * access memory. So if the memory is system memory, we need to
- * do a dma-mapping so it can be accessed by GPU/DMA.
- *
- * FIXME: This function currently only support pages in system
- * memory. If the memory is GPU local memory (of the GPU who
- * is going to access memory), we need gpu dpa (device physical
- * address), and there is no need of dma-mapping. This is TBD.
- *
- * FIXME: dma-mapping for peer gpu device to access remote gpu's
- * memory. Add this when you support p2p
- *
- * This function allocates the storage of the sg table. It is
- * caller's responsibility to free it calling sg_free_table.
- *
- * Returns 0 if successful; -ENOMEM if fails to allocate memory
- */
-static int xe_build_sg(struct xe_device *xe, struct hmm_range *range,
- struct sg_table *st,
- struct rw_semaphore *notifier_sem,
- bool write)
-{
- unsigned long npages = xe_npages_in_range(range->start, range->end);
- struct device *dev = xe->drm.dev;
- struct scatterlist *sgl;
- struct page *page;
- unsigned long i, j;
-
- lockdep_assert_held(notifier_sem);
-
- i = 0;
- for_each_sg(st->sgl, sgl, st->nents, j) {
- unsigned long hmm_pfn, size;
-
- hmm_pfn = range->hmm_pfns[i];
- page = hmm_pfn_to_page(hmm_pfn);
- xe_assert(xe, !is_device_private_page(page));
-
- size = 1UL << hmm_pfn_to_map_order(hmm_pfn);
- size -= page_to_pfn(page) & (size - 1);
- i += size;
-
- if (unlikely(j == st->nents - 1)) {
- xe_assert(xe, i >= npages);
- if (i > npages)
- size -= (i - npages);
-
- sg_mark_end(sgl);
- } else {
- xe_assert(xe, i < npages);
- }
-
- sg_set_page(sgl, page, size << PAGE_SHIFT, 0);
- }
-
- return dma_map_sgtable(dev, st, write ? DMA_BIDIRECTIONAL : DMA_TO_DEVICE,
- DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_NO_KERNEL_MAPPING);
-}
-
-static void xe_hmm_userptr_set_mapped(struct xe_userptr_vma *uvma)
-{
- struct xe_userptr *userptr = &uvma->userptr;
- struct xe_vm *vm = xe_vma_vm(&uvma->vma);
-
- lockdep_assert_held_write(&vm->lock);
- lockdep_assert_held(&vm->userptr.notifier_lock);
-
- mutex_lock(&userptr->unmap_mutex);
- xe_assert(vm->xe, !userptr->mapped);
- userptr->mapped = true;
- mutex_unlock(&userptr->unmap_mutex);
-}
-
-void xe_hmm_userptr_unmap(struct xe_userptr_vma *uvma)
-{
- struct xe_userptr *userptr = &uvma->userptr;
- struct xe_vma *vma = &uvma->vma;
- bool write = !xe_vma_read_only(vma);
- struct xe_vm *vm = xe_vma_vm(vma);
- struct xe_device *xe = vm->xe;
-
- if (!lockdep_is_held_type(&vm->userptr.notifier_lock, 0) &&
- !lockdep_is_held_type(&vm->lock, 0) &&
- !(vma->gpuva.flags & XE_VMA_DESTROYED)) {
- /* Don't unmap in exec critical section. */
- xe_vm_assert_held(vm);
- /* Don't unmap while mapping the sg. */
- lockdep_assert_held(&vm->lock);
- }
-
- mutex_lock(&userptr->unmap_mutex);
- if (userptr->sg && userptr->mapped)
- dma_unmap_sgtable(xe->drm.dev, userptr->sg,
- write ? DMA_BIDIRECTIONAL : DMA_TO_DEVICE, 0);
- userptr->mapped = false;
- mutex_unlock(&userptr->unmap_mutex);
-}
-
-/**
- * xe_hmm_userptr_free_sg() - Free the scatter gather table of userptr
- * @uvma: the userptr vma which hold the scatter gather table
- *
- * With function xe_userptr_populate_range, we allocate storage of
- * the userptr sg table. This is a helper function to free this
- * sg table, and dma unmap the address in the table.
- */
-void xe_hmm_userptr_free_sg(struct xe_userptr_vma *uvma)
-{
- struct xe_userptr *userptr = &uvma->userptr;
-
- xe_assert(xe_vma_vm(&uvma->vma)->xe, userptr->sg);
- xe_hmm_userptr_unmap(uvma);
- sg_free_table(userptr->sg);
- userptr->sg = NULL;
-}
-
-/**
- * xe_hmm_userptr_populate_range() - Populate physical pages of a virtual
- * address range
- *
- * @uvma: userptr vma which has information of the range to populate.
- * @is_mm_mmap_locked: True if mmap_read_lock is already acquired by caller.
- *
- * This function populate the physical pages of a virtual
- * address range. The populated physical pages is saved in
- * userptr's sg table. It is similar to get_user_pages but call
- * hmm_range_fault.
- *
- * This function also read mmu notifier sequence # (
- * mmu_interval_read_begin), for the purpose of later
- * comparison (through mmu_interval_read_retry).
- *
- * This must be called with mmap read or write lock held.
- *
- * This function allocates the storage of the userptr sg table.
- * It is caller's responsibility to free it calling sg_free_table.
- *
- * returns: 0 for success; negative error no on failure
- */
-int xe_hmm_userptr_populate_range(struct xe_userptr_vma *uvma,
- bool is_mm_mmap_locked)
-{
- unsigned long timeout =
- jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
- unsigned long *pfns;
- struct xe_userptr *userptr;
- struct xe_vma *vma = &uvma->vma;
- u64 userptr_start = xe_vma_userptr(vma);
- u64 userptr_end = userptr_start + xe_vma_size(vma);
- struct xe_vm *vm = xe_vma_vm(vma);
- struct hmm_range hmm_range = {
- .pfn_flags_mask = 0, /* ignore pfns */
- .default_flags = HMM_PFN_REQ_FAULT,
- .start = userptr_start,
- .end = userptr_end,
- .notifier = &uvma->userptr.notifier,
- .dev_private_owner = vm->xe,
- };
- bool write = !xe_vma_read_only(vma);
- unsigned long notifier_seq;
- u64 npages;
- int ret;
-
- userptr = &uvma->userptr;
-
- if (is_mm_mmap_locked)
- mmap_assert_locked(userptr->notifier.mm);
-
- if (vma->gpuva.flags & XE_VMA_DESTROYED)
- return 0;
-
- notifier_seq = mmu_interval_read_begin(&userptr->notifier);
- if (notifier_seq == userptr->notifier_seq)
- return 0;
-
- if (userptr->sg)
- xe_hmm_userptr_free_sg(uvma);
-
- npages = xe_npages_in_range(userptr_start, userptr_end);
- pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL);
- if (unlikely(!pfns))
- return -ENOMEM;
-
- if (write)
- hmm_range.default_flags |= HMM_PFN_REQ_WRITE;
-
- if (!mmget_not_zero(userptr->notifier.mm)) {
- ret = -EFAULT;
- goto free_pfns;
- }
-
- hmm_range.hmm_pfns = pfns;
-
- while (true) {
- hmm_range.notifier_seq = mmu_interval_read_begin(&userptr->notifier);
-
- if (!is_mm_mmap_locked)
- mmap_read_lock(userptr->notifier.mm);
-
- ret = hmm_range_fault(&hmm_range);
-
- if (!is_mm_mmap_locked)
- mmap_read_unlock(userptr->notifier.mm);
-
- if (ret == -EBUSY) {
- if (time_after(jiffies, timeout))
- break;
-
- continue;
- }
- break;
- }
-
- mmput(userptr->notifier.mm);
-
- if (ret)
- goto free_pfns;
-
- ret = xe_alloc_sg(vm->xe, &userptr->sgt, &hmm_range, &vm->userptr.notifier_lock);
- if (ret)
- goto free_pfns;
-
- ret = down_read_interruptible(&vm->userptr.notifier_lock);
- if (ret)
- goto free_st;
-
- if (mmu_interval_read_retry(hmm_range.notifier, hmm_range.notifier_seq)) {
- ret = -EAGAIN;
- goto out_unlock;
- }
-
- ret = xe_build_sg(vm->xe, &hmm_range, &userptr->sgt,
- &vm->userptr.notifier_lock, write);
- if (ret)
- goto out_unlock;
-
- xe_mark_range_accessed(&hmm_range, write);
- userptr->sg = &userptr->sgt;
- xe_hmm_userptr_set_mapped(uvma);
- userptr->notifier_seq = hmm_range.notifier_seq;
- up_read(&vm->userptr.notifier_lock);
- kvfree(pfns);
- return 0;
-
-out_unlock:
- up_read(&vm->userptr.notifier_lock);
-free_st:
- sg_free_table(&userptr->sgt);
-free_pfns:
- kvfree(pfns);
- return ret;
-}
diff --git a/drivers/gpu/drm/xe/xe_hmm.h b/drivers/gpu/drm/xe/xe_hmm.h
deleted file mode 100644
index 0ea98d8e7bbc..000000000000
--- a/drivers/gpu/drm/xe/xe_hmm.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/* SPDX-License-Identifier: MIT
- *
- * Copyright © 2024 Intel Corporation
- */
-
-#ifndef _XE_HMM_H_
-#define _XE_HMM_H_
-
-#include <linux/types.h>
-
-struct xe_userptr_vma;
-
-int xe_hmm_userptr_populate_range(struct xe_userptr_vma *uvma, bool is_mm_mmap_locked);
-
-void xe_hmm_userptr_free_sg(struct xe_userptr_vma *uvma);
-
-void xe_hmm_userptr_unmap(struct xe_userptr_vma *uvma);
-#endif
diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index c43e7619cb80..824bf99e5f71 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -727,8 +727,8 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
if (!xe_vma_is_null(vma) && !range) {
if (xe_vma_is_userptr(vma))
- xe_res_first_sg(to_userptr_vma(vma)->userptr.sg, 0,
- xe_vma_size(vma), &curs);
+ xe_res_first_dma(to_userptr_vma(vma)->userptr.range.pages.dma_addr, 0,
+ xe_vma_size(vma), &curs);
else if (xe_bo_is_vram(bo) || xe_bo_is_stolen(bo))
xe_res_first(bo->ttm.resource, xe_vma_bo_offset(vma),
xe_vma_size(vma), &curs);
@@ -998,7 +998,7 @@ static void xe_pt_commit_locks_assert(struct xe_vma *vma)
xe_pt_commit_prepare_locks_assert(vma);
if (xe_vma_is_userptr(vma))
- lockdep_assert_held_read(&vm->userptr.notifier_lock);
+ lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
}
static void xe_pt_commit(struct xe_vma *vma,
@@ -1336,7 +1336,7 @@ static int vma_check_userptr(struct xe_vm *vm, struct xe_vma *vma,
struct xe_userptr_vma *uvma;
unsigned long notifier_seq;
- lockdep_assert_held_read(&vm->userptr.notifier_lock);
+ lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
if (!xe_vma_is_userptr(vma))
return 0;
@@ -1366,7 +1366,7 @@ static int op_check_userptr(struct xe_vm *vm, struct xe_vma_op *op,
{
int err = 0;
- lockdep_assert_held_read(&vm->userptr.notifier_lock);
+ lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
switch (op->base.op) {
case DRM_GPUVA_OP_MAP:
@@ -1407,12 +1407,12 @@ static int xe_pt_userptr_pre_commit(struct xe_migrate_pt_update *pt_update)
if (err)
return err;
- down_read(&vm->userptr.notifier_lock);
+ down_read(&vm->userptr.gpusvm.notifier_lock);
list_for_each_entry(op, &vops->list, link) {
err = op_check_userptr(vm, op, pt_update_ops);
if (err) {
- up_read(&vm->userptr.notifier_lock);
+ up_read(&vm->userptr.gpusvm.notifier_lock);
break;
}
}
@@ -2133,7 +2133,7 @@ static void bind_op_commit(struct xe_vm *vm, struct xe_tile *tile,
vma->tile_present |= BIT(tile->id);
vma->tile_staged &= ~BIT(tile->id);
if (xe_vma_is_userptr(vma)) {
- lockdep_assert_held_read(&vm->userptr.notifier_lock);
+ lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
to_userptr_vma(vma)->userptr.initial_bind = true;
}
@@ -2169,7 +2169,7 @@ static void unbind_op_commit(struct xe_vm *vm, struct xe_tile *tile,
if (!vma->tile_present) {
list_del_init(&vma->combined_links.rebind);
if (xe_vma_is_userptr(vma)) {
- lockdep_assert_held_read(&vm->userptr.notifier_lock);
+ lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
spin_lock(&vm->userptr.invalidated_lock);
list_del_init(&to_userptr_vma(vma)->userptr.invalidate_link);
@@ -2414,7 +2414,7 @@ xe_pt_update_ops_run(struct xe_tile *tile, struct xe_vma_ops *vops)
if (pt_update_ops->needs_svm_lock)
xe_svm_notifier_unlock(vm);
if (pt_update_ops->needs_userptr_lock)
- up_read(&vm->userptr.notifier_lock);
+ up_read(&vm->userptr.gpusvm.notifier_lock);
return fence;
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 60303998bd61..fdc24718b9ad 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -5,6 +5,7 @@
#include "xe_vm.h"
+#include "drm/drm_gpusvm.h"
#include <linux/dma-fence-array.h>
#include <linux/nospec.h>
@@ -40,7 +41,6 @@
#include "xe_sync.h"
#include "xe_trace_bo.h"
#include "xe_wa.h"
-#include "xe_hmm.h"
static struct drm_gem_object *xe_vm_obj(struct xe_vm *vm)
{
@@ -53,7 +53,7 @@ static struct drm_gem_object *xe_vm_obj(struct xe_vm *vm)
*
* Check if the userptr vma has been invalidated since last successful
* repin. The check is advisory only and can the function can be called
- * without the vm->userptr.notifier_lock held. There is no guarantee that the
+ * without the vm->userptr.gpusvm.notifier_lock held. There is no guarantee that the
* vma userptr will remain valid after a lockless check, so typically
* the call needs to be followed by a proper check under the notifier_lock.
*
@@ -71,11 +71,17 @@ int xe_vma_userptr_pin_pages(struct xe_userptr_vma *uvma)
struct xe_vma *vma = &uvma->vma;
struct xe_vm *vm = xe_vma_vm(vma);
struct xe_device *xe = vm->xe;
+ struct drm_gpusvm_ctx ctx = {
+ .read_only = xe_vma_read_only(vma),
+ };
lockdep_assert_held(&vm->lock);
xe_assert(xe, xe_vma_is_userptr(vma));
- return xe_hmm_userptr_populate_range(uvma, false);
+ if (vma->gpuva.flags & XE_VMA_DESTROYED)
+ return 0;
+
+ return drm_gpusvm_basic_range_get_pages(&uvma->userptr.range, &ctx);
}
static bool preempt_fences_waiting(struct xe_vm *vm)
@@ -249,7 +255,7 @@ int xe_vm_add_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q)
++vm->preempt.num_exec_queues;
q->lr.pfence = pfence;
- down_read(&vm->userptr.notifier_lock);
+ down_read(&vm->userptr.gpusvm.notifier_lock);
drm_gpuvm_resv_add_fence(&vm->gpuvm, exec, pfence,
DMA_RESV_USAGE_BOOKKEEP, DMA_RESV_USAGE_BOOKKEEP);
@@ -263,7 +269,7 @@ int xe_vm_add_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q)
if (wait)
dma_fence_enable_sw_signaling(pfence);
- up_read(&vm->userptr.notifier_lock);
+ up_read(&vm->userptr.gpusvm.notifier_lock);
out_fini:
drm_exec_fini(exec);
@@ -305,14 +311,14 @@ void xe_vm_remove_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q)
* @vm: The VM.
*
* This function checks for whether the VM has userptrs that need repinning,
- * and provides a release-type barrier on the userptr.notifier_lock after
+ * and provides a release-type barrier on the userptr.gpusvm.notifier_lock after
* checking.
*
* Return: 0 if there are no userptrs needing repinning, -EAGAIN if there are.
*/
int __xe_vm_userptr_needs_repin(struct xe_vm *vm)
{
- lockdep_assert_held_read(&vm->userptr.notifier_lock);
+ lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
return (list_empty(&vm->userptr.repin_list) &&
list_empty(&vm->userptr.invalidated)) ? 0 : -EAGAIN;
@@ -546,9 +552,9 @@ static void preempt_rebind_work_func(struct work_struct *w)
(!(__tries)++ || __xe_vm_userptr_needs_repin(__vm)) : \
__xe_vm_userptr_needs_repin(__vm))
- down_read(&vm->userptr.notifier_lock);
+ down_read(&vm->userptr.gpusvm.notifier_lock);
if (retry_required(tries, vm)) {
- up_read(&vm->userptr.notifier_lock);
+ up_read(&vm->userptr.gpusvm.notifier_lock);
err = -EAGAIN;
goto out_unlock;
}
@@ -562,7 +568,7 @@ static void preempt_rebind_work_func(struct work_struct *w)
/* Point of no return. */
arm_preempt_fences(vm, &preempt_fences);
resume_and_reinstall_preempt_fences(vm, &exec);
- up_read(&vm->userptr.notifier_lock);
+ up_read(&vm->userptr.gpusvm.notifier_lock);
out_unlock:
drm_exec_fini(&exec);
@@ -589,6 +595,9 @@ static void __vma_userptr_invalidate(struct xe_vm *vm, struct xe_userptr_vma *uv
struct xe_vma *vma = &uvma->vma;
struct dma_resv_iter cursor;
struct dma_fence *fence;
+ struct drm_gpusvm_ctx ctx = {
+ .in_notifier = true,
+ };
long err;
/*
@@ -625,7 +634,7 @@ static void __vma_userptr_invalidate(struct xe_vm *vm, struct xe_userptr_vma *uv
XE_WARN_ON(err);
}
- xe_hmm_userptr_unmap(uvma);
+ drm_gpusvm_basic_range_unmap_pages(&uvma->userptr.range, &ctx);
}
static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
@@ -646,11 +655,11 @@ static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
"NOTIFIER: addr=0x%016llx, range=0x%016llx",
xe_vma_start(vma), xe_vma_size(vma));
- down_write(&vm->userptr.notifier_lock);
+ down_write(&vm->userptr.gpusvm.notifier_lock);
mmu_interval_set_seq(mni, cur_seq);
__vma_userptr_invalidate(vm, uvma);
- up_write(&vm->userptr.notifier_lock);
+ up_write(&vm->userptr.gpusvm.notifier_lock);
trace_xe_vma_userptr_invalidate_complete(vma);
return true;
@@ -674,7 +683,7 @@ void xe_vma_userptr_force_invalidate(struct xe_userptr_vma *uvma)
/* Protect against concurrent userptr pinning */
lockdep_assert_held(&vm->lock);
/* Protect against concurrent notifiers */
- lockdep_assert_held(&vm->userptr.notifier_lock);
+ lockdep_assert_held(&vm->userptr.gpusvm.notifier_lock);
/*
* Protect against concurrent instances of this function and
* the critical exec sections
@@ -747,7 +756,7 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
}
if (err) {
- down_write(&vm->userptr.notifier_lock);
+ down_write(&vm->userptr.gpusvm.notifier_lock);
spin_lock(&vm->userptr.invalidated_lock);
list_for_each_entry_safe(uvma, next, &vm->userptr.repin_list,
userptr.repin_link) {
@@ -756,7 +765,7 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
&vm->userptr.invalidated);
}
spin_unlock(&vm->userptr.invalidated_lock);
- up_write(&vm->userptr.notifier_lock);
+ up_write(&vm->userptr.gpusvm.notifier_lock);
}
return err;
}
@@ -1222,7 +1231,6 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
INIT_LIST_HEAD(&userptr->invalidate_link);
INIT_LIST_HEAD(&userptr->repin_link);
vma->gpuva.gem.offset = bo_offset_or_userptr;
- mutex_init(&userptr->unmap_mutex);
err = mmu_interval_notifier_insert(&userptr->notifier,
current->mm,
@@ -1233,7 +1241,10 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
return ERR_PTR(err);
}
- userptr->notifier_seq = LONG_MAX;
+ drm_gpusvm_basic_range_init(&vm->userptr.gpusvm,
+ &userptr->range,
+ &userptr->notifier,
+ &userptr->notifier_seq);
}
xe_vm_get(vm);
@@ -1255,8 +1266,7 @@ static void xe_vma_destroy_late(struct xe_vma *vma)
struct xe_userptr_vma *uvma = to_userptr_vma(vma);
struct xe_userptr *userptr = &uvma->userptr;
- if (userptr->sg)
- xe_hmm_userptr_free_sg(uvma);
+ drm_gpusvm_basic_range_free_pages(&uvma->userptr.range);
/*
* Since userptr pages are not pinned, we can't remove
@@ -1264,7 +1274,7 @@ static void xe_vma_destroy_late(struct xe_vma *vma)
* them anymore
*/
mmu_interval_notifier_remove(&userptr->notifier);
- mutex_destroy(&userptr->unmap_mutex);
+ drm_gpusvm_basic_range_fini(&uvma->userptr.range);
xe_vm_put(vm);
} else if (xe_vma_is_null(vma) || xe_vma_is_cpu_addr_mirror(vma)) {
xe_vm_put(vm);
@@ -1657,7 +1667,6 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
INIT_LIST_HEAD(&vm->userptr.repin_list);
INIT_LIST_HEAD(&vm->userptr.invalidated);
- init_rwsem(&vm->userptr.notifier_lock);
spin_lock_init(&vm->userptr.invalidated_lock);
ttm_lru_bulk_move_init(&vm->lru_bulk_move);
@@ -1763,6 +1772,9 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
goto err_close;
}
+ drm_gpusvm_basic_init(&vm->userptr.gpusvm, "xe-vm-userptr", &xe->drm);
+ drm_gpusvm_driver_set_lock(&vm->userptr.gpusvm, &vm->lock);
+
if (number_tiles > 1)
vm->composite_fence_ctx = dma_fence_context_alloc(1);
@@ -1867,9 +1879,9 @@ void xe_vm_close_and_put(struct xe_vm *vm)
vma = gpuva_to_vma(gpuva);
if (xe_vma_has_no_bo(vma)) {
- down_read(&vm->userptr.notifier_lock);
+ down_read(&vm->userptr.gpusvm.notifier_lock);
vma->gpuva.flags |= XE_VMA_DESTROYED;
- up_read(&vm->userptr.notifier_lock);
+ up_read(&vm->userptr.gpusvm.notifier_lock);
}
xe_vm_remove_vma(vm, vma);
@@ -1916,6 +1928,8 @@ void xe_vm_close_and_put(struct xe_vm *vm)
if (xe_vm_in_fault_mode(vm))
xe_svm_fini(vm);
+ drm_gpusvm_basic_fini(&vm->userptr.gpusvm);
+
up_write(&vm->lock);
down_write(&xe->usm.lock);
@@ -2144,9 +2158,9 @@ static const u32 region_to_mem_type[] = {
static void prep_vma_destroy(struct xe_vm *vm, struct xe_vma *vma,
bool post_commit)
{
- down_read(&vm->userptr.notifier_lock);
+ down_read(&vm->userptr.gpusvm.notifier_lock);
vma->gpuva.flags |= XE_VMA_DESTROYED;
- up_read(&vm->userptr.notifier_lock);
+ up_read(&vm->userptr.gpusvm.notifier_lock);
if (post_commit)
xe_vm_remove_vma(vm, vma);
}
@@ -2625,9 +2639,9 @@ static void xe_vma_op_unwind(struct xe_vm *vm, struct xe_vma_op *op,
struct xe_vma *vma = gpuva_to_vma(op->base.unmap.va);
if (vma) {
- down_read(&vm->userptr.notifier_lock);
+ down_read(&vm->userptr.gpusvm.notifier_lock);
vma->gpuva.flags &= ~XE_VMA_DESTROYED;
- up_read(&vm->userptr.notifier_lock);
+ up_read(&vm->userptr.gpusvm.notifier_lock);
if (post_commit)
xe_vm_insert_vma(vm, vma);
}
@@ -2646,9 +2660,9 @@ static void xe_vma_op_unwind(struct xe_vm *vm, struct xe_vma_op *op,
xe_vma_destroy_unlocked(op->remap.next);
}
if (vma) {
- down_read(&vm->userptr.notifier_lock);
+ down_read(&vm->userptr.gpusvm.notifier_lock);
vma->gpuva.flags &= ~XE_VMA_DESTROYED;
- up_read(&vm->userptr.notifier_lock);
+ up_read(&vm->userptr.gpusvm.notifier_lock);
if (post_commit)
xe_vm_insert_vma(vm, vma);
}
diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
index 84fa41b9fa20..08f295873a2b 100644
--- a/drivers/gpu/drm/xe/xe_vm_types.h
+++ b/drivers/gpu/drm/xe/xe_vm_types.h
@@ -52,26 +52,23 @@ struct xe_userptr {
struct list_head invalidate_link;
/** @userptr: link into VM repin list if userptr. */
struct list_head repin_link;
+ /**
+ * @range: gpusvm range for this user pointer.
+ */
+ struct drm_gpusvm_basic_range range;
/**
* @notifier: MMU notifier for user pointer (invalidation call back)
*/
struct mmu_interval_notifier notifier;
- /** @sgt: storage for a scatter gather table */
- struct sg_table sgt;
- /** @sg: allocated scatter gather table */
- struct sg_table *sg;
+
/** @notifier_seq: notifier sequence number */
unsigned long notifier_seq;
- /** @unmap_mutex: Mutex protecting dma-unmapping */
- struct mutex unmap_mutex;
/**
* @initial_bind: user pointer has been bound at least once.
- * write: vm->userptr.notifier_lock in read mode and vm->resv held.
- * read: vm->userptr.notifier_lock in write mode or vm->resv held.
+ * write: vm->userptr.gpusvm.notifier_lock in read mode and vm->resv held.
+ * read: vm->userptr.gpusvm.notifier_lock in write mode or vm->resv held.
*/
bool initial_bind;
- /** @mapped: Whether the @sgt sg-table is dma-mapped. Protected by @unmap_mutex. */
- bool mapped;
#if IS_ENABLED(CONFIG_DRM_XE_USERPTR_INVAL_INJECT)
u32 divisor;
#endif
@@ -109,7 +106,7 @@ struct xe_vma {
/**
* @tile_present: GT mask of binding are present for this VMA.
* protected by vm->lock, vm->resv and for userptrs,
- * vm->userptr.notifier_lock for writing. Needs either for reading,
+ * vm->userptr.gpusvm.notifier_lock for writing. Needs either for reading,
* but if reading is done under the vm->lock only, it needs to be held
* in write mode.
*/
@@ -238,16 +235,17 @@ struct xe_vm {
/** @userptr: user pointer state */
struct {
+ /*
+ * @gpusvm: The gpusvm userptr for this vm. The
+ * gpusvm.notifier_lock protects notifier in write mode and
+ * submission in read mode.
+ */
+ struct drm_gpusvm gpusvm;
/**
* @userptr.repin_list: list of VMAs which are user pointers,
* and needs repinning. Protected by @lock.
*/
struct list_head repin_list;
- /**
- * @notifier_lock: protects notifier in write mode and
- * submission in read mode.
- */
- struct rw_semaphore notifier_lock;
/**
* @userptr.invalidated_lock: Protects the
* @userptr.invalidated list.
--
2.48.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 3/7] drm/gpusvm: mark pages as dirty
2025-03-20 17:30 ` [PATCH 3/7] drm/gpusvm: mark pages as dirty Matthew Auld
@ 2025-03-20 19:29 ` Thomas Hellström
2025-03-20 19:33 ` Matthew Brost
0 siblings, 1 reply; 22+ messages in thread
From: Thomas Hellström @ 2025-03-20 19:29 UTC (permalink / raw)
To: Matthew Auld, intel-xe; +Cc: dri-devel, Matthew Brost
On Thu, 2025-03-20 at 17:30 +0000, Matthew Auld wrote:
> If the memory is going to be accessed by the device, make sure we
> mark
> the pages accordingly such that the kernel knows this. This aligns
> with
> the xe-userptr code.
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/drm_gpusvm.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_gpusvm.c
> b/drivers/gpu/drm/drm_gpusvm.c
> index 7f1cf5492bba..5b4ecd36dff1 100644
> --- a/drivers/gpu/drm/drm_gpusvm.c
> +++ b/drivers/gpu/drm/drm_gpusvm.c
> @@ -1471,6 +1471,7 @@ int drm_gpusvm_range_get_pages(struct
> drm_gpusvm *gpusvm,
> pages[i] = page;
> } else {
> dma_addr_t addr;
> + unsigned int k;
>
> if (is_zone_device_page(page) || zdd) {
> err = -EOPNOTSUPP;
> @@ -1489,6 +1490,14 @@ int drm_gpusvm_range_get_pages(struct
> drm_gpusvm *gpusvm,
> range->dma_addr[j] =
> drm_pagemap_device_addr_encode
> (addr, DRM_INTERCONNECT_SYSTEM,
> order,
> dma_dir);
> +
> + for (k = 0; k < 1u << order; k++) {
> + if (!ctx->read_only)
> + set_page_dirty_lock(page);
> +
> + mark_page_accessed(page);
> + page++;
> + }
Actually I think the userptr code did this unnecessarily. This is done
in the CPU page-fault handler, which means it's taken care of during
hmm_range_fault(). Now if the CPU PTE happens to be present and
writeable there will be no fault, but that was done when the page was
faulted in anyway.
If there was a page cleaning event in between so the dirty flag was
dropped, then my understanding is that in addition to an invalidation
notifier, also the CPU PTE is zapped, so that it will be dirtied again
on the next write access, either by the CPU faulting the page or
hmm_range_fault() if there is a GPU page-fault.
So I think we're good without this patch.
/Thomas
> }
> i += 1 << order;
> num_dma_mapped = i;
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/7] drm/gpusvm: use more selective dma dir in get_pages()
2025-03-20 17:29 ` [PATCH 2/7] drm/gpusvm: use more selective dma dir in get_pages() Matthew Auld
@ 2025-03-20 19:31 ` Thomas Hellström
2025-03-20 19:37 ` Matthew Brost
1 sibling, 0 replies; 22+ messages in thread
From: Thomas Hellström @ 2025-03-20 19:31 UTC (permalink / raw)
To: Matthew Auld, intel-xe; +Cc: dri-devel, Matthew Brost
On Thu, 2025-03-20 at 17:29 +0000, Matthew Auld wrote:
> If we are only reading the memory then from the device pov the
> direction
> can be DMA_TO_DEVICE. This aligns with the xe-userptr code. Using the
> most restrictive data direction to represent the access is normally a
> good idea.
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
> drivers/gpu/drm/drm_gpusvm.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gpusvm.c
> b/drivers/gpu/drm/drm_gpusvm.c
> index 48993cef4a74..7f1cf5492bba 100644
> --- a/drivers/gpu/drm/drm_gpusvm.c
> +++ b/drivers/gpu/drm/drm_gpusvm.c
> @@ -1355,6 +1355,8 @@ int drm_gpusvm_range_get_pages(struct
> drm_gpusvm *gpusvm,
> int err = 0;
> struct dev_pagemap *pagemap;
> struct drm_pagemap *dpagemap;
> + enum dma_data_direction dma_dir = ctx->read_only ?
> DMA_TO_DEVICE :
> +
> DMA_BIDIRECTIONAL;
>
> retry:
> hmm_range.notifier_seq = mmu_interval_read_begin(notifier);
> @@ -1459,7 +1461,7 @@ int drm_gpusvm_range_get_pages(struct
> drm_gpusvm *gpusvm,
> dpagemap->ops->device_map(dpagemap,
> gpusvm-
> >drm->dev,
> page,
> order,
> -
> DMA_BIDIRECTIONAL);
> + dma_dir);
> if (dma_mapping_error(gpusvm->drm->dev,
> range-
> >dma_addr[j].addr)) {
> err = -EFAULT;
> @@ -1478,7 +1480,7 @@ int drm_gpusvm_range_get_pages(struct
> drm_gpusvm *gpusvm,
> addr = dma_map_page(gpusvm->drm->dev,
> page, 0,
> PAGE_SIZE << order,
> - DMA_BIDIRECTIONAL);
> + dma_dir);
> if (dma_mapping_error(gpusvm->drm->dev,
> addr)) {
> err = -EFAULT;
> goto err_unmap;
> @@ -1486,7 +1488,7 @@ int drm_gpusvm_range_get_pages(struct
> drm_gpusvm *gpusvm,
>
> range->dma_addr[j] =
> drm_pagemap_device_addr_encode
> (addr, DRM_INTERCONNECT_SYSTEM,
> order,
> - DMA_BIDIRECTIONAL);
> + dma_dir);
> }
> i += 1 << order;
> num_dma_mapped = i;
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/7] drm/gpusvm: mark pages as dirty
2025-03-20 19:29 ` Thomas Hellström
@ 2025-03-20 19:33 ` Matthew Brost
2025-03-21 11:37 ` Matthew Auld
0 siblings, 1 reply; 22+ messages in thread
From: Matthew Brost @ 2025-03-20 19:33 UTC (permalink / raw)
To: Thomas Hellström; +Cc: Matthew Auld, intel-xe, dri-devel
On Thu, Mar 20, 2025 at 08:29:42PM +0100, Thomas Hellström wrote:
> On Thu, 2025-03-20 at 17:30 +0000, Matthew Auld wrote:
> > If the memory is going to be accessed by the device, make sure we
> > mark
> > the pages accordingly such that the kernel knows this. This aligns
> > with
> > the xe-userptr code.
> >
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > ---
> > drivers/gpu/drm/drm_gpusvm.c | 9 +++++++++
> > 1 file changed, 9 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_gpusvm.c
> > b/drivers/gpu/drm/drm_gpusvm.c
> > index 7f1cf5492bba..5b4ecd36dff1 100644
> > --- a/drivers/gpu/drm/drm_gpusvm.c
> > +++ b/drivers/gpu/drm/drm_gpusvm.c
> > @@ -1471,6 +1471,7 @@ int drm_gpusvm_range_get_pages(struct
> > drm_gpusvm *gpusvm,
> > pages[i] = page;
> > } else {
> > dma_addr_t addr;
> > + unsigned int k;
> >
> > if (is_zone_device_page(page) || zdd) {
> > err = -EOPNOTSUPP;
> > @@ -1489,6 +1490,14 @@ int drm_gpusvm_range_get_pages(struct
> > drm_gpusvm *gpusvm,
> > range->dma_addr[j] =
> > drm_pagemap_device_addr_encode
> > (addr, DRM_INTERCONNECT_SYSTEM,
> > order,
> > dma_dir);
> > +
> > + for (k = 0; k < 1u << order; k++) {
> > + if (!ctx->read_only)
> > + set_page_dirty_lock(page);
> > +
> > + mark_page_accessed(page);
> > + page++;
> > + }
>
> Actually I think the userptr code did this unnecessarily. This is done
> in the CPU page-fault handler, which means it's taken care of during
> hmm_range_fault(). Now if the CPU PTE happens to be present and
> writeable there will be no fault, but that was done when the page was
> faulted in anyway.
>
> If there was a page cleaning event in between so the dirty flag was
> dropped, then my understanding is that in addition to an invalidation
> notifier, also the CPU PTE is zapped, so that it will be dirtied again
> on the next write access, either by the CPU faulting the page or
> hmm_range_fault() if there is a GPU page-fault.
>
> So I think we're good without this patch.
>
I was going to suggest the same thing as Thomas - we are good without
this patch for the reasons he states.
Matt
> /Thomas
>
>
>
> > }
> > i += 1 << order;
> > num_dma_mapped = i;
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/7] drm/gpusvm: use more selective dma dir in get_pages()
2025-03-20 17:29 ` [PATCH 2/7] drm/gpusvm: use more selective dma dir in get_pages() Matthew Auld
2025-03-20 19:31 ` Thomas Hellström
@ 2025-03-20 19:37 ` Matthew Brost
1 sibling, 0 replies; 22+ messages in thread
From: Matthew Brost @ 2025-03-20 19:37 UTC (permalink / raw)
To: Matthew Auld; +Cc: intel-xe, dri-devel, Thomas Hellström
On Thu, Mar 20, 2025 at 05:29:59PM +0000, Matthew Auld wrote:
> If we are only reading the memory then from the device pov the direction
> can be DMA_TO_DEVICE. This aligns with the xe-userptr code. Using the
> most restrictive data direction to represent the access is normally a
> good idea.
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/drm_gpusvm.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
> index 48993cef4a74..7f1cf5492bba 100644
> --- a/drivers/gpu/drm/drm_gpusvm.c
> +++ b/drivers/gpu/drm/drm_gpusvm.c
> @@ -1355,6 +1355,8 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> int err = 0;
> struct dev_pagemap *pagemap;
> struct drm_pagemap *dpagemap;
> + enum dma_data_direction dma_dir = ctx->read_only ? DMA_TO_DEVICE :
> + DMA_BIDIRECTIONAL;
>
> retry:
> hmm_range.notifier_seq = mmu_interval_read_begin(notifier);
> @@ -1459,7 +1461,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> dpagemap->ops->device_map(dpagemap,
> gpusvm->drm->dev,
> page, order,
> - DMA_BIDIRECTIONAL);
> + dma_dir);
> if (dma_mapping_error(gpusvm->drm->dev,
> range->dma_addr[j].addr)) {
> err = -EFAULT;
> @@ -1478,7 +1480,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> addr = dma_map_page(gpusvm->drm->dev,
> page, 0,
> PAGE_SIZE << order,
> - DMA_BIDIRECTIONAL);
> + dma_dir);
> if (dma_mapping_error(gpusvm->drm->dev, addr)) {
> err = -EFAULT;
> goto err_unmap;
> @@ -1486,7 +1488,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
>
> range->dma_addr[j] = drm_pagemap_device_addr_encode
> (addr, DRM_INTERCONNECT_SYSTEM, order,
> - DMA_BIDIRECTIONAL);
> + dma_dir);
> }
> i += 1 << order;
> num_dma_mapped = i;
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/7] drm/gpusvm: fix hmm_pfn_to_map_order() usage
2025-03-20 17:29 ` [PATCH 1/7] drm/gpusvm: fix hmm_pfn_to_map_order() usage Matthew Auld
@ 2025-03-20 19:52 ` Thomas Hellström
2025-03-20 20:40 ` Matthew Brost
1 sibling, 0 replies; 22+ messages in thread
From: Thomas Hellström @ 2025-03-20 19:52 UTC (permalink / raw)
To: Matthew Auld, intel-xe; +Cc: dri-devel, Matthew Brost
On Thu, 2025-03-20 at 17:29 +0000, Matthew Auld wrote:
> Handle the case where the hmm range partially covers a huge page
> (like
> 2M), otherwise we can potentially end up doing something nasty like
> mapping memory which potentially is outside the range, and maybe not
> even mapped by the mm. Fix is based on the xe userptr code, which in
> a
> future patch will directly use gpusvm, so needs alignment here.
>
> Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/drm_gpusvm.c | 25 +++++++++++++++++++++++--
> 1 file changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gpusvm.c
> b/drivers/gpu/drm/drm_gpusvm.c
> index 2451c816edd5..48993cef4a74 100644
> --- a/drivers/gpu/drm/drm_gpusvm.c
> +++ b/drivers/gpu/drm/drm_gpusvm.c
> @@ -817,6 +817,27 @@ drm_gpusvm_range_alloc(struct drm_gpusvm
> *gpusvm,
> return range;
> }
>
> +/*
> + * To allow skipping PFNs with the same flags (like when they belong
> to
> + * the same huge PTE) when looping over the pfn array, take a given
> a hmm_pfn,
> + * and return the largest order that will fit inside the PTE, but
> also crucially
> + * accounting for the original hmm range boundaries.
> + */
> +static unsigned int drm_gpusvm_hmm_pfn_to_order(unsigned long
> hmm_pfn,
> + unsigned long
> hmm_pfn_index,
> + unsigned long
> npages)
> +{
> + unsigned long size;
> +
> + size = 1UL << hmm_pfn_to_map_order(hmm_pfn);
> + size -= (hmm_pfn & ~HMM_PFN_FLAGS) & (size - 1);
> + hmm_pfn_index += size;
> + if (hmm_pfn_index > npages)
> + size -= (hmm_pfn_index - npages);
> +
> + return fls(size) - 1;
ilog2() for readability?
> +}
> +
> /**
> * drm_gpusvm_check_pages() - Check pages
> * @gpusvm: Pointer to the GPU SVM structure
> @@ -875,7 +896,7 @@ static bool drm_gpusvm_check_pages(struct
> drm_gpusvm *gpusvm,
> err = -EFAULT;
> goto err_free;
> }
> - i += 0x1 << hmm_pfn_to_map_order(pfns[i]);
> + i += 0x1 << drm_gpusvm_hmm_pfn_to_order(pfns[i], i,
> npages);
> }
>
> err_free:
> @@ -1408,7 +1429,7 @@ int drm_gpusvm_range_get_pages(struct
> drm_gpusvm *gpusvm,
> for (i = 0, j = 0; i < npages; ++j) {
> struct page *page = hmm_pfn_to_page(pfns[i]);
>
> - order = hmm_pfn_to_map_order(pfns[i]);
> + order = drm_gpusvm_hmm_pfn_to_order(pfns[i], i,
> npages);
> if (is_device_private_page(page) ||
> is_device_coherent_page(page)) {
> if (zdd != page->zone_device_data && i > 0)
> {
Either way
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/7] drm/gpusvm: fix hmm_pfn_to_map_order() usage
2025-03-20 17:29 ` [PATCH 1/7] drm/gpusvm: fix hmm_pfn_to_map_order() usage Matthew Auld
2025-03-20 19:52 ` Thomas Hellström
@ 2025-03-20 20:40 ` Matthew Brost
2025-03-21 11:42 ` Matthew Auld
1 sibling, 1 reply; 22+ messages in thread
From: Matthew Brost @ 2025-03-20 20:40 UTC (permalink / raw)
To: Matthew Auld; +Cc: intel-xe, dri-devel, Thomas Hellström
On Thu, Mar 20, 2025 at 05:29:58PM +0000, Matthew Auld wrote:
> Handle the case where the hmm range partially covers a huge page (like
> 2M), otherwise we can potentially end up doing something nasty like
> mapping memory which potentially is outside the range, and maybe not
> even mapped by the mm. Fix is based on the xe userptr code, which in a
> future patch will directly use gpusvm, so needs alignment here.
>
> Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/drm_gpusvm.c | 25 +++++++++++++++++++++++--
> 1 file changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
> index 2451c816edd5..48993cef4a74 100644
> --- a/drivers/gpu/drm/drm_gpusvm.c
> +++ b/drivers/gpu/drm/drm_gpusvm.c
> @@ -817,6 +817,27 @@ drm_gpusvm_range_alloc(struct drm_gpusvm *gpusvm,
> return range;
> }
>
> +/*
> + * To allow skipping PFNs with the same flags (like when they belong to
> + * the same huge PTE) when looping over the pfn array, take a given a hmm_pfn,
> + * and return the largest order that will fit inside the PTE, but also crucially
> + * accounting for the original hmm range boundaries.
> + */
I'd make this proper kernel doc given all of drm_gpusvm.c has proper kernel doc.
Otherwise LGTM.
Matt
> +static unsigned int drm_gpusvm_hmm_pfn_to_order(unsigned long hmm_pfn,
> + unsigned long hmm_pfn_index,
> + unsigned long npages)
> +{
> + unsigned long size;
> +
> + size = 1UL << hmm_pfn_to_map_order(hmm_pfn);
> + size -= (hmm_pfn & ~HMM_PFN_FLAGS) & (size - 1);
> + hmm_pfn_index += size;
> + if (hmm_pfn_index > npages)
> + size -= (hmm_pfn_index - npages);
> +
> + return fls(size) - 1;
> +}
> +
> /**
> * drm_gpusvm_check_pages() - Check pages
> * @gpusvm: Pointer to the GPU SVM structure
> @@ -875,7 +896,7 @@ static bool drm_gpusvm_check_pages(struct drm_gpusvm *gpusvm,
> err = -EFAULT;
> goto err_free;
> }
> - i += 0x1 << hmm_pfn_to_map_order(pfns[i]);
> + i += 0x1 << drm_gpusvm_hmm_pfn_to_order(pfns[i], i, npages);
> }
>
> err_free:
> @@ -1408,7 +1429,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> for (i = 0, j = 0; i < npages; ++j) {
> struct page *page = hmm_pfn_to_page(pfns[i]);
>
> - order = hmm_pfn_to_map_order(pfns[i]);
> + order = drm_gpusvm_hmm_pfn_to_order(pfns[i], i, npages);
> if (is_device_private_page(page) ||
> is_device_coherent_page(page)) {
> if (zdd != page->zone_device_data && i > 0) {
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 4/7] drm/gpusvm: pull out drm_gpusvm_pages substructure
2025-03-20 17:30 ` [PATCH 4/7] drm/gpusvm: pull out drm_gpusvm_pages substructure Matthew Auld
@ 2025-03-20 20:43 ` Matthew Brost
2025-03-21 11:53 ` Matthew Auld
0 siblings, 1 reply; 22+ messages in thread
From: Matthew Brost @ 2025-03-20 20:43 UTC (permalink / raw)
To: Matthew Auld; +Cc: intel-xe, dri-devel, Thomas Hellström
On Thu, Mar 20, 2025 at 05:30:01PM +0000, Matthew Auld wrote:
> Pull the pages stuff from the svm range into its own substructure, with
> the idea of having the main pages related routines, like get_pages(),
> unmap_pages() and free_pages() all operating on some lower level
> structures, which can then be re-used for stuff like userptr which wants
> to use basic stuff like get_pages(), but not all the other more complex
> svm stuff.
>
> Suggested-by: Matthew Brost <matthew.brost@intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
> drivers/gpu/drm/drm_gpusvm.c | 95 +++++++++++++++++++-----------------
> drivers/gpu/drm/xe/xe_pt.c | 2 +-
> drivers/gpu/drm/xe/xe_svm.c | 6 +--
> drivers/gpu/drm/xe/xe_svm.h | 2 +-
> include/drm/drm_gpusvm.h | 43 +++++++++-------
> 5 files changed, 82 insertions(+), 66 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
> index 5b4ecd36dff1..f27731a51f34 100644
> --- a/drivers/gpu/drm/drm_gpusvm.c
> +++ b/drivers/gpu/drm/drm_gpusvm.c
> @@ -812,7 +812,7 @@ drm_gpusvm_range_alloc(struct drm_gpusvm *gpusvm,
> range->itree.last = ALIGN(fault_addr + 1, chunk_size) - 1;
> INIT_LIST_HEAD(&range->entry);
> range->notifier_seq = LONG_MAX;
> - range->flags.migrate_devmem = migrate_devmem ? 1 : 0;
> + range->pages.flags.migrate_devmem = migrate_devmem ? 1 : 0;
>
> return range;
> }
> @@ -1120,27 +1120,27 @@ drm_gpusvm_range_find_or_insert(struct drm_gpusvm *gpusvm,
> EXPORT_SYMBOL_GPL(drm_gpusvm_range_find_or_insert);
>
> /**
> - * __drm_gpusvm_range_unmap_pages() - Unmap pages associated with a GPU SVM range (internal)
> + * __drm_gpusvm_unmap_pages() - Unmap pages associated with a GPU SVM range (internal)
> * @gpusvm: Pointer to the GPU SVM structure
> - * @range: Pointer to the GPU SVM range structure
> + * @svm_pages: Pointer to the GPU SVM pages structure
> * @npages: Number of pages to unmap
> *
> * This function unmap pages associated with a GPU SVM range. Assumes and
> * asserts correct locking is in place when called.
> */
> -static void __drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
> - struct drm_gpusvm_range *range,
> - unsigned long npages)
> +static void __drm_gpusvm_unmap_pages(struct drm_gpusvm *gpusvm,
> + struct drm_gpusvm_pages *svm_pages,
> + unsigned long npages)
> {
> unsigned long i, j;
> - struct drm_pagemap *dpagemap = range->dpagemap;
> + struct drm_pagemap *dpagemap = svm_pages->dpagemap;
> struct device *dev = gpusvm->drm->dev;
>
> lockdep_assert_held(&gpusvm->notifier_lock);
>
> - if (range->flags.has_dma_mapping) {
> + if (svm_pages->flags.has_dma_mapping) {
> for (i = 0, j = 0; i < npages; j++) {
> - struct drm_pagemap_device_addr *addr = &range->dma_addr[j];
> + struct drm_pagemap_device_addr *addr = &svm_pages->dma_addr[j];
>
> if (addr->proto == DRM_INTERCONNECT_SYSTEM)
> dma_unmap_page(dev,
> @@ -1152,9 +1152,9 @@ static void __drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
> dev, *addr);
> i += 1 << addr->order;
> }
> - range->flags.has_devmem_pages = false;
> - range->flags.has_dma_mapping = false;
> - range->dpagemap = NULL;
> + svm_pages->flags.has_devmem_pages = false;
> + svm_pages->flags.has_dma_mapping = false;
> + svm_pages->dpagemap = NULL;
> }
> }
>
> @@ -1165,14 +1165,14 @@ static void __drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
> *
> * This function frees the dma address array associated with a GPU SVM range.
> */
> -static void drm_gpusvm_range_free_pages(struct drm_gpusvm *gpusvm,
> - struct drm_gpusvm_range *range)
> +static void drm_gpusvm_free_pages(struct drm_gpusvm *gpusvm,
> + struct drm_gpusvm_pages *svm_pages)
> {
> lockdep_assert_held(&gpusvm->notifier_lock);
>
> - if (range->dma_addr) {
> - kvfree(range->dma_addr);
> - range->dma_addr = NULL;
> + if (svm_pages->dma_addr) {
> + kvfree(svm_pages->dma_addr);
> + svm_pages->dma_addr = NULL;
> }
> }
>
> @@ -1200,8 +1200,8 @@ void drm_gpusvm_range_remove(struct drm_gpusvm *gpusvm,
> return;
>
> drm_gpusvm_notifier_lock(gpusvm);
> - __drm_gpusvm_range_unmap_pages(gpusvm, range, npages);
> - drm_gpusvm_range_free_pages(gpusvm, range);
> + __drm_gpusvm_unmap_pages(gpusvm, &range->pages, npages);
> + drm_gpusvm_free_pages(gpusvm, &range->pages);
> __drm_gpusvm_range_remove(notifier, range);
> drm_gpusvm_notifier_unlock(gpusvm);
>
> @@ -1266,6 +1266,14 @@ void drm_gpusvm_range_put(struct drm_gpusvm_range *range)
> }
> EXPORT_SYMBOL_GPL(drm_gpusvm_range_put);
>
Same comment as patch #1, let's aim to keep kernel doc for all functions in drm_gpusvm.c.
> +static bool drm_gpusvm_pages_valid(struct drm_gpusvm *gpusvm,
> + struct drm_gpusvm_pages *svm_pages)
> +{
> + lockdep_assert_held(&gpusvm->notifier_lock);
> +
> + return svm_pages->flags.has_devmem_pages || svm_pages->flags.has_dma_mapping;
> +}
> +
> /**
> * drm_gpusvm_range_pages_valid() - GPU SVM range pages valid
> * @gpusvm: Pointer to the GPU SVM structure
> @@ -1283,9 +1291,7 @@ EXPORT_SYMBOL_GPL(drm_gpusvm_range_put);
> bool drm_gpusvm_range_pages_valid(struct drm_gpusvm *gpusvm,
> struct drm_gpusvm_range *range)
> {
> - lockdep_assert_held(&gpusvm->notifier_lock);
> -
> - return range->flags.has_devmem_pages || range->flags.has_dma_mapping;
> + return drm_gpusvm_pages_valid(gpusvm, &range->pages);
> }
> EXPORT_SYMBOL_GPL(drm_gpusvm_range_pages_valid);
>
> @@ -1301,17 +1307,17 @@ EXPORT_SYMBOL_GPL(drm_gpusvm_range_pages_valid);
> */
> static bool
> drm_gpusvm_range_pages_valid_unlocked(struct drm_gpusvm *gpusvm,
> - struct drm_gpusvm_range *range)
> + struct drm_gpusvm_pages *svm_pages)
> {
> bool pages_valid;
>
> - if (!range->dma_addr)
> + if (!svm_pages->dma_addr)
> return false;
>
> drm_gpusvm_notifier_lock(gpusvm);
> - pages_valid = drm_gpusvm_range_pages_valid(gpusvm, range);
> + pages_valid = drm_gpusvm_pages_valid(gpusvm, svm_pages);
> if (!pages_valid)
> - drm_gpusvm_range_free_pages(gpusvm, range);
> + drm_gpusvm_free_pages(gpusvm, svm_pages);
> drm_gpusvm_notifier_unlock(gpusvm);
>
> return pages_valid;
> @@ -1332,6 +1338,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> struct drm_gpusvm_range *range,
> const struct drm_gpusvm_ctx *ctx)
> {
> + struct drm_gpusvm_pages *svm_pages = &range->pages;
> struct mmu_interval_notifier *notifier = &range->notifier->notifier;
> struct hmm_range hmm_range = {
> .default_flags = HMM_PFN_REQ_FAULT | (ctx->read_only ? 0 :
> @@ -1360,7 +1367,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
>
> retry:
> hmm_range.notifier_seq = mmu_interval_read_begin(notifier);
> - if (drm_gpusvm_range_pages_valid_unlocked(gpusvm, range))
> + if (drm_gpusvm_range_pages_valid_unlocked(gpusvm, svm_pages))
> goto set_seqno;
>
> pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL);
> @@ -1401,7 +1408,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> */
> drm_gpusvm_notifier_lock(gpusvm);
>
> - if (range->flags.unmapped) {
> + if (svm_pages->flags.unmapped) {
> drm_gpusvm_notifier_unlock(gpusvm);
> err = -EFAULT;
> goto err_free;
> @@ -1413,13 +1420,12 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> goto retry;
> }
>
> - if (!range->dma_addr) {
> + if (!svm_pages->dma_addr) {
> /* Unlock and restart mapping to allocate memory. */
> drm_gpusvm_notifier_unlock(gpusvm);
> - range->dma_addr = kvmalloc_array(npages,
> - sizeof(*range->dma_addr),
> - GFP_KERNEL);
> - if (!range->dma_addr) {
> + svm_pages->dma_addr =
> + kvmalloc_array(npages, sizeof(*svm_pages->dma_addr), GFP_KERNEL);
> + if (!svm_pages->dma_addr) {
> err = -ENOMEM;
> goto err_free;
> }
> @@ -1457,13 +1463,13 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> goto err_unmap;
> }
> }
> - range->dma_addr[j] =
> + svm_pages->dma_addr[j] =
> dpagemap->ops->device_map(dpagemap,
> gpusvm->drm->dev,
> page, order,
> dma_dir);
> if (dma_mapping_error(gpusvm->drm->dev,
> - range->dma_addr[j].addr)) {
> + svm_pages->dma_addr[j].addr)) {
> err = -EFAULT;
> goto err_unmap;
> }
> @@ -1487,7 +1493,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> goto err_unmap;
> }
>
> - range->dma_addr[j] = drm_pagemap_device_addr_encode
> + svm_pages->dma_addr[j] = drm_pagemap_device_addr_encode
> (addr, DRM_INTERCONNECT_SYSTEM, order,
> dma_dir);
>
> @@ -1503,10 +1509,10 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> num_dma_mapped = i;
> }
>
> - range->flags.has_dma_mapping = true;
> + svm_pages->flags.has_dma_mapping = true;
> if (zdd) {
> - range->flags.has_devmem_pages = true;
> - range->dpagemap = dpagemap;
> + svm_pages->flags.has_devmem_pages = true;
> + svm_pages->dpagemap = dpagemap;
> }
>
> drm_gpusvm_notifier_unlock(gpusvm);
> @@ -1517,7 +1523,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> return 0;
>
> err_unmap:
> - __drm_gpusvm_range_unmap_pages(gpusvm, range, num_dma_mapped);
> + __drm_gpusvm_unmap_pages(gpusvm, svm_pages, num_dma_mapped);
> drm_gpusvm_notifier_unlock(gpusvm);
> err_free:
> kvfree(pfns);
> @@ -1543,6 +1549,7 @@ void drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
> struct drm_gpusvm_range *range,
> const struct drm_gpusvm_ctx *ctx)
> {
> + struct drm_gpusvm_pages *svm_pages = &range->pages;
> unsigned long npages = npages_in_range(drm_gpusvm_range_start(range),
> drm_gpusvm_range_end(range));
>
> @@ -1551,7 +1558,7 @@ void drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
> else
> drm_gpusvm_notifier_lock(gpusvm);
>
> - __drm_gpusvm_range_unmap_pages(gpusvm, range, npages);
> + __drm_gpusvm_unmap_pages(gpusvm, svm_pages, npages);
>
> if (!ctx->in_notifier)
> drm_gpusvm_notifier_unlock(gpusvm);
> @@ -1719,7 +1726,7 @@ int drm_gpusvm_migrate_to_devmem(struct drm_gpusvm *gpusvm,
>
> mmap_assert_locked(gpusvm->mm);
>
> - if (!range->flags.migrate_devmem)
> + if (!range->pages.flags.migrate_devmem)
> return -EINVAL;
>
> if (!ops->populate_devmem_pfn || !ops->copy_to_devmem ||
> @@ -2248,10 +2255,10 @@ void drm_gpusvm_range_set_unmapped(struct drm_gpusvm_range *range,
> {
> lockdep_assert_held_write(&range->gpusvm->notifier_lock);
>
> - range->flags.unmapped = true;
> + range->pages.flags.unmapped = true;
> if (drm_gpusvm_range_start(range) < mmu_range->start ||
> drm_gpusvm_range_end(range) > mmu_range->end)
> - range->flags.partial_unmap = true;
> + range->pages.flags.partial_unmap = true;
> }
> EXPORT_SYMBOL_GPL(drm_gpusvm_range_set_unmapped);
>
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index ffaf0d02dc7d..c43e7619cb80 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -659,7 +659,7 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
> return -EAGAIN;
> }
> if (xe_svm_range_has_dma_mapping(range)) {
> - xe_res_first_dma(range->base.dma_addr, 0,
> + xe_res_first_dma(range->base.pages.dma_addr, 0,
> range->base.itree.last + 1 - range->base.itree.start,
> &curs);
> is_devmem = xe_res_is_vram(&curs);
> diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> index 08617a62ab07..4e7f2e77b38d 100644
> --- a/drivers/gpu/drm/xe/xe_svm.c
> +++ b/drivers/gpu/drm/xe/xe_svm.c
> @@ -17,7 +17,7 @@
> static bool xe_svm_range_in_vram(struct xe_svm_range *range)
> {
> /* Not reliable without notifier lock */
> - return range->base.flags.has_devmem_pages;
> + return range->base.pages.flags.has_devmem_pages;
> }
>
> static bool xe_svm_range_has_vram_binding(struct xe_svm_range *range)
> @@ -135,7 +135,7 @@ xe_svm_range_notifier_event_begin(struct xe_vm *vm, struct drm_gpusvm_range *r,
> range_debug(range, "NOTIFIER");
>
> /* Skip if already unmapped or if no binding exist */
> - if (range->base.flags.unmapped || !range->tile_present)
> + if (range->base.pages.flags.unmapped || !range->tile_present)
> return 0;
>
> range_debug(range, "NOTIFIER - EXECUTE");
> @@ -766,7 +766,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
> range_debug(range, "PAGE FAULT");
>
> /* XXX: Add migration policy, for now migrate range once */
> - if (!range->skip_migrate && range->base.flags.migrate_devmem &&
> + if (!range->skip_migrate && range->base.pages.flags.migrate_devmem &&
> xe_svm_range_size(range) >= SZ_64K) {
> range->skip_migrate = true;
>
> diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
> index 93442738666e..79fbd4fec1fb 100644
> --- a/drivers/gpu/drm/xe/xe_svm.h
> +++ b/drivers/gpu/drm/xe/xe_svm.h
> @@ -136,7 +136,7 @@ void xe_svm_range_debug(struct xe_svm_range *range, const char *operation)
> static inline bool xe_svm_range_has_dma_mapping(struct xe_svm_range *range)
> {
> lockdep_assert_held(&range->base.gpusvm->notifier_lock);
> - return range->base.flags.has_dma_mapping;
> + return range->base.pages.flags.has_dma_mapping;
> }
>
> #define xe_svm_assert_in_notifier(vm__) \
> diff --git a/include/drm/drm_gpusvm.h b/include/drm/drm_gpusvm.h
> index df120b4d1f83..c15c733ef0ad 100644
> --- a/include/drm/drm_gpusvm.h
> +++ b/include/drm/drm_gpusvm.h
> @@ -186,14 +186,8 @@ struct drm_gpusvm_notifier {
> };
>
> /**
> - * struct drm_gpusvm_range - Structure representing a GPU SVM range
> + * struct drm_gpusvm_pages - Structure representing a GPU SVM mapped pages
> *
> - * @gpusvm: Pointer to the GPU SVM structure
> - * @notifier: Pointer to the GPU SVM notifier
> - * @refcount: Reference count for the range
> - * @itree: Interval tree node for the range (inserted in GPU SVM notifier)
> - * @entry: List entry to fast interval tree traversal
> - * @notifier_seq: Notifier sequence number of the range's pages
> * @dma_addr: Device address array
> * @dpagemap: The struct drm_pagemap of the device pages we're dma-mapping.
> * Note this is assuming only one drm_pagemap per range is allowed.
> @@ -203,17 +197,8 @@ struct drm_gpusvm_notifier {
> * @flags.partial_unmap: Flag indicating if the range has been partially unmapped
> * @flags.has_devmem_pages: Flag indicating if the range has devmem pages
> * @flags.has_dma_mapping: Flag indicating if the range has a DMA mapping
> - *
> - * This structure represents a GPU SVM range used for tracking memory ranges
> - * mapped in a DRM device.
> */
> -struct drm_gpusvm_range {
> - struct drm_gpusvm *gpusvm;
> - struct drm_gpusvm_notifier *notifier;
> - struct kref refcount;
> - struct interval_tree_node itree;
> - struct list_head entry;
> - unsigned long notifier_seq;
> +struct drm_gpusvm_pages {
> struct drm_pagemap_device_addr *dma_addr;
> struct drm_pagemap *dpagemap;
> struct {
> @@ -227,6 +212,30 @@ struct drm_gpusvm_range {
> } flags;
> };
>
> +/**
> + * struct drm_gpusvm_range - Structure representing a GPU SVM range
> + *
> + * @gpusvm: Pointer to the GPU SVM structure
> + * @notifier: Pointer to the GPU SVM notifier
> + * @refcount: Reference count for the range
> + * @itree: Interval tree node for the range (inserted in GPU SVM notifier)
> + * @entry: List entry to fast interval tree traversal
> + * @notifier_seq: Notifier sequence number of the range's pages
> + * @pages: The pages for this range.
> + *
> + * This structure represents a GPU SVM range used for tracking memory ranges
> + * mapped in a DRM device.
> + */
> +struct drm_gpusvm_range {
> + struct drm_gpusvm *gpusvm;
> + struct drm_gpusvm_notifier *notifier;
> + struct kref refcount;
> + struct interval_tree_node itree;
> + struct list_head entry;
> + unsigned long notifier_seq;
Should the notifier seqno be in the pages?
Matt
> + struct drm_gpusvm_pages pages;
> +};
> +
> /**
> * struct drm_gpusvm - GPU SVM structure
> *
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 7/7] drm/xe/userptr: replace xe_hmm with gpusvm
2025-03-20 17:30 ` [PATCH 7/7] drm/xe/userptr: replace xe_hmm with gpusvm Matthew Auld
@ 2025-03-20 20:52 ` Matthew Brost
2025-03-25 9:50 ` Ghimiray, Himal Prasad
0 siblings, 1 reply; 22+ messages in thread
From: Matthew Brost @ 2025-03-20 20:52 UTC (permalink / raw)
To: Matthew Auld
Cc: intel-xe, dri-devel, Thomas Hellström, himal.prasad.ghimiray
On Thu, Mar 20, 2025 at 05:30:04PM +0000, Matthew Auld wrote:
> Goal here is cut over to gpusvm and remove xe_hmm, relying instead on
> common code. The core facilities we need are get_pages(), unmap_pages()
> and free_pages() for a given useptr range, plus a vm level notifier
> lock, which is now provided by gpusvm.
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/xe/Makefile | 1 -
> drivers/gpu/drm/xe/xe_exec.c | 4 +-
> drivers/gpu/drm/xe/xe_hmm.c | 349 -------------------------------
> drivers/gpu/drm/xe/xe_hmm.h | 18 --
> drivers/gpu/drm/xe/xe_pt.c | 20 +-
> drivers/gpu/drm/xe/xe_vm.c | 74 ++++---
> drivers/gpu/drm/xe/xe_vm_types.h | 30 ++-
> 7 files changed, 70 insertions(+), 426 deletions(-)
> delete mode 100644 drivers/gpu/drm/xe/xe_hmm.c
> delete mode 100644 drivers/gpu/drm/xe/xe_hmm.h
>
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 9699b08585f7..ada1f0fad629 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -124,7 +124,6 @@ xe-y += xe_bb.o \
> xe_wait_user_fence.o \
> xe_wopcm.o
>
> -xe-$(CONFIG_HMM_MIRROR) += xe_hmm.o
> xe-$(CONFIG_DRM_GPUSVM) += xe_svm.o
>
> # graphics hardware monitoring (HWMON) support
> diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c
> index b75adfc99fb7..84daeedd65d4 100644
> --- a/drivers/gpu/drm/xe/xe_exec.c
> +++ b/drivers/gpu/drm/xe/xe_exec.c
> @@ -294,7 +294,7 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> if (err)
> goto err_put_job;
>
> - err = down_read_interruptible(&vm->userptr.notifier_lock);
> + err = down_read_interruptible(&vm->userptr.gpusvm.notifier_lock);
> if (err)
> goto err_put_job;
>
> @@ -336,7 +336,7 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>
> err_repin:
> if (!xe_vm_in_lr_mode(vm))
> - up_read(&vm->userptr.notifier_lock);
> + up_read(&vm->userptr.gpusvm.notifier_lock);
> err_put_job:
> if (err)
> xe_sched_job_put(job);
> diff --git a/drivers/gpu/drm/xe/xe_hmm.c b/drivers/gpu/drm/xe/xe_hmm.c
> deleted file mode 100644
> index c3cc0fa105e8..000000000000
> --- a/drivers/gpu/drm/xe/xe_hmm.c
> +++ /dev/null
> @@ -1,349 +0,0 @@
> -// SPDX-License-Identifier: MIT
> -/*
> - * Copyright © 2024 Intel Corporation
> - */
> -
> -#include <linux/scatterlist.h>
> -#include <linux/mmu_notifier.h>
> -#include <linux/dma-mapping.h>
> -#include <linux/memremap.h>
> -#include <linux/swap.h>
> -#include <linux/hmm.h>
> -#include <linux/mm.h>
> -#include "xe_hmm.h"
> -#include "xe_vm.h"
> -#include "xe_bo.h"
> -
> -static u64 xe_npages_in_range(unsigned long start, unsigned long end)
> -{
> - return (end - start) >> PAGE_SHIFT;
> -}
> -
> -/**
> - * xe_mark_range_accessed() - mark a range is accessed, so core mm
> - * have such information for memory eviction or write back to
> - * hard disk
> - * @range: the range to mark
> - * @write: if write to this range, we mark pages in this range
> - * as dirty
> - */
> -static void xe_mark_range_accessed(struct hmm_range *range, bool write)
> -{
> - struct page *page;
> - u64 i, npages;
> -
> - npages = xe_npages_in_range(range->start, range->end);
> - for (i = 0; i < npages; i++) {
> - page = hmm_pfn_to_page(range->hmm_pfns[i]);
> - if (write)
> - set_page_dirty_lock(page);
> -
> - mark_page_accessed(page);
> - }
> -}
> -
> -static int xe_alloc_sg(struct xe_device *xe, struct sg_table *st,
> - struct hmm_range *range, struct rw_semaphore *notifier_sem)
> -{
> - unsigned long i, npages, hmm_pfn;
> - unsigned long num_chunks = 0;
> - int ret;
> -
> - /* HMM docs says this is needed. */
> - ret = down_read_interruptible(notifier_sem);
> - if (ret)
> - return ret;
> -
> - if (mmu_interval_read_retry(range->notifier, range->notifier_seq)) {
> - up_read(notifier_sem);
> - return -EAGAIN;
> - }
> -
> - npages = xe_npages_in_range(range->start, range->end);
> - for (i = 0; i < npages;) {
> - unsigned long len;
> -
> - hmm_pfn = range->hmm_pfns[i];
> - xe_assert(xe, hmm_pfn & HMM_PFN_VALID);
> -
> - len = 1UL << hmm_pfn_to_map_order(hmm_pfn);
> -
> - /* If order > 0 the page may extend beyond range->start */
> - len -= (hmm_pfn & ~HMM_PFN_FLAGS) & (len - 1);
> - i += len;
> - num_chunks++;
> - }
> - up_read(notifier_sem);
> -
> - return sg_alloc_table(st, num_chunks, GFP_KERNEL);
> -}
> -
> -/**
> - * xe_build_sg() - build a scatter gather table for all the physical pages/pfn
> - * in a hmm_range. dma-map pages if necessary. dma-address is save in sg table
> - * and will be used to program GPU page table later.
> - * @xe: the xe device who will access the dma-address in sg table
> - * @range: the hmm range that we build the sg table from. range->hmm_pfns[]
> - * has the pfn numbers of pages that back up this hmm address range.
> - * @st: pointer to the sg table.
> - * @notifier_sem: The xe notifier lock.
> - * @write: whether we write to this range. This decides dma map direction
> - * for system pages. If write we map it bi-diretional; otherwise
> - * DMA_TO_DEVICE
> - *
> - * All the contiguous pfns will be collapsed into one entry in
> - * the scatter gather table. This is for the purpose of efficiently
> - * programming GPU page table.
> - *
> - * The dma_address in the sg table will later be used by GPU to
> - * access memory. So if the memory is system memory, we need to
> - * do a dma-mapping so it can be accessed by GPU/DMA.
> - *
> - * FIXME: This function currently only support pages in system
> - * memory. If the memory is GPU local memory (of the GPU who
> - * is going to access memory), we need gpu dpa (device physical
> - * address), and there is no need of dma-mapping. This is TBD.
> - *
> - * FIXME: dma-mapping for peer gpu device to access remote gpu's
> - * memory. Add this when you support p2p
> - *
> - * This function allocates the storage of the sg table. It is
> - * caller's responsibility to free it calling sg_free_table.
> - *
> - * Returns 0 if successful; -ENOMEM if fails to allocate memory
> - */
> -static int xe_build_sg(struct xe_device *xe, struct hmm_range *range,
> - struct sg_table *st,
> - struct rw_semaphore *notifier_sem,
> - bool write)
> -{
> - unsigned long npages = xe_npages_in_range(range->start, range->end);
> - struct device *dev = xe->drm.dev;
> - struct scatterlist *sgl;
> - struct page *page;
> - unsigned long i, j;
> -
> - lockdep_assert_held(notifier_sem);
> -
> - i = 0;
> - for_each_sg(st->sgl, sgl, st->nents, j) {
> - unsigned long hmm_pfn, size;
> -
> - hmm_pfn = range->hmm_pfns[i];
> - page = hmm_pfn_to_page(hmm_pfn);
> - xe_assert(xe, !is_device_private_page(page));
> -
> - size = 1UL << hmm_pfn_to_map_order(hmm_pfn);
> - size -= page_to_pfn(page) & (size - 1);
> - i += size;
> -
> - if (unlikely(j == st->nents - 1)) {
> - xe_assert(xe, i >= npages);
> - if (i > npages)
> - size -= (i - npages);
> -
> - sg_mark_end(sgl);
> - } else {
> - xe_assert(xe, i < npages);
> - }
> -
> - sg_set_page(sgl, page, size << PAGE_SHIFT, 0);
> - }
> -
> - return dma_map_sgtable(dev, st, write ? DMA_BIDIRECTIONAL : DMA_TO_DEVICE,
> - DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_NO_KERNEL_MAPPING);
> -}
> -
> -static void xe_hmm_userptr_set_mapped(struct xe_userptr_vma *uvma)
> -{
> - struct xe_userptr *userptr = &uvma->userptr;
> - struct xe_vm *vm = xe_vma_vm(&uvma->vma);
> -
> - lockdep_assert_held_write(&vm->lock);
> - lockdep_assert_held(&vm->userptr.notifier_lock);
> -
> - mutex_lock(&userptr->unmap_mutex);
> - xe_assert(vm->xe, !userptr->mapped);
> - userptr->mapped = true;
> - mutex_unlock(&userptr->unmap_mutex);
> -}
> -
> -void xe_hmm_userptr_unmap(struct xe_userptr_vma *uvma)
> -{
> - struct xe_userptr *userptr = &uvma->userptr;
> - struct xe_vma *vma = &uvma->vma;
> - bool write = !xe_vma_read_only(vma);
> - struct xe_vm *vm = xe_vma_vm(vma);
> - struct xe_device *xe = vm->xe;
> -
> - if (!lockdep_is_held_type(&vm->userptr.notifier_lock, 0) &&
> - !lockdep_is_held_type(&vm->lock, 0) &&
> - !(vma->gpuva.flags & XE_VMA_DESTROYED)) {
> - /* Don't unmap in exec critical section. */
> - xe_vm_assert_held(vm);
> - /* Don't unmap while mapping the sg. */
> - lockdep_assert_held(&vm->lock);
> - }
> -
> - mutex_lock(&userptr->unmap_mutex);
> - if (userptr->sg && userptr->mapped)
> - dma_unmap_sgtable(xe->drm.dev, userptr->sg,
> - write ? DMA_BIDIRECTIONAL : DMA_TO_DEVICE, 0);
> - userptr->mapped = false;
> - mutex_unlock(&userptr->unmap_mutex);
> -}
> -
> -/**
> - * xe_hmm_userptr_free_sg() - Free the scatter gather table of userptr
> - * @uvma: the userptr vma which hold the scatter gather table
> - *
> - * With function xe_userptr_populate_range, we allocate storage of
> - * the userptr sg table. This is a helper function to free this
> - * sg table, and dma unmap the address in the table.
> - */
> -void xe_hmm_userptr_free_sg(struct xe_userptr_vma *uvma)
> -{
> - struct xe_userptr *userptr = &uvma->userptr;
> -
> - xe_assert(xe_vma_vm(&uvma->vma)->xe, userptr->sg);
> - xe_hmm_userptr_unmap(uvma);
> - sg_free_table(userptr->sg);
> - userptr->sg = NULL;
> -}
> -
> -/**
> - * xe_hmm_userptr_populate_range() - Populate physical pages of a virtual
> - * address range
> - *
> - * @uvma: userptr vma which has information of the range to populate.
> - * @is_mm_mmap_locked: True if mmap_read_lock is already acquired by caller.
> - *
> - * This function populate the physical pages of a virtual
> - * address range. The populated physical pages is saved in
> - * userptr's sg table. It is similar to get_user_pages but call
> - * hmm_range_fault.
> - *
> - * This function also read mmu notifier sequence # (
> - * mmu_interval_read_begin), for the purpose of later
> - * comparison (through mmu_interval_read_retry).
> - *
> - * This must be called with mmap read or write lock held.
> - *
> - * This function allocates the storage of the userptr sg table.
> - * It is caller's responsibility to free it calling sg_free_table.
> - *
> - * returns: 0 for success; negative error no on failure
> - */
> -int xe_hmm_userptr_populate_range(struct xe_userptr_vma *uvma,
> - bool is_mm_mmap_locked)
> -{
> - unsigned long timeout =
> - jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
> - unsigned long *pfns;
> - struct xe_userptr *userptr;
> - struct xe_vma *vma = &uvma->vma;
> - u64 userptr_start = xe_vma_userptr(vma);
> - u64 userptr_end = userptr_start + xe_vma_size(vma);
> - struct xe_vm *vm = xe_vma_vm(vma);
> - struct hmm_range hmm_range = {
> - .pfn_flags_mask = 0, /* ignore pfns */
> - .default_flags = HMM_PFN_REQ_FAULT,
> - .start = userptr_start,
> - .end = userptr_end,
> - .notifier = &uvma->userptr.notifier,
> - .dev_private_owner = vm->xe,
> - };
> - bool write = !xe_vma_read_only(vma);
> - unsigned long notifier_seq;
> - u64 npages;
> - int ret;
> -
> - userptr = &uvma->userptr;
> -
> - if (is_mm_mmap_locked)
> - mmap_assert_locked(userptr->notifier.mm);
> -
> - if (vma->gpuva.flags & XE_VMA_DESTROYED)
> - return 0;
> -
> - notifier_seq = mmu_interval_read_begin(&userptr->notifier);
> - if (notifier_seq == userptr->notifier_seq)
> - return 0;
> -
> - if (userptr->sg)
> - xe_hmm_userptr_free_sg(uvma);
> -
> - npages = xe_npages_in_range(userptr_start, userptr_end);
> - pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL);
> - if (unlikely(!pfns))
> - return -ENOMEM;
> -
> - if (write)
> - hmm_range.default_flags |= HMM_PFN_REQ_WRITE;
> -
> - if (!mmget_not_zero(userptr->notifier.mm)) {
> - ret = -EFAULT;
> - goto free_pfns;
> - }
> -
> - hmm_range.hmm_pfns = pfns;
> -
> - while (true) {
> - hmm_range.notifier_seq = mmu_interval_read_begin(&userptr->notifier);
> -
> - if (!is_mm_mmap_locked)
> - mmap_read_lock(userptr->notifier.mm);
> -
> - ret = hmm_range_fault(&hmm_range);
> -
> - if (!is_mm_mmap_locked)
> - mmap_read_unlock(userptr->notifier.mm);
> -
> - if (ret == -EBUSY) {
> - if (time_after(jiffies, timeout))
> - break;
> -
> - continue;
> - }
> - break;
> - }
> -
> - mmput(userptr->notifier.mm);
> -
> - if (ret)
> - goto free_pfns;
> -
> - ret = xe_alloc_sg(vm->xe, &userptr->sgt, &hmm_range, &vm->userptr.notifier_lock);
> - if (ret)
> - goto free_pfns;
> -
> - ret = down_read_interruptible(&vm->userptr.notifier_lock);
> - if (ret)
> - goto free_st;
> -
> - if (mmu_interval_read_retry(hmm_range.notifier, hmm_range.notifier_seq)) {
> - ret = -EAGAIN;
> - goto out_unlock;
> - }
> -
> - ret = xe_build_sg(vm->xe, &hmm_range, &userptr->sgt,
> - &vm->userptr.notifier_lock, write);
> - if (ret)
> - goto out_unlock;
> -
> - xe_mark_range_accessed(&hmm_range, write);
> - userptr->sg = &userptr->sgt;
> - xe_hmm_userptr_set_mapped(uvma);
> - userptr->notifier_seq = hmm_range.notifier_seq;
> - up_read(&vm->userptr.notifier_lock);
> - kvfree(pfns);
> - return 0;
> -
> -out_unlock:
> - up_read(&vm->userptr.notifier_lock);
> -free_st:
> - sg_free_table(&userptr->sgt);
> -free_pfns:
> - kvfree(pfns);
> - return ret;
> -}
> diff --git a/drivers/gpu/drm/xe/xe_hmm.h b/drivers/gpu/drm/xe/xe_hmm.h
> deleted file mode 100644
> index 0ea98d8e7bbc..000000000000
> --- a/drivers/gpu/drm/xe/xe_hmm.h
> +++ /dev/null
> @@ -1,18 +0,0 @@
> -/* SPDX-License-Identifier: MIT
> - *
> - * Copyright © 2024 Intel Corporation
> - */
> -
> -#ifndef _XE_HMM_H_
> -#define _XE_HMM_H_
> -
> -#include <linux/types.h>
> -
> -struct xe_userptr_vma;
> -
> -int xe_hmm_userptr_populate_range(struct xe_userptr_vma *uvma, bool is_mm_mmap_locked);
> -
> -void xe_hmm_userptr_free_sg(struct xe_userptr_vma *uvma);
> -
> -void xe_hmm_userptr_unmap(struct xe_userptr_vma *uvma);
> -#endif
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index c43e7619cb80..824bf99e5f71 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -727,8 +727,8 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
>
> if (!xe_vma_is_null(vma) && !range) {
> if (xe_vma_is_userptr(vma))
> - xe_res_first_sg(to_userptr_vma(vma)->userptr.sg, 0,
> - xe_vma_size(vma), &curs);
> + xe_res_first_dma(to_userptr_vma(vma)->userptr.range.pages.dma_addr, 0,
> + xe_vma_size(vma), &curs);
> else if (xe_bo_is_vram(bo) || xe_bo_is_stolen(bo))
> xe_res_first(bo->ttm.resource, xe_vma_bo_offset(vma),
> xe_vma_size(vma), &curs);
> @@ -998,7 +998,7 @@ static void xe_pt_commit_locks_assert(struct xe_vma *vma)
> xe_pt_commit_prepare_locks_assert(vma);
>
> if (xe_vma_is_userptr(vma))
> - lockdep_assert_held_read(&vm->userptr.notifier_lock);
> + lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
> }
>
> static void xe_pt_commit(struct xe_vma *vma,
> @@ -1336,7 +1336,7 @@ static int vma_check_userptr(struct xe_vm *vm, struct xe_vma *vma,
> struct xe_userptr_vma *uvma;
> unsigned long notifier_seq;
>
> - lockdep_assert_held_read(&vm->userptr.notifier_lock);
> + lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
>
> if (!xe_vma_is_userptr(vma))
> return 0;
> @@ -1366,7 +1366,7 @@ static int op_check_userptr(struct xe_vm *vm, struct xe_vma_op *op,
> {
> int err = 0;
>
> - lockdep_assert_held_read(&vm->userptr.notifier_lock);
> + lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
>
> switch (op->base.op) {
> case DRM_GPUVA_OP_MAP:
> @@ -1407,12 +1407,12 @@ static int xe_pt_userptr_pre_commit(struct xe_migrate_pt_update *pt_update)
> if (err)
> return err;
>
> - down_read(&vm->userptr.notifier_lock);
> + down_read(&vm->userptr.gpusvm.notifier_lock);
>
> list_for_each_entry(op, &vops->list, link) {
> err = op_check_userptr(vm, op, pt_update_ops);
> if (err) {
> - up_read(&vm->userptr.notifier_lock);
> + up_read(&vm->userptr.gpusvm.notifier_lock);
> break;
> }
> }
> @@ -2133,7 +2133,7 @@ static void bind_op_commit(struct xe_vm *vm, struct xe_tile *tile,
> vma->tile_present |= BIT(tile->id);
> vma->tile_staged &= ~BIT(tile->id);
> if (xe_vma_is_userptr(vma)) {
> - lockdep_assert_held_read(&vm->userptr.notifier_lock);
> + lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
> to_userptr_vma(vma)->userptr.initial_bind = true;
> }
>
> @@ -2169,7 +2169,7 @@ static void unbind_op_commit(struct xe_vm *vm, struct xe_tile *tile,
> if (!vma->tile_present) {
> list_del_init(&vma->combined_links.rebind);
> if (xe_vma_is_userptr(vma)) {
> - lockdep_assert_held_read(&vm->userptr.notifier_lock);
> + lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
>
> spin_lock(&vm->userptr.invalidated_lock);
> list_del_init(&to_userptr_vma(vma)->userptr.invalidate_link);
> @@ -2414,7 +2414,7 @@ xe_pt_update_ops_run(struct xe_tile *tile, struct xe_vma_ops *vops)
> if (pt_update_ops->needs_svm_lock)
> xe_svm_notifier_unlock(vm);
> if (pt_update_ops->needs_userptr_lock)
> - up_read(&vm->userptr.notifier_lock);
> + up_read(&vm->userptr.gpusvm.notifier_lock);
>
> return fence;
>
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 60303998bd61..fdc24718b9ad 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -5,6 +5,7 @@
>
> #include "xe_vm.h"
>
> +#include "drm/drm_gpusvm.h"
> #include <linux/dma-fence-array.h>
> #include <linux/nospec.h>
>
> @@ -40,7 +41,6 @@
> #include "xe_sync.h"
> #include "xe_trace_bo.h"
> #include "xe_wa.h"
> -#include "xe_hmm.h"
>
> static struct drm_gem_object *xe_vm_obj(struct xe_vm *vm)
> {
> @@ -53,7 +53,7 @@ static struct drm_gem_object *xe_vm_obj(struct xe_vm *vm)
> *
> * Check if the userptr vma has been invalidated since last successful
> * repin. The check is advisory only and can the function can be called
> - * without the vm->userptr.notifier_lock held. There is no guarantee that the
> + * without the vm->userptr.gpusvm.notifier_lock held. There is no guarantee that the
> * vma userptr will remain valid after a lockless check, so typically
> * the call needs to be followed by a proper check under the notifier_lock.
> *
> @@ -71,11 +71,17 @@ int xe_vma_userptr_pin_pages(struct xe_userptr_vma *uvma)
> struct xe_vma *vma = &uvma->vma;
> struct xe_vm *vm = xe_vma_vm(vma);
> struct xe_device *xe = vm->xe;
> + struct drm_gpusvm_ctx ctx = {
> + .read_only = xe_vma_read_only(vma),
> + };
>
> lockdep_assert_held(&vm->lock);
> xe_assert(xe, xe_vma_is_userptr(vma));
>
> - return xe_hmm_userptr_populate_range(uvma, false);
> + if (vma->gpuva.flags & XE_VMA_DESTROYED)
> + return 0;
> +
> + return drm_gpusvm_basic_range_get_pages(&uvma->userptr.range, &ctx);
> }
>
> static bool preempt_fences_waiting(struct xe_vm *vm)
> @@ -249,7 +255,7 @@ int xe_vm_add_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q)
> ++vm->preempt.num_exec_queues;
> q->lr.pfence = pfence;
>
> - down_read(&vm->userptr.notifier_lock);
> + down_read(&vm->userptr.gpusvm.notifier_lock);
>
> drm_gpuvm_resv_add_fence(&vm->gpuvm, exec, pfence,
> DMA_RESV_USAGE_BOOKKEEP, DMA_RESV_USAGE_BOOKKEEP);
> @@ -263,7 +269,7 @@ int xe_vm_add_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q)
> if (wait)
> dma_fence_enable_sw_signaling(pfence);
>
> - up_read(&vm->userptr.notifier_lock);
> + up_read(&vm->userptr.gpusvm.notifier_lock);
>
> out_fini:
> drm_exec_fini(exec);
> @@ -305,14 +311,14 @@ void xe_vm_remove_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q)
> * @vm: The VM.
> *
> * This function checks for whether the VM has userptrs that need repinning,
> - * and provides a release-type barrier on the userptr.notifier_lock after
> + * and provides a release-type barrier on the userptr.gpusvm.notifier_lock after
> * checking.
> *
> * Return: 0 if there are no userptrs needing repinning, -EAGAIN if there are.
> */
> int __xe_vm_userptr_needs_repin(struct xe_vm *vm)
> {
> - lockdep_assert_held_read(&vm->userptr.notifier_lock);
> + lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
>
> return (list_empty(&vm->userptr.repin_list) &&
> list_empty(&vm->userptr.invalidated)) ? 0 : -EAGAIN;
> @@ -546,9 +552,9 @@ static void preempt_rebind_work_func(struct work_struct *w)
> (!(__tries)++ || __xe_vm_userptr_needs_repin(__vm)) : \
> __xe_vm_userptr_needs_repin(__vm))
>
> - down_read(&vm->userptr.notifier_lock);
> + down_read(&vm->userptr.gpusvm.notifier_lock);
> if (retry_required(tries, vm)) {
> - up_read(&vm->userptr.notifier_lock);
> + up_read(&vm->userptr.gpusvm.notifier_lock);
> err = -EAGAIN;
> goto out_unlock;
> }
> @@ -562,7 +568,7 @@ static void preempt_rebind_work_func(struct work_struct *w)
> /* Point of no return. */
> arm_preempt_fences(vm, &preempt_fences);
> resume_and_reinstall_preempt_fences(vm, &exec);
> - up_read(&vm->userptr.notifier_lock);
> + up_read(&vm->userptr.gpusvm.notifier_lock);
>
> out_unlock:
> drm_exec_fini(&exec);
> @@ -589,6 +595,9 @@ static void __vma_userptr_invalidate(struct xe_vm *vm, struct xe_userptr_vma *uv
> struct xe_vma *vma = &uvma->vma;
> struct dma_resv_iter cursor;
> struct dma_fence *fence;
> + struct drm_gpusvm_ctx ctx = {
> + .in_notifier = true,
> + };
> long err;
>
> /*
> @@ -625,7 +634,7 @@ static void __vma_userptr_invalidate(struct xe_vm *vm, struct xe_userptr_vma *uv
> XE_WARN_ON(err);
> }
>
> - xe_hmm_userptr_unmap(uvma);
> + drm_gpusvm_basic_range_unmap_pages(&uvma->userptr.range, &ctx);
> }
>
> static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
> @@ -646,11 +655,11 @@ static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
> "NOTIFIER: addr=0x%016llx, range=0x%016llx",
> xe_vma_start(vma), xe_vma_size(vma));
>
> - down_write(&vm->userptr.notifier_lock);
> + down_write(&vm->userptr.gpusvm.notifier_lock);
> mmu_interval_set_seq(mni, cur_seq);
>
> __vma_userptr_invalidate(vm, uvma);
> - up_write(&vm->userptr.notifier_lock);
> + up_write(&vm->userptr.gpusvm.notifier_lock);
> trace_xe_vma_userptr_invalidate_complete(vma);
>
> return true;
> @@ -674,7 +683,7 @@ void xe_vma_userptr_force_invalidate(struct xe_userptr_vma *uvma)
> /* Protect against concurrent userptr pinning */
> lockdep_assert_held(&vm->lock);
> /* Protect against concurrent notifiers */
> - lockdep_assert_held(&vm->userptr.notifier_lock);
> + lockdep_assert_held(&vm->userptr.gpusvm.notifier_lock);
> /*
> * Protect against concurrent instances of this function and
> * the critical exec sections
> @@ -747,7 +756,7 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
> }
>
> if (err) {
> - down_write(&vm->userptr.notifier_lock);
> + down_write(&vm->userptr.gpusvm.notifier_lock);
> spin_lock(&vm->userptr.invalidated_lock);
> list_for_each_entry_safe(uvma, next, &vm->userptr.repin_list,
> userptr.repin_link) {
> @@ -756,7 +765,7 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
> &vm->userptr.invalidated);
> }
> spin_unlock(&vm->userptr.invalidated_lock);
> - up_write(&vm->userptr.notifier_lock);
> + up_write(&vm->userptr.gpusvm.notifier_lock);
> }
> return err;
> }
> @@ -1222,7 +1231,6 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
> INIT_LIST_HEAD(&userptr->invalidate_link);
> INIT_LIST_HEAD(&userptr->repin_link);
> vma->gpuva.gem.offset = bo_offset_or_userptr;
> - mutex_init(&userptr->unmap_mutex);
>
> err = mmu_interval_notifier_insert(&userptr->notifier,
> current->mm,
> @@ -1233,7 +1241,10 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
> return ERR_PTR(err);
> }
>
> - userptr->notifier_seq = LONG_MAX;
> + drm_gpusvm_basic_range_init(&vm->userptr.gpusvm,
> + &userptr->range,
> + &userptr->notifier,
> + &userptr->notifier_seq);
> }
>
> xe_vm_get(vm);
> @@ -1255,8 +1266,7 @@ static void xe_vma_destroy_late(struct xe_vma *vma)
> struct xe_userptr_vma *uvma = to_userptr_vma(vma);
> struct xe_userptr *userptr = &uvma->userptr;
>
> - if (userptr->sg)
> - xe_hmm_userptr_free_sg(uvma);
> + drm_gpusvm_basic_range_free_pages(&uvma->userptr.range);
>
> /*
> * Since userptr pages are not pinned, we can't remove
> @@ -1264,7 +1274,7 @@ static void xe_vma_destroy_late(struct xe_vma *vma)
> * them anymore
> */
> mmu_interval_notifier_remove(&userptr->notifier);
> - mutex_destroy(&userptr->unmap_mutex);
> + drm_gpusvm_basic_range_fini(&uvma->userptr.range);
> xe_vm_put(vm);
> } else if (xe_vma_is_null(vma) || xe_vma_is_cpu_addr_mirror(vma)) {
> xe_vm_put(vm);
> @@ -1657,7 +1667,6 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
>
> INIT_LIST_HEAD(&vm->userptr.repin_list);
> INIT_LIST_HEAD(&vm->userptr.invalidated);
> - init_rwsem(&vm->userptr.notifier_lock);
> spin_lock_init(&vm->userptr.invalidated_lock);
>
> ttm_lru_bulk_move_init(&vm->lru_bulk_move);
> @@ -1763,6 +1772,9 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
> goto err_close;
> }
>
> + drm_gpusvm_basic_init(&vm->userptr.gpusvm, "xe-vm-userptr", &xe->drm);
> + drm_gpusvm_driver_set_lock(&vm->userptr.gpusvm, &vm->lock);
> +
> if (number_tiles > 1)
> vm->composite_fence_ctx = dma_fence_context_alloc(1);
>
> @@ -1867,9 +1879,9 @@ void xe_vm_close_and_put(struct xe_vm *vm)
> vma = gpuva_to_vma(gpuva);
>
> if (xe_vma_has_no_bo(vma)) {
> - down_read(&vm->userptr.notifier_lock);
> + down_read(&vm->userptr.gpusvm.notifier_lock);
> vma->gpuva.flags |= XE_VMA_DESTROYED;
> - up_read(&vm->userptr.notifier_lock);
> + up_read(&vm->userptr.gpusvm.notifier_lock);
> }
>
> xe_vm_remove_vma(vm, vma);
> @@ -1916,6 +1928,8 @@ void xe_vm_close_and_put(struct xe_vm *vm)
> if (xe_vm_in_fault_mode(vm))
> xe_svm_fini(vm);
>
> + drm_gpusvm_basic_fini(&vm->userptr.gpusvm);
> +
> up_write(&vm->lock);
>
> down_write(&xe->usm.lock);
> @@ -2144,9 +2158,9 @@ static const u32 region_to_mem_type[] = {
> static void prep_vma_destroy(struct xe_vm *vm, struct xe_vma *vma,
> bool post_commit)
> {
> - down_read(&vm->userptr.notifier_lock);
> + down_read(&vm->userptr.gpusvm.notifier_lock);
> vma->gpuva.flags |= XE_VMA_DESTROYED;
> - up_read(&vm->userptr.notifier_lock);
> + up_read(&vm->userptr.gpusvm.notifier_lock);
> if (post_commit)
> xe_vm_remove_vma(vm, vma);
> }
> @@ -2625,9 +2639,9 @@ static void xe_vma_op_unwind(struct xe_vm *vm, struct xe_vma_op *op,
> struct xe_vma *vma = gpuva_to_vma(op->base.unmap.va);
>
> if (vma) {
> - down_read(&vm->userptr.notifier_lock);
> + down_read(&vm->userptr.gpusvm.notifier_lock);
> vma->gpuva.flags &= ~XE_VMA_DESTROYED;
> - up_read(&vm->userptr.notifier_lock);
> + up_read(&vm->userptr.gpusvm.notifier_lock);
> if (post_commit)
> xe_vm_insert_vma(vm, vma);
> }
> @@ -2646,9 +2660,9 @@ static void xe_vma_op_unwind(struct xe_vm *vm, struct xe_vma_op *op,
> xe_vma_destroy_unlocked(op->remap.next);
> }
> if (vma) {
> - down_read(&vm->userptr.notifier_lock);
> + down_read(&vm->userptr.gpusvm.notifier_lock);
> vma->gpuva.flags &= ~XE_VMA_DESTROYED;
> - up_read(&vm->userptr.notifier_lock);
> + up_read(&vm->userptr.gpusvm.notifier_lock);
> if (post_commit)
> xe_vm_insert_vma(vm, vma);
> }
> diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
> index 84fa41b9fa20..08f295873a2b 100644
> --- a/drivers/gpu/drm/xe/xe_vm_types.h
> +++ b/drivers/gpu/drm/xe/xe_vm_types.h
> @@ -52,26 +52,23 @@ struct xe_userptr {
> struct list_head invalidate_link;
> /** @userptr: link into VM repin list if userptr. */
> struct list_head repin_link;
> + /**
> + * @range: gpusvm range for this user pointer.
> + */
> + struct drm_gpusvm_basic_range range;
> /**
> * @notifier: MMU notifier for user pointer (invalidation call back)
> */
> struct mmu_interval_notifier notifier;
> - /** @sgt: storage for a scatter gather table */
> - struct sg_table sgt;
> - /** @sg: allocated scatter gather table */
> - struct sg_table *sg;
> +
> /** @notifier_seq: notifier sequence number */
> unsigned long notifier_seq;
> - /** @unmap_mutex: Mutex protecting dma-unmapping */
> - struct mutex unmap_mutex;
> /**
> * @initial_bind: user pointer has been bound at least once.
> - * write: vm->userptr.notifier_lock in read mode and vm->resv held.
> - * read: vm->userptr.notifier_lock in write mode or vm->resv held.
> + * write: vm->userptr.gpusvm.notifier_lock in read mode and vm->resv held.
> + * read: vm->userptr.gpusvm.notifier_lock in write mode or vm->resv held.
> */
> bool initial_bind;
> - /** @mapped: Whether the @sgt sg-table is dma-mapped. Protected by @unmap_mutex. */
> - bool mapped;
> #if IS_ENABLED(CONFIG_DRM_XE_USERPTR_INVAL_INJECT)
> u32 divisor;
> #endif
> @@ -109,7 +106,7 @@ struct xe_vma {
> /**
> * @tile_present: GT mask of binding are present for this VMA.
> * protected by vm->lock, vm->resv and for userptrs,
> - * vm->userptr.notifier_lock for writing. Needs either for reading,
> + * vm->userptr.gpusvm.notifier_lock for writing. Needs either for reading,
> * but if reading is done under the vm->lock only, it needs to be held
> * in write mode.
> */
> @@ -238,16 +235,17 @@ struct xe_vm {
>
> /** @userptr: user pointer state */
> struct {
> + /*
> + * @gpusvm: The gpusvm userptr for this vm. The
> + * gpusvm.notifier_lock protects notifier in write mode and
> + * submission in read mode.
> + */
> + struct drm_gpusvm gpusvm;
This is not a full review, but I believe the notifier lock should be
shared between SVM and userptr (i.e., we should have a single drm_gpusvm
structure for both). The reasoning is that if we prefetch both SVM
ranges and userptr within the same list of VM bind operations, we take a
single lock at the final step for both. Following this approach,
xe_pt_userptr_pre_commit and xe_pt_svm_pre_commit would be merged into a
single function.
+Himal, Thomas to see if they agree with me here.
Matt
> /**
> * @userptr.repin_list: list of VMAs which are user pointers,
> * and needs repinning. Protected by @lock.
> */
> struct list_head repin_list;
> - /**
> - * @notifier_lock: protects notifier in write mode and
> - * submission in read mode.
> - */
> - struct rw_semaphore notifier_lock;
> /**
> * @userptr.invalidated_lock: Protects the
> * @userptr.invalidated list.
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 5/7] drm/gpusvm: lower get/unmap pages
2025-03-20 17:30 ` [PATCH 5/7] drm/gpusvm: lower get/unmap pages Matthew Auld
@ 2025-03-20 20:59 ` Matthew Brost
0 siblings, 0 replies; 22+ messages in thread
From: Matthew Brost @ 2025-03-20 20:59 UTC (permalink / raw)
To: Matthew Auld; +Cc: intel-xe, dri-devel, Thomas Hellström
On Thu, Mar 20, 2025 at 05:30:02PM +0000, Matthew Auld wrote:
Maybe try rewording this. I find the lower in patch subject / usage
below a bit confusing.
> Lower get/unmap pages to facilitate operating on the lowest level
> pieces, without needing a full drm_gpusvm_range structure. In the next
> patch we want to extract get/unmap/free to operate on a different range
> type.
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/drm_gpusvm.c | 90 ++++++++++++++++++++++--------------
> 1 file changed, 55 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
> index f27731a51f34..2beca5a6dc0a 100644
> --- a/drivers/gpu/drm/drm_gpusvm.c
> +++ b/drivers/gpu/drm/drm_gpusvm.c
> @@ -1323,38 +1323,28 @@ drm_gpusvm_range_pages_valid_unlocked(struct drm_gpusvm *gpusvm,
> return pages_valid;
> }
>
> -/**
> - * drm_gpusvm_range_get_pages() - Get pages for a GPU SVM range
> - * @gpusvm: Pointer to the GPU SVM structure
> - * @range: Pointer to the GPU SVM range structure
> - * @ctx: GPU SVM context
> - *
> - * This function gets pages for a GPU SVM range and ensures they are mapped for
> - * DMA access.
> - *
> - * Return: 0 on success, negative error code on failure.
> - */
> -int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> - struct drm_gpusvm_range *range,
> - const struct drm_gpusvm_ctx *ctx)
> +static int drm_gpusvm_get_pages(struct drm_gpusvm *gpusvm,
> + struct drm_gpusvm_pages *svm_pages,
> + struct mm_struct *mm,
> + struct mmu_interval_notifier *notifier,
> + unsigned long *notifier_seq,
> + unsigned long mm_start,
> + unsigned long mm_end,
s/mm_start/pages_start ?
s/mm_end/pages_end ?
Matt
> + const struct drm_gpusvm_ctx *ctx)
> {
> - struct drm_gpusvm_pages *svm_pages = &range->pages;
> - struct mmu_interval_notifier *notifier = &range->notifier->notifier;
> struct hmm_range hmm_range = {
> .default_flags = HMM_PFN_REQ_FAULT | (ctx->read_only ? 0 :
> HMM_PFN_REQ_WRITE),
> .notifier = notifier,
> - .start = drm_gpusvm_range_start(range),
> - .end = drm_gpusvm_range_end(range),
> + .start = mm_start,
> + .end = mm_end,
> .dev_private_owner = gpusvm->device_private_page_owner,
> };
> - struct mm_struct *mm = gpusvm->mm;
> struct drm_gpusvm_zdd *zdd;
> unsigned long timeout =
> jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
> unsigned long i, j;
> - unsigned long npages = npages_in_range(drm_gpusvm_range_start(range),
> - drm_gpusvm_range_end(range));
> + unsigned long npages = npages_in_range(mm_start, mm_end);
> unsigned long num_dma_mapped;
> unsigned int order = 0;
> unsigned long *pfns;
> @@ -1518,7 +1508,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> drm_gpusvm_notifier_unlock(gpusvm);
> kvfree(pfns);
> set_seqno:
> - range->notifier_seq = hmm_range.notifier_seq;
> + *notifier_seq = hmm_range.notifier_seq;
>
> return 0;
>
> @@ -1531,8 +1521,48 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> goto retry;
> return err;
> }
> +
> +/**
> + * drm_gpusvm_range_get_pages() - Get pages for a GPU SVM range
> + * @gpusvm: Pointer to the GPU SVM structure
> + * @range: Pointer to the GPU SVM range structure
> + * @ctx: GPU SVM context
> + *
> + * This function gets pages for a GPU SVM range and ensures they are mapped for
> + * DMA access.
> + *
> + * Return: 0 on success, negative error code on failure.
> + */
> +int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> + struct drm_gpusvm_range *range,
> + const struct drm_gpusvm_ctx *ctx)
> +{
> + return drm_gpusvm_get_pages(gpusvm, &range->pages, gpusvm->mm,
> + &range->notifier->notifier,
> + &range->notifier_seq,
> + drm_gpusvm_range_start(range),
> + drm_gpusvm_range_end(range), ctx);
> +}
> EXPORT_SYMBOL_GPL(drm_gpusvm_range_get_pages);
>
> +static void drm_gpusvm_unmap_pages(struct drm_gpusvm *gpusvm,
> + unsigned long mm_start, unsigned long mm_end,
> + struct drm_gpusvm_pages *svm_pages,
> + const struct drm_gpusvm_ctx *ctx)
> +{
> + unsigned long npages = npages_in_range(mm_start, mm_end);
> +
> + if (ctx->in_notifier)
> + lockdep_assert_held_write(&gpusvm->notifier_lock);
> + else
> + drm_gpusvm_notifier_lock(gpusvm);
> +
> + __drm_gpusvm_unmap_pages(gpusvm, svm_pages, npages);
> +
> + if (!ctx->in_notifier)
> + drm_gpusvm_notifier_unlock(gpusvm);
> +}
> +
> /**
> * drm_gpusvm_range_unmap_pages() - Unmap pages associated with a GPU SVM range
> * @gpusvm: Pointer to the GPU SVM structure
> @@ -1549,19 +1579,9 @@ void drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
> struct drm_gpusvm_range *range,
> const struct drm_gpusvm_ctx *ctx)
> {
> - struct drm_gpusvm_pages *svm_pages = &range->pages;
> - unsigned long npages = npages_in_range(drm_gpusvm_range_start(range),
> - drm_gpusvm_range_end(range));
> -
> - if (ctx->in_notifier)
> - lockdep_assert_held_write(&gpusvm->notifier_lock);
> - else
> - drm_gpusvm_notifier_lock(gpusvm);
> -
> - __drm_gpusvm_unmap_pages(gpusvm, svm_pages, npages);
> -
> - if (!ctx->in_notifier)
> - drm_gpusvm_notifier_unlock(gpusvm);
> + return drm_gpusvm_unmap_pages(gpusvm, drm_gpusvm_range_start(range),
> + drm_gpusvm_range_end(range),
> + &range->pages, ctx);
> }
> EXPORT_SYMBOL_GPL(drm_gpusvm_range_unmap_pages);
>
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 6/7] drm/gpusvm: support basic_range
2025-03-20 17:30 ` [PATCH 6/7] drm/gpusvm: support basic_range Matthew Auld
@ 2025-03-20 21:04 ` Matthew Brost
0 siblings, 0 replies; 22+ messages in thread
From: Matthew Brost @ 2025-03-20 21:04 UTC (permalink / raw)
To: Matthew Auld
Cc: intel-xe, dri-devel, Thomas Hellström, himal.prasad.ghimiray
On Thu, Mar 20, 2025 at 05:30:03PM +0000, Matthew Auld wrote:
> Idea is to use this for userptr, where we mostly just need
> get/unmap/free pages, plus some kind of common notifier lock at the svm
> level (provided by gpusvm). The range itself also maps to a single
> notifier, covering the entire range (provided by the user).
>
So, same comment as in patch #7 [1]: could we drop the idea of a basic
GPUSVM and unify SVM and userptr GPUSVM to share the locking?
Following that, rather than having wrappers in GPU SVM for
drm_gpusvm_basic_range, can we expose the lower-level GPU SVM functions
that operate on pages and have wrappers call these in the Xe layer?
Again, adding +Himal and Thomas for their opinions.
Matt
[1] https://patchwork.freedesktop.org/patch/643976/?series=146553&rev=1#comment_1177820
> TODO: needs proper kernel-doc, assuming this change makes sense.
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/drm_gpusvm.c | 138 +++++++++++++++++++++++++++++------
> include/drm/drm_gpusvm.h | 23 ++++++
> 2 files changed, 137 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
> index 2beca5a6dc0a..ca571610214c 100644
> --- a/drivers/gpu/drm/drm_gpusvm.c
> +++ b/drivers/gpu/drm/drm_gpusvm.c
> @@ -521,6 +521,41 @@ static const struct mmu_interval_notifier_ops drm_gpusvm_notifier_ops = {
> .invalidate = drm_gpusvm_notifier_invalidate,
> };
>
> +static void __drm_gpusvm_init(struct drm_gpusvm *gpusvm, const char *name,
> + struct drm_device *drm, struct mm_struct *mm,
> + void *device_private_page_owner,
> + unsigned long mm_start, unsigned long mm_range,
> + unsigned long notifier_size,
> + const struct drm_gpusvm_ops *ops,
> + const unsigned long *chunk_sizes, int num_chunks)
> +{
> + gpusvm->name = name;
> + gpusvm->drm = drm;
> + gpusvm->mm = mm;
> + gpusvm->device_private_page_owner = device_private_page_owner;
> + gpusvm->mm_start = mm_start;
> + gpusvm->mm_range = mm_range;
> + gpusvm->notifier_size = notifier_size;
> + gpusvm->ops = ops;
> + gpusvm->chunk_sizes = chunk_sizes;
> + gpusvm->num_chunks = num_chunks;
> +
> + if (mm)
> + mmgrab(mm);
> + gpusvm->root = RB_ROOT_CACHED;
> + INIT_LIST_HEAD(&gpusvm->notifier_list);
> +
> + init_rwsem(&gpusvm->notifier_lock);
> +
> + fs_reclaim_acquire(GFP_KERNEL);
> + might_lock(&gpusvm->notifier_lock);
> + fs_reclaim_release(GFP_KERNEL);
> +
> +#ifdef CONFIG_LOCKDEP
> + gpusvm->lock_dep_map = NULL;
> +#endif
> +}
> +
> /**
> * drm_gpusvm_init() - Initialize the GPU SVM.
> * @gpusvm: Pointer to the GPU SVM structure.
> @@ -552,35 +587,32 @@ int drm_gpusvm_init(struct drm_gpusvm *gpusvm,
> if (!ops->invalidate || !num_chunks)
> return -EINVAL;
>
> - gpusvm->name = name;
> - gpusvm->drm = drm;
> - gpusvm->mm = mm;
> - gpusvm->device_private_page_owner = device_private_page_owner;
> - gpusvm->mm_start = mm_start;
> - gpusvm->mm_range = mm_range;
> - gpusvm->notifier_size = notifier_size;
> - gpusvm->ops = ops;
> - gpusvm->chunk_sizes = chunk_sizes;
> - gpusvm->num_chunks = num_chunks;
> -
> - mmgrab(mm);
> - gpusvm->root = RB_ROOT_CACHED;
> - INIT_LIST_HEAD(&gpusvm->notifier_list);
> -
> - init_rwsem(&gpusvm->notifier_lock);
> -
> - fs_reclaim_acquire(GFP_KERNEL);
> - might_lock(&gpusvm->notifier_lock);
> - fs_reclaim_release(GFP_KERNEL);
> -
> -#ifdef CONFIG_LOCKDEP
> - gpusvm->lock_dep_map = NULL;
> -#endif
> + __drm_gpusvm_init(gpusvm, name, drm, mm, device_private_page_owner,
> + mm_start, mm_range, notifier_size, ops, chunk_sizes,
> + num_chunks);
>
> return 0;
> }
> EXPORT_SYMBOL_GPL(drm_gpusvm_init);
>
> +static bool drm_gpusvm_is_basic(struct drm_gpusvm *svm)
> +{
> + return !svm->mm;
> +}
> +
> +void drm_gpusvm_basic_init(struct drm_gpusvm *gpusvm, const char *name,
> + struct drm_device *drm)
> +{
> + __drm_gpusvm_init(gpusvm, name, drm, NULL, NULL, 0, 0, 0, NULL, NULL,
> + 0);
> +}
> +EXPORT_SYMBOL_GPL(drm_gpusvm_basic_init);
> +
> +void drm_gpusvm_basic_fini(struct drm_gpusvm *gpusvm)
> +{
> +}
> +EXPORT_SYMBOL_GPL(drm_gpusvm_basic_fini);
> +
> /**
> * drm_gpusvm_notifier_find() - Find GPU SVM notifier
> * @gpusvm: Pointer to the GPU SVM structure
> @@ -1001,6 +1033,27 @@ static void drm_gpusvm_driver_lock_held(struct drm_gpusvm *gpusvm)
> }
> #endif
>
> +void drm_gpusvm_basic_range_init(struct drm_gpusvm *svm,
> + struct drm_gpusvm_basic_range *range,
> + struct mmu_interval_notifier *notifier,
> + unsigned long *notifier_seq)
> +{
> + WARN_ON(!drm_gpusvm_is_basic(svm));
> +
> + range->gpusvm = svm;
> + range->notifier = notifier;
> + range->notifier_seq = notifier_seq;
> + *notifier_seq = LONG_MAX;
> + memset(&range->pages, 0, sizeof(range->pages));
> +}
> +EXPORT_SYMBOL_GPL(drm_gpusvm_basic_range_init);
> +
> +void drm_gpusvm_basic_range_fini(struct drm_gpusvm_basic_range *range)
> +{
> + WARN_ON(range->pages.flags.has_dma_mapping);
> +}
> +EXPORT_SYMBOL_GPL(drm_gpusvm_basic_range_fini);
> +
> /**
> * drm_gpusvm_range_find_or_insert() - Find or insert GPU SVM range
> * @gpusvm: Pointer to the GPU SVM structure
> @@ -1176,6 +1229,19 @@ static void drm_gpusvm_free_pages(struct drm_gpusvm *gpusvm,
> }
> }
>
> +void drm_gpusvm_basic_range_free_pages(struct drm_gpusvm_basic_range *range)
> +{
> + unsigned long npages =
> + npages_in_range(range->notifier->interval_tree.start,
> + range->notifier->interval_tree.last + 1);
> +
> + drm_gpusvm_notifier_lock(range->gpusvm);
> + __drm_gpusvm_unmap_pages(range->gpusvm, &range->pages, npages);
> + drm_gpusvm_free_pages(range->gpusvm, &range->pages);
> + drm_gpusvm_notifier_unlock(range->gpusvm);
> +}
> +EXPORT_SYMBOL_GPL(drm_gpusvm_basic_range_free_pages);
> +
> /**
> * drm_gpusvm_range_remove() - Remove GPU SVM range
> * @gpusvm: Pointer to the GPU SVM structure
> @@ -1545,6 +1611,20 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> }
> EXPORT_SYMBOL_GPL(drm_gpusvm_range_get_pages);
>
> +int drm_gpusvm_basic_range_get_pages(struct drm_gpusvm_basic_range *range,
> + const struct drm_gpusvm_ctx *ctx)
> +{
> + drm_gpusvm_driver_lock_held(range->gpusvm);
> +
> + return drm_gpusvm_get_pages(range->gpusvm, &range->pages,
> + range->notifier->mm, range->notifier,
> + range->notifier_seq,
> + range->notifier->interval_tree.start,
> + range->notifier->interval_tree.last + 1,
> + ctx);
> +}
> +EXPORT_SYMBOL_GPL(drm_gpusvm_basic_range_get_pages);
> +
> static void drm_gpusvm_unmap_pages(struct drm_gpusvm *gpusvm,
> unsigned long mm_start, unsigned long mm_end,
> struct drm_gpusvm_pages *svm_pages,
> @@ -1585,6 +1665,16 @@ void drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
> }
> EXPORT_SYMBOL_GPL(drm_gpusvm_range_unmap_pages);
>
> +void drm_gpusvm_basic_range_unmap_pages(struct drm_gpusvm_basic_range *range,
> + const struct drm_gpusvm_ctx *ctx)
> +{
> + drm_gpusvm_unmap_pages(range->gpusvm,
> + range->notifier->interval_tree.start,
> + range->notifier->interval_tree.last + 1,
> + &range->pages, ctx);
> +}
> +EXPORT_SYMBOL_GPL(drm_gpusvm_basic_range_unmap_pages);
> +
> /**
> * drm_gpusvm_migration_unlock_put_page() - Put a migration page
> * @page: Pointer to the page to put
> diff --git a/include/drm/drm_gpusvm.h b/include/drm/drm_gpusvm.h
> index c15c733ef0ad..82c4e58fa84c 100644
> --- a/include/drm/drm_gpusvm.h
> +++ b/include/drm/drm_gpusvm.h
> @@ -305,6 +305,29 @@ struct drm_gpusvm_ctx {
> unsigned int devmem_possible :1;
> };
>
> +struct drm_gpusvm_basic_range {
> + struct drm_gpusvm *gpusvm;
> + struct drm_gpusvm_pages pages;
> + struct mmu_interval_notifier *notifier;
> + unsigned long *notifier_seq;
> +};
> +
> +void drm_gpusvm_basic_init(struct drm_gpusvm *gpusvm, const char *name,
> + struct drm_device *drm);
> +void drm_gpusvm_basic_fini(struct drm_gpusvm *gpusvm);
> +
> +void drm_gpusvm_basic_range_init(struct drm_gpusvm *svm,
> + struct drm_gpusvm_basic_range *range,
> + struct mmu_interval_notifier *notifier,
> + unsigned long *notifier_seq);
> +void drm_gpusvm_basic_range_fini(struct drm_gpusvm_basic_range *range);
> +
> +int drm_gpusvm_basic_range_get_pages(struct drm_gpusvm_basic_range *range,
> + const struct drm_gpusvm_ctx *ctx);
> +void drm_gpusvm_basic_range_unmap_pages(struct drm_gpusvm_basic_range *range,
> + const struct drm_gpusvm_ctx *ctx);
> +void drm_gpusvm_basic_range_free_pages(struct drm_gpusvm_basic_range *range);
> +
> int drm_gpusvm_init(struct drm_gpusvm *gpusvm,
> const char *name, struct drm_device *drm,
> struct mm_struct *mm, void *device_private_page_owner,
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/7] drm/gpusvm: mark pages as dirty
2025-03-20 19:33 ` Matthew Brost
@ 2025-03-21 11:37 ` Matthew Auld
0 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2025-03-21 11:37 UTC (permalink / raw)
To: Matthew Brost, Thomas Hellström; +Cc: intel-xe, dri-devel
On 20/03/2025 19:33, Matthew Brost wrote:
> On Thu, Mar 20, 2025 at 08:29:42PM +0100, Thomas Hellström wrote:
>> On Thu, 2025-03-20 at 17:30 +0000, Matthew Auld wrote:
>>> If the memory is going to be accessed by the device, make sure we
>>> mark
>>> the pages accordingly such that the kernel knows this. This aligns
>>> with
>>> the xe-userptr code.
>>>
>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>> drivers/gpu/drm/drm_gpusvm.c | 9 +++++++++
>>> 1 file changed, 9 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/drm_gpusvm.c
>>> b/drivers/gpu/drm/drm_gpusvm.c
>>> index 7f1cf5492bba..5b4ecd36dff1 100644
>>> --- a/drivers/gpu/drm/drm_gpusvm.c
>>> +++ b/drivers/gpu/drm/drm_gpusvm.c
>>> @@ -1471,6 +1471,7 @@ int drm_gpusvm_range_get_pages(struct
>>> drm_gpusvm *gpusvm,
>>> pages[i] = page;
>>> } else {
>>> dma_addr_t addr;
>>> + unsigned int k;
>>>
>>> if (is_zone_device_page(page) || zdd) {
>>> err = -EOPNOTSUPP;
>>> @@ -1489,6 +1490,14 @@ int drm_gpusvm_range_get_pages(struct
>>> drm_gpusvm *gpusvm,
>>> range->dma_addr[j] =
>>> drm_pagemap_device_addr_encode
>>> (addr, DRM_INTERCONNECT_SYSTEM,
>>> order,
>>> dma_dir);
>>> +
>>> + for (k = 0; k < 1u << order; k++) {
>>> + if (!ctx->read_only)
>>> + set_page_dirty_lock(page);
>>> +
>>> + mark_page_accessed(page);
>>> + page++;
>>> + }
>>
>> Actually I think the userptr code did this unnecessarily. This is done
>> in the CPU page-fault handler, which means it's taken care of during
>> hmm_range_fault(). Now if the CPU PTE happens to be present and
>> writeable there will be no fault, but that was done when the page was
>> faulted in anyway.
>>
>> If there was a page cleaning event in between so the dirty flag was
>> dropped, then my understanding is that in addition to an invalidation
>> notifier, also the CPU PTE is zapped, so that it will be dirtied again
>> on the next write access, either by the CPU faulting the page or
>> hmm_range_fault() if there is a GPU page-fault.
>>
>> So I think we're good without this patch.
>>
>
> I was going to suggest the same thing as Thomas - we are good without
> this patch for the reasons he states.
Ah, will drop this then. Thanks.
>
> Matt
>
>> /Thomas
>>
>>
>>
>>> }
>>> i += 1 << order;
>>> num_dma_mapped = i;
>>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/7] drm/gpusvm: fix hmm_pfn_to_map_order() usage
2025-03-20 20:40 ` Matthew Brost
@ 2025-03-21 11:42 ` Matthew Auld
0 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2025-03-21 11:42 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, Thomas Hellström
On 20/03/2025 20:40, Matthew Brost wrote:
> On Thu, Mar 20, 2025 at 05:29:58PM +0000, Matthew Auld wrote:
>> Handle the case where the hmm range partially covers a huge page (like
>> 2M), otherwise we can potentially end up doing something nasty like
>> mapping memory which potentially is outside the range, and maybe not
>> even mapped by the mm. Fix is based on the xe userptr code, which in a
>> future patch will directly use gpusvm, so needs alignment here.
>>
>> Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> ---
>> drivers/gpu/drm/drm_gpusvm.c | 25 +++++++++++++++++++++++--
>> 1 file changed, 23 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
>> index 2451c816edd5..48993cef4a74 100644
>> --- a/drivers/gpu/drm/drm_gpusvm.c
>> +++ b/drivers/gpu/drm/drm_gpusvm.c
>> @@ -817,6 +817,27 @@ drm_gpusvm_range_alloc(struct drm_gpusvm *gpusvm,
>> return range;
>> }
>>
>> +/*
>> + * To allow skipping PFNs with the same flags (like when they belong to
>> + * the same huge PTE) when looping over the pfn array, take a given a hmm_pfn,
>> + * and return the largest order that will fit inside the PTE, but also crucially
>> + * accounting for the original hmm range boundaries.
>> + */
>
> I'd make this proper kernel doc given all of drm_gpusvm.c has proper kernel doc.
Ok, typically we don't add kernel-doc for static declarations so didn't
bother. Will make this consistent. Thanks.
>
> Otherwise LGTM.
>
> Matt
>
>> +static unsigned int drm_gpusvm_hmm_pfn_to_order(unsigned long hmm_pfn,
>> + unsigned long hmm_pfn_index,
>> + unsigned long npages)
>> +{
>> + unsigned long size;
>> +
>> + size = 1UL << hmm_pfn_to_map_order(hmm_pfn);
>> + size -= (hmm_pfn & ~HMM_PFN_FLAGS) & (size - 1);
>> + hmm_pfn_index += size;
>> + if (hmm_pfn_index > npages)
>> + size -= (hmm_pfn_index - npages);
>> +
>> + return fls(size) - 1;
>> +}
>> +
>> /**
>> * drm_gpusvm_check_pages() - Check pages
>> * @gpusvm: Pointer to the GPU SVM structure
>> @@ -875,7 +896,7 @@ static bool drm_gpusvm_check_pages(struct drm_gpusvm *gpusvm,
>> err = -EFAULT;
>> goto err_free;
>> }
>> - i += 0x1 << hmm_pfn_to_map_order(pfns[i]);
>> + i += 0x1 << drm_gpusvm_hmm_pfn_to_order(pfns[i], i, npages);
>> }
>>
>> err_free:
>> @@ -1408,7 +1429,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
>> for (i = 0, j = 0; i < npages; ++j) {
>> struct page *page = hmm_pfn_to_page(pfns[i]);
>>
>> - order = hmm_pfn_to_map_order(pfns[i]);
>> + order = drm_gpusvm_hmm_pfn_to_order(pfns[i], i, npages);
>> if (is_device_private_page(page) ||
>> is_device_coherent_page(page)) {
>> if (zdd != page->zone_device_data && i > 0) {
>> --
>> 2.48.1
>>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 4/7] drm/gpusvm: pull out drm_gpusvm_pages substructure
2025-03-20 20:43 ` Matthew Brost
@ 2025-03-21 11:53 ` Matthew Auld
0 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2025-03-21 11:53 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, Thomas Hellström
On 20/03/2025 20:43, Matthew Brost wrote:
> On Thu, Mar 20, 2025 at 05:30:01PM +0000, Matthew Auld wrote:
>> Pull the pages stuff from the svm range into its own substructure, with
>> the idea of having the main pages related routines, like get_pages(),
>> unmap_pages() and free_pages() all operating on some lower level
>> structures, which can then be re-used for stuff like userptr which wants
>> to use basic stuff like get_pages(), but not all the other more complex
>> svm stuff.
>>
>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>> ---
>> drivers/gpu/drm/drm_gpusvm.c | 95 +++++++++++++++++++-----------------
>> drivers/gpu/drm/xe/xe_pt.c | 2 +-
>> drivers/gpu/drm/xe/xe_svm.c | 6 +--
>> drivers/gpu/drm/xe/xe_svm.h | 2 +-
>> include/drm/drm_gpusvm.h | 43 +++++++++-------
>> 5 files changed, 82 insertions(+), 66 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
>> index 5b4ecd36dff1..f27731a51f34 100644
>> --- a/drivers/gpu/drm/drm_gpusvm.c
>> +++ b/drivers/gpu/drm/drm_gpusvm.c
>> @@ -812,7 +812,7 @@ drm_gpusvm_range_alloc(struct drm_gpusvm *gpusvm,
>> range->itree.last = ALIGN(fault_addr + 1, chunk_size) - 1;
>> INIT_LIST_HEAD(&range->entry);
>> range->notifier_seq = LONG_MAX;
>> - range->flags.migrate_devmem = migrate_devmem ? 1 : 0;
>> + range->pages.flags.migrate_devmem = migrate_devmem ? 1 : 0;
>>
>> return range;
>> }
>> @@ -1120,27 +1120,27 @@ drm_gpusvm_range_find_or_insert(struct drm_gpusvm *gpusvm,
>> EXPORT_SYMBOL_GPL(drm_gpusvm_range_find_or_insert);
>>
>> /**
>> - * __drm_gpusvm_range_unmap_pages() - Unmap pages associated with a GPU SVM range (internal)
>> + * __drm_gpusvm_unmap_pages() - Unmap pages associated with a GPU SVM range (internal)
>> * @gpusvm: Pointer to the GPU SVM structure
>> - * @range: Pointer to the GPU SVM range structure
>> + * @svm_pages: Pointer to the GPU SVM pages structure
>> * @npages: Number of pages to unmap
>> *
>> * This function unmap pages associated with a GPU SVM range. Assumes and
>> * asserts correct locking is in place when called.
>> */
>> -static void __drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
>> - struct drm_gpusvm_range *range,
>> - unsigned long npages)
>> +static void __drm_gpusvm_unmap_pages(struct drm_gpusvm *gpusvm,
>> + struct drm_gpusvm_pages *svm_pages,
>> + unsigned long npages)
>> {
>> unsigned long i, j;
>> - struct drm_pagemap *dpagemap = range->dpagemap;
>> + struct drm_pagemap *dpagemap = svm_pages->dpagemap;
>> struct device *dev = gpusvm->drm->dev;
>>
>> lockdep_assert_held(&gpusvm->notifier_lock);
>>
>> - if (range->flags.has_dma_mapping) {
>> + if (svm_pages->flags.has_dma_mapping) {
>> for (i = 0, j = 0; i < npages; j++) {
>> - struct drm_pagemap_device_addr *addr = &range->dma_addr[j];
>> + struct drm_pagemap_device_addr *addr = &svm_pages->dma_addr[j];
>>
>> if (addr->proto == DRM_INTERCONNECT_SYSTEM)
>> dma_unmap_page(dev,
>> @@ -1152,9 +1152,9 @@ static void __drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
>> dev, *addr);
>> i += 1 << addr->order;
>> }
>> - range->flags.has_devmem_pages = false;
>> - range->flags.has_dma_mapping = false;
>> - range->dpagemap = NULL;
>> + svm_pages->flags.has_devmem_pages = false;
>> + svm_pages->flags.has_dma_mapping = false;
>> + svm_pages->dpagemap = NULL;
>> }
>> }
>>
>> @@ -1165,14 +1165,14 @@ static void __drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
>> *
>> * This function frees the dma address array associated with a GPU SVM range.
>> */
>> -static void drm_gpusvm_range_free_pages(struct drm_gpusvm *gpusvm,
>> - struct drm_gpusvm_range *range)
>> +static void drm_gpusvm_free_pages(struct drm_gpusvm *gpusvm,
>> + struct drm_gpusvm_pages *svm_pages)
>> {
>> lockdep_assert_held(&gpusvm->notifier_lock);
>>
>> - if (range->dma_addr) {
>> - kvfree(range->dma_addr);
>> - range->dma_addr = NULL;
>> + if (svm_pages->dma_addr) {
>> + kvfree(svm_pages->dma_addr);
>> + svm_pages->dma_addr = NULL;
>> }
>> }
>>
>> @@ -1200,8 +1200,8 @@ void drm_gpusvm_range_remove(struct drm_gpusvm *gpusvm,
>> return;
>>
>> drm_gpusvm_notifier_lock(gpusvm);
>> - __drm_gpusvm_range_unmap_pages(gpusvm, range, npages);
>> - drm_gpusvm_range_free_pages(gpusvm, range);
>> + __drm_gpusvm_unmap_pages(gpusvm, &range->pages, npages);
>> + drm_gpusvm_free_pages(gpusvm, &range->pages);
>> __drm_gpusvm_range_remove(notifier, range);
>> drm_gpusvm_notifier_unlock(gpusvm);
>>
>> @@ -1266,6 +1266,14 @@ void drm_gpusvm_range_put(struct drm_gpusvm_range *range)
>> }
>> EXPORT_SYMBOL_GPL(drm_gpusvm_range_put);
>>
>
> Same comment as patch #1, let's aim to keep kernel doc for all functions in drm_gpusvm.c.
>
>> +static bool drm_gpusvm_pages_valid(struct drm_gpusvm *gpusvm,
>> + struct drm_gpusvm_pages *svm_pages)
>> +{
>> + lockdep_assert_held(&gpusvm->notifier_lock);
>> +
>> + return svm_pages->flags.has_devmem_pages || svm_pages->flags.has_dma_mapping;
>> +}
>> +
>> /**
>> * drm_gpusvm_range_pages_valid() - GPU SVM range pages valid
>> * @gpusvm: Pointer to the GPU SVM structure
>> @@ -1283,9 +1291,7 @@ EXPORT_SYMBOL_GPL(drm_gpusvm_range_put);
>> bool drm_gpusvm_range_pages_valid(struct drm_gpusvm *gpusvm,
>> struct drm_gpusvm_range *range)
>> {
>> - lockdep_assert_held(&gpusvm->notifier_lock);
>> -
>> - return range->flags.has_devmem_pages || range->flags.has_dma_mapping;
>> + return drm_gpusvm_pages_valid(gpusvm, &range->pages);
>> }
>> EXPORT_SYMBOL_GPL(drm_gpusvm_range_pages_valid);
>>
>> @@ -1301,17 +1307,17 @@ EXPORT_SYMBOL_GPL(drm_gpusvm_range_pages_valid);
>> */
>> static bool
>> drm_gpusvm_range_pages_valid_unlocked(struct drm_gpusvm *gpusvm,
>> - struct drm_gpusvm_range *range)
>> + struct drm_gpusvm_pages *svm_pages)
>> {
>> bool pages_valid;
>>
>> - if (!range->dma_addr)
>> + if (!svm_pages->dma_addr)
>> return false;
>>
>> drm_gpusvm_notifier_lock(gpusvm);
>> - pages_valid = drm_gpusvm_range_pages_valid(gpusvm, range);
>> + pages_valid = drm_gpusvm_pages_valid(gpusvm, svm_pages);
>> if (!pages_valid)
>> - drm_gpusvm_range_free_pages(gpusvm, range);
>> + drm_gpusvm_free_pages(gpusvm, svm_pages);
>> drm_gpusvm_notifier_unlock(gpusvm);
>>
>> return pages_valid;
>> @@ -1332,6 +1338,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
>> struct drm_gpusvm_range *range,
>> const struct drm_gpusvm_ctx *ctx)
>> {
>> + struct drm_gpusvm_pages *svm_pages = &range->pages;
>> struct mmu_interval_notifier *notifier = &range->notifier->notifier;
>> struct hmm_range hmm_range = {
>> .default_flags = HMM_PFN_REQ_FAULT | (ctx->read_only ? 0 :
>> @@ -1360,7 +1367,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
>>
>> retry:
>> hmm_range.notifier_seq = mmu_interval_read_begin(notifier);
>> - if (drm_gpusvm_range_pages_valid_unlocked(gpusvm, range))
>> + if (drm_gpusvm_range_pages_valid_unlocked(gpusvm, svm_pages))
>> goto set_seqno;
>>
>> pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL);
>> @@ -1401,7 +1408,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
>> */
>> drm_gpusvm_notifier_lock(gpusvm);
>>
>> - if (range->flags.unmapped) {
>> + if (svm_pages->flags.unmapped) {
>> drm_gpusvm_notifier_unlock(gpusvm);
>> err = -EFAULT;
>> goto err_free;
>> @@ -1413,13 +1420,12 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
>> goto retry;
>> }
>>
>> - if (!range->dma_addr) {
>> + if (!svm_pages->dma_addr) {
>> /* Unlock and restart mapping to allocate memory. */
>> drm_gpusvm_notifier_unlock(gpusvm);
>> - range->dma_addr = kvmalloc_array(npages,
>> - sizeof(*range->dma_addr),
>> - GFP_KERNEL);
>> - if (!range->dma_addr) {
>> + svm_pages->dma_addr =
>> + kvmalloc_array(npages, sizeof(*svm_pages->dma_addr), GFP_KERNEL);
>> + if (!svm_pages->dma_addr) {
>> err = -ENOMEM;
>> goto err_free;
>> }
>> @@ -1457,13 +1463,13 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
>> goto err_unmap;
>> }
>> }
>> - range->dma_addr[j] =
>> + svm_pages->dma_addr[j] =
>> dpagemap->ops->device_map(dpagemap,
>> gpusvm->drm->dev,
>> page, order,
>> dma_dir);
>> if (dma_mapping_error(gpusvm->drm->dev,
>> - range->dma_addr[j].addr)) {
>> + svm_pages->dma_addr[j].addr)) {
>> err = -EFAULT;
>> goto err_unmap;
>> }
>> @@ -1487,7 +1493,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
>> goto err_unmap;
>> }
>>
>> - range->dma_addr[j] = drm_pagemap_device_addr_encode
>> + svm_pages->dma_addr[j] = drm_pagemap_device_addr_encode
>> (addr, DRM_INTERCONNECT_SYSTEM, order,
>> dma_dir);
>>
>> @@ -1503,10 +1509,10 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
>> num_dma_mapped = i;
>> }
>>
>> - range->flags.has_dma_mapping = true;
>> + svm_pages->flags.has_dma_mapping = true;
>> if (zdd) {
>> - range->flags.has_devmem_pages = true;
>> - range->dpagemap = dpagemap;
>> + svm_pages->flags.has_devmem_pages = true;
>> + svm_pages->dpagemap = dpagemap;
>> }
>>
>> drm_gpusvm_notifier_unlock(gpusvm);
>> @@ -1517,7 +1523,7 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
>> return 0;
>>
>> err_unmap:
>> - __drm_gpusvm_range_unmap_pages(gpusvm, range, num_dma_mapped);
>> + __drm_gpusvm_unmap_pages(gpusvm, svm_pages, num_dma_mapped);
>> drm_gpusvm_notifier_unlock(gpusvm);
>> err_free:
>> kvfree(pfns);
>> @@ -1543,6 +1549,7 @@ void drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
>> struct drm_gpusvm_range *range,
>> const struct drm_gpusvm_ctx *ctx)
>> {
>> + struct drm_gpusvm_pages *svm_pages = &range->pages;
>> unsigned long npages = npages_in_range(drm_gpusvm_range_start(range),
>> drm_gpusvm_range_end(range));
>>
>> @@ -1551,7 +1558,7 @@ void drm_gpusvm_range_unmap_pages(struct drm_gpusvm *gpusvm,
>> else
>> drm_gpusvm_notifier_lock(gpusvm);
>>
>> - __drm_gpusvm_range_unmap_pages(gpusvm, range, npages);
>> + __drm_gpusvm_unmap_pages(gpusvm, svm_pages, npages);
>>
>> if (!ctx->in_notifier)
>> drm_gpusvm_notifier_unlock(gpusvm);
>> @@ -1719,7 +1726,7 @@ int drm_gpusvm_migrate_to_devmem(struct drm_gpusvm *gpusvm,
>>
>> mmap_assert_locked(gpusvm->mm);
>>
>> - if (!range->flags.migrate_devmem)
>> + if (!range->pages.flags.migrate_devmem)
>> return -EINVAL;
>>
>> if (!ops->populate_devmem_pfn || !ops->copy_to_devmem ||
>> @@ -2248,10 +2255,10 @@ void drm_gpusvm_range_set_unmapped(struct drm_gpusvm_range *range,
>> {
>> lockdep_assert_held_write(&range->gpusvm->notifier_lock);
>>
>> - range->flags.unmapped = true;
>> + range->pages.flags.unmapped = true;
>> if (drm_gpusvm_range_start(range) < mmu_range->start ||
>> drm_gpusvm_range_end(range) > mmu_range->end)
>> - range->flags.partial_unmap = true;
>> + range->pages.flags.partial_unmap = true;
>> }
>> EXPORT_SYMBOL_GPL(drm_gpusvm_range_set_unmapped);
>>
>> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
>> index ffaf0d02dc7d..c43e7619cb80 100644
>> --- a/drivers/gpu/drm/xe/xe_pt.c
>> +++ b/drivers/gpu/drm/xe/xe_pt.c
>> @@ -659,7 +659,7 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
>> return -EAGAIN;
>> }
>> if (xe_svm_range_has_dma_mapping(range)) {
>> - xe_res_first_dma(range->base.dma_addr, 0,
>> + xe_res_first_dma(range->base.pages.dma_addr, 0,
>> range->base.itree.last + 1 - range->base.itree.start,
>> &curs);
>> is_devmem = xe_res_is_vram(&curs);
>> diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
>> index 08617a62ab07..4e7f2e77b38d 100644
>> --- a/drivers/gpu/drm/xe/xe_svm.c
>> +++ b/drivers/gpu/drm/xe/xe_svm.c
>> @@ -17,7 +17,7 @@
>> static bool xe_svm_range_in_vram(struct xe_svm_range *range)
>> {
>> /* Not reliable without notifier lock */
>> - return range->base.flags.has_devmem_pages;
>> + return range->base.pages.flags.has_devmem_pages;
>> }
>>
>> static bool xe_svm_range_has_vram_binding(struct xe_svm_range *range)
>> @@ -135,7 +135,7 @@ xe_svm_range_notifier_event_begin(struct xe_vm *vm, struct drm_gpusvm_range *r,
>> range_debug(range, "NOTIFIER");
>>
>> /* Skip if already unmapped or if no binding exist */
>> - if (range->base.flags.unmapped || !range->tile_present)
>> + if (range->base.pages.flags.unmapped || !range->tile_present)
>> return 0;
>>
>> range_debug(range, "NOTIFIER - EXECUTE");
>> @@ -766,7 +766,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
>> range_debug(range, "PAGE FAULT");
>>
>> /* XXX: Add migration policy, for now migrate range once */
>> - if (!range->skip_migrate && range->base.flags.migrate_devmem &&
>> + if (!range->skip_migrate && range->base.pages.flags.migrate_devmem &&
>> xe_svm_range_size(range) >= SZ_64K) {
>> range->skip_migrate = true;
>>
>> diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
>> index 93442738666e..79fbd4fec1fb 100644
>> --- a/drivers/gpu/drm/xe/xe_svm.h
>> +++ b/drivers/gpu/drm/xe/xe_svm.h
>> @@ -136,7 +136,7 @@ void xe_svm_range_debug(struct xe_svm_range *range, const char *operation)
>> static inline bool xe_svm_range_has_dma_mapping(struct xe_svm_range *range)
>> {
>> lockdep_assert_held(&range->base.gpusvm->notifier_lock);
>> - return range->base.flags.has_dma_mapping;
>> + return range->base.pages.flags.has_dma_mapping;
>> }
>>
>> #define xe_svm_assert_in_notifier(vm__) \
>> diff --git a/include/drm/drm_gpusvm.h b/include/drm/drm_gpusvm.h
>> index df120b4d1f83..c15c733ef0ad 100644
>> --- a/include/drm/drm_gpusvm.h
>> +++ b/include/drm/drm_gpusvm.h
>> @@ -186,14 +186,8 @@ struct drm_gpusvm_notifier {
>> };
>>
>> /**
>> - * struct drm_gpusvm_range - Structure representing a GPU SVM range
>> + * struct drm_gpusvm_pages - Structure representing a GPU SVM mapped pages
>> *
>> - * @gpusvm: Pointer to the GPU SVM structure
>> - * @notifier: Pointer to the GPU SVM notifier
>> - * @refcount: Reference count for the range
>> - * @itree: Interval tree node for the range (inserted in GPU SVM notifier)
>> - * @entry: List entry to fast interval tree traversal
>> - * @notifier_seq: Notifier sequence number of the range's pages
>> * @dma_addr: Device address array
>> * @dpagemap: The struct drm_pagemap of the device pages we're dma-mapping.
>> * Note this is assuming only one drm_pagemap per range is allowed.
>> @@ -203,17 +197,8 @@ struct drm_gpusvm_notifier {
>> * @flags.partial_unmap: Flag indicating if the range has been partially unmapped
>> * @flags.has_devmem_pages: Flag indicating if the range has devmem pages
>> * @flags.has_dma_mapping: Flag indicating if the range has a DMA mapping
>> - *
>> - * This structure represents a GPU SVM range used for tracking memory ranges
>> - * mapped in a DRM device.
>> */
>> -struct drm_gpusvm_range {
>> - struct drm_gpusvm *gpusvm;
>> - struct drm_gpusvm_notifier *notifier;
>> - struct kref refcount;
>> - struct interval_tree_node itree;
>> - struct list_head entry;
>> - unsigned long notifier_seq;
>> +struct drm_gpusvm_pages {
>> struct drm_pagemap_device_addr *dma_addr;
>> struct drm_pagemap *dpagemap;
>> struct {
>> @@ -227,6 +212,30 @@ struct drm_gpusvm_range {
>> } flags;
>> };
>>
>> +/**
>> + * struct drm_gpusvm_range - Structure representing a GPU SVM range
>> + *
>> + * @gpusvm: Pointer to the GPU SVM structure
>> + * @notifier: Pointer to the GPU SVM notifier
>> + * @refcount: Reference count for the range
>> + * @itree: Interval tree node for the range (inserted in GPU SVM notifier)
>> + * @entry: List entry to fast interval tree traversal
>> + * @notifier_seq: Notifier sequence number of the range's pages
>> + * @pages: The pages for this range.
>> + *
>> + * This structure represents a GPU SVM range used for tracking memory ranges
>> + * mapped in a DRM device.
>> + */
>> +struct drm_gpusvm_range {
>> + struct drm_gpusvm *gpusvm;
>> + struct drm_gpusvm_notifier *notifier;
>> + struct kref refcount;
>> + struct interval_tree_node itree;
>> + struct list_head entry;
>> + unsigned long notifier_seq;
>
> Should the notifier seqno be in the pages?
Yeah, I think that makes sense. Thanks.
>
> Matt
>
>> + struct drm_gpusvm_pages pages;
>> +};
>> +
>> /**
>> * struct drm_gpusvm - GPU SVM structure
>> *
>> --
>> 2.48.1
>>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 7/7] drm/xe/userptr: replace xe_hmm with gpusvm
2025-03-20 20:52 ` Matthew Brost
@ 2025-03-25 9:50 ` Ghimiray, Himal Prasad
0 siblings, 0 replies; 22+ messages in thread
From: Ghimiray, Himal Prasad @ 2025-03-25 9:50 UTC (permalink / raw)
To: Matthew Brost, Matthew Auld; +Cc: intel-xe, dri-devel, Thomas Hellström
On 21-03-2025 02:22, Matthew Brost wrote:
> On Thu, Mar 20, 2025 at 05:30:04PM +0000, Matthew Auld wrote:
>> Goal here is cut over to gpusvm and remove xe_hmm, relying instead on
>> common code. The core facilities we need are get_pages(), unmap_pages()
>> and free_pages() for a given useptr range, plus a vm level notifier
>> lock, which is now provided by gpusvm.
>>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> ---
>> drivers/gpu/drm/xe/Makefile | 1 -
>> drivers/gpu/drm/xe/xe_exec.c | 4 +-
>> drivers/gpu/drm/xe/xe_hmm.c | 349 -------------------------------
>> drivers/gpu/drm/xe/xe_hmm.h | 18 --
>> drivers/gpu/drm/xe/xe_pt.c | 20 +-
>> drivers/gpu/drm/xe/xe_vm.c | 74 ++++---
>> drivers/gpu/drm/xe/xe_vm_types.h | 30 ++-
>> 7 files changed, 70 insertions(+), 426 deletions(-)
>> delete mode 100644 drivers/gpu/drm/xe/xe_hmm.c
>> delete mode 100644 drivers/gpu/drm/xe/xe_hmm.h
>>
>> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
>> index 9699b08585f7..ada1f0fad629 100644
>> --- a/drivers/gpu/drm/xe/Makefile
>> +++ b/drivers/gpu/drm/xe/Makefile
>> @@ -124,7 +124,6 @@ xe-y += xe_bb.o \
>> xe_wait_user_fence.o \
>> xe_wopcm.o
>>
>> -xe-$(CONFIG_HMM_MIRROR) += xe_hmm.o
>> xe-$(CONFIG_DRM_GPUSVM) += xe_svm.o
>>
>> # graphics hardware monitoring (HWMON) support
>> diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c
>> index b75adfc99fb7..84daeedd65d4 100644
>> --- a/drivers/gpu/drm/xe/xe_exec.c
>> +++ b/drivers/gpu/drm/xe/xe_exec.c
>> @@ -294,7 +294,7 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>> if (err)
>> goto err_put_job;
>>
>> - err = down_read_interruptible(&vm->userptr.notifier_lock);
>> + err = down_read_interruptible(&vm->userptr.gpusvm.notifier_lock);
>> if (err)
>> goto err_put_job;
>>
>> @@ -336,7 +336,7 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>>
>> err_repin:
>> if (!xe_vm_in_lr_mode(vm))
>> - up_read(&vm->userptr.notifier_lock);
>> + up_read(&vm->userptr.gpusvm.notifier_lock);
>> err_put_job:
>> if (err)
>> xe_sched_job_put(job);
>> diff --git a/drivers/gpu/drm/xe/xe_hmm.c b/drivers/gpu/drm/xe/xe_hmm.c
>> deleted file mode 100644
>> index c3cc0fa105e8..000000000000
>> --- a/drivers/gpu/drm/xe/xe_hmm.c
>> +++ /dev/null
>> @@ -1,349 +0,0 @@
>> -// SPDX-License-Identifier: MIT
>> -/*
>> - * Copyright © 2024 Intel Corporation
>> - */
>> -
>> -#include <linux/scatterlist.h>
>> -#include <linux/mmu_notifier.h>
>> -#include <linux/dma-mapping.h>
>> -#include <linux/memremap.h>
>> -#include <linux/swap.h>
>> -#include <linux/hmm.h>
>> -#include <linux/mm.h>
>> -#include "xe_hmm.h"
>> -#include "xe_vm.h"
>> -#include "xe_bo.h"
>> -
>> -static u64 xe_npages_in_range(unsigned long start, unsigned long end)
>> -{
>> - return (end - start) >> PAGE_SHIFT;
>> -}
>> -
>> -/**
>> - * xe_mark_range_accessed() - mark a range is accessed, so core mm
>> - * have such information for memory eviction or write back to
>> - * hard disk
>> - * @range: the range to mark
>> - * @write: if write to this range, we mark pages in this range
>> - * as dirty
>> - */
>> -static void xe_mark_range_accessed(struct hmm_range *range, bool write)
>> -{
>> - struct page *page;
>> - u64 i, npages;
>> -
>> - npages = xe_npages_in_range(range->start, range->end);
>> - for (i = 0; i < npages; i++) {
>> - page = hmm_pfn_to_page(range->hmm_pfns[i]);
>> - if (write)
>> - set_page_dirty_lock(page);
>> -
>> - mark_page_accessed(page);
>> - }
>> -}
>> -
>> -static int xe_alloc_sg(struct xe_device *xe, struct sg_table *st,
>> - struct hmm_range *range, struct rw_semaphore *notifier_sem)
>> -{
>> - unsigned long i, npages, hmm_pfn;
>> - unsigned long num_chunks = 0;
>> - int ret;
>> -
>> - /* HMM docs says this is needed. */
>> - ret = down_read_interruptible(notifier_sem);
>> - if (ret)
>> - return ret;
>> -
>> - if (mmu_interval_read_retry(range->notifier, range->notifier_seq)) {
>> - up_read(notifier_sem);
>> - return -EAGAIN;
>> - }
>> -
>> - npages = xe_npages_in_range(range->start, range->end);
>> - for (i = 0; i < npages;) {
>> - unsigned long len;
>> -
>> - hmm_pfn = range->hmm_pfns[i];
>> - xe_assert(xe, hmm_pfn & HMM_PFN_VALID);
>> -
>> - len = 1UL << hmm_pfn_to_map_order(hmm_pfn);
>> -
>> - /* If order > 0 the page may extend beyond range->start */
>> - len -= (hmm_pfn & ~HMM_PFN_FLAGS) & (len - 1);
>> - i += len;
>> - num_chunks++;
>> - }
>> - up_read(notifier_sem);
>> -
>> - return sg_alloc_table(st, num_chunks, GFP_KERNEL);
>> -}
>> -
>> -/**
>> - * xe_build_sg() - build a scatter gather table for all the physical pages/pfn
>> - * in a hmm_range. dma-map pages if necessary. dma-address is save in sg table
>> - * and will be used to program GPU page table later.
>> - * @xe: the xe device who will access the dma-address in sg table
>> - * @range: the hmm range that we build the sg table from. range->hmm_pfns[]
>> - * has the pfn numbers of pages that back up this hmm address range.
>> - * @st: pointer to the sg table.
>> - * @notifier_sem: The xe notifier lock.
>> - * @write: whether we write to this range. This decides dma map direction
>> - * for system pages. If write we map it bi-diretional; otherwise
>> - * DMA_TO_DEVICE
>> - *
>> - * All the contiguous pfns will be collapsed into one entry in
>> - * the scatter gather table. This is for the purpose of efficiently
>> - * programming GPU page table.
>> - *
>> - * The dma_address in the sg table will later be used by GPU to
>> - * access memory. So if the memory is system memory, we need to
>> - * do a dma-mapping so it can be accessed by GPU/DMA.
>> - *
>> - * FIXME: This function currently only support pages in system
>> - * memory. If the memory is GPU local memory (of the GPU who
>> - * is going to access memory), we need gpu dpa (device physical
>> - * address), and there is no need of dma-mapping. This is TBD.
>> - *
>> - * FIXME: dma-mapping for peer gpu device to access remote gpu's
>> - * memory. Add this when you support p2p
>> - *
>> - * This function allocates the storage of the sg table. It is
>> - * caller's responsibility to free it calling sg_free_table.
>> - *
>> - * Returns 0 if successful; -ENOMEM if fails to allocate memory
>> - */
>> -static int xe_build_sg(struct xe_device *xe, struct hmm_range *range,
>> - struct sg_table *st,
>> - struct rw_semaphore *notifier_sem,
>> - bool write)
>> -{
>> - unsigned long npages = xe_npages_in_range(range->start, range->end);
>> - struct device *dev = xe->drm.dev;
>> - struct scatterlist *sgl;
>> - struct page *page;
>> - unsigned long i, j;
>> -
>> - lockdep_assert_held(notifier_sem);
>> -
>> - i = 0;
>> - for_each_sg(st->sgl, sgl, st->nents, j) {
>> - unsigned long hmm_pfn, size;
>> -
>> - hmm_pfn = range->hmm_pfns[i];
>> - page = hmm_pfn_to_page(hmm_pfn);
>> - xe_assert(xe, !is_device_private_page(page));
>> -
>> - size = 1UL << hmm_pfn_to_map_order(hmm_pfn);
>> - size -= page_to_pfn(page) & (size - 1);
>> - i += size;
>> -
>> - if (unlikely(j == st->nents - 1)) {
>> - xe_assert(xe, i >= npages);
>> - if (i > npages)
>> - size -= (i - npages);
>> -
>> - sg_mark_end(sgl);
>> - } else {
>> - xe_assert(xe, i < npages);
>> - }
>> -
>> - sg_set_page(sgl, page, size << PAGE_SHIFT, 0);
>> - }
>> -
>> - return dma_map_sgtable(dev, st, write ? DMA_BIDIRECTIONAL : DMA_TO_DEVICE,
>> - DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_NO_KERNEL_MAPPING);
>> -}
>> -
>> -static void xe_hmm_userptr_set_mapped(struct xe_userptr_vma *uvma)
>> -{
>> - struct xe_userptr *userptr = &uvma->userptr;
>> - struct xe_vm *vm = xe_vma_vm(&uvma->vma);
>> -
>> - lockdep_assert_held_write(&vm->lock);
>> - lockdep_assert_held(&vm->userptr.notifier_lock);
>> -
>> - mutex_lock(&userptr->unmap_mutex);
>> - xe_assert(vm->xe, !userptr->mapped);
>> - userptr->mapped = true;
>> - mutex_unlock(&userptr->unmap_mutex);
>> -}
>> -
>> -void xe_hmm_userptr_unmap(struct xe_userptr_vma *uvma)
>> -{
>> - struct xe_userptr *userptr = &uvma->userptr;
>> - struct xe_vma *vma = &uvma->vma;
>> - bool write = !xe_vma_read_only(vma);
>> - struct xe_vm *vm = xe_vma_vm(vma);
>> - struct xe_device *xe = vm->xe;
>> -
>> - if (!lockdep_is_held_type(&vm->userptr.notifier_lock, 0) &&
>> - !lockdep_is_held_type(&vm->lock, 0) &&
>> - !(vma->gpuva.flags & XE_VMA_DESTROYED)) {
>> - /* Don't unmap in exec critical section. */
>> - xe_vm_assert_held(vm);
>> - /* Don't unmap while mapping the sg. */
>> - lockdep_assert_held(&vm->lock);
>> - }
>> -
>> - mutex_lock(&userptr->unmap_mutex);
>> - if (userptr->sg && userptr->mapped)
>> - dma_unmap_sgtable(xe->drm.dev, userptr->sg,
>> - write ? DMA_BIDIRECTIONAL : DMA_TO_DEVICE, 0);
>> - userptr->mapped = false;
>> - mutex_unlock(&userptr->unmap_mutex);
>> -}
>> -
>> -/**
>> - * xe_hmm_userptr_free_sg() - Free the scatter gather table of userptr
>> - * @uvma: the userptr vma which hold the scatter gather table
>> - *
>> - * With function xe_userptr_populate_range, we allocate storage of
>> - * the userptr sg table. This is a helper function to free this
>> - * sg table, and dma unmap the address in the table.
>> - */
>> -void xe_hmm_userptr_free_sg(struct xe_userptr_vma *uvma)
>> -{
>> - struct xe_userptr *userptr = &uvma->userptr;
>> -
>> - xe_assert(xe_vma_vm(&uvma->vma)->xe, userptr->sg);
>> - xe_hmm_userptr_unmap(uvma);
>> - sg_free_table(userptr->sg);
>> - userptr->sg = NULL;
>> -}
>> -
>> -/**
>> - * xe_hmm_userptr_populate_range() - Populate physical pages of a virtual
>> - * address range
>> - *
>> - * @uvma: userptr vma which has information of the range to populate.
>> - * @is_mm_mmap_locked: True if mmap_read_lock is already acquired by caller.
>> - *
>> - * This function populate the physical pages of a virtual
>> - * address range. The populated physical pages is saved in
>> - * userptr's sg table. It is similar to get_user_pages but call
>> - * hmm_range_fault.
>> - *
>> - * This function also read mmu notifier sequence # (
>> - * mmu_interval_read_begin), for the purpose of later
>> - * comparison (through mmu_interval_read_retry).
>> - *
>> - * This must be called with mmap read or write lock held.
>> - *
>> - * This function allocates the storage of the userptr sg table.
>> - * It is caller's responsibility to free it calling sg_free_table.
>> - *
>> - * returns: 0 for success; negative error no on failure
>> - */
>> -int xe_hmm_userptr_populate_range(struct xe_userptr_vma *uvma,
>> - bool is_mm_mmap_locked)
>> -{
>> - unsigned long timeout =
>> - jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
>> - unsigned long *pfns;
>> - struct xe_userptr *userptr;
>> - struct xe_vma *vma = &uvma->vma;
>> - u64 userptr_start = xe_vma_userptr(vma);
>> - u64 userptr_end = userptr_start + xe_vma_size(vma);
>> - struct xe_vm *vm = xe_vma_vm(vma);
>> - struct hmm_range hmm_range = {
>> - .pfn_flags_mask = 0, /* ignore pfns */
>> - .default_flags = HMM_PFN_REQ_FAULT,
>> - .start = userptr_start,
>> - .end = userptr_end,
>> - .notifier = &uvma->userptr.notifier,
>> - .dev_private_owner = vm->xe,
>> - };
>> - bool write = !xe_vma_read_only(vma);
>> - unsigned long notifier_seq;
>> - u64 npages;
>> - int ret;
>> -
>> - userptr = &uvma->userptr;
>> -
>> - if (is_mm_mmap_locked)
>> - mmap_assert_locked(userptr->notifier.mm);
>> -
>> - if (vma->gpuva.flags & XE_VMA_DESTROYED)
>> - return 0;
>> -
>> - notifier_seq = mmu_interval_read_begin(&userptr->notifier);
>> - if (notifier_seq == userptr->notifier_seq)
>> - return 0;
>> -
>> - if (userptr->sg)
>> - xe_hmm_userptr_free_sg(uvma);
>> -
>> - npages = xe_npages_in_range(userptr_start, userptr_end);
>> - pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL);
>> - if (unlikely(!pfns))
>> - return -ENOMEM;
>> -
>> - if (write)
>> - hmm_range.default_flags |= HMM_PFN_REQ_WRITE;
>> -
>> - if (!mmget_not_zero(userptr->notifier.mm)) {
>> - ret = -EFAULT;
>> - goto free_pfns;
>> - }
>> -
>> - hmm_range.hmm_pfns = pfns;
>> -
>> - while (true) {
>> - hmm_range.notifier_seq = mmu_interval_read_begin(&userptr->notifier);
>> -
>> - if (!is_mm_mmap_locked)
>> - mmap_read_lock(userptr->notifier.mm);
>> -
>> - ret = hmm_range_fault(&hmm_range);
>> -
>> - if (!is_mm_mmap_locked)
>> - mmap_read_unlock(userptr->notifier.mm);
>> -
>> - if (ret == -EBUSY) {
>> - if (time_after(jiffies, timeout))
>> - break;
>> -
>> - continue;
>> - }
>> - break;
>> - }
>> -
>> - mmput(userptr->notifier.mm);
>> -
>> - if (ret)
>> - goto free_pfns;
>> -
>> - ret = xe_alloc_sg(vm->xe, &userptr->sgt, &hmm_range, &vm->userptr.notifier_lock);
>> - if (ret)
>> - goto free_pfns;
>> -
>> - ret = down_read_interruptible(&vm->userptr.notifier_lock);
>> - if (ret)
>> - goto free_st;
>> -
>> - if (mmu_interval_read_retry(hmm_range.notifier, hmm_range.notifier_seq)) {
>> - ret = -EAGAIN;
>> - goto out_unlock;
>> - }
>> -
>> - ret = xe_build_sg(vm->xe, &hmm_range, &userptr->sgt,
>> - &vm->userptr.notifier_lock, write);
>> - if (ret)
>> - goto out_unlock;
>> -
>> - xe_mark_range_accessed(&hmm_range, write);
>> - userptr->sg = &userptr->sgt;
>> - xe_hmm_userptr_set_mapped(uvma);
>> - userptr->notifier_seq = hmm_range.notifier_seq;
>> - up_read(&vm->userptr.notifier_lock);
>> - kvfree(pfns);
>> - return 0;
>> -
>> -out_unlock:
>> - up_read(&vm->userptr.notifier_lock);
>> -free_st:
>> - sg_free_table(&userptr->sgt);
>> -free_pfns:
>> - kvfree(pfns);
>> - return ret;
>> -}
>> diff --git a/drivers/gpu/drm/xe/xe_hmm.h b/drivers/gpu/drm/xe/xe_hmm.h
>> deleted file mode 100644
>> index 0ea98d8e7bbc..000000000000
>> --- a/drivers/gpu/drm/xe/xe_hmm.h
>> +++ /dev/null
>> @@ -1,18 +0,0 @@
>> -/* SPDX-License-Identifier: MIT
>> - *
>> - * Copyright © 2024 Intel Corporation
>> - */
>> -
>> -#ifndef _XE_HMM_H_
>> -#define _XE_HMM_H_
>> -
>> -#include <linux/types.h>
>> -
>> -struct xe_userptr_vma;
>> -
>> -int xe_hmm_userptr_populate_range(struct xe_userptr_vma *uvma, bool is_mm_mmap_locked);
>> -
>> -void xe_hmm_userptr_free_sg(struct xe_userptr_vma *uvma);
>> -
>> -void xe_hmm_userptr_unmap(struct xe_userptr_vma *uvma);
>> -#endif
>> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
>> index c43e7619cb80..824bf99e5f71 100644
>> --- a/drivers/gpu/drm/xe/xe_pt.c
>> +++ b/drivers/gpu/drm/xe/xe_pt.c
>> @@ -727,8 +727,8 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
>>
>> if (!xe_vma_is_null(vma) && !range) {
>> if (xe_vma_is_userptr(vma))
>> - xe_res_first_sg(to_userptr_vma(vma)->userptr.sg, 0,
>> - xe_vma_size(vma), &curs);
>> + xe_res_first_dma(to_userptr_vma(vma)->userptr.range.pages.dma_addr, 0,
>> + xe_vma_size(vma), &curs);
>> else if (xe_bo_is_vram(bo) || xe_bo_is_stolen(bo))
>> xe_res_first(bo->ttm.resource, xe_vma_bo_offset(vma),
>> xe_vma_size(vma), &curs);
>> @@ -998,7 +998,7 @@ static void xe_pt_commit_locks_assert(struct xe_vma *vma)
>> xe_pt_commit_prepare_locks_assert(vma);
>>
>> if (xe_vma_is_userptr(vma))
>> - lockdep_assert_held_read(&vm->userptr.notifier_lock);
>> + lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
>> }
>>
>> static void xe_pt_commit(struct xe_vma *vma,
>> @@ -1336,7 +1336,7 @@ static int vma_check_userptr(struct xe_vm *vm, struct xe_vma *vma,
>> struct xe_userptr_vma *uvma;
>> unsigned long notifier_seq;
>>
>> - lockdep_assert_held_read(&vm->userptr.notifier_lock);
>> + lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
>>
>> if (!xe_vma_is_userptr(vma))
>> return 0;
>> @@ -1366,7 +1366,7 @@ static int op_check_userptr(struct xe_vm *vm, struct xe_vma_op *op,
>> {
>> int err = 0;
>>
>> - lockdep_assert_held_read(&vm->userptr.notifier_lock);
>> + lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
>>
>> switch (op->base.op) {
>> case DRM_GPUVA_OP_MAP:
>> @@ -1407,12 +1407,12 @@ static int xe_pt_userptr_pre_commit(struct xe_migrate_pt_update *pt_update)
>> if (err)
>> return err;
>>
>> - down_read(&vm->userptr.notifier_lock);
>> + down_read(&vm->userptr.gpusvm.notifier_lock);
>>
>> list_for_each_entry(op, &vops->list, link) {
>> err = op_check_userptr(vm, op, pt_update_ops);
>> if (err) {
>> - up_read(&vm->userptr.notifier_lock);
>> + up_read(&vm->userptr.gpusvm.notifier_lock);
>> break;
>> }
>> }
>> @@ -2133,7 +2133,7 @@ static void bind_op_commit(struct xe_vm *vm, struct xe_tile *tile,
>> vma->tile_present |= BIT(tile->id);
>> vma->tile_staged &= ~BIT(tile->id);
>> if (xe_vma_is_userptr(vma)) {
>> - lockdep_assert_held_read(&vm->userptr.notifier_lock);
>> + lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
>> to_userptr_vma(vma)->userptr.initial_bind = true;
>> }
>>
>> @@ -2169,7 +2169,7 @@ static void unbind_op_commit(struct xe_vm *vm, struct xe_tile *tile,
>> if (!vma->tile_present) {
>> list_del_init(&vma->combined_links.rebind);
>> if (xe_vma_is_userptr(vma)) {
>> - lockdep_assert_held_read(&vm->userptr.notifier_lock);
>> + lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
>>
>> spin_lock(&vm->userptr.invalidated_lock);
>> list_del_init(&to_userptr_vma(vma)->userptr.invalidate_link);
>> @@ -2414,7 +2414,7 @@ xe_pt_update_ops_run(struct xe_tile *tile, struct xe_vma_ops *vops)
>> if (pt_update_ops->needs_svm_lock)
>> xe_svm_notifier_unlock(vm);
>> if (pt_update_ops->needs_userptr_lock)
>> - up_read(&vm->userptr.notifier_lock);
>> + up_read(&vm->userptr.gpusvm.notifier_lock);
>>
>> return fence;
>>
>> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
>> index 60303998bd61..fdc24718b9ad 100644
>> --- a/drivers/gpu/drm/xe/xe_vm.c
>> +++ b/drivers/gpu/drm/xe/xe_vm.c
>> @@ -5,6 +5,7 @@
>>
>> #include "xe_vm.h"
>>
>> +#include "drm/drm_gpusvm.h"
>> #include <linux/dma-fence-array.h>
>> #include <linux/nospec.h>
>>
>> @@ -40,7 +41,6 @@
>> #include "xe_sync.h"
>> #include "xe_trace_bo.h"
>> #include "xe_wa.h"
>> -#include "xe_hmm.h"
>>
>> static struct drm_gem_object *xe_vm_obj(struct xe_vm *vm)
>> {
>> @@ -53,7 +53,7 @@ static struct drm_gem_object *xe_vm_obj(struct xe_vm *vm)
>> *
>> * Check if the userptr vma has been invalidated since last successful
>> * repin. The check is advisory only and can the function can be called
>> - * without the vm->userptr.notifier_lock held. There is no guarantee that the
>> + * without the vm->userptr.gpusvm.notifier_lock held. There is no guarantee that the
>> * vma userptr will remain valid after a lockless check, so typically
>> * the call needs to be followed by a proper check under the notifier_lock.
>> *
>> @@ -71,11 +71,17 @@ int xe_vma_userptr_pin_pages(struct xe_userptr_vma *uvma)
>> struct xe_vma *vma = &uvma->vma;
>> struct xe_vm *vm = xe_vma_vm(vma);
>> struct xe_device *xe = vm->xe;
>> + struct drm_gpusvm_ctx ctx = {
>> + .read_only = xe_vma_read_only(vma),
>> + };
>>
>> lockdep_assert_held(&vm->lock);
>> xe_assert(xe, xe_vma_is_userptr(vma));
>>
>> - return xe_hmm_userptr_populate_range(uvma, false);
>> + if (vma->gpuva.flags & XE_VMA_DESTROYED)
>> + return 0;
>> +
>> + return drm_gpusvm_basic_range_get_pages(&uvma->userptr.range, &ctx);
>> }
>>
>> static bool preempt_fences_waiting(struct xe_vm *vm)
>> @@ -249,7 +255,7 @@ int xe_vm_add_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q)
>> ++vm->preempt.num_exec_queues;
>> q->lr.pfence = pfence;
>>
>> - down_read(&vm->userptr.notifier_lock);
>> + down_read(&vm->userptr.gpusvm.notifier_lock);
>>
>> drm_gpuvm_resv_add_fence(&vm->gpuvm, exec, pfence,
>> DMA_RESV_USAGE_BOOKKEEP, DMA_RESV_USAGE_BOOKKEEP);
>> @@ -263,7 +269,7 @@ int xe_vm_add_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q)
>> if (wait)
>> dma_fence_enable_sw_signaling(pfence);
>>
>> - up_read(&vm->userptr.notifier_lock);
>> + up_read(&vm->userptr.gpusvm.notifier_lock);
>>
>> out_fini:
>> drm_exec_fini(exec);
>> @@ -305,14 +311,14 @@ void xe_vm_remove_compute_exec_queue(struct xe_vm *vm, struct xe_exec_queue *q)
>> * @vm: The VM.
>> *
>> * This function checks for whether the VM has userptrs that need repinning,
>> - * and provides a release-type barrier on the userptr.notifier_lock after
>> + * and provides a release-type barrier on the userptr.gpusvm.notifier_lock after
>> * checking.
>> *
>> * Return: 0 if there are no userptrs needing repinning, -EAGAIN if there are.
>> */
>> int __xe_vm_userptr_needs_repin(struct xe_vm *vm)
>> {
>> - lockdep_assert_held_read(&vm->userptr.notifier_lock);
>> + lockdep_assert_held_read(&vm->userptr.gpusvm.notifier_lock);
>>
>> return (list_empty(&vm->userptr.repin_list) &&
>> list_empty(&vm->userptr.invalidated)) ? 0 : -EAGAIN;
>> @@ -546,9 +552,9 @@ static void preempt_rebind_work_func(struct work_struct *w)
>> (!(__tries)++ || __xe_vm_userptr_needs_repin(__vm)) : \
>> __xe_vm_userptr_needs_repin(__vm))
>>
>> - down_read(&vm->userptr.notifier_lock);
>> + down_read(&vm->userptr.gpusvm.notifier_lock);
>> if (retry_required(tries, vm)) {
>> - up_read(&vm->userptr.notifier_lock);
>> + up_read(&vm->userptr.gpusvm.notifier_lock);
>> err = -EAGAIN;
>> goto out_unlock;
>> }
>> @@ -562,7 +568,7 @@ static void preempt_rebind_work_func(struct work_struct *w)
>> /* Point of no return. */
>> arm_preempt_fences(vm, &preempt_fences);
>> resume_and_reinstall_preempt_fences(vm, &exec);
>> - up_read(&vm->userptr.notifier_lock);
>> + up_read(&vm->userptr.gpusvm.notifier_lock);
>>
>> out_unlock:
>> drm_exec_fini(&exec);
>> @@ -589,6 +595,9 @@ static void __vma_userptr_invalidate(struct xe_vm *vm, struct xe_userptr_vma *uv
>> struct xe_vma *vma = &uvma->vma;
>> struct dma_resv_iter cursor;
>> struct dma_fence *fence;
>> + struct drm_gpusvm_ctx ctx = {
>> + .in_notifier = true,
>> + };
>> long err;
>>
>> /*
>> @@ -625,7 +634,7 @@ static void __vma_userptr_invalidate(struct xe_vm *vm, struct xe_userptr_vma *uv
>> XE_WARN_ON(err);
>> }
>>
>> - xe_hmm_userptr_unmap(uvma);
>> + drm_gpusvm_basic_range_unmap_pages(&uvma->userptr.range, &ctx);
>> }
>>
>> static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
>> @@ -646,11 +655,11 @@ static bool vma_userptr_invalidate(struct mmu_interval_notifier *mni,
>> "NOTIFIER: addr=0x%016llx, range=0x%016llx",
>> xe_vma_start(vma), xe_vma_size(vma));
>>
>> - down_write(&vm->userptr.notifier_lock);
>> + down_write(&vm->userptr.gpusvm.notifier_lock);
>> mmu_interval_set_seq(mni, cur_seq);
>>
>> __vma_userptr_invalidate(vm, uvma);
>> - up_write(&vm->userptr.notifier_lock);
>> + up_write(&vm->userptr.gpusvm.notifier_lock);
>> trace_xe_vma_userptr_invalidate_complete(vma);
>>
>> return true;
>> @@ -674,7 +683,7 @@ void xe_vma_userptr_force_invalidate(struct xe_userptr_vma *uvma)
>> /* Protect against concurrent userptr pinning */
>> lockdep_assert_held(&vm->lock);
>> /* Protect against concurrent notifiers */
>> - lockdep_assert_held(&vm->userptr.notifier_lock);
>> + lockdep_assert_held(&vm->userptr.gpusvm.notifier_lock);
>> /*
>> * Protect against concurrent instances of this function and
>> * the critical exec sections
>> @@ -747,7 +756,7 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
>> }
>>
>> if (err) {
>> - down_write(&vm->userptr.notifier_lock);
>> + down_write(&vm->userptr.gpusvm.notifier_lock);
>> spin_lock(&vm->userptr.invalidated_lock);
>> list_for_each_entry_safe(uvma, next, &vm->userptr.repin_list,
>> userptr.repin_link) {
>> @@ -756,7 +765,7 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
>> &vm->userptr.invalidated);
>> }
>> spin_unlock(&vm->userptr.invalidated_lock);
>> - up_write(&vm->userptr.notifier_lock);
>> + up_write(&vm->userptr.gpusvm.notifier_lock);
>> }
>> return err;
>> }
>> @@ -1222,7 +1231,6 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
>> INIT_LIST_HEAD(&userptr->invalidate_link);
>> INIT_LIST_HEAD(&userptr->repin_link);
>> vma->gpuva.gem.offset = bo_offset_or_userptr;
>> - mutex_init(&userptr->unmap_mutex);
>>
>> err = mmu_interval_notifier_insert(&userptr->notifier,
>> current->mm,
>> @@ -1233,7 +1241,10 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
>> return ERR_PTR(err);
>> }
>>
>> - userptr->notifier_seq = LONG_MAX;
>> + drm_gpusvm_basic_range_init(&vm->userptr.gpusvm,
>> + &userptr->range,
>> + &userptr->notifier,
>> + &userptr->notifier_seq);
>> }
>>
>> xe_vm_get(vm);
>> @@ -1255,8 +1266,7 @@ static void xe_vma_destroy_late(struct xe_vma *vma)
>> struct xe_userptr_vma *uvma = to_userptr_vma(vma);
>> struct xe_userptr *userptr = &uvma->userptr;
>>
>> - if (userptr->sg)
>> - xe_hmm_userptr_free_sg(uvma);
>> + drm_gpusvm_basic_range_free_pages(&uvma->userptr.range);
>>
>> /*
>> * Since userptr pages are not pinned, we can't remove
>> @@ -1264,7 +1274,7 @@ static void xe_vma_destroy_late(struct xe_vma *vma)
>> * them anymore
>> */
>> mmu_interval_notifier_remove(&userptr->notifier);
>> - mutex_destroy(&userptr->unmap_mutex);
>> + drm_gpusvm_basic_range_fini(&uvma->userptr.range);
>> xe_vm_put(vm);
>> } else if (xe_vma_is_null(vma) || xe_vma_is_cpu_addr_mirror(vma)) {
>> xe_vm_put(vm);
>> @@ -1657,7 +1667,6 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
>>
>> INIT_LIST_HEAD(&vm->userptr.repin_list);
>> INIT_LIST_HEAD(&vm->userptr.invalidated);
>> - init_rwsem(&vm->userptr.notifier_lock);
>> spin_lock_init(&vm->userptr.invalidated_lock);
>>
>> ttm_lru_bulk_move_init(&vm->lru_bulk_move);
>> @@ -1763,6 +1772,9 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32 flags)
>> goto err_close;
>> }
>>
>> + drm_gpusvm_basic_init(&vm->userptr.gpusvm, "xe-vm-userptr", &xe->drm);
>> + drm_gpusvm_driver_set_lock(&vm->userptr.gpusvm, &vm->lock);
>> +
>> if (number_tiles > 1)
>> vm->composite_fence_ctx = dma_fence_context_alloc(1);
>>
>> @@ -1867,9 +1879,9 @@ void xe_vm_close_and_put(struct xe_vm *vm)
>> vma = gpuva_to_vma(gpuva);
>>
>> if (xe_vma_has_no_bo(vma)) {
>> - down_read(&vm->userptr.notifier_lock);
>> + down_read(&vm->userptr.gpusvm.notifier_lock);
>> vma->gpuva.flags |= XE_VMA_DESTROYED;
>> - up_read(&vm->userptr.notifier_lock);
>> + up_read(&vm->userptr.gpusvm.notifier_lock);
>> }
>>
>> xe_vm_remove_vma(vm, vma);
>> @@ -1916,6 +1928,8 @@ void xe_vm_close_and_put(struct xe_vm *vm)
>> if (xe_vm_in_fault_mode(vm))
>> xe_svm_fini(vm);
>>
>> + drm_gpusvm_basic_fini(&vm->userptr.gpusvm);
>> +
>> up_write(&vm->lock);
>>
>> down_write(&xe->usm.lock);
>> @@ -2144,9 +2158,9 @@ static const u32 region_to_mem_type[] = {
>> static void prep_vma_destroy(struct xe_vm *vm, struct xe_vma *vma,
>> bool post_commit)
>> {
>> - down_read(&vm->userptr.notifier_lock);
>> + down_read(&vm->userptr.gpusvm.notifier_lock);
>> vma->gpuva.flags |= XE_VMA_DESTROYED;
>> - up_read(&vm->userptr.notifier_lock);
>> + up_read(&vm->userptr.gpusvm.notifier_lock);
>> if (post_commit)
>> xe_vm_remove_vma(vm, vma);
>> }
>> @@ -2625,9 +2639,9 @@ static void xe_vma_op_unwind(struct xe_vm *vm, struct xe_vma_op *op,
>> struct xe_vma *vma = gpuva_to_vma(op->base.unmap.va);
>>
>> if (vma) {
>> - down_read(&vm->userptr.notifier_lock);
>> + down_read(&vm->userptr.gpusvm.notifier_lock);
>> vma->gpuva.flags &= ~XE_VMA_DESTROYED;
>> - up_read(&vm->userptr.notifier_lock);
>> + up_read(&vm->userptr.gpusvm.notifier_lock);
>> if (post_commit)
>> xe_vm_insert_vma(vm, vma);
>> }
>> @@ -2646,9 +2660,9 @@ static void xe_vma_op_unwind(struct xe_vm *vm, struct xe_vma_op *op,
>> xe_vma_destroy_unlocked(op->remap.next);
>> }
>> if (vma) {
>> - down_read(&vm->userptr.notifier_lock);
>> + down_read(&vm->userptr.gpusvm.notifier_lock);
>> vma->gpuva.flags &= ~XE_VMA_DESTROYED;
>> - up_read(&vm->userptr.notifier_lock);
>> + up_read(&vm->userptr.gpusvm.notifier_lock);
>> if (post_commit)
>> xe_vm_insert_vma(vm, vma);
>> }
>> diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
>> index 84fa41b9fa20..08f295873a2b 100644
>> --- a/drivers/gpu/drm/xe/xe_vm_types.h
>> +++ b/drivers/gpu/drm/xe/xe_vm_types.h
>> @@ -52,26 +52,23 @@ struct xe_userptr {
>> struct list_head invalidate_link;
>> /** @userptr: link into VM repin list if userptr. */
>> struct list_head repin_link;
>> + /**
>> + * @range: gpusvm range for this user pointer.
>> + */
>> + struct drm_gpusvm_basic_range range;
>> /**
>> * @notifier: MMU notifier for user pointer (invalidation call back)
>> */
>> struct mmu_interval_notifier notifier;
>> - /** @sgt: storage for a scatter gather table */
>> - struct sg_table sgt;
>> - /** @sg: allocated scatter gather table */
>> - struct sg_table *sg;
>> +
>> /** @notifier_seq: notifier sequence number */
>> unsigned long notifier_seq;
>> - /** @unmap_mutex: Mutex protecting dma-unmapping */
>> - struct mutex unmap_mutex;
>> /**
>> * @initial_bind: user pointer has been bound at least once.
>> - * write: vm->userptr.notifier_lock in read mode and vm->resv held.
>> - * read: vm->userptr.notifier_lock in write mode or vm->resv held.
>> + * write: vm->userptr.gpusvm.notifier_lock in read mode and vm->resv held.
>> + * read: vm->userptr.gpusvm.notifier_lock in write mode or vm->resv held.
>> */
>> bool initial_bind;
>> - /** @mapped: Whether the @sgt sg-table is dma-mapped. Protected by @unmap_mutex. */
>> - bool mapped;
>> #if IS_ENABLED(CONFIG_DRM_XE_USERPTR_INVAL_INJECT)
>> u32 divisor;
>> #endif
>> @@ -109,7 +106,7 @@ struct xe_vma {
>> /**
>> * @tile_present: GT mask of binding are present for this VMA.
>> * protected by vm->lock, vm->resv and for userptrs,
>> - * vm->userptr.notifier_lock for writing. Needs either for reading,
>> + * vm->userptr.gpusvm.notifier_lock for writing. Needs either for reading,
>> * but if reading is done under the vm->lock only, it needs to be held
>> * in write mode.
>> */
>> @@ -238,16 +235,17 @@ struct xe_vm {
>>
>> /** @userptr: user pointer state */
>> struct {
>> + /*
>> + * @gpusvm: The gpusvm userptr for this vm. The
>> + * gpusvm.notifier_lock protects notifier in write mode and
>> + * submission in read mode.
>> + */
>> + struct drm_gpusvm gpusvm;
>
> This is not a full review, but I believe the notifier lock should be
> shared between SVM and userptr (i.e., we should have a single drm_gpusvm
> structure for both). The reasoning is that if we prefetch both SVM
> ranges and userptr within the same list of VM bind operations, we take a
> single lock at the final step for both. Following this approach,
> xe_pt_userptr_pre_commit and xe_pt_svm_pre_commit would be merged into a
> single function.
>
> +Himal, Thomas to see if they agree with me here.
+1 from my end.
>
> Matt
>
>> /**
>> * @userptr.repin_list: list of VMAs which are user pointers,
>> * and needs repinning. Protected by @lock.
>> */
>> struct list_head repin_list;
>> - /**
>> - * @notifier_lock: protects notifier in write mode and
>> - * submission in read mode.
>> - */
>> - struct rw_semaphore notifier_lock;
>> /**
>> * @userptr.invalidated_lock: Protects the
>> * @userptr.invalidated list.
>> --
>> 2.48.1
>>
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2025-03-25 9:50 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-20 17:29 [PATCH 0/7] Replace xe_hmm with gpusvm Matthew Auld
2025-03-20 17:29 ` [PATCH 1/7] drm/gpusvm: fix hmm_pfn_to_map_order() usage Matthew Auld
2025-03-20 19:52 ` Thomas Hellström
2025-03-20 20:40 ` Matthew Brost
2025-03-21 11:42 ` Matthew Auld
2025-03-20 17:29 ` [PATCH 2/7] drm/gpusvm: use more selective dma dir in get_pages() Matthew Auld
2025-03-20 19:31 ` Thomas Hellström
2025-03-20 19:37 ` Matthew Brost
2025-03-20 17:30 ` [PATCH 3/7] drm/gpusvm: mark pages as dirty Matthew Auld
2025-03-20 19:29 ` Thomas Hellström
2025-03-20 19:33 ` Matthew Brost
2025-03-21 11:37 ` Matthew Auld
2025-03-20 17:30 ` [PATCH 4/7] drm/gpusvm: pull out drm_gpusvm_pages substructure Matthew Auld
2025-03-20 20:43 ` Matthew Brost
2025-03-21 11:53 ` Matthew Auld
2025-03-20 17:30 ` [PATCH 5/7] drm/gpusvm: lower get/unmap pages Matthew Auld
2025-03-20 20:59 ` Matthew Brost
2025-03-20 17:30 ` [PATCH 6/7] drm/gpusvm: support basic_range Matthew Auld
2025-03-20 21:04 ` Matthew Brost
2025-03-20 17:30 ` [PATCH 7/7] drm/xe/userptr: replace xe_hmm with gpusvm Matthew Auld
2025-03-20 20:52 ` Matthew Brost
2025-03-25 9:50 ` Ghimiray, Himal Prasad
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).