* [PATCH v4 0/5] Enable SVM atomics in Xe / GPU SVM
@ 2025-04-22 17:04 Matthew Brost
2025-04-22 17:04 ` [PATCH v4 1/5] drm/gpusvm: Introduce devmem_only flag for allocation Matthew Brost
` (4 more replies)
0 siblings, 5 replies; 17+ messages in thread
From: Matthew Brost @ 2025-04-22 17:04 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, thomas.hellstrom, himal.prasad.ghimiray
Minimal set of patches to enable compute UMD SVM atomics.
Collaboration with Himal.
Matt
Himal Prasad Ghimiray (1):
drm/gpusvm: Introduce devmem_only flag for allocation
Matthew Brost (4):
drm/xe: Strict migration policy for atomic SVM faults
drm/gpusvm: Add timeslicing support to GPU SVM
drm/xe: Timeslice GPU on atomic SVM fault
drm/xe: Add atomic_svm_timeslice_ms debugfs entry
drivers/gpu/drm/drm_gpusvm.c | 14 +++++
drivers/gpu/drm/xe/xe_debugfs.c | 38 ++++++++++++
drivers/gpu/drm/xe/xe_device.c | 1 +
drivers/gpu/drm/xe/xe_device_types.h | 3 +
drivers/gpu/drm/xe/xe_module.c | 3 -
drivers/gpu/drm/xe/xe_module.h | 1 -
drivers/gpu/drm/xe/xe_svm.c | 93 +++++++++++++++++++++-------
drivers/gpu/drm/xe/xe_svm.h | 5 --
include/drm/drm_gpusvm.h | 7 +++
9 files changed, 132 insertions(+), 33 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v4 1/5] drm/gpusvm: Introduce devmem_only flag for allocation
2025-04-22 17:04 [PATCH v4 0/5] Enable SVM atomics in Xe / GPU SVM Matthew Brost
@ 2025-04-22 17:04 ` Matthew Brost
2025-04-23 16:14 ` Matthew Brost
2025-04-22 17:04 ` [PATCH v4 2/5] drm/xe: Strict migration policy for atomic SVM faults Matthew Brost
` (3 subsequent siblings)
4 siblings, 1 reply; 17+ messages in thread
From: Matthew Brost @ 2025-04-22 17:04 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, thomas.hellstrom, himal.prasad.ghimiray
From: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
This commit adds a new flag, devmem_only, to the drm_gpusvm structure. The
purpose of this flag is to ensure that the get_pages function allocates
memory exclusively from the device's memory. If the allocation from
device memory fails, the function will return an -EFAULT error.
v3:
- s/vram_only/devmem_only/
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
---
drivers/gpu/drm/drm_gpusvm.c | 5 +++++
include/drm/drm_gpusvm.h | 2 ++
2 files changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
index 38431e8360e7..edf107809d20 100644
--- a/drivers/gpu/drm/drm_gpusvm.c
+++ b/drivers/gpu/drm/drm_gpusvm.c
@@ -1454,6 +1454,11 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
goto err_unmap;
}
+ if (ctx->devmem_only) {
+ err = -EFAULT;
+ goto err_unmap;
+ }
+
addr = dma_map_page(gpusvm->drm->dev,
page, 0,
PAGE_SIZE << order,
diff --git a/include/drm/drm_gpusvm.h b/include/drm/drm_gpusvm.h
index df120b4d1f83..9fd25fc880a4 100644
--- a/include/drm/drm_gpusvm.h
+++ b/include/drm/drm_gpusvm.h
@@ -286,6 +286,7 @@ struct drm_gpusvm {
* @in_notifier: entering from a MMU notifier
* @read_only: operating on read-only memory
* @devmem_possible: possible to use device memory
+ * @devmem_only: use only device memory
*
* Context that is DRM GPUSVM is operating in (i.e. user arguments).
*/
@@ -294,6 +295,7 @@ struct drm_gpusvm_ctx {
unsigned int in_notifier :1;
unsigned int read_only :1;
unsigned int devmem_possible :1;
+ unsigned int devmem_only :1;
};
int drm_gpusvm_init(struct drm_gpusvm *gpusvm,
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v4 2/5] drm/xe: Strict migration policy for atomic SVM faults
2025-04-22 17:04 [PATCH v4 0/5] Enable SVM atomics in Xe / GPU SVM Matthew Brost
2025-04-22 17:04 ` [PATCH v4 1/5] drm/gpusvm: Introduce devmem_only flag for allocation Matthew Brost
@ 2025-04-22 17:04 ` Matthew Brost
2025-04-22 17:21 ` Ghimiray, Himal Prasad
2025-04-24 14:39 ` Thomas Hellström
2025-04-22 17:04 ` [PATCH v4 3/5] drm/gpusvm: Add timeslicing support to GPU SVM Matthew Brost
` (2 subsequent siblings)
4 siblings, 2 replies; 17+ messages in thread
From: Matthew Brost @ 2025-04-22 17:04 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, thomas.hellstrom, himal.prasad.ghimiray
Mixing GPU and CPU atomics does not work unless a strict migration
policy of GPU atomics must be device memory. Enforce a policy of must be
in VRAM with a retry loop of 2 attempts, if retry loop fails abort
fault.
v2:
- Only retry migration on atomics
- Drop alway migrate modparam
v3:
- Only set vram_only on DGFX (Himal)
- Bail on get_pages failure if vram_only and retry count exceeded (Himal)
- s/vram_only/devmem_only
- Update xe_svm_range_is_valid to accept devmem_only argument
v4:
- Fix logic bug get_pages failure
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/xe/xe_module.c | 3 --
drivers/gpu/drm/xe/xe_module.h | 1 -
drivers/gpu/drm/xe/xe_svm.c | 89 +++++++++++++++++++++++++---------
drivers/gpu/drm/xe/xe_svm.h | 5 --
4 files changed, 65 insertions(+), 33 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
index 05c7d0ae6d83..1c4dfafbcd0b 100644
--- a/drivers/gpu/drm/xe/xe_module.c
+++ b/drivers/gpu/drm/xe/xe_module.c
@@ -33,9 +33,6 @@ struct xe_modparam xe_modparam = {
module_param_named(svm_notifier_size, xe_modparam.svm_notifier_size, uint, 0600);
MODULE_PARM_DESC(svm_notifier_size, "Set the svm notifier size(in MiB), must be power of 2");
-module_param_named(always_migrate_to_vram, xe_modparam.always_migrate_to_vram, bool, 0444);
-MODULE_PARM_DESC(always_migrate_to_vram, "Always migrate to VRAM on GPU fault");
-
module_param_named_unsafe(force_execlist, xe_modparam.force_execlist, bool, 0444);
MODULE_PARM_DESC(force_execlist, "Force Execlist submission");
diff --git a/drivers/gpu/drm/xe/xe_module.h b/drivers/gpu/drm/xe/xe_module.h
index 84339e509c80..5a3bfea8b7b4 100644
--- a/drivers/gpu/drm/xe/xe_module.h
+++ b/drivers/gpu/drm/xe/xe_module.h
@@ -12,7 +12,6 @@
struct xe_modparam {
bool force_execlist;
bool probe_display;
- bool always_migrate_to_vram;
u32 force_vram_bar_size;
int guc_log_level;
char *guc_firmware_path;
diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
index 890f6b2f40e9..f749ae367a8f 100644
--- a/drivers/gpu/drm/xe/xe_svm.c
+++ b/drivers/gpu/drm/xe/xe_svm.c
@@ -650,9 +650,11 @@ void xe_svm_fini(struct xe_vm *vm)
}
static bool xe_svm_range_is_valid(struct xe_svm_range *range,
- struct xe_tile *tile)
+ struct xe_tile *tile,
+ bool devmem_only)
{
- return (range->tile_present & ~range->tile_invalidated) & BIT(tile->id);
+ return ((range->tile_present & ~range->tile_invalidated) & BIT(tile->id))
+ && (!devmem_only || range->base.flags.migrate_devmem);
}
#if IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR)
@@ -726,6 +728,35 @@ static int xe_svm_alloc_vram(struct xe_vm *vm, struct xe_tile *tile,
}
#endif
+static bool supports_4K_migration(struct xe_device *xe)
+{
+ if (xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
+ return false;
+
+ return true;
+}
+
+static bool xe_svm_range_needs_migrate_to_vram(struct xe_svm_range *range,
+ struct xe_vma *vma)
+{
+ struct xe_vm *vm = range_to_vm(&range->base);
+ u64 range_size = xe_svm_range_size(range);
+
+ if (!range->base.flags.migrate_devmem)
+ return false;
+
+ if (xe_svm_range_in_vram(range)) {
+ drm_dbg(&vm->xe->drm, "Range is already in VRAM\n");
+ return false;
+ }
+
+ if (range_size <= SZ_64K && !supports_4K_migration(vm->xe)) {
+ drm_dbg(&vm->xe->drm, "Platform doesn't support SZ_4K range migration\n");
+ return false;
+ }
+
+ return true;
+}
/**
* xe_svm_handle_pagefault() - SVM handle page fault
@@ -750,12 +781,15 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
.check_pages_threshold = IS_DGFX(vm->xe) &&
IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? SZ_64K : 0,
+ .devmem_only = atomic && IS_DGFX(vm->xe) &&
+ IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
};
struct xe_svm_range *range;
struct drm_gpusvm_range *r;
struct drm_exec exec;
struct dma_fence *fence;
struct xe_tile *tile = gt_to_tile(gt);
+ int migrate_try_count = ctx.devmem_only ? 3 : 1;
ktime_t end = 0;
int err;
@@ -777,23 +811,26 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
return PTR_ERR(r);
range = to_xe_range(r);
- if (xe_svm_range_is_valid(range, tile))
+ if (xe_svm_range_is_valid(range, tile, ctx.devmem_only))
return 0;
range_debug(range, "PAGE FAULT");
- /* XXX: Add migration policy, for now migrate range once */
- if (!range->skip_migrate && range->base.flags.migrate_devmem &&
- xe_svm_range_size(range) >= SZ_64K) {
- range->skip_migrate = true;
-
+ if (--migrate_try_count >= 0 &&
+ xe_svm_range_needs_migrate_to_vram(range, vma)) {
err = xe_svm_alloc_vram(vm, tile, range, &ctx);
if (err) {
- drm_dbg(&vm->xe->drm,
- "VRAM allocation failed, falling back to "
- "retrying fault, asid=%u, errno=%pe\n",
- vm->usm.asid, ERR_PTR(err));
- goto retry;
+ if (migrate_try_count || !ctx.devmem_only) {
+ drm_dbg(&vm->xe->drm,
+ "VRAM allocation failed, falling back to retrying fault, asid=%u, errno=%pe\n",
+ vm->usm.asid, ERR_PTR(err));
+ goto retry;
+ } else {
+ drm_err(&vm->xe->drm,
+ "VRAM allocation failed, retry count exceeded, asid=%u, errno=%pe\n",
+ vm->usm.asid, ERR_PTR(err));
+ return err;
+ }
}
}
@@ -801,15 +838,22 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, &ctx);
/* Corner where CPU mappings have changed */
if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) {
- if (err == -EOPNOTSUPP) {
- range_debug(range, "PAGE FAULT - EVICT PAGES");
- drm_gpusvm_range_evict(&vm->svm.gpusvm, &range->base);
+ if (migrate_try_count > 0 || !ctx.devmem_only) {
+ if (err == -EOPNOTSUPP) {
+ range_debug(range, "PAGE FAULT - EVICT PAGES");
+ drm_gpusvm_range_evict(&vm->svm.gpusvm,
+ &range->base);
+ }
+ drm_dbg(&vm->xe->drm,
+ "Get pages failed, falling back to retrying, asid=%u, gpusvm=%p, errno=%pe\n",
+ vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err));
+ range_debug(range, "PAGE FAULT - RETRY PAGES");
+ goto retry;
+ } else {
+ drm_err(&vm->xe->drm,
+ "Get pages failed, retry count exceeded, asid=%u, gpusvm=%p, errno=%pe\n",
+ vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err));
}
- drm_dbg(&vm->xe->drm,
- "Get pages failed, falling back to retrying, asid=%u, gpusvm=%p, errno=%pe\n",
- vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err));
- range_debug(range, "PAGE FAULT - RETRY PAGES");
- goto retry;
}
if (err) {
range_debug(range, "PAGE FAULT - FAIL PAGE COLLECT");
@@ -843,9 +887,6 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
}
drm_exec_fini(&exec);
- if (xe_modparam.always_migrate_to_vram)
- range->skip_migrate = false;
-
dma_fence_wait(fence, false);
dma_fence_put(fence);
diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
index 3d441eb1f7ea..0e1f376a7471 100644
--- a/drivers/gpu/drm/xe/xe_svm.h
+++ b/drivers/gpu/drm/xe/xe_svm.h
@@ -39,11 +39,6 @@ struct xe_svm_range {
* range. Protected by GPU SVM notifier lock.
*/
u8 tile_invalidated;
- /**
- * @skip_migrate: Skip migration to VRAM, protected by GPU fault handler
- * locking.
- */
- u8 skip_migrate :1;
};
/**
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v4 3/5] drm/gpusvm: Add timeslicing support to GPU SVM
2025-04-22 17:04 [PATCH v4 0/5] Enable SVM atomics in Xe / GPU SVM Matthew Brost
2025-04-22 17:04 ` [PATCH v4 1/5] drm/gpusvm: Introduce devmem_only flag for allocation Matthew Brost
2025-04-22 17:04 ` [PATCH v4 2/5] drm/xe: Strict migration policy for atomic SVM faults Matthew Brost
@ 2025-04-22 17:04 ` Matthew Brost
2025-04-23 5:36 ` Ghimiray, Himal Prasad
2025-04-22 17:04 ` [PATCH v4 4/5] drm/xe: Timeslice GPU on atomic SVM fault Matthew Brost
2025-04-22 17:04 ` [PATCH v4 5/5] drm/xe: Add atomic_svm_timeslice_ms debugfs entry Matthew Brost
4 siblings, 1 reply; 17+ messages in thread
From: Matthew Brost @ 2025-04-22 17:04 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, thomas.hellstrom, himal.prasad.ghimiray
Add timeslicing support to GPU SVM which will guarantee the GPU a
minimum execution time on piece of physical memory before migration back
to CPU. Intended to implement strict migration policies which require
memory to be in a certain placement for correct execution.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/drm_gpusvm.c | 9 +++++++++
include/drm/drm_gpusvm.h | 5 +++++
2 files changed, 14 insertions(+)
diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
index edf107809d20..40a56f38ff8e 100644
--- a/drivers/gpu/drm/drm_gpusvm.c
+++ b/drivers/gpu/drm/drm_gpusvm.c
@@ -1770,6 +1770,8 @@ int drm_gpusvm_migrate_to_devmem(struct drm_gpusvm *gpusvm,
goto err_finalize;
/* Upon success bind devmem allocation to range and zdd */
+ devmem_allocation->timeslice_expiration = get_jiffies_64() +
+ msecs_to_jiffies(ctx->timeslice_ms);
zdd->devmem_allocation = devmem_allocation; /* Owns ref */
err_finalize:
@@ -1990,6 +1992,13 @@ static int __drm_gpusvm_migrate_to_ram(struct vm_area_struct *vas,
void *buf;
int i, err = 0;
+ if (page) {
+ zdd = page->zone_device_data;
+ if (time_before64(get_jiffies_64(),
+ zdd->devmem_allocation->timeslice_expiration))
+ return 0;
+ }
+
start = ALIGN_DOWN(fault_addr, size);
end = ALIGN(fault_addr + 1, size);
diff --git a/include/drm/drm_gpusvm.h b/include/drm/drm_gpusvm.h
index 9fd25fc880a4..cce217bc136f 100644
--- a/include/drm/drm_gpusvm.h
+++ b/include/drm/drm_gpusvm.h
@@ -89,6 +89,7 @@ struct drm_gpusvm_devmem_ops {
* @ops: Pointer to the operations structure for GPU SVM device memory
* @dpagemap: The struct drm_pagemap of the pages this allocation belongs to.
* @size: Size of device memory allocation
+ * @timeslice_expiration: Timeslice expiration in jiffies
*/
struct drm_gpusvm_devmem {
struct device *dev;
@@ -97,6 +98,7 @@ struct drm_gpusvm_devmem {
const struct drm_gpusvm_devmem_ops *ops;
struct drm_pagemap *dpagemap;
size_t size;
+ u64 timeslice_expiration;
};
/**
@@ -283,6 +285,8 @@ struct drm_gpusvm {
* @check_pages_threshold: Check CPU pages for present if chunk is less than or
* equal to threshold. If not present, reduce chunk
* size.
+ * @timeslice_ms: The timeslice MS which in minimum time a piece of memory
+ * remains with either exclusive GPU or CPU access.
* @in_notifier: entering from a MMU notifier
* @read_only: operating on read-only memory
* @devmem_possible: possible to use device memory
@@ -292,6 +296,7 @@ struct drm_gpusvm {
*/
struct drm_gpusvm_ctx {
unsigned long check_pages_threshold;
+ unsigned long timeslice_ms;
unsigned int in_notifier :1;
unsigned int read_only :1;
unsigned int devmem_possible :1;
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v4 4/5] drm/xe: Timeslice GPU on atomic SVM fault
2025-04-22 17:04 [PATCH v4 0/5] Enable SVM atomics in Xe / GPU SVM Matthew Brost
` (2 preceding siblings ...)
2025-04-22 17:04 ` [PATCH v4 3/5] drm/gpusvm: Add timeslicing support to GPU SVM Matthew Brost
@ 2025-04-22 17:04 ` Matthew Brost
2025-04-23 5:36 ` Ghimiray, Himal Prasad
2025-04-22 17:04 ` [PATCH v4 5/5] drm/xe: Add atomic_svm_timeslice_ms debugfs entry Matthew Brost
4 siblings, 1 reply; 17+ messages in thread
From: Matthew Brost @ 2025-04-22 17:04 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, thomas.hellstrom, himal.prasad.ghimiray
Ensure GPU can make forward progress on an atomic SVM GPU fault by
giving the GPU a timeslice of 5ms
v2:
- Reduce timeslice to 5ms
- Double timeslice on retry
- Split out GPU SVM changes into independent patch
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/xe/xe_svm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
index f749ae367a8f..d5376a76cdd1 100644
--- a/drivers/gpu/drm/xe/xe_svm.c
+++ b/drivers/gpu/drm/xe/xe_svm.c
@@ -783,6 +783,8 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? SZ_64K : 0,
.devmem_only = atomic && IS_DGFX(vm->xe) &&
IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
+ .timeslice_ms = atomic && IS_DGFX(vm->xe) &&
+ IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? 5 : 0,
};
struct xe_svm_range *range;
struct drm_gpusvm_range *r;
@@ -819,6 +821,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
if (--migrate_try_count >= 0 &&
xe_svm_range_needs_migrate_to_vram(range, vma)) {
err = xe_svm_alloc_vram(vm, tile, range, &ctx);
+ ctx.timeslice_ms <<= 1; /* Double timeslice if we have to retry */
if (err) {
if (migrate_try_count || !ctx.devmem_only) {
drm_dbg(&vm->xe->drm,
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v4 5/5] drm/xe: Add atomic_svm_timeslice_ms debugfs entry
2025-04-22 17:04 [PATCH v4 0/5] Enable SVM atomics in Xe / GPU SVM Matthew Brost
` (3 preceding siblings ...)
2025-04-22 17:04 ` [PATCH v4 4/5] drm/xe: Timeslice GPU on atomic SVM fault Matthew Brost
@ 2025-04-22 17:04 ` Matthew Brost
2025-04-23 5:37 ` Ghimiray, Himal Prasad
4 siblings, 1 reply; 17+ messages in thread
From: Matthew Brost @ 2025-04-22 17:04 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, thomas.hellstrom, himal.prasad.ghimiray
Add some informal control for atomic SVM fault GPU timeslice to be able
to play around with values and tweak performance.
v2:
- Reduce timeslice default value to 5ms
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/xe/xe_debugfs.c | 38 ++++++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_device.c | 1 +
drivers/gpu/drm/xe/xe_device_types.h | 3 +++
drivers/gpu/drm/xe/xe_svm.c | 3 ++-
4 files changed, 44 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
index d0503959a8ed..d83cd6ed3fa8 100644
--- a/drivers/gpu/drm/xe/xe_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_debugfs.c
@@ -191,6 +191,41 @@ static const struct file_operations wedged_mode_fops = {
.write = wedged_mode_set,
};
+static ssize_t atomic_svm_timeslice_ms_show(struct file *f, char __user *ubuf,
+ size_t size, loff_t *pos)
+{
+ struct xe_device *xe = file_inode(f)->i_private;
+ char buf[32];
+ int len = 0;
+
+ len = scnprintf(buf, sizeof(buf), "%d\n", xe->atomic_svm_timeslice_ms);
+
+ return simple_read_from_buffer(ubuf, size, pos, buf, len);
+}
+
+static ssize_t atomic_svm_timeslice_ms_set(struct file *f,
+ const char __user *ubuf,
+ size_t size, loff_t *pos)
+{
+ struct xe_device *xe = file_inode(f)->i_private;
+ u32 atomic_svm_timeslice_ms;
+ ssize_t ret;
+
+ ret = kstrtouint_from_user(ubuf, size, 0, &atomic_svm_timeslice_ms);
+ if (ret)
+ return ret;
+
+ xe->atomic_svm_timeslice_ms = atomic_svm_timeslice_ms;
+
+ return size;
+}
+
+static const struct file_operations atomic_svm_timeslice_ms_fops = {
+ .owner = THIS_MODULE,
+ .read = atomic_svm_timeslice_ms_show,
+ .write = atomic_svm_timeslice_ms_set,
+};
+
void xe_debugfs_register(struct xe_device *xe)
{
struct ttm_device *bdev = &xe->ttm;
@@ -211,6 +246,9 @@ void xe_debugfs_register(struct xe_device *xe)
debugfs_create_file("wedged_mode", 0600, root, xe,
&wedged_mode_fops);
+ debugfs_create_file("atomic_svm_timeslice_ms", 0600, root, xe,
+ &atomic_svm_timeslice_ms_fops);
+
for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1; ++mem_type) {
man = ttm_manager_type(bdev, mem_type);
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 75e753e0a682..abf3c72baaa6 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -444,6 +444,7 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
xe->info.devid = pdev->device;
xe->info.revid = pdev->revision;
xe->info.force_execlist = xe_modparam.force_execlist;
+ xe->atomic_svm_timeslice_ms = 5;
err = xe_irq_init(xe);
if (err)
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index b9a892c44c67..6f5222f42410 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -567,6 +567,9 @@ struct xe_device {
/** @pmu: performance monitoring unit */
struct xe_pmu pmu;
+ /** @atomic_svm_timeslice_ms: Atomic SVM fault timeslice MS */
+ u32 atomic_svm_timeslice_ms;
+
#ifdef TEST_VM_OPS_ERROR
/**
* @vm_inject_error_position: inject errors at different places in VM
diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
index d5376a76cdd1..de4eb04fc78e 100644
--- a/drivers/gpu/drm/xe/xe_svm.c
+++ b/drivers/gpu/drm/xe/xe_svm.c
@@ -784,7 +784,8 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
.devmem_only = atomic && IS_DGFX(vm->xe) &&
IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
.timeslice_ms = atomic && IS_DGFX(vm->xe) &&
- IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? 5 : 0,
+ IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ?
+ vm->xe->atomic_svm_timeslice_ms : 0,
};
struct xe_svm_range *range;
struct drm_gpusvm_range *r;
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v4 2/5] drm/xe: Strict migration policy for atomic SVM faults
2025-04-22 17:04 ` [PATCH v4 2/5] drm/xe: Strict migration policy for atomic SVM faults Matthew Brost
@ 2025-04-22 17:21 ` Ghimiray, Himal Prasad
2025-04-23 17:29 ` Matthew Brost
2025-04-24 14:39 ` Thomas Hellström
1 sibling, 1 reply; 17+ messages in thread
From: Ghimiray, Himal Prasad @ 2025-04-22 17:21 UTC (permalink / raw)
To: Matthew Brost, intel-xe; +Cc: dri-devel, thomas.hellstrom
On 22-04-2025 22:34, Matthew Brost wrote:
> Mixing GPU and CPU atomics does not work unless a strict migration
> policy of GPU atomics must be device memory. Enforce a policy of must be
> in VRAM with a retry loop of 2 attempts, if retry loop fails abort
> fault.
retry loop of 3 attempts not 2. with that addressed, Patch looks good to
me. Since both of us collaborated on this , it has my ack and need
someone else also to review it.
Acked-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
>
> v2:
> - Only retry migration on atomics
> - Drop alway migrate modparam
> v3:
> - Only set vram_only on DGFX (Himal)
> - Bail on get_pages failure if vram_only and retry count exceeded (Himal)
> - s/vram_only/devmem_only
> - Update xe_svm_range_is_valid to accept devmem_only argument
> v4:
> - Fix logic bug get_pages failure
>
> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/xe/xe_module.c | 3 --
> drivers/gpu/drm/xe/xe_module.h | 1 -
> drivers/gpu/drm/xe/xe_svm.c | 89 +++++++++++++++++++++++++---------
> drivers/gpu/drm/xe/xe_svm.h | 5 --
> 4 files changed, 65 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
> index 05c7d0ae6d83..1c4dfafbcd0b 100644
> --- a/drivers/gpu/drm/xe/xe_module.c
> +++ b/drivers/gpu/drm/xe/xe_module.c
> @@ -33,9 +33,6 @@ struct xe_modparam xe_modparam = {
> module_param_named(svm_notifier_size, xe_modparam.svm_notifier_size, uint, 0600);
> MODULE_PARM_DESC(svm_notifier_size, "Set the svm notifier size(in MiB), must be power of 2");
>
> -module_param_named(always_migrate_to_vram, xe_modparam.always_migrate_to_vram, bool, 0444);
> -MODULE_PARM_DESC(always_migrate_to_vram, "Always migrate to VRAM on GPU fault");
> -
> module_param_named_unsafe(force_execlist, xe_modparam.force_execlist, bool, 0444);
> MODULE_PARM_DESC(force_execlist, "Force Execlist submission");
>
> diff --git a/drivers/gpu/drm/xe/xe_module.h b/drivers/gpu/drm/xe/xe_module.h
> index 84339e509c80..5a3bfea8b7b4 100644
> --- a/drivers/gpu/drm/xe/xe_module.h
> +++ b/drivers/gpu/drm/xe/xe_module.h
> @@ -12,7 +12,6 @@
> struct xe_modparam {
> bool force_execlist;
> bool probe_display;
> - bool always_migrate_to_vram;
> u32 force_vram_bar_size;
> int guc_log_level;
> char *guc_firmware_path;
> diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> index 890f6b2f40e9..f749ae367a8f 100644
> --- a/drivers/gpu/drm/xe/xe_svm.c
> +++ b/drivers/gpu/drm/xe/xe_svm.c
> @@ -650,9 +650,11 @@ void xe_svm_fini(struct xe_vm *vm)
> }
>
> static bool xe_svm_range_is_valid(struct xe_svm_range *range,
> - struct xe_tile *tile)
> + struct xe_tile *tile,
> + bool devmem_only)
> {
> - return (range->tile_present & ~range->tile_invalidated) & BIT(tile->id);
> + return ((range->tile_present & ~range->tile_invalidated) & BIT(tile->id))
> + && (!devmem_only || range->base.flags.migrate_devmem);
> }
>
> #if IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR)
> @@ -726,6 +728,35 @@ static int xe_svm_alloc_vram(struct xe_vm *vm, struct xe_tile *tile,
> }
> #endif
>
> +static bool supports_4K_migration(struct xe_device *xe)
> +{
> + if (xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
> + return false;
> +
> + return true;
> +}
> +
> +static bool xe_svm_range_needs_migrate_to_vram(struct xe_svm_range *range,
> + struct xe_vma *vma)
> +{
> + struct xe_vm *vm = range_to_vm(&range->base);
> + u64 range_size = xe_svm_range_size(range);
> +
> + if (!range->base.flags.migrate_devmem)
> + return false;
> +
> + if (xe_svm_range_in_vram(range)) {
> + drm_dbg(&vm->xe->drm, "Range is already in VRAM\n");
> + return false;
> + }
> +
> + if (range_size <= SZ_64K && !supports_4K_migration(vm->xe)) {
> + drm_dbg(&vm->xe->drm, "Platform doesn't support SZ_4K range migration\n");
> + return false;
> + }
> +
> + return true;
> +}
>
> /**
> * xe_svm_handle_pagefault() - SVM handle page fault
> @@ -750,12 +781,15 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
> IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> .check_pages_threshold = IS_DGFX(vm->xe) &&
> IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? SZ_64K : 0,
> + .devmem_only = atomic && IS_DGFX(vm->xe) &&
> + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> };
> struct xe_svm_range *range;
> struct drm_gpusvm_range *r;
> struct drm_exec exec;
> struct dma_fence *fence;
> struct xe_tile *tile = gt_to_tile(gt);
> + int migrate_try_count = ctx.devmem_only ? 3 : 1;
> ktime_t end = 0;
> int err;
>
> @@ -777,23 +811,26 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
> return PTR_ERR(r);
>
> range = to_xe_range(r);
> - if (xe_svm_range_is_valid(range, tile))
> + if (xe_svm_range_is_valid(range, tile, ctx.devmem_only))
> return 0;
>
> range_debug(range, "PAGE FAULT");
>
> - /* XXX: Add migration policy, for now migrate range once */
> - if (!range->skip_migrate && range->base.flags.migrate_devmem &&
> - xe_svm_range_size(range) >= SZ_64K) {
> - range->skip_migrate = true;
> -
> + if (--migrate_try_count >= 0 &&
> + xe_svm_range_needs_migrate_to_vram(range, vma)) {
> err = xe_svm_alloc_vram(vm, tile, range, &ctx);
> if (err) {
> - drm_dbg(&vm->xe->drm,
> - "VRAM allocation failed, falling back to "
> - "retrying fault, asid=%u, errno=%pe\n",
> - vm->usm.asid, ERR_PTR(err));
> - goto retry;
> + if (migrate_try_count || !ctx.devmem_only) {
> + drm_dbg(&vm->xe->drm,
> + "VRAM allocation failed, falling back to retrying fault, asid=%u, errno=%pe\n",
> + vm->usm.asid, ERR_PTR(err));
> + goto retry;
> + } else {
> + drm_err(&vm->xe->drm,
> + "VRAM allocation failed, retry count exceeded, asid=%u, errno=%pe\n",
> + vm->usm.asid, ERR_PTR(err));
> + return err;
> + }
> }
> }
>
> @@ -801,15 +838,22 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
> err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, &ctx);
> /* Corner where CPU mappings have changed */
> if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) {
> - if (err == -EOPNOTSUPP) {
> - range_debug(range, "PAGE FAULT - EVICT PAGES");
> - drm_gpusvm_range_evict(&vm->svm.gpusvm, &range->base);
> + if (migrate_try_count > 0 || !ctx.devmem_only) {
> + if (err == -EOPNOTSUPP) {
> + range_debug(range, "PAGE FAULT - EVICT PAGES");
> + drm_gpusvm_range_evict(&vm->svm.gpusvm,
> + &range->base);
> + }
> + drm_dbg(&vm->xe->drm,
> + "Get pages failed, falling back to retrying, asid=%u, gpusvm=%p, errno=%pe\n",
> + vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err));
> + range_debug(range, "PAGE FAULT - RETRY PAGES");
> + goto retry;
> + } else {
> + drm_err(&vm->xe->drm,
> + "Get pages failed, retry count exceeded, asid=%u, gpusvm=%p, errno=%pe\n",
> + vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err));
> }
> - drm_dbg(&vm->xe->drm,
> - "Get pages failed, falling back to retrying, asid=%u, gpusvm=%p, errno=%pe\n",
> - vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err));
> - range_debug(range, "PAGE FAULT - RETRY PAGES");
> - goto retry;
> }
> if (err) {
> range_debug(range, "PAGE FAULT - FAIL PAGE COLLECT");
> @@ -843,9 +887,6 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
> }
> drm_exec_fini(&exec);
>
> - if (xe_modparam.always_migrate_to_vram)
> - range->skip_migrate = false;
> -
> dma_fence_wait(fence, false);
> dma_fence_put(fence);
>
> diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
> index 3d441eb1f7ea..0e1f376a7471 100644
> --- a/drivers/gpu/drm/xe/xe_svm.h
> +++ b/drivers/gpu/drm/xe/xe_svm.h
> @@ -39,11 +39,6 @@ struct xe_svm_range {
> * range. Protected by GPU SVM notifier lock.
> */
> u8 tile_invalidated;
> - /**
> - * @skip_migrate: Skip migration to VRAM, protected by GPU fault handler
> - * locking.
> - */
> - u8 skip_migrate :1;
> };
>
> /**
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 3/5] drm/gpusvm: Add timeslicing support to GPU SVM
2025-04-22 17:04 ` [PATCH v4 3/5] drm/gpusvm: Add timeslicing support to GPU SVM Matthew Brost
@ 2025-04-23 5:36 ` Ghimiray, Himal Prasad
0 siblings, 0 replies; 17+ messages in thread
From: Ghimiray, Himal Prasad @ 2025-04-23 5:36 UTC (permalink / raw)
To: Matthew Brost, intel-xe; +Cc: dri-devel, thomas.hellstrom
On 22-04-2025 22:34, Matthew Brost wrote:
> Add timeslicing support to GPU SVM which will guarantee the GPU a
> minimum execution time on piece of physical memory before migration back
> to CPU. Intended to implement strict migration policies which require
> memory to be in a certain placement for correct execution.
>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/drm_gpusvm.c | 9 +++++++++
> include/drm/drm_gpusvm.h | 5 +++++
> 2 files changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
> index edf107809d20..40a56f38ff8e 100644
> --- a/drivers/gpu/drm/drm_gpusvm.c
> +++ b/drivers/gpu/drm/drm_gpusvm.c
> @@ -1770,6 +1770,8 @@ int drm_gpusvm_migrate_to_devmem(struct drm_gpusvm *gpusvm,
> goto err_finalize;
>
> /* Upon success bind devmem allocation to range and zdd */
> + devmem_allocation->timeslice_expiration = get_jiffies_64() +
> + msecs_to_jiffies(ctx->timeslice_ms);
> zdd->devmem_allocation = devmem_allocation; /* Owns ref */
>
> err_finalize:
> @@ -1990,6 +1992,13 @@ static int __drm_gpusvm_migrate_to_ram(struct vm_area_struct *vas,
> void *buf;
> int i, err = 0;
>
> + if (page) {
> + zdd = page->zone_device_data;
> + if (time_before64(get_jiffies_64(),
> + zdd->devmem_allocation->timeslice_expiration))
> + return 0;
> + }
> +
> start = ALIGN_DOWN(fault_addr, size);
> end = ALIGN(fault_addr + 1, size);
>
> diff --git a/include/drm/drm_gpusvm.h b/include/drm/drm_gpusvm.h
> index 9fd25fc880a4..cce217bc136f 100644
> --- a/include/drm/drm_gpusvm.h
> +++ b/include/drm/drm_gpusvm.h
> @@ -89,6 +89,7 @@ struct drm_gpusvm_devmem_ops {
> * @ops: Pointer to the operations structure for GPU SVM device memory
> * @dpagemap: The struct drm_pagemap of the pages this allocation belongs to.
> * @size: Size of device memory allocation
> + * @timeslice_expiration: Timeslice expiration in jiffies
> */
> struct drm_gpusvm_devmem {
> struct device *dev;
> @@ -97,6 +98,7 @@ struct drm_gpusvm_devmem {
> const struct drm_gpusvm_devmem_ops *ops;
> struct drm_pagemap *dpagemap;
> size_t size;
> + u64 timeslice_expiration;
> };
>
> /**
> @@ -283,6 +285,8 @@ struct drm_gpusvm {
> * @check_pages_threshold: Check CPU pages for present if chunk is less than or
> * equal to threshold. If not present, reduce chunk
> * size.
> + * @timeslice_ms: The timeslice MS which in minimum time a piece of memory
> + * remains with either exclusive GPU or CPU access.
> * @in_notifier: entering from a MMU notifier
> * @read_only: operating on read-only memory
> * @devmem_possible: possible to use device memory
> @@ -292,6 +296,7 @@ struct drm_gpusvm {
> */
> struct drm_gpusvm_ctx {
> unsigned long check_pages_threshold;
> + unsigned long timeslice_ms;
LGTM
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> unsigned int in_notifier :1;
> unsigned int read_only :1;
> unsigned int devmem_possible :1;
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 4/5] drm/xe: Timeslice GPU on atomic SVM fault
2025-04-22 17:04 ` [PATCH v4 4/5] drm/xe: Timeslice GPU on atomic SVM fault Matthew Brost
@ 2025-04-23 5:36 ` Ghimiray, Himal Prasad
0 siblings, 0 replies; 17+ messages in thread
From: Ghimiray, Himal Prasad @ 2025-04-23 5:36 UTC (permalink / raw)
To: Matthew Brost, intel-xe; +Cc: dri-devel, thomas.hellstrom
On 22-04-2025 22:34, Matthew Brost wrote:
> Ensure GPU can make forward progress on an atomic SVM GPU fault by
> giving the GPU a timeslice of 5ms
>
> v2:
> - Reduce timeslice to 5ms
> - Double timeslice on retry
> - Split out GPU SVM changes into independent patch
>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/xe/xe_svm.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> index f749ae367a8f..d5376a76cdd1 100644
> --- a/drivers/gpu/drm/xe/xe_svm.c
> +++ b/drivers/gpu/drm/xe/xe_svm.c
> @@ -783,6 +783,8 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
> IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? SZ_64K : 0,
> .devmem_only = atomic && IS_DGFX(vm->xe) &&
> IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> + .timeslice_ms = atomic && IS_DGFX(vm->xe) &&
> + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? 5 : 0,
> };
> struct xe_svm_range *range;
> struct drm_gpusvm_range *r;
> @@ -819,6 +821,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
> if (--migrate_try_count >= 0 &&
> xe_svm_range_needs_migrate_to_vram(range, vma)) {
> err = xe_svm_alloc_vram(vm, tile, range, &ctx);
> + ctx.timeslice_ms <<= 1; /* Double timeslice if we have to retry */
LGTM
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> if (err) {
> if (migrate_try_count || !ctx.devmem_only) {
> drm_dbg(&vm->xe->drm,
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 5/5] drm/xe: Add atomic_svm_timeslice_ms debugfs entry
2025-04-22 17:04 ` [PATCH v4 5/5] drm/xe: Add atomic_svm_timeslice_ms debugfs entry Matthew Brost
@ 2025-04-23 5:37 ` Ghimiray, Himal Prasad
0 siblings, 0 replies; 17+ messages in thread
From: Ghimiray, Himal Prasad @ 2025-04-23 5:37 UTC (permalink / raw)
To: Matthew Brost, intel-xe; +Cc: dri-devel, thomas.hellstrom
On 22-04-2025 22:34, Matthew Brost wrote:
> Add some informal control for atomic SVM fault GPU timeslice to be able
> to play around with values and tweak performance.
>
> v2:
> - Reduce timeslice default value to 5ms
>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/xe/xe_debugfs.c | 38 ++++++++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_device.c | 1 +
> drivers/gpu/drm/xe/xe_device_types.h | 3 +++
> drivers/gpu/drm/xe/xe_svm.c | 3 ++-
> 4 files changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> index d0503959a8ed..d83cd6ed3fa8 100644
> --- a/drivers/gpu/drm/xe/xe_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> @@ -191,6 +191,41 @@ static const struct file_operations wedged_mode_fops = {
> .write = wedged_mode_set,
> };
>
> +static ssize_t atomic_svm_timeslice_ms_show(struct file *f, char __user *ubuf,
> + size_t size, loff_t *pos)
> +{
> + struct xe_device *xe = file_inode(f)->i_private;
> + char buf[32];
> + int len = 0;
> +
> + len = scnprintf(buf, sizeof(buf), "%d\n", xe->atomic_svm_timeslice_ms);
> +
> + return simple_read_from_buffer(ubuf, size, pos, buf, len);
> +}
> +
> +static ssize_t atomic_svm_timeslice_ms_set(struct file *f,
> + const char __user *ubuf,
> + size_t size, loff_t *pos)
> +{
> + struct xe_device *xe = file_inode(f)->i_private;
> + u32 atomic_svm_timeslice_ms;
> + ssize_t ret;
> +
> + ret = kstrtouint_from_user(ubuf, size, 0, &atomic_svm_timeslice_ms);
> + if (ret)
> + return ret;
> +
> + xe->atomic_svm_timeslice_ms = atomic_svm_timeslice_ms;
> +
> + return size;
> +}
> +
> +static const struct file_operations atomic_svm_timeslice_ms_fops = {
> + .owner = THIS_MODULE,
> + .read = atomic_svm_timeslice_ms_show,
> + .write = atomic_svm_timeslice_ms_set,
> +};
> +
> void xe_debugfs_register(struct xe_device *xe)
> {
> struct ttm_device *bdev = &xe->ttm;
> @@ -211,6 +246,9 @@ void xe_debugfs_register(struct xe_device *xe)
> debugfs_create_file("wedged_mode", 0600, root, xe,
> &wedged_mode_fops);
>
> + debugfs_create_file("atomic_svm_timeslice_ms", 0600, root, xe,
> + &atomic_svm_timeslice_ms_fops);
> +
> for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1; ++mem_type) {
> man = ttm_manager_type(bdev, mem_type);
>
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 75e753e0a682..abf3c72baaa6 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -444,6 +444,7 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
> xe->info.devid = pdev->device;
> xe->info.revid = pdev->revision;
> xe->info.force_execlist = xe_modparam.force_execlist;
> + xe->atomic_svm_timeslice_ms = 5;
>
> err = xe_irq_init(xe);
> if (err)
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index b9a892c44c67..6f5222f42410 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -567,6 +567,9 @@ struct xe_device {
> /** @pmu: performance monitoring unit */
> struct xe_pmu pmu;
>
> + /** @atomic_svm_timeslice_ms: Atomic SVM fault timeslice MS */
> + u32 atomic_svm_timeslice_ms;
> +
> #ifdef TEST_VM_OPS_ERROR
> /**
> * @vm_inject_error_position: inject errors at different places in VM
> diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> index d5376a76cdd1..de4eb04fc78e 100644
> --- a/drivers/gpu/drm/xe/xe_svm.c
> +++ b/drivers/gpu/drm/xe/xe_svm.c
> @@ -784,7 +784,8 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
> .devmem_only = atomic && IS_DGFX(vm->xe) &&
> IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> .timeslice_ms = atomic && IS_DGFX(vm->xe) &&
> - IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? 5 : 0,
> + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ?
> + vm->xe->atomic_svm_timeslice_ms : 0,
LGTM
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> };
> struct xe_svm_range *range;
> struct drm_gpusvm_range *r;
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 1/5] drm/gpusvm: Introduce devmem_only flag for allocation
2025-04-22 17:04 ` [PATCH v4 1/5] drm/gpusvm: Introduce devmem_only flag for allocation Matthew Brost
@ 2025-04-23 16:14 ` Matthew Brost
0 siblings, 0 replies; 17+ messages in thread
From: Matthew Brost @ 2025-04-23 16:14 UTC (permalink / raw)
To: intel-xe; +Cc: dri-devel, thomas.hellstrom, himal.prasad.ghimiray
On Tue, Apr 22, 2025 at 10:04:11AM -0700, Matthew Brost wrote:
> From: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
>
> This commit adds a new flag, devmem_only, to the drm_gpusvm structure. The
> purpose of this flag is to ensure that the get_pages function allocates
> memory exclusively from the device's memory. If the allocation from
> device memory fails, the function will return an -EFAULT error.
>
> v3:
> - s/vram_only/devmem_only/
>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> ---
> drivers/gpu/drm/drm_gpusvm.c | 5 +++++
> include/drm/drm_gpusvm.h | 2 ++
> 2 files changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
> index 38431e8360e7..edf107809d20 100644
> --- a/drivers/gpu/drm/drm_gpusvm.c
> +++ b/drivers/gpu/drm/drm_gpusvm.c
> @@ -1454,6 +1454,11 @@ int drm_gpusvm_range_get_pages(struct drm_gpusvm *gpusvm,
> goto err_unmap;
> }
>
> + if (ctx->devmem_only) {
> + err = -EFAULT;
> + goto err_unmap;
> + }
> +
> addr = dma_map_page(gpusvm->drm->dev,
> page, 0,
> PAGE_SIZE << order,
> diff --git a/include/drm/drm_gpusvm.h b/include/drm/drm_gpusvm.h
> index df120b4d1f83..9fd25fc880a4 100644
> --- a/include/drm/drm_gpusvm.h
> +++ b/include/drm/drm_gpusvm.h
> @@ -286,6 +286,7 @@ struct drm_gpusvm {
> * @in_notifier: entering from a MMU notifier
> * @read_only: operating on read-only memory
> * @devmem_possible: possible to use device memory
> + * @devmem_only: use only device memory
> *
> * Context that is DRM GPUSVM is operating in (i.e. user arguments).
> */
> @@ -294,6 +295,7 @@ struct drm_gpusvm_ctx {
> unsigned int in_notifier :1;
> unsigned int read_only :1;
> unsigned int devmem_possible :1;
> + unsigned int devmem_only :1;
> };
>
> int drm_gpusvm_init(struct drm_gpusvm *gpusvm,
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 2/5] drm/xe: Strict migration policy for atomic SVM faults
2025-04-22 17:21 ` Ghimiray, Himal Prasad
@ 2025-04-23 17:29 ` Matthew Brost
0 siblings, 0 replies; 17+ messages in thread
From: Matthew Brost @ 2025-04-23 17:29 UTC (permalink / raw)
To: Ghimiray, Himal Prasad; +Cc: intel-xe, dri-devel, thomas.hellstrom
On Tue, Apr 22, 2025 at 10:51:44PM +0530, Ghimiray, Himal Prasad wrote:
>
>
> On 22-04-2025 22:34, Matthew Brost wrote:
> > Mixing GPU and CPU atomics does not work unless a strict migration
> > policy of GPU atomics must be device memory. Enforce a policy of must be
> > in VRAM with a retry loop of 2 attempts, if retry loop fails abort
> > fault.
>
> retry loop of 3 attempts not 2. with that addressed, Patch looks good to me.
> Since both of us collaborated on this , it has my ack and need someone else
> also to review it.
>
> Acked-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
>
> >
> > v2:
> > - Only retry migration on atomics
> > - Drop alway migrate modparam
> > v3:
> > - Only set vram_only on DGFX (Himal)
> > - Bail on get_pages failure if vram_only and retry count exceeded (Himal)
> > - s/vram_only/devmem_only
> > - Update xe_svm_range_is_valid to accept devmem_only argument
> > v4:
> > - Fix logic bug get_pages failure
> >
> > Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_module.c | 3 --
> > drivers/gpu/drm/xe/xe_module.h | 1 -
> > drivers/gpu/drm/xe/xe_svm.c | 89 +++++++++++++++++++++++++---------
> > drivers/gpu/drm/xe/xe_svm.h | 5 --
> > 4 files changed, 65 insertions(+), 33 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
> > index 05c7d0ae6d83..1c4dfafbcd0b 100644
> > --- a/drivers/gpu/drm/xe/xe_module.c
> > +++ b/drivers/gpu/drm/xe/xe_module.c
> > @@ -33,9 +33,6 @@ struct xe_modparam xe_modparam = {
> > module_param_named(svm_notifier_size, xe_modparam.svm_notifier_size, uint, 0600);
> > MODULE_PARM_DESC(svm_notifier_size, "Set the svm notifier size(in MiB), must be power of 2");
> > -module_param_named(always_migrate_to_vram, xe_modparam.always_migrate_to_vram, bool, 0444);
> > -MODULE_PARM_DESC(always_migrate_to_vram, "Always migrate to VRAM on GPU fault");
> > -
> > module_param_named_unsafe(force_execlist, xe_modparam.force_execlist, bool, 0444);
> > MODULE_PARM_DESC(force_execlist, "Force Execlist submission");
> > diff --git a/drivers/gpu/drm/xe/xe_module.h b/drivers/gpu/drm/xe/xe_module.h
> > index 84339e509c80..5a3bfea8b7b4 100644
> > --- a/drivers/gpu/drm/xe/xe_module.h
> > +++ b/drivers/gpu/drm/xe/xe_module.h
> > @@ -12,7 +12,6 @@
> > struct xe_modparam {
> > bool force_execlist;
> > bool probe_display;
> > - bool always_migrate_to_vram;
> > u32 force_vram_bar_size;
> > int guc_log_level;
> > char *guc_firmware_path;
> > diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> > index 890f6b2f40e9..f749ae367a8f 100644
> > --- a/drivers/gpu/drm/xe/xe_svm.c
> > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > @@ -650,9 +650,11 @@ void xe_svm_fini(struct xe_vm *vm)
> > }
> > static bool xe_svm_range_is_valid(struct xe_svm_range *range,
> > - struct xe_tile *tile)
> > + struct xe_tile *tile,
> > + bool devmem_only)
> > {
> > - return (range->tile_present & ~range->tile_invalidated) & BIT(tile->id);
> > + return ((range->tile_present & ~range->tile_invalidated) & BIT(tile->id))
> > + && (!devmem_only || range->base.flags.migrate_devmem);
Typo...
s/migrate_devmem/has_devmem_pages/
Matt
> > }
> > #if IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR)
> > @@ -726,6 +728,35 @@ static int xe_svm_alloc_vram(struct xe_vm *vm, struct xe_tile *tile,
> > }
> > #endif
> > +static bool supports_4K_migration(struct xe_device *xe)
> > +{
> > + if (xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
> > + return false;
> > +
> > + return true;
> > +}
> > +
> > +static bool xe_svm_range_needs_migrate_to_vram(struct xe_svm_range *range,
> > + struct xe_vma *vma)
> > +{
> > + struct xe_vm *vm = range_to_vm(&range->base);
> > + u64 range_size = xe_svm_range_size(range);
> > +
> > + if (!range->base.flags.migrate_devmem)
> > + return false;
> > +
> > + if (xe_svm_range_in_vram(range)) {
> > + drm_dbg(&vm->xe->drm, "Range is already in VRAM\n");
> > + return false;
> > + }
> > +
> > + if (range_size <= SZ_64K && !supports_4K_migration(vm->xe)) {
> > + drm_dbg(&vm->xe->drm, "Platform doesn't support SZ_4K range migration\n");
> > + return false;
> > + }
> > +
> > + return true;
> > +}
> > /**
> > * xe_svm_handle_pagefault() - SVM handle page fault
> > @@ -750,12 +781,15 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
> > IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> > .check_pages_threshold = IS_DGFX(vm->xe) &&
> > IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? SZ_64K : 0,
> > + .devmem_only = atomic && IS_DGFX(vm->xe) &&
> > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> > };
> > struct xe_svm_range *range;
> > struct drm_gpusvm_range *r;
> > struct drm_exec exec;
> > struct dma_fence *fence;
> > struct xe_tile *tile = gt_to_tile(gt);
> > + int migrate_try_count = ctx.devmem_only ? 3 : 1;
> > ktime_t end = 0;
> > int err;
> > @@ -777,23 +811,26 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
> > return PTR_ERR(r);
> > range = to_xe_range(r);
> > - if (xe_svm_range_is_valid(range, tile))
> > + if (xe_svm_range_is_valid(range, tile, ctx.devmem_only))
> > return 0;
> > range_debug(range, "PAGE FAULT");
> > - /* XXX: Add migration policy, for now migrate range once */
> > - if (!range->skip_migrate && range->base.flags.migrate_devmem &&
> > - xe_svm_range_size(range) >= SZ_64K) {
> > - range->skip_migrate = true;
> > -
> > + if (--migrate_try_count >= 0 &&
> > + xe_svm_range_needs_migrate_to_vram(range, vma)) {
> > err = xe_svm_alloc_vram(vm, tile, range, &ctx);
> > if (err) {
> > - drm_dbg(&vm->xe->drm,
> > - "VRAM allocation failed, falling back to "
> > - "retrying fault, asid=%u, errno=%pe\n",
> > - vm->usm.asid, ERR_PTR(err));
> > - goto retry;
> > + if (migrate_try_count || !ctx.devmem_only) {
> > + drm_dbg(&vm->xe->drm,
> > + "VRAM allocation failed, falling back to retrying fault, asid=%u, errno=%pe\n",
> > + vm->usm.asid, ERR_PTR(err));
> > + goto retry;
> > + } else {
> > + drm_err(&vm->xe->drm,
> > + "VRAM allocation failed, retry count exceeded, asid=%u, errno=%pe\n",
> > + vm->usm.asid, ERR_PTR(err));
> > + return err;
> > + }
> > }
> > }
> > @@ -801,15 +838,22 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
> > err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, &ctx);
> > /* Corner where CPU mappings have changed */
> > if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) {
> > - if (err == -EOPNOTSUPP) {
> > - range_debug(range, "PAGE FAULT - EVICT PAGES");
> > - drm_gpusvm_range_evict(&vm->svm.gpusvm, &range->base);
> > + if (migrate_try_count > 0 || !ctx.devmem_only) {
> > + if (err == -EOPNOTSUPP) {
> > + range_debug(range, "PAGE FAULT - EVICT PAGES");
> > + drm_gpusvm_range_evict(&vm->svm.gpusvm,
> > + &range->base);
> > + }
> > + drm_dbg(&vm->xe->drm,
> > + "Get pages failed, falling back to retrying, asid=%u, gpusvm=%p, errno=%pe\n",
> > + vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err));
> > + range_debug(range, "PAGE FAULT - RETRY PAGES");
> > + goto retry;
> > + } else {
> > + drm_err(&vm->xe->drm,
> > + "Get pages failed, retry count exceeded, asid=%u, gpusvm=%p, errno=%pe\n",
> > + vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err));
> > }
> > - drm_dbg(&vm->xe->drm,
> > - "Get pages failed, falling back to retrying, asid=%u, gpusvm=%p, errno=%pe\n",
> > - vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err));
> > - range_debug(range, "PAGE FAULT - RETRY PAGES");
> > - goto retry;
> > }
> > if (err) {
> > range_debug(range, "PAGE FAULT - FAIL PAGE COLLECT");
> > @@ -843,9 +887,6 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
> > }
> > drm_exec_fini(&exec);
> > - if (xe_modparam.always_migrate_to_vram)
> > - range->skip_migrate = false;
> > -
> > dma_fence_wait(fence, false);
> > dma_fence_put(fence);
> > diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
> > index 3d441eb1f7ea..0e1f376a7471 100644
> > --- a/drivers/gpu/drm/xe/xe_svm.h
> > +++ b/drivers/gpu/drm/xe/xe_svm.h
> > @@ -39,11 +39,6 @@ struct xe_svm_range {
> > * range. Protected by GPU SVM notifier lock.
> > */
> > u8 tile_invalidated;
> > - /**
> > - * @skip_migrate: Skip migration to VRAM, protected by GPU fault handler
> > - * locking.
> > - */
> > - u8 skip_migrate :1;
> > };
> > /**
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 2/5] drm/xe: Strict migration policy for atomic SVM faults
2025-04-22 17:04 ` [PATCH v4 2/5] drm/xe: Strict migration policy for atomic SVM faults Matthew Brost
2025-04-22 17:21 ` Ghimiray, Himal Prasad
@ 2025-04-24 14:39 ` Thomas Hellström
2025-04-24 18:03 ` Matthew Brost
1 sibling, 1 reply; 17+ messages in thread
From: Thomas Hellström @ 2025-04-24 14:39 UTC (permalink / raw)
To: Matthew Brost, intel-xe; +Cc: dri-devel, himal.prasad.ghimiray
On Tue, 2025-04-22 at 10:04 -0700, Matthew Brost wrote:
> Mixing GPU and CPU atomics does not work unless a strict migration
> policy of GPU atomics must be device memory. Enforce a policy of must
> be
> in VRAM with a retry loop of 2 attempts, if retry loop fails abort
> fault.
>
> v2:
> - Only retry migration on atomics
> - Drop alway migrate modparam
> v3:
> - Only set vram_only on DGFX (Himal)
> - Bail on get_pages failure if vram_only and retry count exceeded
> (Himal)
> - s/vram_only/devmem_only
> - Update xe_svm_range_is_valid to accept devmem_only argument
> v4:
> - Fix logic bug get_pages failure
>
> Signed-off-by: Himal Prasad Ghimiray
> <himal.prasad.ghimiray@intel.com>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/xe/xe_module.c | 3 --
> drivers/gpu/drm/xe/xe_module.h | 1 -
> drivers/gpu/drm/xe/xe_svm.c | 89 +++++++++++++++++++++++++-------
> --
> drivers/gpu/drm/xe/xe_svm.h | 5 --
> 4 files changed, 65 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_module.c
> b/drivers/gpu/drm/xe/xe_module.c
> index 05c7d0ae6d83..1c4dfafbcd0b 100644
> --- a/drivers/gpu/drm/xe/xe_module.c
> +++ b/drivers/gpu/drm/xe/xe_module.c
> @@ -33,9 +33,6 @@ struct xe_modparam xe_modparam = {
> module_param_named(svm_notifier_size, xe_modparam.svm_notifier_size,
> uint, 0600);
> MODULE_PARM_DESC(svm_notifier_size, "Set the svm notifier size(in
> MiB), must be power of 2");
>
> -module_param_named(always_migrate_to_vram,
> xe_modparam.always_migrate_to_vram, bool, 0444);
> -MODULE_PARM_DESC(always_migrate_to_vram, "Always migrate to VRAM on
> GPU fault");
> -
module_param_named_unsafe(force_execlist,
> xe_modparam.force_execlist, bool, 0444);
> MODULE_PARM_DESC(force_execlist, "Force Execlist submission");
>
> diff --git a/drivers/gpu/drm/xe/xe_module.h
> b/drivers/gpu/drm/xe/xe_module.h
> index 84339e509c80..5a3bfea8b7b4 100644
> --- a/drivers/gpu/drm/xe/xe_module.h
> +++ b/drivers/gpu/drm/xe/xe_module.h
> @@ -12,7 +12,6 @@
> struct xe_modparam {
> bool force_execlist;
> bool probe_display;
> - bool always_migrate_to_vram;
> u32 force_vram_bar_size;
> int guc_log_level;
> char *guc_firmware_path;
> diff --git a/drivers/gpu/drm/xe/xe_svm.c
> b/drivers/gpu/drm/xe/xe_svm.c
> index 890f6b2f40e9..f749ae367a8f 100644
> --- a/drivers/gpu/drm/xe/xe_svm.c
> +++ b/drivers/gpu/drm/xe/xe_svm.c
> @@ -650,9 +650,11 @@ void xe_svm_fini(struct xe_vm *vm)
> }
>
> static bool xe_svm_range_is_valid(struct xe_svm_range *range,
> - struct xe_tile *tile)
> + struct xe_tile *tile,
> + bool devmem_only)
> {
> - return (range->tile_present & ~range->tile_invalidated) &
> BIT(tile->id);
> + return ((range->tile_present & ~range->tile_invalidated) &
> BIT(tile->id))
> + && (!devmem_only || range-
> >base.flags.migrate_devmem);
> }
So let's say devmem_only is true here, and range-
>base.flags.migrate_devmem is false. Wouldn't that mean the range is
unusable and needs to be freed and re-allocated?
Also another thing going back to older code, it seems like range-
>tile_invalidated is protected by the notifier lock, so shouldn't we
assert that to be held in the function? It seems not to be held further
below:
>
> #if IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR)
> @@ -726,6 +728,35 @@ static int xe_svm_alloc_vram(struct xe_vm *vm,
> struct xe_tile *tile,
> }
> #endif
>
> +static bool supports_4K_migration(struct xe_device *xe)
> +{
> + if (xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
> + return false;
> +
> + return true;
> +}
Do we have any hardware that supports pagefaults but not 4K VRAM pages?
> +
> +static bool xe_svm_range_needs_migrate_to_vram(struct xe_svm_range
> *range,
> + struct xe_vma *vma)
> +{
> + struct xe_vm *vm = range_to_vm(&range->base);
> + u64 range_size = xe_svm_range_size(range);
> +
> + if (!range->base.flags.migrate_devmem)
> + return false;
> +
> + if (xe_svm_range_in_vram(range)) {
> + drm_dbg(&vm->xe->drm, "Range is already in VRAM\n");
> + return false;
> + }
> +
> + if (range_size <= SZ_64K && !supports_4K_migration(vm->xe))
> {
> + drm_dbg(&vm->xe->drm, "Platform doesn't support
> SZ_4K range migration\n");
> + return false;
> + }
> +
> + return true;
> +}
>
> /**
> * xe_svm_handle_pagefault() - SVM handle page fault
> @@ -750,12 +781,15 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> struct xe_vma *vma,
> IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> .check_pages_threshold = IS_DGFX(vm->xe) &&
> IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ?
> SZ_64K : 0,
> + .devmem_only = atomic && IS_DGFX(vm->xe) &&
> + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> };
> struct xe_svm_range *range;
> struct drm_gpusvm_range *r;
> struct drm_exec exec;
> struct dma_fence *fence;
> struct xe_tile *tile = gt_to_tile(gt);
> + int migrate_try_count = ctx.devmem_only ? 3 : 1;
> ktime_t end = 0;
> int err;
>
> @@ -777,23 +811,26 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> struct xe_vma *vma,
> return PTR_ERR(r);
>
> range = to_xe_range(r);
> - if (xe_svm_range_is_valid(range, tile))
> + if (xe_svm_range_is_valid(range, tile, ctx.devmem_only))
Requires notifier lock. Also see comment on re-allocating the range
above.
> return 0;
>
> range_debug(range, "PAGE FAULT");
>
> - /* XXX: Add migration policy, for now migrate range once */
> - if (!range->skip_migrate && range->base.flags.migrate_devmem
> &&
> - xe_svm_range_size(range) >= SZ_64K) {
> - range->skip_migrate = true;
> -
> + if (--migrate_try_count >= 0 &&
> + xe_svm_range_needs_migrate_to_vram(range, vma)
Requires notifier lock.
Should we have some sort of timeout instead of a try-count? Perhaps as
a last resort fall back to a 4K range?
/Thomas
> ) {
> err = xe_svm_alloc_vram(vm, tile, range, &ctx);
> if (err) {
> - drm_dbg(&vm->xe->drm,
> - "VRAM allocation failed, falling
> back to "
> - "retrying fault, asid=%u,
> errno=%pe\n",
> - vm->usm.asid, ERR_PTR(err));
> - goto retry;
> + if (migrate_try_count || !ctx.devmem_only) {
> + drm_dbg(&vm->xe->drm,
> + "VRAM allocation failed,
> falling back to retrying fault, asid=%u, errno=%pe\n",
> + vm->usm.asid, ERR_PTR(err));
> + goto retry;
> + } else {
> + drm_err(&vm->xe->drm,
> + "VRAM allocation failed,
> retry count exceeded, asid=%u, errno=%pe\n",
> + vm->usm.asid, ERR_PTR(err));
> + return err;
> + }
> }
> }
>
> @@ -801,15 +838,22 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> struct xe_vma *vma,
> err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, &ctx);
> /* Corner where CPU mappings have changed */
> if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) {
> - if (err == -EOPNOTSUPP) {
> - range_debug(range, "PAGE FAULT - EVICT
> PAGES");
> - drm_gpusvm_range_evict(&vm->svm.gpusvm,
> &range->base);
> + if (migrate_try_count > 0 || !ctx.devmem_only) {
> + if (err == -EOPNOTSUPP) {
> + range_debug(range, "PAGE FAULT -
> EVICT PAGES");
> + drm_gpusvm_range_evict(&vm-
> >svm.gpusvm,
> + &range-
> >base);
> + }
> + drm_dbg(&vm->xe->drm,
> + "Get pages failed, falling back to
> retrying, asid=%u, gpusvm=%p, errno=%pe\n",
> + vm->usm.asid, &vm->svm.gpusvm,
> ERR_PTR(err));
> + range_debug(range, "PAGE FAULT - RETRY
> PAGES");
> + goto retry;
> + } else {
> + drm_err(&vm->xe->drm,
> + "Get pages failed, retry count
> exceeded, asid=%u, gpusvm=%p, errno=%pe\n",
> + vm->usm.asid, &vm->svm.gpusvm,
> ERR_PTR(err));
> }
> - drm_dbg(&vm->xe->drm,
> - "Get pages failed, falling back to retrying,
> asid=%u, gpusvm=%p, errno=%pe\n",
> - vm->usm.asid, &vm->svm.gpusvm,
> ERR_PTR(err));
> - range_debug(range, "PAGE FAULT - RETRY PAGES");
> - goto retry;
> }
> if (err) {
> range_debug(range, "PAGE FAULT - FAIL PAGE
> COLLECT");
> @@ -843,9 +887,6 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> struct xe_vma *vma,
> }
> drm_exec_fini(&exec);
>
> - if (xe_modparam.always_migrate_to_vram)
> - range->skip_migrate = false;
> -
> dma_fence_wait(fence, false);
> dma_fence_put(fence);
>
> diff --git a/drivers/gpu/drm/xe/xe_svm.h
> b/drivers/gpu/drm/xe/xe_svm.h
> index 3d441eb1f7ea..0e1f376a7471 100644
> --- a/drivers/gpu/drm/xe/xe_svm.h
> +++ b/drivers/gpu/drm/xe/xe_svm.h
> @@ -39,11 +39,6 @@ struct xe_svm_range {
> * range. Protected by GPU SVM notifier lock.
> */
> u8 tile_invalidated;
> - /**
> - * @skip_migrate: Skip migration to VRAM, protected by GPU
> fault handler
> - * locking.
> - */
> - u8 skip_migrate :1;
> };
>
> /**
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 2/5] drm/xe: Strict migration policy for atomic SVM faults
2025-04-24 14:39 ` Thomas Hellström
@ 2025-04-24 18:03 ` Matthew Brost
2025-04-25 7:18 ` Thomas Hellström
0 siblings, 1 reply; 17+ messages in thread
From: Matthew Brost @ 2025-04-24 18:03 UTC (permalink / raw)
To: Thomas Hellström; +Cc: intel-xe, dri-devel, himal.prasad.ghimiray
On Thu, Apr 24, 2025 at 04:39:21PM +0200, Thomas Hellström wrote:
> On Tue, 2025-04-22 at 10:04 -0700, Matthew Brost wrote:
> > Mixing GPU and CPU atomics does not work unless a strict migration
> > policy of GPU atomics must be device memory. Enforce a policy of must
> > be
> > in VRAM with a retry loop of 2 attempts, if retry loop fails abort
> > fault.
> >
> > v2:
> > - Only retry migration on atomics
> > - Drop alway migrate modparam
> > v3:
> > - Only set vram_only on DGFX (Himal)
> > - Bail on get_pages failure if vram_only and retry count exceeded
> > (Himal)
> > - s/vram_only/devmem_only
> > - Update xe_svm_range_is_valid to accept devmem_only argument
> > v4:
> > - Fix logic bug get_pages failure
> >
> > Signed-off-by: Himal Prasad Ghimiray
> > <himal.prasad.ghimiray@intel.com>
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_module.c | 3 --
> > drivers/gpu/drm/xe/xe_module.h | 1 -
> > drivers/gpu/drm/xe/xe_svm.c | 89 +++++++++++++++++++++++++-------
> > --
> > drivers/gpu/drm/xe/xe_svm.h | 5 --
> > 4 files changed, 65 insertions(+), 33 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_module.c
> > b/drivers/gpu/drm/xe/xe_module.c
> > index 05c7d0ae6d83..1c4dfafbcd0b 100644
> > --- a/drivers/gpu/drm/xe/xe_module.c
> > +++ b/drivers/gpu/drm/xe/xe_module.c
> > @@ -33,9 +33,6 @@ struct xe_modparam xe_modparam = {
> > module_param_named(svm_notifier_size, xe_modparam.svm_notifier_size,
> > uint, 0600);
> > MODULE_PARM_DESC(svm_notifier_size, "Set the svm notifier size(in
> > MiB), must be power of 2");
> >
> > -module_param_named(always_migrate_to_vram,
> > xe_modparam.always_migrate_to_vram, bool, 0444);
> > -MODULE_PARM_DESC(always_migrate_to_vram, "Always migrate to VRAM on
> > GPU fault");
> > -
> module_param_named_unsafe(force_execlist,
> > xe_modparam.force_execlist, bool, 0444);
> > MODULE_PARM_DESC(force_execlist, "Force Execlist submission");
> >
> > diff --git a/drivers/gpu/drm/xe/xe_module.h
> > b/drivers/gpu/drm/xe/xe_module.h
> > index 84339e509c80..5a3bfea8b7b4 100644
> > --- a/drivers/gpu/drm/xe/xe_module.h
> > +++ b/drivers/gpu/drm/xe/xe_module.h
> > @@ -12,7 +12,6 @@
> > struct xe_modparam {
> > bool force_execlist;
> > bool probe_display;
> > - bool always_migrate_to_vram;
> > u32 force_vram_bar_size;
> > int guc_log_level;
> > char *guc_firmware_path;
> > diff --git a/drivers/gpu/drm/xe/xe_svm.c
> > b/drivers/gpu/drm/xe/xe_svm.c
> > index 890f6b2f40e9..f749ae367a8f 100644
> > --- a/drivers/gpu/drm/xe/xe_svm.c
> > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > @@ -650,9 +650,11 @@ void xe_svm_fini(struct xe_vm *vm)
> > }
> >
> > static bool xe_svm_range_is_valid(struct xe_svm_range *range,
> > - struct xe_tile *tile)
> > + struct xe_tile *tile,
> > + bool devmem_only)
> > {
> > - return (range->tile_present & ~range->tile_invalidated) &
> > BIT(tile->id);
> > + return ((range->tile_present & ~range->tile_invalidated) &
> > BIT(tile->id))
> > + && (!devmem_only || range-
> > >base.flags.migrate_devmem);
> > }
>
> So let's say devmem_only is true here, and range-
> >base.flags.migrate_devmem is false. Wouldn't that mean the range is
> unusable and needs to be freed and re-allocated?
>
This is typo, this should be s/migrate_devmem/has_devmem_pages.
This translates to:
Either devmem_only is not required or we have devmem pages with a valid mapping.
If migrate_devmem is false and devmem_only is true, that is a fatal
error actually, we should have check for that and kill the fault. An
example of this would be shared mapping which cannot be migrated to
devmem.
> Also another thing going back to older code, it seems like range-
> >tile_invalidated is protected by the notifier lock, so shouldn't we
> assert that to be held in the function? It seems not to be held further
> below:
Yea techincally to get a stable value we'd need the notifier lock but
this is an opportunistic check - at worst if we read a valid range we
skip the page faults and will immediately get another page fault. So we
could take the notifier lock here but I don't think this is strickly
required. Let me know what you think here.
>
> >
> > #if IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR)
> > @@ -726,6 +728,35 @@ static int xe_svm_alloc_vram(struct xe_vm *vm,
> > struct xe_tile *tile,
> > }
> > #endif
> >
> > +static bool supports_4K_migration(struct xe_device *xe)
> > +{
> > + if (xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
> > + return false;
> > +
> > + return true;
> > +}
>
> Do we have any hardware that supports pagefaults but not 4K VRAM pages?
>
PVC
> > +
> > +static bool xe_svm_range_needs_migrate_to_vram(struct xe_svm_range
> > *range,
> > + struct xe_vma *vma)
> > +{
> > + struct xe_vm *vm = range_to_vm(&range->base);
> > + u64 range_size = xe_svm_range_size(range);
> > +
> > + if (!range->base.flags.migrate_devmem)
> > + return false;
> > +
> > + if (xe_svm_range_in_vram(range)) {
> > + drm_dbg(&vm->xe->drm, "Range is already in VRAM\n");
> > + return false;
> > + }
> > +
> > + if (range_size <= SZ_64K && !supports_4K_migration(vm->xe))
> > {
> > + drm_dbg(&vm->xe->drm, "Platform doesn't support
> > SZ_4K range migration\n");
> > + return false;
> > + }
> > +
> > + return true;
> > +}
> >
> > /**
> > * xe_svm_handle_pagefault() - SVM handle page fault
> > @@ -750,12 +781,15 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> > struct xe_vma *vma,
> > IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> > .check_pages_threshold = IS_DGFX(vm->xe) &&
> > IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ?
> > SZ_64K : 0,
> > + .devmem_only = atomic && IS_DGFX(vm->xe) &&
> > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> > };
> > struct xe_svm_range *range;
> > struct drm_gpusvm_range *r;
> > struct drm_exec exec;
> > struct dma_fence *fence;
> > struct xe_tile *tile = gt_to_tile(gt);
> > + int migrate_try_count = ctx.devmem_only ? 3 : 1;
> > ktime_t end = 0;
> > int err;
> >
> > @@ -777,23 +811,26 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> > struct xe_vma *vma,
> > return PTR_ERR(r);
> >
> > range = to_xe_range(r);
> > - if (xe_svm_range_is_valid(range, tile))
> > + if (xe_svm_range_is_valid(range, tile, ctx.devmem_only))
>
> Requires notifier lock. Also see comment on re-allocating the range
> above.
>
Same as above.
> > return 0;
> >
> > range_debug(range, "PAGE FAULT");
> >
> > - /* XXX: Add migration policy, for now migrate range once */
> > - if (!range->skip_migrate && range->base.flags.migrate_devmem
> > &&
> > - xe_svm_range_size(range) >= SZ_64K) {
> > - range->skip_migrate = true;
> > -
> > + if (--migrate_try_count >= 0 &&
> > + xe_svm_range_needs_migrate_to_vram(range, vma)
>
> Requires notifier lock.
>
Same as above.
> Should we have some sort of timeout instead of a try-count? Perhaps as
> a last resort fall back to a 4K range?
>
I did have code like that at one point to reduce range size but it is a
bit complicated as we'd have to remove the range... I'd rather stick
with the retry loop for now and if this becomes problematic, circle back
to reducing the size of the fault page on each retry loop.
Matt
> /Thomas
>
>
>
> > ) {
> > err = xe_svm_alloc_vram(vm, tile, range, &ctx);
> > if (err) {
> > - drm_dbg(&vm->xe->drm,
> > - "VRAM allocation failed, falling
> > back to "
> > - "retrying fault, asid=%u,
> > errno=%pe\n",
> > - vm->usm.asid, ERR_PTR(err));
> > - goto retry;
> > + if (migrate_try_count || !ctx.devmem_only) {
> > + drm_dbg(&vm->xe->drm,
> > + "VRAM allocation failed,
> > falling back to retrying fault, asid=%u, errno=%pe\n",
> > + vm->usm.asid, ERR_PTR(err));
> > + goto retry;
> > + } else {
> > + drm_err(&vm->xe->drm,
> > + "VRAM allocation failed,
> > retry count exceeded, asid=%u, errno=%pe\n",
> > + vm->usm.asid, ERR_PTR(err));
> > + return err;
> > + }
> > }
> > }
> >
> > @@ -801,15 +838,22 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> > struct xe_vma *vma,
> > err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, &ctx);
> > /* Corner where CPU mappings have changed */
> > if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) {
> > - if (err == -EOPNOTSUPP) {
> > - range_debug(range, "PAGE FAULT - EVICT
> > PAGES");
> > - drm_gpusvm_range_evict(&vm->svm.gpusvm,
> > &range->base);
> > + if (migrate_try_count > 0 || !ctx.devmem_only) {
> > + if (err == -EOPNOTSUPP) {
> > + range_debug(range, "PAGE FAULT -
> > EVICT PAGES");
> > + drm_gpusvm_range_evict(&vm-
> > >svm.gpusvm,
> > + &range-
> > >base);
> > + }
> > + drm_dbg(&vm->xe->drm,
> > + "Get pages failed, falling back to
> > retrying, asid=%u, gpusvm=%p, errno=%pe\n",
> > + vm->usm.asid, &vm->svm.gpusvm,
> > ERR_PTR(err));
> > + range_debug(range, "PAGE FAULT - RETRY
> > PAGES");
> > + goto retry;
> > + } else {
> > + drm_err(&vm->xe->drm,
> > + "Get pages failed, retry count
> > exceeded, asid=%u, gpusvm=%p, errno=%pe\n",
> > + vm->usm.asid, &vm->svm.gpusvm,
> > ERR_PTR(err));
> > }
> > - drm_dbg(&vm->xe->drm,
> > - "Get pages failed, falling back to retrying,
> > asid=%u, gpusvm=%p, errno=%pe\n",
> > - vm->usm.asid, &vm->svm.gpusvm,
> > ERR_PTR(err));
> > - range_debug(range, "PAGE FAULT - RETRY PAGES");
> > - goto retry;
> > }
> > if (err) {
> > range_debug(range, "PAGE FAULT - FAIL PAGE
> > COLLECT");
> > @@ -843,9 +887,6 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> > struct xe_vma *vma,
> > }
> > drm_exec_fini(&exec);
> >
> > - if (xe_modparam.always_migrate_to_vram)
> > - range->skip_migrate = false;
> > -
> > dma_fence_wait(fence, false);
> > dma_fence_put(fence);
> >
> > diff --git a/drivers/gpu/drm/xe/xe_svm.h
> > b/drivers/gpu/drm/xe/xe_svm.h
> > index 3d441eb1f7ea..0e1f376a7471 100644
> > --- a/drivers/gpu/drm/xe/xe_svm.h
> > +++ b/drivers/gpu/drm/xe/xe_svm.h
> > @@ -39,11 +39,6 @@ struct xe_svm_range {
> > * range. Protected by GPU SVM notifier lock.
> > */
> > u8 tile_invalidated;
> > - /**
> > - * @skip_migrate: Skip migration to VRAM, protected by GPU
> > fault handler
> > - * locking.
> > - */
> > - u8 skip_migrate :1;
> > };
> >
> > /**
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 2/5] drm/xe: Strict migration policy for atomic SVM faults
2025-04-24 18:03 ` Matthew Brost
@ 2025-04-25 7:18 ` Thomas Hellström
2025-04-25 7:39 ` Matthew Brost
0 siblings, 1 reply; 17+ messages in thread
From: Thomas Hellström @ 2025-04-25 7:18 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, himal.prasad.ghimiray
On Thu, 2025-04-24 at 11:03 -0700, Matthew Brost wrote:
> On Thu, Apr 24, 2025 at 04:39:21PM +0200, Thomas Hellström wrote:
> > On Tue, 2025-04-22 at 10:04 -0700, Matthew Brost wrote:
> > > Mixing GPU and CPU atomics does not work unless a strict
> > > migration
> > > policy of GPU atomics must be device memory. Enforce a policy of
> > > must
> > > be
> > > in VRAM with a retry loop of 2 attempts, if retry loop fails
> > > abort
> > > fault.
> > >
> > > v2:
> > > - Only retry migration on atomics
> > > - Drop alway migrate modparam
> > > v3:
> > > - Only set vram_only on DGFX (Himal)
> > > - Bail on get_pages failure if vram_only and retry count
> > > exceeded
> > > (Himal)
> > > - s/vram_only/devmem_only
> > > - Update xe_svm_range_is_valid to accept devmem_only argument
> > > v4:
> > > - Fix logic bug get_pages failure
> > >
> > > Signed-off-by: Himal Prasad Ghimiray
> > > <himal.prasad.ghimiray@intel.com>
> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > ---
> > > drivers/gpu/drm/xe/xe_module.c | 3 --
> > > drivers/gpu/drm/xe/xe_module.h | 1 -
> > > drivers/gpu/drm/xe/xe_svm.c | 89 +++++++++++++++++++++++++---
> > > ----
> > > --
> > > drivers/gpu/drm/xe/xe_svm.h | 5 --
> > > 4 files changed, 65 insertions(+), 33 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_module.c
> > > b/drivers/gpu/drm/xe/xe_module.c
> > > index 05c7d0ae6d83..1c4dfafbcd0b 100644
> > > --- a/drivers/gpu/drm/xe/xe_module.c
> > > +++ b/drivers/gpu/drm/xe/xe_module.c
> > > @@ -33,9 +33,6 @@ struct xe_modparam xe_modparam = {
> > > module_param_named(svm_notifier_size,
> > > xe_modparam.svm_notifier_size,
> > > uint, 0600);
> > > MODULE_PARM_DESC(svm_notifier_size, "Set the svm notifier
> > > size(in
> > > MiB), must be power of 2");
> > >
> > > -module_param_named(always_migrate_to_vram,
> > > xe_modparam.always_migrate_to_vram, bool, 0444);
> > > -MODULE_PARM_DESC(always_migrate_to_vram, "Always migrate to VRAM
> > > on
> > > GPU fault");
> > > -
> > module_param_named_unsafe(force_execlist,
> > > xe_modparam.force_execlist, bool, 0444);
> > > MODULE_PARM_DESC(force_execlist, "Force Execlist submission");
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_module.h
> > > b/drivers/gpu/drm/xe/xe_module.h
> > > index 84339e509c80..5a3bfea8b7b4 100644
> > > --- a/drivers/gpu/drm/xe/xe_module.h
> > > +++ b/drivers/gpu/drm/xe/xe_module.h
> > > @@ -12,7 +12,6 @@
> > > struct xe_modparam {
> > > bool force_execlist;
> > > bool probe_display;
> > > - bool always_migrate_to_vram;
> > > u32 force_vram_bar_size;
> > > int guc_log_level;
> > > char *guc_firmware_path;
> > > diff --git a/drivers/gpu/drm/xe/xe_svm.c
> > > b/drivers/gpu/drm/xe/xe_svm.c
> > > index 890f6b2f40e9..f749ae367a8f 100644
> > > --- a/drivers/gpu/drm/xe/xe_svm.c
> > > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > > @@ -650,9 +650,11 @@ void xe_svm_fini(struct xe_vm *vm)
> > > }
> > >
> > > static bool xe_svm_range_is_valid(struct xe_svm_range *range,
> > > - struct xe_tile *tile)
> > > + struct xe_tile *tile,
> > > + bool devmem_only)
> > > {
> > > - return (range->tile_present & ~range->tile_invalidated)
> > > &
> > > BIT(tile->id);
> > > + return ((range->tile_present & ~range->tile_invalidated)
> > > &
> > > BIT(tile->id))
> > > + && (!devmem_only || range-
> > > > base.flags.migrate_devmem);
> > > }
> >
> > So let's say devmem_only is true here, and range-
> > > base.flags.migrate_devmem is false. Wouldn't that mean the range
> > > is
> > unusable and needs to be freed and re-allocated?
> >
>
> This is typo, this should be s/migrate_devmem/has_devmem_pages.
>
> This translates to:
>
> Either devmem_only is not required or we have devmem pages with a
> valid mapping.
>
> If migrate_devmem is false and devmem_only is true, that is a fatal
> error actually, we should have check for that and kill the fault. An
> example of this would be shared mapping which cannot be migrated to
> devmem.
>
> > Also another thing going back to older code, it seems like range-
> > > tile_invalidated is protected by the notifier lock, so shouldn't
> > > we
> > assert that to be held in the function? It seems not to be held
> > further
> > below:
>
> Yea techincally to get a stable value we'd need the notifier lock but
> this is an opportunistic check - at worst if we read a valid range we
> skip the page faults and will immediately get another page fault. So
> we
> could take the notifier lock here but I don't think this is strickly
> required. Let me know what you think here.
The problem with this is that the code gets harder to maintain and
understand. A new reader would probably first react over the lockless
read, and then why there are no memory barriers and then what happens
if the page-fault was marked as resolved without actually resolving it.
So IMO if we do opportunistic tests to opt out of locking (which is
discouraged in the drm locking guidelines
https://blog.ffwll.ch/2022/08/locking-hierarchy.html)
we should definitely add separate functions for that with extensive
docs and READ_ONCE() annotation.
But also think if this is really worth sacrificing readability instead
of actually relying on alloc_vram() and get_pages() exiting early if
everything looks ok?
>
> >
> > >
> > > #if IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR)
> > > @@ -726,6 +728,35 @@ static int xe_svm_alloc_vram(struct xe_vm
> > > *vm,
> > > struct xe_tile *tile,
> > > }
> > > #endif
> > >
> > > +static bool supports_4K_migration(struct xe_device *xe)
> > > +{
> > > + if (xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
> > > + return false;
> > > +
> > > + return true;
> > > +}
> >
> > Do we have any hardware that supports pagefaults but not 4K VRAM
> > pages?
> >
>
> PVC
OK, I was under the impression that PVC actually supported 4K pages.
But perhaps there was a bug encountered while implementing that.
>
> > > +
> > > +static bool xe_svm_range_needs_migrate_to_vram(struct
> > > xe_svm_range
> > > *range,
> > > + struct xe_vma
> > > *vma)
> > > +{
> > > + struct xe_vm *vm = range_to_vm(&range->base);
> > > + u64 range_size = xe_svm_range_size(range);
> > > +
> > > + if (!range->base.flags.migrate_devmem)
> > > + return false;
> > > +
> > > + if (xe_svm_range_in_vram(range)) {
> > > + drm_dbg(&vm->xe->drm, "Range is already in
> > > VRAM\n");
> > > + return false;
> > > + }
> > > +
> > > + if (range_size <= SZ_64K && !supports_4K_migration(vm-
> > > >xe))
> > > {
> > > + drm_dbg(&vm->xe->drm, "Platform doesn't support
> > > SZ_4K range migration\n");
> > > + return false;
> > > + }
> > > +
> > > + return true;
> > > +}
> > >
> > > /**
> > > * xe_svm_handle_pagefault() - SVM handle page fault
> > > @@ -750,12 +781,15 @@ int xe_svm_handle_pagefault(struct xe_vm
> > > *vm,
> > > struct xe_vma *vma,
> > > IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> > > .check_pages_threshold = IS_DGFX(vm->xe) &&
> > > IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR)
> > > ?
> > > SZ_64K : 0,
> > > + .devmem_only = atomic && IS_DGFX(vm->xe) &&
> > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> > > };
> > > struct xe_svm_range *range;
> > > struct drm_gpusvm_range *r;
> > > struct drm_exec exec;
> > > struct dma_fence *fence;
> > > struct xe_tile *tile = gt_to_tile(gt);
> > > + int migrate_try_count = ctx.devmem_only ? 3 : 1;
> > > ktime_t end = 0;
> > > int err;
> > >
> > > @@ -777,23 +811,26 @@ int xe_svm_handle_pagefault(struct xe_vm
> > > *vm,
> > > struct xe_vma *vma,
> > > return PTR_ERR(r);
> > >
> > > range = to_xe_range(r);
> > > - if (xe_svm_range_is_valid(range, tile))
> > > + if (xe_svm_range_is_valid(range, tile, ctx.devmem_only))
> >
> > Requires notifier lock. Also see comment on re-allocating the range
> > above.
> >
>
> Same as above.
>
> > > return 0;
> > >
> > > range_debug(range, "PAGE FAULT");
> > >
> > > - /* XXX: Add migration policy, for now migrate range once
> > > */
> > > - if (!range->skip_migrate && range-
> > > >base.flags.migrate_devmem
> > > &&
> > > - xe_svm_range_size(range) >= SZ_64K) {
> > > - range->skip_migrate = true;
> > > -
> > > + if (--migrate_try_count >= 0 &&
> > > + xe_svm_range_needs_migrate_to_vram(range, vma)
> >
> > Requires notifier lock.
> >
>
> Same as above.
>
> > Should we have some sort of timeout instead of a try-count? Perhaps
> > as
> > a last resort fall back to a 4K range?
> >
>
> I did have code like that at one point to reduce range size but it is
> a
> bit complicated as we'd have to remove the range... I'd rather stick
> with the retry loop for now and if this becomes problematic, circle
> back
> to reducing the size of the fault page on each retry loop.
OK, makes sense.
/Thomas
>
> Matt
>
> > /Thomas
> >
> >
> >
> > > ) {
> > > err = xe_svm_alloc_vram(vm, tile, range, &ctx);
> > > if (err) {
> > > - drm_dbg(&vm->xe->drm,
> > > - "VRAM allocation failed, falling
> > > back to "
> > > - "retrying fault, asid=%u,
> > > errno=%pe\n",
> > > - vm->usm.asid, ERR_PTR(err));
> > > - goto retry;
> > > + if (migrate_try_count ||
> > > !ctx.devmem_only) {
> > > + drm_dbg(&vm->xe->drm,
> > > + "VRAM allocation failed,
> > > falling back to retrying fault, asid=%u, errno=%pe\n",
> > > + vm->usm.asid,
> > > ERR_PTR(err));
> > > + goto retry;
> > > + } else {
> > > + drm_err(&vm->xe->drm,
> > > + "VRAM allocation failed,
> > > retry count exceeded, asid=%u, errno=%pe\n",
> > > + vm->usm.asid,
> > > ERR_PTR(err));
> > > + return err;
> > > + }
> > > }
> > > }
> > >
> > > @@ -801,15 +838,22 @@ int xe_svm_handle_pagefault(struct xe_vm
> > > *vm,
> > > struct xe_vma *vma,
> > > err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r,
> > > &ctx);
> > > /* Corner where CPU mappings have changed */
> > > if (err == -EOPNOTSUPP || err == -EFAULT || err == -
> > > EPERM) {
> > > - if (err == -EOPNOTSUPP) {
> > > - range_debug(range, "PAGE FAULT - EVICT
> > > PAGES");
> > > - drm_gpusvm_range_evict(&vm->svm.gpusvm,
> > > &range->base);
> > > + if (migrate_try_count > 0 || !ctx.devmem_only) {
> > > + if (err == -EOPNOTSUPP) {
> > > + range_debug(range, "PAGE FAULT -
> > > EVICT PAGES");
> > > + drm_gpusvm_range_evict(&vm-
> > > > svm.gpusvm,
> > > + &range-
> > > > base);
> > > + }
> > > + drm_dbg(&vm->xe->drm,
> > > + "Get pages failed, falling back
> > > to
> > > retrying, asid=%u, gpusvm=%p, errno=%pe\n",
> > > + vm->usm.asid, &vm->svm.gpusvm,
> > > ERR_PTR(err));
> > > + range_debug(range, "PAGE FAULT - RETRY
> > > PAGES");
> > > + goto retry;
> > > + } else {
> > > + drm_err(&vm->xe->drm,
> > > + "Get pages failed, retry count
> > > exceeded, asid=%u, gpusvm=%p, errno=%pe\n",
> > > + vm->usm.asid, &vm->svm.gpusvm,
> > > ERR_PTR(err));
> > > }
> > > - drm_dbg(&vm->xe->drm,
> > > - "Get pages failed, falling back to
> > > retrying,
> > > asid=%u, gpusvm=%p, errno=%pe\n",
> > > - vm->usm.asid, &vm->svm.gpusvm,
> > > ERR_PTR(err));
> > > - range_debug(range, "PAGE FAULT - RETRY PAGES");
> > > - goto retry;
> > > }
> > > if (err) {
> > > range_debug(range, "PAGE FAULT - FAIL PAGE
> > > COLLECT");
> > > @@ -843,9 +887,6 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> > > struct xe_vma *vma,
> > > }
> > > drm_exec_fini(&exec);
> > >
> > > - if (xe_modparam.always_migrate_to_vram)
> > > - range->skip_migrate = false;
> > > -
> > > dma_fence_wait(fence, false);
> > > dma_fence_put(fence);
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_svm.h
> > > b/drivers/gpu/drm/xe/xe_svm.h
> > > index 3d441eb1f7ea..0e1f376a7471 100644
> > > --- a/drivers/gpu/drm/xe/xe_svm.h
> > > +++ b/drivers/gpu/drm/xe/xe_svm.h
> > > @@ -39,11 +39,6 @@ struct xe_svm_range {
> > > * range. Protected by GPU SVM notifier lock.
> > > */
> > > u8 tile_invalidated;
> > > - /**
> > > - * @skip_migrate: Skip migration to VRAM, protected by
> > > GPU
> > > fault handler
> > > - * locking.
> > > - */
> > > - u8 skip_migrate :1;
> > > };
> > >
> > > /**
> >
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 2/5] drm/xe: Strict migration policy for atomic SVM faults
2025-04-25 7:18 ` Thomas Hellström
@ 2025-04-25 7:39 ` Matthew Brost
2025-04-25 9:10 ` Thomas Hellström
0 siblings, 1 reply; 17+ messages in thread
From: Matthew Brost @ 2025-04-25 7:39 UTC (permalink / raw)
To: Thomas Hellström; +Cc: intel-xe, dri-devel, himal.prasad.ghimiray
On Fri, Apr 25, 2025 at 09:18:19AM +0200, Thomas Hellström wrote:
> On Thu, 2025-04-24 at 11:03 -0700, Matthew Brost wrote:
> > On Thu, Apr 24, 2025 at 04:39:21PM +0200, Thomas Hellström wrote:
> > > On Tue, 2025-04-22 at 10:04 -0700, Matthew Brost wrote:
> > > > Mixing GPU and CPU atomics does not work unless a strict
> > > > migration
> > > > policy of GPU atomics must be device memory. Enforce a policy of
> > > > must
> > > > be
> > > > in VRAM with a retry loop of 2 attempts, if retry loop fails
> > > > abort
> > > > fault.
> > > >
> > > > v2:
> > > > - Only retry migration on atomics
> > > > - Drop alway migrate modparam
> > > > v3:
> > > > - Only set vram_only on DGFX (Himal)
> > > > - Bail on get_pages failure if vram_only and retry count
> > > > exceeded
> > > > (Himal)
> > > > - s/vram_only/devmem_only
> > > > - Update xe_svm_range_is_valid to accept devmem_only argument
> > > > v4:
> > > > - Fix logic bug get_pages failure
> > > >
> > > > Signed-off-by: Himal Prasad Ghimiray
> > > > <himal.prasad.ghimiray@intel.com>
> > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > > ---
> > > > drivers/gpu/drm/xe/xe_module.c | 3 --
> > > > drivers/gpu/drm/xe/xe_module.h | 1 -
> > > > drivers/gpu/drm/xe/xe_svm.c | 89 +++++++++++++++++++++++++---
> > > > ----
> > > > --
> > > > drivers/gpu/drm/xe/xe_svm.h | 5 --
> > > > 4 files changed, 65 insertions(+), 33 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_module.c
> > > > b/drivers/gpu/drm/xe/xe_module.c
> > > > index 05c7d0ae6d83..1c4dfafbcd0b 100644
> > > > --- a/drivers/gpu/drm/xe/xe_module.c
> > > > +++ b/drivers/gpu/drm/xe/xe_module.c
> > > > @@ -33,9 +33,6 @@ struct xe_modparam xe_modparam = {
> > > > module_param_named(svm_notifier_size,
> > > > xe_modparam.svm_notifier_size,
> > > > uint, 0600);
> > > > MODULE_PARM_DESC(svm_notifier_size, "Set the svm notifier
> > > > size(in
> > > > MiB), must be power of 2");
> > > >
> > > > -module_param_named(always_migrate_to_vram,
> > > > xe_modparam.always_migrate_to_vram, bool, 0444);
> > > > -MODULE_PARM_DESC(always_migrate_to_vram, "Always migrate to VRAM
> > > > on
> > > > GPU fault");
> > > > -
> > > module_param_named_unsafe(force_execlist,
> > > > xe_modparam.force_execlist, bool, 0444);
> > > > MODULE_PARM_DESC(force_execlist, "Force Execlist submission");
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_module.h
> > > > b/drivers/gpu/drm/xe/xe_module.h
> > > > index 84339e509c80..5a3bfea8b7b4 100644
> > > > --- a/drivers/gpu/drm/xe/xe_module.h
> > > > +++ b/drivers/gpu/drm/xe/xe_module.h
> > > > @@ -12,7 +12,6 @@
> > > > struct xe_modparam {
> > > > bool force_execlist;
> > > > bool probe_display;
> > > > - bool always_migrate_to_vram;
> > > > u32 force_vram_bar_size;
> > > > int guc_log_level;
> > > > char *guc_firmware_path;
> > > > diff --git a/drivers/gpu/drm/xe/xe_svm.c
> > > > b/drivers/gpu/drm/xe/xe_svm.c
> > > > index 890f6b2f40e9..f749ae367a8f 100644
> > > > --- a/drivers/gpu/drm/xe/xe_svm.c
> > > > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > > > @@ -650,9 +650,11 @@ void xe_svm_fini(struct xe_vm *vm)
> > > > }
> > > >
> > > > static bool xe_svm_range_is_valid(struct xe_svm_range *range,
> > > > - struct xe_tile *tile)
> > > > + struct xe_tile *tile,
> > > > + bool devmem_only)
> > > > {
> > > > - return (range->tile_present & ~range->tile_invalidated)
> > > > &
> > > > BIT(tile->id);
> > > > + return ((range->tile_present & ~range->tile_invalidated)
> > > > &
> > > > BIT(tile->id))
> > > > + && (!devmem_only || range-
> > > > > base.flags.migrate_devmem);
> > > > }
> > >
> > > So let's say devmem_only is true here, and range-
> > > > base.flags.migrate_devmem is false. Wouldn't that mean the range
> > > > is
> > > unusable and needs to be freed and re-allocated?
> > >
> >
> > This is typo, this should be s/migrate_devmem/has_devmem_pages.
> >
> > This translates to:
> >
> > Either devmem_only is not required or we have devmem pages with a
> > valid mapping.
> >
> > If migrate_devmem is false and devmem_only is true, that is a fatal
> > error actually, we should have check for that and kill the fault. An
> > example of this would be shared mapping which cannot be migrated to
> > devmem.
> >
> > > Also another thing going back to older code, it seems like range-
> > > > tile_invalidated is protected by the notifier lock, so shouldn't
> > > > we
> > > assert that to be held in the function? It seems not to be held
> > > further
> > > below:
> >
> > Yea techincally to get a stable value we'd need the notifier lock but
> > this is an opportunistic check - at worst if we read a valid range we
> > skip the page faults and will immediately get another page fault. So
> > we
> > could take the notifier lock here but I don't think this is strickly
> > required. Let me know what you think here.
>
> The problem with this is that the code gets harder to maintain and
Agree.
> understand. A new reader would probably first react over the lockless
> read, and then why there are no memory barriers and then what happens
> if the page-fault was marked as resolved without actually resolving it.
>
> So IMO if we do opportunistic tests to opt out of locking (which is
> discouraged in the drm locking guidelines
> https://blog.ffwll.ch/2022/08/locking-hierarchy.html)
> we should definitely add separate functions for that with extensive
> docs and READ_ONCE() annotation.
>
A lock here doesn't actually gain us anything, though, as the state can
immediately change after lock release triggering another fault. If you
agree, I'll go with READ_ONCE and add comments in the code indicating
it's an opportunistic check.
> But also think if this is really worth sacrificing readability instead
> of actually relying on alloc_vram() and get_pages() exiting early if
> everything looks ok?
>
alloc_vram() as is very expensive, get_pages() less but still CPU cycles.
The idea here is to short-circuit "page fault storms," where many EUs
access the same page simultaneously. If I recall correctly, this was a
significant issue on PVC—so much so that we are considering firmware and
hardware changes going forward. We should try to mitigate these
conditions in the page fault handler, if possible.
Matt
> >
> > >
> > > >
> > > > #if IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR)
> > > > @@ -726,6 +728,35 @@ static int xe_svm_alloc_vram(struct xe_vm
> > > > *vm,
> > > > struct xe_tile *tile,
> > > > }
> > > > #endif
> > > >
> > > > +static bool supports_4K_migration(struct xe_device *xe)
> > > > +{
> > > > + if (xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
> > > > + return false;
> > > > +
> > > > + return true;
> > > > +}
> > >
> > > Do we have any hardware that supports pagefaults but not 4K VRAM
> > > pages?
> > >
> >
> > PVC
>
> OK, I was under the impression that PVC actually supported 4K pages.
> But perhaps there was a bug encountered while implementing that.
>
>
> >
> > > > +
> > > > +static bool xe_svm_range_needs_migrate_to_vram(struct
> > > > xe_svm_range
> > > > *range,
> > > > + struct xe_vma
> > > > *vma)
> > > > +{
> > > > + struct xe_vm *vm = range_to_vm(&range->base);
> > > > + u64 range_size = xe_svm_range_size(range);
> > > > +
> > > > + if (!range->base.flags.migrate_devmem)
> > > > + return false;
> > > > +
> > > > + if (xe_svm_range_in_vram(range)) {
> > > > + drm_dbg(&vm->xe->drm, "Range is already in
> > > > VRAM\n");
> > > > + return false;
> > > > + }
> > > > +
> > > > + if (range_size <= SZ_64K && !supports_4K_migration(vm-
> > > > >xe))
> > > > {
> > > > + drm_dbg(&vm->xe->drm, "Platform doesn't support
> > > > SZ_4K range migration\n");
> > > > + return false;
> > > > + }
> > > > +
> > > > + return true;
> > > > +}
> > > >
> > > > /**
> > > > * xe_svm_handle_pagefault() - SVM handle page fault
> > > > @@ -750,12 +781,15 @@ int xe_svm_handle_pagefault(struct xe_vm
> > > > *vm,
> > > > struct xe_vma *vma,
> > > > IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> > > > .check_pages_threshold = IS_DGFX(vm->xe) &&
> > > > IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR)
> > > > ?
> > > > SZ_64K : 0,
> > > > + .devmem_only = atomic && IS_DGFX(vm->xe) &&
> > > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> > > > };
> > > > struct xe_svm_range *range;
> > > > struct drm_gpusvm_range *r;
> > > > struct drm_exec exec;
> > > > struct dma_fence *fence;
> > > > struct xe_tile *tile = gt_to_tile(gt);
> > > > + int migrate_try_count = ctx.devmem_only ? 3 : 1;
> > > > ktime_t end = 0;
> > > > int err;
> > > >
> > > > @@ -777,23 +811,26 @@ int xe_svm_handle_pagefault(struct xe_vm
> > > > *vm,
> > > > struct xe_vma *vma,
> > > > return PTR_ERR(r);
> > > >
> > > > range = to_xe_range(r);
> > > > - if (xe_svm_range_is_valid(range, tile))
> > > > + if (xe_svm_range_is_valid(range, tile, ctx.devmem_only))
> > >
> > > Requires notifier lock. Also see comment on re-allocating the range
> > > above.
> > >
> >
> > Same as above.
> >
> > > > return 0;
> > > >
> > > > range_debug(range, "PAGE FAULT");
> > > >
> > > > - /* XXX: Add migration policy, for now migrate range once
> > > > */
> > > > - if (!range->skip_migrate && range-
> > > > >base.flags.migrate_devmem
> > > > &&
> > > > - xe_svm_range_size(range) >= SZ_64K) {
> > > > - range->skip_migrate = true;
> > > > -
> > > > + if (--migrate_try_count >= 0 &&
> > > > + xe_svm_range_needs_migrate_to_vram(range, vma)
> > >
> > > Requires notifier lock.
> > >
> >
> > Same as above.
> >
> > > Should we have some sort of timeout instead of a try-count? Perhaps
> > > as
> > > a last resort fall back to a 4K range?
> > >
> >
> > I did have code like that at one point to reduce range size but it is
> > a
> > bit complicated as we'd have to remove the range... I'd rather stick
> > with the retry loop for now and if this becomes problematic, circle
> > back
> > to reducing the size of the fault page on each retry loop.
>
> OK, makes sense.
>
> /Thomas
>
>
> >
> > Matt
> >
> > > /Thomas
> > >
> > >
> > >
> > > > ) {
> > > > err = xe_svm_alloc_vram(vm, tile, range, &ctx);
> > > > if (err) {
> > > > - drm_dbg(&vm->xe->drm,
> > > > - "VRAM allocation failed, falling
> > > > back to "
> > > > - "retrying fault, asid=%u,
> > > > errno=%pe\n",
> > > > - vm->usm.asid, ERR_PTR(err));
> > > > - goto retry;
> > > > + if (migrate_try_count ||
> > > > !ctx.devmem_only) {
> > > > + drm_dbg(&vm->xe->drm,
> > > > + "VRAM allocation failed,
> > > > falling back to retrying fault, asid=%u, errno=%pe\n",
> > > > + vm->usm.asid,
> > > > ERR_PTR(err));
> > > > + goto retry;
> > > > + } else {
> > > > + drm_err(&vm->xe->drm,
> > > > + "VRAM allocation failed,
> > > > retry count exceeded, asid=%u, errno=%pe\n",
> > > > + vm->usm.asid,
> > > > ERR_PTR(err));
> > > > + return err;
> > > > + }
> > > > }
> > > > }
> > > >
> > > > @@ -801,15 +838,22 @@ int xe_svm_handle_pagefault(struct xe_vm
> > > > *vm,
> > > > struct xe_vma *vma,
> > > > err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r,
> > > > &ctx);
> > > > /* Corner where CPU mappings have changed */
> > > > if (err == -EOPNOTSUPP || err == -EFAULT || err == -
> > > > EPERM) {
> > > > - if (err == -EOPNOTSUPP) {
> > > > - range_debug(range, "PAGE FAULT - EVICT
> > > > PAGES");
> > > > - drm_gpusvm_range_evict(&vm->svm.gpusvm,
> > > > &range->base);
> > > > + if (migrate_try_count > 0 || !ctx.devmem_only) {
> > > > + if (err == -EOPNOTSUPP) {
> > > > + range_debug(range, "PAGE FAULT -
> > > > EVICT PAGES");
> > > > + drm_gpusvm_range_evict(&vm-
> > > > > svm.gpusvm,
> > > > + &range-
> > > > > base);
> > > > + }
> > > > + drm_dbg(&vm->xe->drm,
> > > > + "Get pages failed, falling back
> > > > to
> > > > retrying, asid=%u, gpusvm=%p, errno=%pe\n",
> > > > + vm->usm.asid, &vm->svm.gpusvm,
> > > > ERR_PTR(err));
> > > > + range_debug(range, "PAGE FAULT - RETRY
> > > > PAGES");
> > > > + goto retry;
> > > > + } else {
> > > > + drm_err(&vm->xe->drm,
> > > > + "Get pages failed, retry count
> > > > exceeded, asid=%u, gpusvm=%p, errno=%pe\n",
> > > > + vm->usm.asid, &vm->svm.gpusvm,
> > > > ERR_PTR(err));
> > > > }
> > > > - drm_dbg(&vm->xe->drm,
> > > > - "Get pages failed, falling back to
> > > > retrying,
> > > > asid=%u, gpusvm=%p, errno=%pe\n",
> > > > - vm->usm.asid, &vm->svm.gpusvm,
> > > > ERR_PTR(err));
> > > > - range_debug(range, "PAGE FAULT - RETRY PAGES");
> > > > - goto retry;
> > > > }
> > > > if (err) {
> > > > range_debug(range, "PAGE FAULT - FAIL PAGE
> > > > COLLECT");
> > > > @@ -843,9 +887,6 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> > > > struct xe_vma *vma,
> > > > }
> > > > drm_exec_fini(&exec);
> > > >
> > > > - if (xe_modparam.always_migrate_to_vram)
> > > > - range->skip_migrate = false;
> > > > -
> > > > dma_fence_wait(fence, false);
> > > > dma_fence_put(fence);
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_svm.h
> > > > b/drivers/gpu/drm/xe/xe_svm.h
> > > > index 3d441eb1f7ea..0e1f376a7471 100644
> > > > --- a/drivers/gpu/drm/xe/xe_svm.h
> > > > +++ b/drivers/gpu/drm/xe/xe_svm.h
> > > > @@ -39,11 +39,6 @@ struct xe_svm_range {
> > > > * range. Protected by GPU SVM notifier lock.
> > > > */
> > > > u8 tile_invalidated;
> > > > - /**
> > > > - * @skip_migrate: Skip migration to VRAM, protected by
> > > > GPU
> > > > fault handler
> > > > - * locking.
> > > > - */
> > > > - u8 skip_migrate :1;
> > > > };
> > > >
> > > > /**
> > >
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v4 2/5] drm/xe: Strict migration policy for atomic SVM faults
2025-04-25 7:39 ` Matthew Brost
@ 2025-04-25 9:10 ` Thomas Hellström
0 siblings, 0 replies; 17+ messages in thread
From: Thomas Hellström @ 2025-04-25 9:10 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe, dri-devel, himal.prasad.ghimiray
On Fri, 2025-04-25 at 00:39 -0700, Matthew Brost wrote:
> On Fri, Apr 25, 2025 at 09:18:19AM +0200, Thomas Hellström wrote:
> > On Thu, 2025-04-24 at 11:03 -0700, Matthew Brost wrote:
> > > On Thu, Apr 24, 2025 at 04:39:21PM +0200, Thomas Hellström wrote:
> > > > On Tue, 2025-04-22 at 10:04 -0700, Matthew Brost wrote:
> > > > > Mixing GPU and CPU atomics does not work unless a strict
> > > > > migration
> > > > > policy of GPU atomics must be device memory. Enforce a policy
> > > > > of
> > > > > must
> > > > > be
> > > > > in VRAM with a retry loop of 2 attempts, if retry loop fails
> > > > > abort
> > > > > fault.
> > > > >
> > > > > v2:
> > > > > - Only retry migration on atomics
> > > > > - Drop alway migrate modparam
> > > > > v3:
> > > > > - Only set vram_only on DGFX (Himal)
> > > > > - Bail on get_pages failure if vram_only and retry count
> > > > > exceeded
> > > > > (Himal)
> > > > > - s/vram_only/devmem_only
> > > > > - Update xe_svm_range_is_valid to accept devmem_only
> > > > > argument
> > > > > v4:
> > > > > - Fix logic bug get_pages failure
> > > > >
> > > > > Signed-off-by: Himal Prasad Ghimiray
> > > > > <himal.prasad.ghimiray@intel.com>
> > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > > > ---
> > > > > drivers/gpu/drm/xe/xe_module.c | 3 --
> > > > > drivers/gpu/drm/xe/xe_module.h | 1 -
> > > > > drivers/gpu/drm/xe/xe_svm.c | 89
> > > > > +++++++++++++++++++++++++---
> > > > > ----
> > > > > --
> > > > > drivers/gpu/drm/xe/xe_svm.h | 5 --
> > > > > 4 files changed, 65 insertions(+), 33 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/xe/xe_module.c
> > > > > b/drivers/gpu/drm/xe/xe_module.c
> > > > > index 05c7d0ae6d83..1c4dfafbcd0b 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_module.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_module.c
> > > > > @@ -33,9 +33,6 @@ struct xe_modparam xe_modparam = {
> > > > > module_param_named(svm_notifier_size,
> > > > > xe_modparam.svm_notifier_size,
> > > > > uint, 0600);
> > > > > MODULE_PARM_DESC(svm_notifier_size, "Set the svm notifier
> > > > > size(in
> > > > > MiB), must be power of 2");
> > > > >
> > > > > -module_param_named(always_migrate_to_vram,
> > > > > xe_modparam.always_migrate_to_vram, bool, 0444);
> > > > > -MODULE_PARM_DESC(always_migrate_to_vram, "Always migrate to
> > > > > VRAM
> > > > > on
> > > > > GPU fault");
> > > > > -
> > > > module_param_named_unsafe(force_execlist,
> > > > > xe_modparam.force_execlist, bool, 0444);
> > > > > MODULE_PARM_DESC(force_execlist, "Force Execlist
> > > > > submission");
> > > > >
> > > > > diff --git a/drivers/gpu/drm/xe/xe_module.h
> > > > > b/drivers/gpu/drm/xe/xe_module.h
> > > > > index 84339e509c80..5a3bfea8b7b4 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_module.h
> > > > > +++ b/drivers/gpu/drm/xe/xe_module.h
> > > > > @@ -12,7 +12,6 @@
> > > > > struct xe_modparam {
> > > > > bool force_execlist;
> > > > > bool probe_display;
> > > > > - bool always_migrate_to_vram;
> > > > > u32 force_vram_bar_size;
> > > > > int guc_log_level;
> > > > > char *guc_firmware_path;
> > > > > diff --git a/drivers/gpu/drm/xe/xe_svm.c
> > > > > b/drivers/gpu/drm/xe/xe_svm.c
> > > > > index 890f6b2f40e9..f749ae367a8f 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_svm.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > > > > @@ -650,9 +650,11 @@ void xe_svm_fini(struct xe_vm *vm)
> > > > > }
> > > > >
> > > > > static bool xe_svm_range_is_valid(struct xe_svm_range
> > > > > *range,
> > > > > - struct xe_tile *tile)
> > > > > + struct xe_tile *tile,
> > > > > + bool devmem_only)
> > > > > {
> > > > > - return (range->tile_present & ~range-
> > > > > >tile_invalidated)
> > > > > &
> > > > > BIT(tile->id);
> > > > > + return ((range->tile_present & ~range-
> > > > > >tile_invalidated)
> > > > > &
> > > > > BIT(tile->id))
> > > > > + && (!devmem_only || range-
> > > > > > base.flags.migrate_devmem);
> > > > > }
> > > >
> > > > So let's say devmem_only is true here, and range-
> > > > > base.flags.migrate_devmem is false. Wouldn't that mean the
> > > > > range
> > > > > is
> > > > unusable and needs to be freed and re-allocated?
> > > >
> > >
> > > This is typo, this should be s/migrate_devmem/has_devmem_pages.
> > >
> > > This translates to:
> > >
> > > Either devmem_only is not required or we have devmem pages with a
> > > valid mapping.
> > >
> > > If migrate_devmem is false and devmem_only is true, that is a
> > > fatal
> > > error actually, we should have check for that and kill the fault.
> > > An
> > > example of this would be shared mapping which cannot be migrated
> > > to
> > > devmem.
> > >
> > > > Also another thing going back to older code, it seems like
> > > > range-
> > > > > tile_invalidated is protected by the notifier lock, so
> > > > > shouldn't
> > > > > we
> > > > assert that to be held in the function? It seems not to be held
> > > > further
> > > > below:
> > >
> > > Yea techincally to get a stable value we'd need the notifier lock
> > > but
> > > this is an opportunistic check - at worst if we read a valid
> > > range we
> > > skip the page faults and will immediately get another page fault.
> > > So
> > > we
> > > could take the notifier lock here but I don't think this is
> > > strickly
> > > required. Let me know what you think here.
> >
> > The problem with this is that the code gets harder to maintain and
>
> Agree.
>
> > understand. A new reader would probably first react over the
> > lockless
> > read, and then why there are no memory barriers and then what
> > happens
> > if the page-fault was marked as resolved without actually resolving
> > it.
> >
> > So IMO if we do opportunistic tests to opt out of locking (which is
> > discouraged in the drm locking guidelines
> > https://blog.ffwll.ch/2022/08/locking-hierarchy.html)
> > we should definitely add separate functions for that with extensive
> > docs and READ_ONCE() annotation.
> >
>
> A lock here doesn't actually gain us anything, though, as the state
> can
> immediately change after lock release triggering another fault. If
> you
> agree, I'll go with READ_ONCE and add comments in the code indicating
> it's an opportunistic check.
Yeah. Ideally I think a well documented
xe_svm_range_check_valid() (or perhaps better name),
and an
xe_svm_range_is_valid() with an assert.
in the spirit of
mmu_interval_check_retry() (lockless, documented) and
mmu_interval_read_retry() (requires lock).
>
> > But also think if this is really worth sacrificing readability
> > instead
> > of actually relying on alloc_vram() and get_pages() exiting early
> > if
> > everything looks ok?
> >
>
> alloc_vram() as is very expensive, get_pages() less but still CPU
> cycles.
>
> The idea here is to short-circuit "page fault storms," where many EUs
> access the same page simultaneously. If I recall correctly, this was
> a
> significant issue on PVC—so much so that we are considering firmware
> and
> hardware changes going forward. We should try to mitigate these
> conditions in the page fault handler, if possible.
Yes, I think that's a perfectly legit case.
Perhaps in the future we could even short-circuit pf storms in the G2H
handler?
/Thomas
>
> Matt
>
> > >
> > > >
> > > > >
> > > > > #if IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR)
> > > > > @@ -726,6 +728,35 @@ static int xe_svm_alloc_vram(struct
> > > > > xe_vm
> > > > > *vm,
> > > > > struct xe_tile *tile,
> > > > > }
> > > > > #endif
> > > > >
> > > > > +static bool supports_4K_migration(struct xe_device *xe)
> > > > > +{
> > > > > + if (xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K)
> > > > > + return false;
> > > > > +
> > > > > + return true;
> > > > > +}
> > > >
> > > > Do we have any hardware that supports pagefaults but not 4K
> > > > VRAM
> > > > pages?
> > > >
> > >
> > > PVC
> >
> > OK, I was under the impression that PVC actually supported 4K
> > pages.
> > But perhaps there was a bug encountered while implementing that.
> >
> >
> > >
> > > > > +
> > > > > +static bool xe_svm_range_needs_migrate_to_vram(struct
> > > > > xe_svm_range
> > > > > *range,
> > > > > + struct xe_vma
> > > > > *vma)
> > > > > +{
> > > > > + struct xe_vm *vm = range_to_vm(&range->base);
> > > > > + u64 range_size = xe_svm_range_size(range);
> > > > > +
> > > > > + if (!range->base.flags.migrate_devmem)
> > > > > + return false;
> > > > > +
> > > > > + if (xe_svm_range_in_vram(range)) {
> > > > > + drm_dbg(&vm->xe->drm, "Range is already in
> > > > > VRAM\n");
> > > > > + return false;
> > > > > + }
> > > > > +
> > > > > + if (range_size <= SZ_64K &&
> > > > > !supports_4K_migration(vm-
> > > > > > xe))
> > > > > {
> > > > > + drm_dbg(&vm->xe->drm, "Platform doesn't
> > > > > support
> > > > > SZ_4K range migration\n");
> > > > > + return false;
> > > > > + }
> > > > > +
> > > > > + return true;
> > > > > +}
> > > > >
> > > > > /**
> > > > > * xe_svm_handle_pagefault() - SVM handle page fault
> > > > > @@ -750,12 +781,15 @@ int xe_svm_handle_pagefault(struct
> > > > > xe_vm
> > > > > *vm,
> > > > > struct xe_vma *vma,
> > > > > IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRR
> > > > > OR),
> > > > > .check_pages_threshold = IS_DGFX(vm->xe) &&
> > > > > IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRR
> > > > > OR)
> > > > > ?
> > > > > SZ_64K : 0,
> > > > > + .devmem_only = atomic && IS_DGFX(vm->xe) &&
> > > > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRR
> > > > > OR),
> > > > > };
> > > > > struct xe_svm_range *range;
> > > > > struct drm_gpusvm_range *r;
> > > > > struct drm_exec exec;
> > > > > struct dma_fence *fence;
> > > > > struct xe_tile *tile = gt_to_tile(gt);
> > > > > + int migrate_try_count = ctx.devmem_only ? 3 : 1;
> > > > > ktime_t end = 0;
> > > > > int err;
> > > > >
> > > > > @@ -777,23 +811,26 @@ int xe_svm_handle_pagefault(struct
> > > > > xe_vm
> > > > > *vm,
> > > > > struct xe_vma *vma,
> > > > > return PTR_ERR(r);
> > > > >
> > > > > range = to_xe_range(r);
> > > > > - if (xe_svm_range_is_valid(range, tile))
> > > > > + if (xe_svm_range_is_valid(range, tile,
> > > > > ctx.devmem_only))
> > > >
> > > > Requires notifier lock. Also see comment on re-allocating the
> > > > range
> > > > above.
> > > >
> > >
> > > Same as above.
> > >
> > > > > return 0;
> > > > >
> > > > > range_debug(range, "PAGE FAULT");
> > > > >
> > > > > - /* XXX: Add migration policy, for now migrate range
> > > > > once
> > > > > */
> > > > > - if (!range->skip_migrate && range-
> > > > > > base.flags.migrate_devmem
> > > > > &&
> > > > > - xe_svm_range_size(range) >= SZ_64K) {
> > > > > - range->skip_migrate = true;
> > > > > -
> > > > > + if (--migrate_try_count >= 0 &&
> > > > > + xe_svm_range_needs_migrate_to_vram(range, vma)
> > > >
> > > > Requires notifier lock.
> > > >
> > >
> > > Same as above.
> > >
> > > > Should we have some sort of timeout instead of a try-count?
> > > > Perhaps
> > > > as
> > > > a last resort fall back to a 4K range?
> > > >
> > >
> > > I did have code like that at one point to reduce range size but
> > > it is
> > > a
> > > bit complicated as we'd have to remove the range... I'd rather
> > > stick
> > > with the retry loop for now and if this becomes problematic,
> > > circle
> > > back
> > > to reducing the size of the fault page on each retry loop.
> >
> > OK, makes sense.
> >
> > /Thomas
> >
> >
> > >
> > > Matt
> > >
> > > > /Thomas
> > > >
> > > >
> > > >
> > > > > ) {
> > > > > err = xe_svm_alloc_vram(vm, tile, range,
> > > > > &ctx);
> > > > > if (err) {
> > > > > - drm_dbg(&vm->xe->drm,
> > > > > - "VRAM allocation failed,
> > > > > falling
> > > > > back to "
> > > > > - "retrying fault, asid=%u,
> > > > > errno=%pe\n",
> > > > > - vm->usm.asid, ERR_PTR(err));
> > > > > - goto retry;
> > > > > + if (migrate_try_count ||
> > > > > !ctx.devmem_only) {
> > > > > + drm_dbg(&vm->xe->drm,
> > > > > + "VRAM allocation
> > > > > failed,
> > > > > falling back to retrying fault, asid=%u, errno=%pe\n",
> > > > > + vm->usm.asid,
> > > > > ERR_PTR(err));
> > > > > + goto retry;
> > > > > + } else {
> > > > > + drm_err(&vm->xe->drm,
> > > > > + "VRAM allocation
> > > > > failed,
> > > > > retry count exceeded, asid=%u, errno=%pe\n",
> > > > > + vm->usm.asid,
> > > > > ERR_PTR(err));
> > > > > + return err;
> > > > > + }
> > > > > }
> > > > > }
> > > > >
> > > > > @@ -801,15 +838,22 @@ int xe_svm_handle_pagefault(struct
> > > > > xe_vm
> > > > > *vm,
> > > > > struct xe_vma *vma,
> > > > > err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r,
> > > > > &ctx);
> > > > > /* Corner where CPU mappings have changed */
> > > > > if (err == -EOPNOTSUPP || err == -EFAULT || err == -
> > > > > EPERM) {
> > > > > - if (err == -EOPNOTSUPP) {
> > > > > - range_debug(range, "PAGE FAULT -
> > > > > EVICT
> > > > > PAGES");
> > > > > - drm_gpusvm_range_evict(&vm-
> > > > > >svm.gpusvm,
> > > > > &range->base);
> > > > > + if (migrate_try_count > 0 ||
> > > > > !ctx.devmem_only) {
> > > > > + if (err == -EOPNOTSUPP) {
> > > > > + range_debug(range, "PAGE
> > > > > FAULT -
> > > > > EVICT PAGES");
> > > > > + drm_gpusvm_range_evict(&vm-
> > > > > > svm.gpusvm,
> > > > > +
> > > > > &range-
> > > > > > base);
> > > > > + }
> > > > > + drm_dbg(&vm->xe->drm,
> > > > > + "Get pages failed, falling
> > > > > back
> > > > > to
> > > > > retrying, asid=%u, gpusvm=%p, errno=%pe\n",
> > > > > + vm->usm.asid, &vm-
> > > > > >svm.gpusvm,
> > > > > ERR_PTR(err));
> > > > > + range_debug(range, "PAGE FAULT -
> > > > > RETRY
> > > > > PAGES");
> > > > > + goto retry;
> > > > > + } else {
> > > > > + drm_err(&vm->xe->drm,
> > > > > + "Get pages failed, retry
> > > > > count
> > > > > exceeded, asid=%u, gpusvm=%p, errno=%pe\n",
> > > > > + vm->usm.asid, &vm-
> > > > > >svm.gpusvm,
> > > > > ERR_PTR(err));
> > > > > }
> > > > > - drm_dbg(&vm->xe->drm,
> > > > > - "Get pages failed, falling back to
> > > > > retrying,
> > > > > asid=%u, gpusvm=%p, errno=%pe\n",
> > > > > - vm->usm.asid, &vm->svm.gpusvm,
> > > > > ERR_PTR(err));
> > > > > - range_debug(range, "PAGE FAULT - RETRY
> > > > > PAGES");
> > > > > - goto retry;
> > > > > }
> > > > > if (err) {
> > > > > range_debug(range, "PAGE FAULT - FAIL PAGE
> > > > > COLLECT");
> > > > > @@ -843,9 +887,6 @@ int xe_svm_handle_pagefault(struct xe_vm
> > > > > *vm,
> > > > > struct xe_vma *vma,
> > > > > }
> > > > > drm_exec_fini(&exec);
> > > > >
> > > > > - if (xe_modparam.always_migrate_to_vram)
> > > > > - range->skip_migrate = false;
> > > > > -
> > > > > dma_fence_wait(fence, false);
> > > > > dma_fence_put(fence);
> > > > >
> > > > > diff --git a/drivers/gpu/drm/xe/xe_svm.h
> > > > > b/drivers/gpu/drm/xe/xe_svm.h
> > > > > index 3d441eb1f7ea..0e1f376a7471 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_svm.h
> > > > > +++ b/drivers/gpu/drm/xe/xe_svm.h
> > > > > @@ -39,11 +39,6 @@ struct xe_svm_range {
> > > > > * range. Protected by GPU SVM notifier lock.
> > > > > */
> > > > > u8 tile_invalidated;
> > > > > - /**
> > > > > - * @skip_migrate: Skip migration to VRAM, protected
> > > > > by
> > > > > GPU
> > > > > fault handler
> > > > > - * locking.
> > > > > - */
> > > > > - u8 skip_migrate :1;
> > > > > };
> > > > >
> > > > > /**
> > > >
> >
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2025-04-25 9:10 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-22 17:04 [PATCH v4 0/5] Enable SVM atomics in Xe / GPU SVM Matthew Brost
2025-04-22 17:04 ` [PATCH v4 1/5] drm/gpusvm: Introduce devmem_only flag for allocation Matthew Brost
2025-04-23 16:14 ` Matthew Brost
2025-04-22 17:04 ` [PATCH v4 2/5] drm/xe: Strict migration policy for atomic SVM faults Matthew Brost
2025-04-22 17:21 ` Ghimiray, Himal Prasad
2025-04-23 17:29 ` Matthew Brost
2025-04-24 14:39 ` Thomas Hellström
2025-04-24 18:03 ` Matthew Brost
2025-04-25 7:18 ` Thomas Hellström
2025-04-25 7:39 ` Matthew Brost
2025-04-25 9:10 ` Thomas Hellström
2025-04-22 17:04 ` [PATCH v4 3/5] drm/gpusvm: Add timeslicing support to GPU SVM Matthew Brost
2025-04-23 5:36 ` Ghimiray, Himal Prasad
2025-04-22 17:04 ` [PATCH v4 4/5] drm/xe: Timeslice GPU on atomic SVM fault Matthew Brost
2025-04-23 5:36 ` Ghimiray, Himal Prasad
2025-04-22 17:04 ` [PATCH v4 5/5] drm/xe: Add atomic_svm_timeslice_ms debugfs entry Matthew Brost
2025-04-23 5:37 ` Ghimiray, Himal Prasad
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).