Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/xe: Avoid evicting object of the same vm in none fault mode
@ 2024-12-03  2:19 Oak Zeng
  2024-12-03  2:10 ` ✓ CI.Patch_applied: success for drm/xe: Avoid evicting object of the same vm in none fault mode (rev2) Patchwork
                   ` (8 more replies)
  0 siblings, 9 replies; 14+ messages in thread
From: Oak Zeng @ 2024-12-03  2:19 UTC (permalink / raw)
  To: intel-xe; +Cc: Thomas.Hellstrom

BO validation during vm_bind could trigger memory eviction when
system runs under memory pressure. Right now we blindly evict
BOs of all VMs. This scheme has a problem when system runs in
none recoverable page fault mode: even though the vm_bind could
be successful by evicting BOs, the later the rebinding of the
evicted BOs would fail. So it is better to report an out-of-
memory failure at vm_bind time than at time of rebinding where
xekmd currently doesn't have a good mechanism to report error
to user space.

This patch implemented a scheme to only evict objects of other
VMs during vm_bind time. Object of the same VM will skip eviction.
If we failed to find enough memory for vm_bind, we report error
to user space at vm_bind time.

This scheme is not needed for recoverable page fault mode under
what we can dynamically fault-in pages on demand.

v1: Use xe_vm_in_preempt_fence_mode instead of stack variable (Thomas)

Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_vm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 2492750505d69..016fedae5d554 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -2359,13 +2359,15 @@ static int vma_lock_and_validate(struct drm_exec *exec, struct xe_vma *vma,
 				 bool validate)
 {
 	struct xe_bo *bo = xe_vma_bo(vma);
+	struct xe_vm *vm = xe_vma_vm(vma);
 	int err = 0;
 
 	if (bo) {
 		if (!bo->vm)
 			err = drm_exec_lock_obj(exec, &bo->ttm.base);
 		if (!err && validate)
-			err = xe_bo_validate(bo, xe_vma_vm(vma), true);
+			err = xe_bo_validate(bo, vm,
+					     !xe_vm_in_preempt_fence_mode(vm));
 	}
 
 	return err;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread
* [PATCH] drm/xe: Avoid evicting object of the same vm in none fault mode
@ 2024-11-28 21:01 Oak Zeng
  2024-11-29  8:45 ` Thomas Hellström
  0 siblings, 1 reply; 14+ messages in thread
From: Oak Zeng @ 2024-11-28 21:01 UTC (permalink / raw)
  To: intel-xe; +Cc: Thomas.Hellstrom

BO validation during vm_bind could trigger memory eviction when
system runs under memory pressure. Right now we blindly evict
BOs of all VMs. This scheme has a problem when system runs in
none recoverable page fault mode: even though the vm_bind could
be successful by evicting BOs, the later the rebinding of the
evicted BOs would fail. So it is better to report an out-of-
memory failure at vm_bind time than at time of rebinding where
xekmd currently doesn't have a good mechanism to report error
to user space.

This patch implemented a scheme to only evict objects of other
VMs during vm_bind time. Object of the same VM will skip eviction.
If we failed to find enough memory for vm_bind, we report error
to user space at vm_bind time.

This scheme is not needed for recoverable page fault mode under
what we can dynamically fault-in pages on demand.

Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_vm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 2492750505d69..c005c96b88167 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -2359,13 +2359,15 @@ static int vma_lock_and_validate(struct drm_exec *exec, struct xe_vma *vma,
 				 bool validate)
 {
 	struct xe_bo *bo = xe_vma_bo(vma);
+	struct xe_vm *vm = xe_vma_vm(vma);
+	bool preempt_mode = xe_vm_in_preempt_fence_mode(vm);
 	int err = 0;
 
 	if (bo) {
 		if (!bo->vm)
 			err = drm_exec_lock_obj(exec, &bo->ttm.base);
 		if (!err && validate)
-			err = xe_bo_validate(bo, xe_vma_vm(vma), true);
+			err = xe_bo_validate(bo, vm, !preempt_mode);
 	}
 
 	return err;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-12-06 15:54 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-03  2:19 [PATCH] drm/xe: Avoid evicting object of the same vm in none fault mode Oak Zeng
2024-12-03  2:10 ` ✓ CI.Patch_applied: success for drm/xe: Avoid evicting object of the same vm in none fault mode (rev2) Patchwork
2024-12-03  2:11 ` ✓ CI.checkpatch: " Patchwork
2024-12-03  2:12 ` ✓ CI.KUnit: " Patchwork
2024-12-03  2:30 ` ✓ CI.Build: " Patchwork
2024-12-03  2:32 ` ✓ CI.Hooks: " Patchwork
2024-12-03  2:34 ` ✓ CI.checksparse: " Patchwork
2024-12-03  2:54 ` ✓ Xe.CI.BAT: " Patchwork
2024-12-03  4:00 ` ✗ Xe.CI.Full: failure " Patchwork
2024-12-06 14:45 ` [PATCH] drm/xe: Avoid evicting object of the same vm in none fault mode Rodrigo Vivi
2024-12-06 15:19   ` Zeng, Oak
2024-12-06 15:53     ` Rodrigo Vivi
  -- strict thread matches above, loose matches on Subject: below --
2024-11-28 21:01 Oak Zeng
2024-11-29  8:45 ` Thomas Hellström

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox