Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: intel-xe@lists.freedesktop.org
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"Matthew Brost" <matthew.brost@intel.com>,
	dri-devel@lists.freedesktop.org, himal.prasad.ghimiray@intel.com,
	apopple@nvidia.com, airlied@gmail.com,
	"Simona Vetter" <simona.vetter@ffwll.ch>,
	felix.kuehling@amd.com,
	"Christian König" <christian.koenig@amd.com>,
	dakr@kernel.org, "Mrozek, Michal" <michal.mrozek@intel.com>,
	"Joonas Lahtinen" <joonas.lahtinen@linux.intel.com>
Subject: [PATCH v6 24/24] drm/xe/svm: Serialize migration to device if racing
Date: Fri, 19 Dec 2025 12:33:20 +0100	[thread overview]
Message-ID: <20251219113320.183860-25-thomas.hellstrom@linux.intel.com> (raw)
In-Reply-To: <20251219113320.183860-1-thomas.hellstrom@linux.intel.com>

Introduce an rw-semaphore to serialize migration to device if
it's likely that migration races with another device migration
of the same CPU address space range.
This is a temporary fix to attempt to mitigate a livelock that
might happen if many devices try to migrate a range at the same
time, and it affects only devices using the xe driver.
A longer term fix is probably improvements in the core mm
migration layer.

Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/xe/xe_svm.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
index 84ff99aa3e49..fa2ee2c08f31 100644
--- a/drivers/gpu/drm/xe/xe_svm.c
+++ b/drivers/gpu/drm/xe/xe_svm.c
@@ -1593,10 +1593,12 @@ struct drm_pagemap *xe_vma_resolve_pagemap(struct xe_vma *vma, struct xe_tile *t
 int xe_svm_alloc_vram(struct xe_svm_range *range, const struct drm_gpusvm_ctx *ctx,
 		      struct drm_pagemap *dpagemap)
 {
+	static DECLARE_RWSEM(driver_migrate_lock);
 	struct xe_vm *vm = range_to_vm(&range->base);
 	enum drm_gpusvm_scan_result migration_state;
 	struct xe_device *xe = vm->xe;
 	int err, retries = 1;
+	bool write_locked = false;
 
 	xe_assert(range_to_vm(&range->base)->xe, range->base.pages.flags.migrate_devmem);
 	range_debug(range, "ALLOCATE VRAM");
@@ -1615,16 +1617,32 @@ int xe_svm_alloc_vram(struct xe_svm_range *range, const struct drm_gpusvm_ctx *c
 		drm_dbg(&xe->drm, "Request migration to device memory on \"%s\".\n",
 			dpagemap->drm->unique);
 
+	err = down_read_interruptible(&driver_migrate_lock);
+	if (err)
+		return err;
 	do {
 		err = drm_pagemap_populate_mm(dpagemap, xe_svm_range_start(range),
 					      xe_svm_range_end(range),
 					      range->base.gpusvm->mm,
 					      ctx->timeslice_ms);
 
-		if (err == -EBUSY && retries)
-			drm_gpusvm_range_evict(range->base.gpusvm, &range->base);
+		if (err == -EBUSY && retries) {
+			if (!write_locked) {
+				int lock_err;
 
+				up_read(&driver_migrate_lock);
+				lock_err = down_write_killable(&driver_migrate_lock);
+				if (lock_err)
+					return lock_err;
+				write_locked = true;
+			}
+			drm_gpusvm_range_evict(range->base.gpusvm, &range->base);
+		}
 	} while (err == -EBUSY && retries--);
+	if (write_locked)
+		up_write(&driver_migrate_lock);
+	else
+		up_read(&driver_migrate_lock);
 
 	return err;
 }
-- 
2.51.1


  parent reply	other threads:[~2025-12-19 11:35 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-19 11:32 [PATCH v6 00/24] Dynamic drm_pagemaps and Initial multi-device SVM Thomas Hellström
2025-12-19 11:32 ` [PATCH v6 01/24] drm/xe/svm: Fix a debug printout Thomas Hellström
2025-12-19 11:32 ` [PATCH v6 02/24] drm/pagemap: Remove some dead code Thomas Hellström
2025-12-19 11:32 ` [PATCH v6 03/24] drm/pagemap, drm/xe: Ensure that the devmem allocation is idle before use Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 04/24] drm/pagemap, drm/xe: Add refcounting to struct drm_pagemap Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 05/24] drm/pagemap: Add a refcounted drm_pagemap backpointer to struct drm_pagemap_zdd Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 06/24] drm/pagemap, drm/xe: Manage drm_pagemap provider lifetimes Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 07/24] drm/pagemap: Add a drm_pagemap cache and shrinker Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 08/24] drm/xe: Use the " Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 09/24] drm/pagemap: Remove the drm_pagemap_create() interface Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 10/24] drm/pagemap_util: Add a utility to assign an owner to a set of interconnected gpus Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 11/24] drm/xe: Use the drm_pagemap_util helper to get a svm pagemap owner Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 12/24] drm/xe: Pass a drm_pagemap pointer around with the memory advise attributes Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 13/24] drm/xe: Use the vma attibute drm_pagemap to select where to migrate Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 14/24] drm/xe: Simplify madvise_preferred_mem_loc() Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 15/24] drm/xe/uapi: Extend the madvise functionality to support foreign pagemap placement for svm Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 16/24] drm/xe: Support pcie p2p dma as a fast interconnect Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 17/24] drm/xe/vm: Add a couple of VM debug printouts Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 18/24] drm/xe/svm: Document how xe keeps drm_pagemap references Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 19/24] drm/pagemap, drm/xe: Clean up the use of the device-private page owner Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 20/24] drm/gpusvm: Introduce a function to scan the current migration state Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 21/24] drm/xe: Use drm_gpusvm_scan_mm() Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 22/24] drm/pagemap, drm/xe: Support destination migration over interconnect Thomas Hellström
2025-12-19 11:33 ` [PATCH v6 23/24] drm/pagemap: Support source " Thomas Hellström
2025-12-19 11:33 ` Thomas Hellström [this message]
2025-12-19 13:31 ` [PATCH v6 00/24] Dynamic drm_pagemaps and Initial multi-device SVM Maarten Lankhorst
2025-12-22 18:44 ` ✗ CI.checkpatch: warning for Dynamic drm_pagemaps and Initial multi-device SVM (rev7) Patchwork
2025-12-22 18:45 ` ✓ CI.KUnit: success " Patchwork
2025-12-22 19:21 ` ✓ Xe.CI.BAT: " Patchwork
2025-12-22 20:14 ` ✗ CI.checkpatch: warning for Dynamic drm_pagemaps and Initial multi-device SVM (rev8) Patchwork
2025-12-22 20:16 ` ✓ CI.KUnit: success " Patchwork
2025-12-22 21:15 ` ✓ Xe.CI.BAT: " Patchwork
2025-12-23  2:31 ` ✓ Xe.CI.Full: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251219113320.183860-25-thomas.hellstrom@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=airlied@gmail.com \
    --cc=apopple@nvidia.com \
    --cc=christian.koenig@amd.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=felix.kuehling@amd.com \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=matthew.brost@intel.com \
    --cc=michal.mrozek@intel.com \
    --cc=simona.vetter@ffwll.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox