All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Auld <matthew.auld@intel.com>
To: intel-xe@lists.freedesktop.org
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Subject: [Intel-xe] [PATCH v5 6/7] drm/xe: fix xe_device_mem_access_get() races
Date: Wed, 17 May 2023 16:22:43 +0100	[thread overview]
Message-ID: <20230517152244.348171-6-matthew.auld@intel.com> (raw)
In-Reply-To: <20230517152244.348171-1-matthew.auld@intel.com>

It looks like there is at least one race here, given that the
pm_runtime_suspended() check looks to return false if we are in the
process of suspending the device (RPM_SUSPENDING vs RPM_SUSPENDED).  We
later also do xe_pm_runtime_get_if_active(), but since the device is
suspending or has now suspended, this doesn't do anything either.
Following from this we can potentially return from
xe_device_mem_access_get() with the device suspended or about to be,
leading to broken behaviour.

Attempt to fix this by always grabbing the runtime ref when our
internal ref transitions from 0 -> 1, and then wrap the whole thing with
a lock to ensure callers are serialized.

v2:
 - ct->lock looks to be primed with fs_reclaim, so holding that and then
   allocating memory will cause lockdep to complain. Now that we
   unconditionally grab the mem_access.lock around mem_access_{get,put}, we
   need to change the ordering wrt to grabbing the ct->lock, since some of
   the runtime_pm routines can allocate memory (or at least that's what
   lockdep seems to suggest). Hopefully not a big deal.  It might be that
   there were already issues with this, just that the atomics where
   "hiding" the potential issues.
v3:
 - Use Thomas Hellström' idea with tracking the active task that is
   executing in the resume or suspend callback, in order to avoid
   recursive resume/suspend calls deadlocking on itself.
 - Split the ct->lock change.
v4:
 - Add smb_mb() around accessing the pm_callback_task for extra safety
   (Thomas Hellström)

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/258
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/xe/xe_device.c       | 52 +++++++++++++++++++---
 drivers/gpu/drm/xe/xe_device.h       | 17 +------
 drivers/gpu/drm/xe/xe_device_types.h | 11 ++++-
 drivers/gpu/drm/xe/xe_pm.c           | 66 +++++++++++++++++++---------
 drivers/gpu/drm/xe/xe_pm.h           |  3 +-
 5 files changed, 104 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index d0297ee432b2..55f974b1ee78 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -403,24 +403,62 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
 		DIV_ROUND_UP(size, NUM_BYTES_PER_CCS_BYTE) : 0;
 }
 
+void xe_device_assert_mem_access(struct xe_device *xe)
+{
+	if (xe_pm_read_callback_task(xe) == current)
+		return;
+
+	XE_WARN_ON(!xe->mem_access.ref);
+}
+
+bool xe_device_mem_access_ongoing(struct xe_device *xe)
+{
+	bool ret;
+
+	if (xe_pm_read_callback_task(xe) == current)
+		return true;
+
+	mutex_lock(&xe->mem_access.lock);
+	ret = xe->mem_access.ref;
+	mutex_unlock(&xe->mem_access.lock);
+
+	return ret;
+}
+
 void xe_device_mem_access_get(struct xe_device *xe)
 {
-	bool resumed = xe_pm_runtime_resume_if_suspended(xe);
+	/*
+	 * This looks racy, but should be fine since the pm_callback_task only
+	 * transitions from NULL -> current (and back to NULL again), during the
+	 * runtime_resume() or runtime_suspend() callbacks, for which there can
+	 * only be a single one running for our device. We only need to prevent
+	 * recursively calling the runtime_get or runtime_put from those
+	 * callbacks, as well as preventing triggering any access_ongoing
+	 * asserts.
+	 */
+	if (xe_pm_read_callback_task(xe) == current)
+		return;
 
+	/*
+	 * It's possible to get rid of the mem_access.lock here, however lockdep
+	 * has an easier time digesting the mem_access.lock -> runtime_pm chain
+	 * (and any locks the caller is holding), on all transitions, and not
+	 * for example just on the 0 -> 1.
+	 */
 	mutex_lock(&xe->mem_access.lock);
-	if (xe->mem_access.ref++ == 0)
-		xe->mem_access.hold_rpm = xe_pm_runtime_get_if_active(xe);
+	if (xe->mem_access.ref == 0)
+		xe->mem_access.hold_rpm = xe_pm_runtime_resume_and_get(xe);
+	xe->mem_access.ref++;
 	mutex_unlock(&xe->mem_access.lock);
 
-	/* The usage counter increased if device was immediately resumed */
-	if (resumed)
-		xe_pm_runtime_put(xe);
-
 	XE_WARN_ON(xe->mem_access.ref == S32_MAX);
 }
 
 void xe_device_mem_access_put(struct xe_device *xe)
 {
+	if (xe_pm_read_callback_task(xe) == current)
+		return;
+
 	mutex_lock(&xe->mem_access.lock);
 	if (--xe->mem_access.ref == 0 && xe->mem_access.hold_rpm)
 		xe_pm_runtime_put(xe);
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 9ab7e6134f89..64576ed9323e 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -90,21 +90,8 @@ static inline struct xe_force_wake * gt_to_fw(struct xe_gt *gt)
 void xe_device_mem_access_get(struct xe_device *xe);
 void xe_device_mem_access_put(struct xe_device *xe);
 
-static inline void xe_device_assert_mem_access(struct xe_device *xe)
-{
-	XE_WARN_ON(!xe->mem_access.ref);
-}
-
-static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
-{
-	bool ret;
-
-	mutex_lock(&xe->mem_access.lock);
-	ret = xe->mem_access.ref;
-	mutex_unlock(&xe->mem_access.lock);
-
-	return ret;
-}
+void xe_device_assert_mem_access(struct xe_device *xe);
+bool xe_device_mem_access_ongoing(struct xe_device *xe);
 
 static inline bool xe_device_in_fault_mode(struct xe_device *xe)
 {
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 0b6860156574..b0c69fcb4065 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -256,7 +256,10 @@ struct xe_device {
 	 * triggering additional actions when they occur.
 	 */
 	struct {
-		/** @lock: protect the ref count */
+		/**
+		 * @lock: Serialize xe_device_mem_access users,
+		 * and protect the below internal state, like @ref.
+		 */
 		struct mutex lock;
 		/** @ref: ref count of memory accesses */
 		s32 ref;
@@ -264,6 +267,12 @@ struct xe_device {
 		bool hold_rpm;
 	} mem_access;
 
+	/**
+	 * @pm_callback_task: Track the active task that is running in either
+	 * the runtime_suspend or runtime_resume callbacks.
+	 */
+	struct task_struct *pm_callback_task;
+
 	/** @d3cold_allowed: Indicates if d3cold is a valid device state */
 	bool d3cold_allowed;
 
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index b7b57f10ba25..29c8861e58a5 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -137,43 +137,71 @@ void xe_pm_runtime_fini(struct xe_device *xe)
 	pm_runtime_forbid(dev);
 }
 
+static void xe_pm_write_callback_task(struct xe_device *xe,
+				      struct task_struct *task)
+{
+	WRITE_ONCE(xe->pm_callback_task, task);
+
+	/*
+	 * Just in case it's somehow possible for our writes to be reordered to
+	 * the extent that something else re-uses the task written in
+	 * pm_callback_task. For example after returning from the callback, but
+	 * before the reordered write that resets pm_callback_task back to NULL.
+	 */
+	smp_mb(); /* pairs with xe_pm_read_callback_task */
+}
+
+struct task_struct *xe_pm_read_callback_task(struct xe_device *xe)
+{
+	smp_mb(); /* pairs with xe_pm_write_callback_task */
+
+	return READ_ONCE(xe->pm_callback_task);
+}
+
 int xe_pm_runtime_suspend(struct xe_device *xe)
 {
 	struct xe_gt *gt;
 	u8 id;
-	int err;
+	int err = 0;
+
+	if (xe->d3cold_allowed && xe_device_mem_access_ongoing(xe))
+		return -EBUSY;
+
+	/* Disable access_ongoing asserts and prevent recursive pm calls */
+	xe_pm_write_callback_task(xe, current);
 
 	if (xe->d3cold_allowed) {
-		if (xe_device_mem_access_ongoing(xe))
-			return -EBUSY;
-
 		err = xe_bo_evict_all(xe);
 		if (err)
-			return err;
+			goto out;
 	}
 
 	for_each_gt(gt, xe, id) {
 		err = xe_gt_suspend(gt);
 		if (err)
-			return err;
+			goto out;
 	}
 
 	xe_irq_suspend(xe);
-
-	return 0;
+out:
+	xe_pm_write_callback_task(xe, NULL);
+	return err;
 }
 
 int xe_pm_runtime_resume(struct xe_device *xe)
 {
 	struct xe_gt *gt;
 	u8 id;
-	int err;
+	int err = 0;
+
+	/* Disable access_ongoing asserts and prevent recursive pm calls */
+	xe_pm_write_callback_task(xe, current);
 
 	if (xe->d3cold_allowed) {
 		for_each_gt(gt, xe, id) {
 			err = xe_pcode_init(gt);
 			if (err)
-				return err;
+				goto out;
 		}
 
 		/*
@@ -182,7 +210,7 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 		 */
 		err = xe_bo_restore_kernel(xe);
 		if (err)
-			return err;
+			goto out;
 	}
 
 	xe_irq_resume(xe);
@@ -193,10 +221,11 @@ int xe_pm_runtime_resume(struct xe_device *xe)
 	if (xe->d3cold_allowed) {
 		err = xe_bo_restore_user(xe);
 		if (err)
-			return err;
+			goto out;
 	}
-
-	return 0;
+out:
+	xe_pm_write_callback_task(xe, NULL);
+	return err;
 }
 
 int xe_pm_runtime_get(struct xe_device *xe)
@@ -210,14 +239,9 @@ int xe_pm_runtime_put(struct xe_device *xe)
 	return pm_runtime_put_autosuspend(xe->drm.dev);
 }
 
-/* Return true if resume operation happened and usage count was increased */
-bool xe_pm_runtime_resume_if_suspended(struct xe_device *xe)
+bool xe_pm_runtime_resume_and_get(struct xe_device *xe)
 {
-	/* In case we are suspended we need to immediately wake up */
-	if (pm_runtime_suspended(xe->drm.dev))
-		return !pm_runtime_resume_and_get(xe->drm.dev);
-
-	return false;
+	return !pm_runtime_resume_and_get(xe->drm.dev);
 }
 
 int xe_pm_runtime_get_if_active(struct xe_device *xe)
diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
index 6a885585f653..e92c508d44b9 100644
--- a/drivers/gpu/drm/xe/xe_pm.h
+++ b/drivers/gpu/drm/xe/xe_pm.h
@@ -19,7 +19,8 @@ int xe_pm_runtime_suspend(struct xe_device *xe);
 int xe_pm_runtime_resume(struct xe_device *xe);
 int xe_pm_runtime_get(struct xe_device *xe);
 int xe_pm_runtime_put(struct xe_device *xe);
-bool xe_pm_runtime_resume_if_suspended(struct xe_device *xe);
+bool xe_pm_runtime_resume_and_get(struct xe_device *xe);
 int xe_pm_runtime_get_if_active(struct xe_device *xe);
+struct task_struct *xe_pm_read_callback_task(struct xe_device *xe);
 
 #endif
-- 
2.40.1


  parent reply	other threads:[~2023-05-17 15:23 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-17 15:22 [Intel-xe] [PATCH v5 1/7] drm: fix drmm_mutex_init() Matthew Auld
2023-05-17 15:22 ` Matthew Auld
2023-05-17 15:22 ` [Intel-xe] [PATCH v5 2/7] Revert "drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing" Matthew Auld
2023-05-17 15:22 ` [Intel-xe] [PATCH v5 3/7] drm/xe: don't allocate under ct->lock Matthew Auld
2023-05-17 15:50   ` Rodrigo Vivi
2023-05-17 15:22 ` [Intel-xe] [PATCH v5 4/7] drm/xe: keep pulling mem_access_get further back Matthew Auld
2023-05-17 15:53   ` Rodrigo Vivi
2023-05-17 15:22 ` [Intel-xe] [PATCH v5 5/7] drm/xe/ggtt: prime ggtt->lock against FS_RECLAIM Matthew Auld
2023-05-17 15:48   ` Rodrigo Vivi
2023-05-17 16:21     ` Matthew Auld
2023-05-17 15:22 ` Matthew Auld [this message]
2023-05-19 20:25   ` [Intel-xe] [PATCH v5 6/7] drm/xe: fix xe_device_mem_access_get() races Rodrigo Vivi
2023-05-22  9:49     ` Matthew Auld
2023-05-17 15:22 ` [Intel-xe] [PATCH v5 7/7] drm/xe: Use atomic for mem_access.ref Matthew Auld
2023-05-17 15:25 ` [Intel-xe] ✓ CI.Patch_applied: success for series starting with [v5,1/7] drm: fix drmm_mutex_init() Patchwork
2023-05-17 15:27 ` [Intel-xe] ✓ CI.KUnit: " Patchwork
2023-05-17 15:31 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-05-17 15:49 ` [Intel-xe] ○ CI.BAT: info " Patchwork
2023-05-17 16:05 ` [Intel-xe] [PATCH v5 1/7] " Stanislaw Gruszka
2023-05-17 16:05   ` Stanislaw Gruszka
2023-05-17 16:29   ` [Intel-xe] " Matthew Auld
2023-05-17 16:29     ` Matthew Auld
2023-05-17 17:03     ` [Intel-xe] " Thomas Hellström
2023-05-17 17:03       ` Thomas Hellström
2023-05-17 17:47       ` [Intel-xe] " Thomas Zimmermann
2023-05-17 17:47         ` Thomas Zimmermann
2023-05-17 16:21 ` [Intel-xe] " Thomas Zimmermann
2023-05-17 16:21   ` Thomas Zimmermann
2023-05-17 17:04   ` [Intel-xe] " Matthew Auld
2023-05-17 17:04     ` Matthew Auld
2023-05-17 17:43     ` [Intel-xe] " Thomas Zimmermann
2023-05-17 17:43       ` Thomas Zimmermann
2023-05-18  9:51 ` [Intel-xe] ✓ CI.Patch_applied: success for series starting with [v5,1/7] drm: fix drmm_mutex_init() (rev2) Patchwork
2023-05-18  9:52 ` [Intel-xe] ✓ CI.KUnit: " Patchwork
2023-05-18  9:56 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-05-18 10:16 ` [Intel-xe] ○ CI.BAT: info " Patchwork
2023-05-19 15:49 ` [Intel-xe] ✓ CI.Patch_applied: success for series starting with [v5,1/7] drm: fix drmm_mutex_init() (rev3) Patchwork
2023-05-19 15:50 ` [Intel-xe] ✓ CI.KUnit: " Patchwork
2023-05-19 15:54 ` [Intel-xe] ✓ CI.Build: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230517152244.348171-6-matthew.auld@intel.com \
    --to=matthew.auld@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.