Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 1/1] drm/xe/gsc: Fix GSC proxy cleanup on early initialization failure
@ 2026-02-20 22:53 Zhanjun Dong
  2026-02-20 23:00 ` ✓ CI.KUnit: success for series starting with [v5,1/1] " Patchwork
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Zhanjun Dong @ 2026-02-20 22:53 UTC (permalink / raw)
  To: intel-xe; +Cc: Zhanjun Dong, Daniele Ceraolo Spurio

xe_gsc_proxy_remove undoes what is done in both xe_gsc_proxy_init and
xe_gsc_proxy_start; however, if we fail between those 2 calls, it is
possible that the HW forcewake access hasn't been initialized yet and so
we hit errors when the cleanup code tries to write GSC register. To
avoid that, split the cleanup in 2 functions so that the HW cleanup is
only called if the HW setup was completed successfully.

Since the HW cleanup (interrupt disabling) is now removed from
xe_gsc_proxy_remove, the cleanup on error paths in xe_gsc_proxy_start
must be updated to disable interrupts before returning.

Fixes: ff6cd29b690b ("drm/xe: Cleanup unwind of gt initialization")
Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
v5:
- Update comments (Daniele)
v4:
- Replace devm-managed cleanup action for xe_gsc_proxy_stop() with a
  manual flag-based approach using a 'started' flag. This avoids a race
  condition where module unload could start while the async GSC proxy
  initialization is still in progress, potentially causing the devm
  cleanup to be called at the wrong time.
- Set gsc->proxy.started = true at the end of xe_gsc_proxy_start() when
  initialization completes successfully.
- Check gsc->proxy.started in xe_gsc_proxy_remove() to conditionally
  call xe_gsc_proxy_stop() only if the proxy was actually started.

v3:
- Move xe_gsc_wait_for_worker_completion() to xe_gsc_proxy_stop() after
  disabling interrupts, since the worker shouldn't be queued anymore
  after interrupts are disabled.
- Update commit message to clarify that the error handling changes in
  xe_gsc_proxy_start() are necessary due to the cleanup refactoring,
  not a separate fix.

v2:
- Split cleanup into two functions: xe_gsc_proxy_remove() for SW cleanup
  and xe_gsc_proxy_stop() for HW cleanup that requires forcewake access.
- Add error handling in xe_gsc_proxy_start to disable interrupts on
  early error exits.
---
 drivers/gpu/drm/xe/xe_gsc_proxy.c | 43 +++++++++++++++++++++++++------
 drivers/gpu/drm/xe/xe_gsc_types.h |  2 ++
 2 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gsc_proxy.c b/drivers/gpu/drm/xe/xe_gsc_proxy.c
index 42438b21f235..707db650a2ae 100644
--- a/drivers/gpu/drm/xe/xe_gsc_proxy.c
+++ b/drivers/gpu/drm/xe/xe_gsc_proxy.c
@@ -435,15 +435,11 @@ static int proxy_channel_alloc(struct xe_gsc *gsc)
 	return 0;
 }
 
-static void xe_gsc_proxy_remove(void *arg)
+static void xe_gsc_proxy_stop(struct xe_gsc *gsc)
 {
-	struct xe_gsc *gsc = arg;
 	struct xe_gt *gt = gsc_to_gt(gsc);
 	struct xe_device *xe = gt_to_xe(gt);
 
-	if (!gsc->proxy.component_added)
-		return;
-
 	/* disable HECI2 IRQs */
 	scoped_guard(xe_pm_runtime, xe) {
 		CLASS(xe_force_wake, fw_ref)(gt_to_fw(gt), XE_FW_GSC);
@@ -455,6 +451,30 @@ static void xe_gsc_proxy_remove(void *arg)
 	}
 
 	xe_gsc_wait_for_worker_completion(gsc);
+	gsc->proxy.started = false;
+}
+
+static void xe_gsc_proxy_remove(void *arg)
+{
+	struct xe_gsc *gsc = arg;
+	struct xe_gt *gt = gsc_to_gt(gsc);
+	struct xe_device *xe = gt_to_xe(gt);
+
+	if (!gsc->proxy.component_added)
+		return;
+
+	/*
+	 * GSC proxy start is an async process that can be ongoing during
+	 * Xe module load/unload. Using devm managed action to register
+	 * xe_gsc_proxy_stop could cause issues if Xe module unload has
+	 * already started when the action is registered, potentially leading
+	 * to the cleanup being called at the wrong time. Therefore, instead
+	 * of registering a separate devm action to undo what is done in
+	 * proxy start, we call it from here, but only if the start has
+	 * completed successfully (tracked with the 'started' flag).
+	 */
+	if (gsc->proxy.started)
+		xe_gsc_proxy_stop(gsc);
 
 	component_del(xe->drm.dev, &xe_gsc_proxy_component_ops);
 	gsc->proxy.component_added = false;
@@ -510,6 +530,7 @@ int xe_gsc_proxy_init(struct xe_gsc *gsc)
  */
 int xe_gsc_proxy_start(struct xe_gsc *gsc)
 {
+	struct xe_gt *gt = gsc_to_gt(gsc);
 	int err;
 
 	/* enable the proxy interrupt in the GSC shim layer */
@@ -521,12 +542,18 @@ int xe_gsc_proxy_start(struct xe_gsc *gsc)
 	 */
 	err = xe_gsc_proxy_request_handler(gsc);
 	if (err)
-		return err;
+		goto err_irq_disable;
 
 	if (!xe_gsc_proxy_init_done(gsc)) {
-		xe_gt_err(gsc_to_gt(gsc), "GSC FW reports proxy init not completed\n");
-		return -EIO;
+		xe_gt_err(gt, "GSC FW reports proxy init not completed\n");
+		err = -EIO;
+		goto err_irq_disable;
 	}
 
+	gsc->proxy.started = true;
 	return 0;
+
+err_irq_disable:
+	gsc_proxy_irq_toggle(gsc, false);
+	return err;
 }
diff --git a/drivers/gpu/drm/xe/xe_gsc_types.h b/drivers/gpu/drm/xe/xe_gsc_types.h
index 97c056656df0..5aaa2a75861f 100644
--- a/drivers/gpu/drm/xe/xe_gsc_types.h
+++ b/drivers/gpu/drm/xe/xe_gsc_types.h
@@ -58,6 +58,8 @@ struct xe_gsc {
 		struct mutex mutex;
 		/** @proxy.component_added: whether the component has been added */
 		bool component_added;
+		/** @proxy.started: whether the proxy has been started */
+		bool started;
 		/** @proxy.bo: object to store message to and from the GSC */
 		struct xe_bo *bo;
 		/** @proxy.to_gsc: map of the memory used to send messages to the GSC */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-02-26 15:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-20 22:53 [PATCH v5 1/1] drm/xe/gsc: Fix GSC proxy cleanup on early initialization failure Zhanjun Dong
2026-02-20 23:00 ` ✓ CI.KUnit: success for series starting with [v5,1/1] " Patchwork
2026-02-20 23:34 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-23 11:58 ` ✗ Xe.CI.FULL: failure " Patchwork
2026-02-24 15:24 ` ✓ CI.KUnit: success for series starting with [v5,1/1] drm/xe/gsc: Fix GSC proxy cleanup on early initialization failure (rev2) Patchwork
2026-02-25 23:41 ` ✓ CI.KUnit: success for series starting with [v5,1/1] drm/xe/gsc: Fix GSC proxy cleanup on early initialization failure (rev3) Patchwork
2026-02-26  0:37 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-26  3:13 ` ✗ Xe.CI.FULL: failure " Patchwork
2026-02-26 15:02   ` Dong, Zhanjun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox