From: Zhanjun Dong <zhanjun.dong@intel.com>
To: intel-xe@lists.freedesktop.org
Cc: Zhanjun Dong <zhanjun.dong@intel.com>,
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Subject: [PATCH v5 1/1] drm/xe/gsc: Fix GSC proxy cleanup on early initialization failure
Date: Fri, 20 Feb 2026 17:53:08 -0500 [thread overview]
Message-ID: <20260220225308.101469-1-zhanjun.dong@intel.com> (raw)
xe_gsc_proxy_remove undoes what is done in both xe_gsc_proxy_init and
xe_gsc_proxy_start; however, if we fail between those 2 calls, it is
possible that the HW forcewake access hasn't been initialized yet and so
we hit errors when the cleanup code tries to write GSC register. To
avoid that, split the cleanup in 2 functions so that the HW cleanup is
only called if the HW setup was completed successfully.
Since the HW cleanup (interrupt disabling) is now removed from
xe_gsc_proxy_remove, the cleanup on error paths in xe_gsc_proxy_start
must be updated to disable interrupts before returning.
Fixes: ff6cd29b690b ("drm/xe: Cleanup unwind of gt initialization")
Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
v5:
- Update comments (Daniele)
v4:
- Replace devm-managed cleanup action for xe_gsc_proxy_stop() with a
manual flag-based approach using a 'started' flag. This avoids a race
condition where module unload could start while the async GSC proxy
initialization is still in progress, potentially causing the devm
cleanup to be called at the wrong time.
- Set gsc->proxy.started = true at the end of xe_gsc_proxy_start() when
initialization completes successfully.
- Check gsc->proxy.started in xe_gsc_proxy_remove() to conditionally
call xe_gsc_proxy_stop() only if the proxy was actually started.
v3:
- Move xe_gsc_wait_for_worker_completion() to xe_gsc_proxy_stop() after
disabling interrupts, since the worker shouldn't be queued anymore
after interrupts are disabled.
- Update commit message to clarify that the error handling changes in
xe_gsc_proxy_start() are necessary due to the cleanup refactoring,
not a separate fix.
v2:
- Split cleanup into two functions: xe_gsc_proxy_remove() for SW cleanup
and xe_gsc_proxy_stop() for HW cleanup that requires forcewake access.
- Add error handling in xe_gsc_proxy_start to disable interrupts on
early error exits.
---
drivers/gpu/drm/xe/xe_gsc_proxy.c | 43 +++++++++++++++++++++++++------
drivers/gpu/drm/xe/xe_gsc_types.h | 2 ++
2 files changed, 37 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gsc_proxy.c b/drivers/gpu/drm/xe/xe_gsc_proxy.c
index 42438b21f235..707db650a2ae 100644
--- a/drivers/gpu/drm/xe/xe_gsc_proxy.c
+++ b/drivers/gpu/drm/xe/xe_gsc_proxy.c
@@ -435,15 +435,11 @@ static int proxy_channel_alloc(struct xe_gsc *gsc)
return 0;
}
-static void xe_gsc_proxy_remove(void *arg)
+static void xe_gsc_proxy_stop(struct xe_gsc *gsc)
{
- struct xe_gsc *gsc = arg;
struct xe_gt *gt = gsc_to_gt(gsc);
struct xe_device *xe = gt_to_xe(gt);
- if (!gsc->proxy.component_added)
- return;
-
/* disable HECI2 IRQs */
scoped_guard(xe_pm_runtime, xe) {
CLASS(xe_force_wake, fw_ref)(gt_to_fw(gt), XE_FW_GSC);
@@ -455,6 +451,30 @@ static void xe_gsc_proxy_remove(void *arg)
}
xe_gsc_wait_for_worker_completion(gsc);
+ gsc->proxy.started = false;
+}
+
+static void xe_gsc_proxy_remove(void *arg)
+{
+ struct xe_gsc *gsc = arg;
+ struct xe_gt *gt = gsc_to_gt(gsc);
+ struct xe_device *xe = gt_to_xe(gt);
+
+ if (!gsc->proxy.component_added)
+ return;
+
+ /*
+ * GSC proxy start is an async process that can be ongoing during
+ * Xe module load/unload. Using devm managed action to register
+ * xe_gsc_proxy_stop could cause issues if Xe module unload has
+ * already started when the action is registered, potentially leading
+ * to the cleanup being called at the wrong time. Therefore, instead
+ * of registering a separate devm action to undo what is done in
+ * proxy start, we call it from here, but only if the start has
+ * completed successfully (tracked with the 'started' flag).
+ */
+ if (gsc->proxy.started)
+ xe_gsc_proxy_stop(gsc);
component_del(xe->drm.dev, &xe_gsc_proxy_component_ops);
gsc->proxy.component_added = false;
@@ -510,6 +530,7 @@ int xe_gsc_proxy_init(struct xe_gsc *gsc)
*/
int xe_gsc_proxy_start(struct xe_gsc *gsc)
{
+ struct xe_gt *gt = gsc_to_gt(gsc);
int err;
/* enable the proxy interrupt in the GSC shim layer */
@@ -521,12 +542,18 @@ int xe_gsc_proxy_start(struct xe_gsc *gsc)
*/
err = xe_gsc_proxy_request_handler(gsc);
if (err)
- return err;
+ goto err_irq_disable;
if (!xe_gsc_proxy_init_done(gsc)) {
- xe_gt_err(gsc_to_gt(gsc), "GSC FW reports proxy init not completed\n");
- return -EIO;
+ xe_gt_err(gt, "GSC FW reports proxy init not completed\n");
+ err = -EIO;
+ goto err_irq_disable;
}
+ gsc->proxy.started = true;
return 0;
+
+err_irq_disable:
+ gsc_proxy_irq_toggle(gsc, false);
+ return err;
}
diff --git a/drivers/gpu/drm/xe/xe_gsc_types.h b/drivers/gpu/drm/xe/xe_gsc_types.h
index 97c056656df0..5aaa2a75861f 100644
--- a/drivers/gpu/drm/xe/xe_gsc_types.h
+++ b/drivers/gpu/drm/xe/xe_gsc_types.h
@@ -58,6 +58,8 @@ struct xe_gsc {
struct mutex mutex;
/** @proxy.component_added: whether the component has been added */
bool component_added;
+ /** @proxy.started: whether the proxy has been started */
+ bool started;
/** @proxy.bo: object to store message to and from the GSC */
struct xe_bo *bo;
/** @proxy.to_gsc: map of the memory used to send messages to the GSC */
--
2.34.1
next reply other threads:[~2026-02-20 22:53 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-20 22:53 Zhanjun Dong [this message]
2026-02-20 23:00 ` ✓ CI.KUnit: success for series starting with [v5,1/1] drm/xe/gsc: Fix GSC proxy cleanup on early initialization failure Patchwork
2026-02-20 23:34 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-23 11:58 ` ✗ Xe.CI.FULL: failure " Patchwork
2026-02-24 15:24 ` ✓ CI.KUnit: success for series starting with [v5,1/1] drm/xe/gsc: Fix GSC proxy cleanup on early initialization failure (rev2) Patchwork
2026-02-25 23:41 ` ✓ CI.KUnit: success for series starting with [v5,1/1] drm/xe/gsc: Fix GSC proxy cleanup on early initialization failure (rev3) Patchwork
2026-02-26 0:37 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-26 3:13 ` ✗ Xe.CI.FULL: failure " Patchwork
2026-02-26 15:02 ` Dong, Zhanjun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260220225308.101469-1-zhanjun.dong@intel.com \
--to=zhanjun.dong@intel.com \
--cc=daniele.ceraolospurio@intel.com \
--cc=intel-xe@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox