From: Alan Previn <alan.previn.teres.alexis@intel.com>
To: intel-gfx@lists.freedesktop.org
Cc: , Alan Previn <alan.previn.teres.alexis@intel.com>,
dri-devel@lists.freedesktop.org,
Rodrigo Vivi <rodrigo.vivi@intel.com>,
intel.com@freedesktop.org, Mousumi Jana <mousumi.jana@intel.com>
Subject: [Intel-gfx] [PATCH v3 3/3] drm/i915/gt: Timeout when waiting for idle in suspending
Date: Sat, 9 Sep 2023 20:58:46 -0700 [thread overview]
Message-ID: <20230910035846.493766-4-alan.previn.teres.alexis@intel.com> (raw)
In-Reply-To: <20230910035846.493766-1-alan.previn.teres.alexis@intel.com>
When suspending, add a timeout when calling
intel_gt_pm_wait_for_idle else if we have a lost
G2H event that holds a wakeref (which would be
indicative of a bug elsewhere in the driver),
driver will at least complete the suspend-resume
cycle, (albeit not hitting all the targets for
low power hw counters), instead of hanging in the kernel.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Tested-by: Mousumi Jana <mousumi.jana@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +-
drivers/gpu/drm/i915/gt/intel_gt_pm.c | 7 ++++++-
drivers/gpu/drm/i915/gt/intel_gt_pm.h | 7 ++++++-
drivers/gpu/drm/i915/intel_wakeref.c | 14 ++++++++++----
drivers/gpu/drm/i915/intel_wakeref.h | 6 ++++--
5 files changed, 27 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index dfb69fc977a0..4811f3be0332 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -688,7 +688,7 @@ void intel_engines_release(struct intel_gt *gt)
if (!engine->release)
continue;
- intel_wakeref_wait_for_idle(&engine->wakeref);
+ intel_wakeref_wait_for_idle(&engine->wakeref, 0);
GEM_BUG_ON(intel_engine_pm_is_awake(engine));
engine->release(engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index 5a942af0a14e..ca46aee72573 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -289,6 +289,8 @@ int intel_gt_resume(struct intel_gt *gt)
static void wait_for_suspend(struct intel_gt *gt)
{
+ int timeout_ms = CONFIG_DRM_I915_MAX_REQUEST_BUSYWAIT ? : 10000;
+
if (!intel_gt_pm_is_awake(gt))
return;
@@ -301,7 +303,10 @@ static void wait_for_suspend(struct intel_gt *gt)
intel_gt_retire_requests(gt);
}
- intel_gt_pm_wait_for_idle(gt);
+ /* we are suspending, so we shouldn't be waiting forever */
+ if (intel_gt_pm_wait_timeout_for_idle(gt, timeout_ms) == -ETIMEDOUT)
+ gt_warn(gt, "bailing from %s after %d milisec timeout\n",
+ __func__, timeout_ms);
}
void intel_gt_suspend_prepare(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
index 6c9a46452364..5358acc2b5b1 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
@@ -68,7 +68,12 @@ static inline void intel_gt_pm_might_put(struct intel_gt *gt)
static inline int intel_gt_pm_wait_for_idle(struct intel_gt *gt)
{
- return intel_wakeref_wait_for_idle(>->wakeref);
+ return intel_wakeref_wait_for_idle(>->wakeref, 0);
+}
+
+static inline int intel_gt_pm_wait_timeout_for_idle(struct intel_gt *gt, int timeout_ms)
+{
+ return intel_wakeref_wait_for_idle(>->wakeref, timeout_ms);
}
void intel_gt_pm_init_early(struct intel_gt *gt);
diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
index 718f2f1b6174..383a37521415 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.c
+++ b/drivers/gpu/drm/i915/intel_wakeref.c
@@ -111,14 +111,20 @@ void __intel_wakeref_init(struct intel_wakeref *wf,
"wakeref.work", &key->work, 0);
}
-int intel_wakeref_wait_for_idle(struct intel_wakeref *wf)
+int intel_wakeref_wait_for_idle(struct intel_wakeref *wf, int timeout_ms)
{
- int err;
+ int err = 0;
might_sleep();
- err = wait_var_event_killable(&wf->wakeref,
- !intel_wakeref_is_active(wf));
+ if (!timeout_ms)
+ err = wait_var_event_killable(&wf->wakeref,
+ !intel_wakeref_is_active(wf));
+ else if (wait_var_event_timeout(&wf->wakeref,
+ !intel_wakeref_is_active(wf),
+ msecs_to_jiffies(timeout_ms)) < 1)
+ err = -ETIMEDOUT;
+
if (err)
return err;
diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
index ec881b097368..302694a780d2 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.h
+++ b/drivers/gpu/drm/i915/intel_wakeref.h
@@ -251,15 +251,17 @@ __intel_wakeref_defer_park(struct intel_wakeref *wf)
/**
* intel_wakeref_wait_for_idle: Wait until the wakeref is idle
* @wf: the wakeref
+ * @timeout_ms: Timeout in ms, 0 means never timeout.
*
* Wait for the earlier asynchronous release of the wakeref. Note
* this will wait for any third party as well, so make sure you only wait
* when you have control over the wakeref and trust no one else is acquiring
* it.
*
- * Return: 0 on success, error code if killed.
+ * Returns 0 on success, -ETIMEDOUT upon a timeout, or the unlikely
+ * error propagation from wait_var_event_killable if timeout_ms is 0.
*/
-int intel_wakeref_wait_for_idle(struct intel_wakeref *wf);
+int intel_wakeref_wait_for_idle(struct intel_wakeref *wf, int timeout_ms);
struct intel_wakeref_auto {
struct drm_i915_private *i915;
--
2.39.0
next prev parent reply other threads:[~2023-09-10 3:58 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-10 3:58 [Intel-gfx] [PATCH v3 0/3] Resolve suspend-resume racing with GuC destroy-context-worker Alan Previn
2023-09-10 3:58 ` [Intel-gfx] [PATCH v3 1/3] drm/i915/guc: Flush context destruction worker at suspend Alan Previn
2023-09-14 15:35 ` Rodrigo Vivi
2023-09-22 16:54 ` Teres Alexis, Alan Previn
2023-09-10 3:58 ` [Intel-gfx] [PATCH v3 2/3] drm/i915/guc: Close deregister-context race against CT-loss Alan Previn
2023-09-14 15:34 ` Rodrigo Vivi
2023-09-22 18:02 ` Teres Alexis, Alan Previn
2023-09-23 4:00 ` Gupta, Anshuman
2023-09-26 17:19 ` Teres Alexis, Alan Previn
2023-09-26 18:21 ` Teres Alexis, Alan Previn
2023-09-10 3:58 ` Alan Previn [this message]
2023-09-10 4:27 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Resolve suspend-resume racing with GuC destroy-context-worker (rev3) Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230910035846.493766-4-alan.previn.teres.alexis@intel.com \
--to=alan.previn.teres.alexis@intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel.com@freedesktop.org \
--cc=mousumi.jana@intel.com \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).