All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH v3 0/3] drm/i915: implement internal workqueues
@ 2023-05-30 11:15 Luca Coelho
  2023-05-30 11:15 ` [Intel-gfx] [PATCH v3 1/3] drm/i915: use pointer to i915 instead of rpm in wakeref Luca Coelho
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Luca Coelho @ 2023-05-30 11:15 UTC (permalink / raw)
  To: intel-gfx; +Cc: rodrigo.vivi

Hi,

This series implements internal workqueues in the i915 driver in order
to avoid using the system queue.  We add one generic workqueue in the
drm_i915_private structure, one specific for wake references and one
in a self-test.

This is based on Tetsuo's work[1] and is required to get rid of the
flush_scheduled_work() usage.

As discussed, we can either take Tetsuo's implementation first, which
uses a module-global workqueue, and then my series on top of that, to
change the implementation to per-device workqueues, or apply my series
directly.  And a third option would be to keep the workqueue as
module-global.  I'm fine with any of this options.  It's up to the
maintainers to decide which one to take.

In v2:
   * Removed per-engine workqueue and wakeref-specific queue;
   * Renamed the workqueue name from "i915-wq" to "unordered_wq";
   * Added comment about when the new workqueue should be used;
   * Changed wakeref structs to store i915 instead of rpm;

In v3:
   * Fixed queue to destroy in the init error path;
   * Improved the new queue's description a bit;
   * Removed stray empty-line removal;

Please review.

[1] https://patchwork.freedesktop.org/series/114608/

Cheers,
Luca.


Luca Coelho (3):
  drm/i915: use pointer to i915 instead of rpm in wakeref
  drm/i915: add a dedicated workqueue inside drm_i915_private
  drm/i915/selftests: add local workqueue for SW fence selftest

 drivers/gpu/drm/i915/display/intel_display.c  |  5 ++--
 .../drm/i915/display/intel_display_driver.c   |  2 +-
 drivers/gpu/drm/i915/display/intel_dmc.c      |  2 +-
 drivers/gpu/drm/i915/display/intel_dp.c       |  2 +-
 .../drm/i915/display/intel_dp_link_training.c |  3 ++-
 drivers/gpu/drm/i915/display/intel_drrs.c     |  4 +++-
 drivers/gpu/drm/i915/display/intel_fbc.c      |  2 +-
 drivers/gpu/drm/i915/display/intel_fbdev.c    |  3 ++-
 drivers/gpu/drm/i915/display/intel_hdcp.c     | 23 +++++++++++--------
 drivers/gpu/drm/i915/display/intel_hotplug.c  | 18 ++++++++++-----
 drivers/gpu/drm/i915/display/intel_opregion.c |  3 ++-
 drivers/gpu/drm/i915/display/intel_pps.c      |  4 +++-
 drivers/gpu/drm/i915/display/intel_psr.c      |  8 ++++---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |  4 +---
 .../drm/i915/gt/intel_execlists_submission.c  |  5 ++--
 .../gpu/drm/i915/gt/intel_gt_buffer_pool.c    | 10 ++++----
 drivers/gpu/drm/i915/gt/intel_gt_irq.c        |  2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c         |  2 +-
 drivers/gpu/drm/i915/gt/intel_gt_requests.c   | 10 ++++----
 drivers/gpu/drm/i915/gt/intel_reset.c         |  2 +-
 drivers/gpu/drm/i915/gt/intel_rps.c           | 20 ++++++++--------
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c  |  2 +-
 drivers/gpu/drm/i915/i915_driver.c            | 13 +++++++++++
 drivers/gpu/drm/i915/i915_drv.h               | 10 ++++++++
 drivers/gpu/drm/i915/i915_request.c           |  2 +-
 drivers/gpu/drm/i915/intel_runtime_pm.c       |  2 +-
 drivers/gpu/drm/i915/intel_wakeref.c          | 22 ++++++++++--------
 drivers/gpu/drm/i915/intel_wakeref.h          | 12 +++++-----
 .../gpu/drm/i915/selftests/i915_sw_fence.c    | 16 ++++++++++---
 29 files changed, 136 insertions(+), 77 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Intel-gfx] [PATCH v3 1/3] drm/i915: use pointer to i915 instead of rpm in wakeref
  2023-05-30 11:15 [Intel-gfx] [PATCH v3 0/3] drm/i915: implement internal workqueues Luca Coelho
@ 2023-05-30 11:15 ` Luca Coelho
  2023-05-30 11:15 ` [Intel-gfx] [PATCH v3 2/3] drm/i915: add a dedicated workqueue inside drm_i915_private Luca Coelho
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Luca Coelho @ 2023-05-30 11:15 UTC (permalink / raw)
  To: intel-gfx; +Cc: rodrigo.vivi

Currently a pointer to an intel_runtime_pm structure is stored in the
wake reference structures so the runtime data can be accessed.  We can
save the entire device information (drm_i915_private) instead, since
we'll need to reference the new workqueue we'll add in subsequent
patches.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |  4 +---
 drivers/gpu/drm/i915/gt/intel_gt_pm.c     |  2 +-
 drivers/gpu/drm/i915/intel_runtime_pm.c   |  2 +-
 drivers/gpu/drm/i915/intel_wakeref.c      | 20 +++++++++++---------
 drivers/gpu/drm/i915/intel_wakeref.h      | 12 ++++++------
 5 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index ee531a5c142c..21af0ec52223 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -296,9 +296,7 @@ static const struct intel_wakeref_ops wf_ops = {
 
 void intel_engine_init__pm(struct intel_engine_cs *engine)
 {
-	struct intel_runtime_pm *rpm = engine->uncore->rpm;
-
-	intel_wakeref_init(&engine->wakeref, rpm, &wf_ops);
+	intel_wakeref_init(&engine->wakeref, engine->i915, &wf_ops);
 	intel_engine_init_heartbeat(engine);
 
 	intel_gsc_idle_msg_enable(engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index c2e69bafd02b..5a942af0a14e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -137,7 +137,7 @@ void intel_gt_pm_init_early(struct intel_gt *gt)
 	 * runtime_pm is per-device rather than per-tile, so this is still the
 	 * correct structure.
 	 */
-	intel_wakeref_init(&gt->wakeref, &gt->i915->runtime_pm, &wf_ops);
+	intel_wakeref_init(&gt->wakeref, gt->i915, &wf_ops);
 	seqcount_mutex_init(&gt->stats.lock, &gt->wakeref.mutex);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index cf5122299b6b..6d8e5e5c0cba 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -658,5 +658,5 @@ void intel_runtime_pm_init_early(struct intel_runtime_pm *rpm)
 	init_intel_runtime_pm_wakeref(rpm);
 	INIT_LIST_HEAD(&rpm->lmem_userfault_list);
 	spin_lock_init(&rpm->lmem_userfault_lock);
-	intel_wakeref_auto_init(&rpm->userfault_wakeref, rpm);
+	intel_wakeref_auto_init(&rpm->userfault_wakeref, i915);
 }
diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
index dfd87d082218..40aafe676017 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.c
+++ b/drivers/gpu/drm/i915/intel_wakeref.c
@@ -8,17 +8,18 @@
 
 #include "intel_runtime_pm.h"
 #include "intel_wakeref.h"
+#include "i915_drv.h"
 
 static void rpm_get(struct intel_wakeref *wf)
 {
-	wf->wakeref = intel_runtime_pm_get(wf->rpm);
+	wf->wakeref = intel_runtime_pm_get(&wf->i915->runtime_pm);
 }
 
 static void rpm_put(struct intel_wakeref *wf)
 {
 	intel_wakeref_t wakeref = fetch_and_zero(&wf->wakeref);
 
-	intel_runtime_pm_put(wf->rpm, wakeref);
+	intel_runtime_pm_put(&wf->i915->runtime_pm, wakeref);
 	INTEL_WAKEREF_BUG_ON(!wakeref);
 }
 
@@ -94,11 +95,11 @@ static void __intel_wakeref_put_work(struct work_struct *wrk)
 }
 
 void __intel_wakeref_init(struct intel_wakeref *wf,
-			  struct intel_runtime_pm *rpm,
+			  struct drm_i915_private *i915,
 			  const struct intel_wakeref_ops *ops,
 			  struct intel_wakeref_lockclass *key)
 {
-	wf->rpm = rpm;
+	wf->i915 = i915;
 	wf->ops = ops;
 
 	__mutex_init(&wf->mutex, "wakeref.mutex", &key->mutex);
@@ -137,17 +138,17 @@ static void wakeref_auto_timeout(struct timer_list *t)
 	wakeref = fetch_and_zero(&wf->wakeref);
 	spin_unlock_irqrestore(&wf->lock, flags);
 
-	intel_runtime_pm_put(wf->rpm, wakeref);
+	intel_runtime_pm_put(&wf->i915->runtime_pm, wakeref);
 }
 
 void intel_wakeref_auto_init(struct intel_wakeref_auto *wf,
-			     struct intel_runtime_pm *rpm)
+			     struct drm_i915_private *i915)
 {
 	spin_lock_init(&wf->lock);
 	timer_setup(&wf->timer, wakeref_auto_timeout, 0);
 	refcount_set(&wf->count, 0);
 	wf->wakeref = 0;
-	wf->rpm = rpm;
+	wf->i915 = i915;
 }
 
 void intel_wakeref_auto(struct intel_wakeref_auto *wf, unsigned long timeout)
@@ -161,13 +162,14 @@ void intel_wakeref_auto(struct intel_wakeref_auto *wf, unsigned long timeout)
 	}
 
 	/* Our mission is that we only extend an already active wakeref */
-	assert_rpm_wakelock_held(wf->rpm);
+	assert_rpm_wakelock_held(&wf->i915->runtime_pm);
 
 	if (!refcount_inc_not_zero(&wf->count)) {
 		spin_lock_irqsave(&wf->lock, flags);
 		if (!refcount_inc_not_zero(&wf->count)) {
 			INTEL_WAKEREF_BUG_ON(wf->wakeref);
-			wf->wakeref = intel_runtime_pm_get_if_in_use(wf->rpm);
+			wf->wakeref =
+				intel_runtime_pm_get_if_in_use(&wf->i915->runtime_pm);
 			refcount_set(&wf->count, 1);
 		}
 		spin_unlock_irqrestore(&wf->lock, flags);
diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
index 0b6b4852ab23..ec881b097368 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.h
+++ b/drivers/gpu/drm/i915/intel_wakeref.h
@@ -39,7 +39,7 @@ struct intel_wakeref {
 
 	intel_wakeref_t wakeref;
 
-	struct intel_runtime_pm *rpm;
+	struct drm_i915_private *i915;
 	const struct intel_wakeref_ops *ops;
 
 	struct delayed_work work;
@@ -51,13 +51,13 @@ struct intel_wakeref_lockclass {
 };
 
 void __intel_wakeref_init(struct intel_wakeref *wf,
-			  struct intel_runtime_pm *rpm,
+			  struct drm_i915_private *i915,
 			  const struct intel_wakeref_ops *ops,
 			  struct intel_wakeref_lockclass *key);
-#define intel_wakeref_init(wf, rpm, ops) do {				\
+#define intel_wakeref_init(wf, i915, ops) do {				\
 	static struct intel_wakeref_lockclass __key;			\
 									\
-	__intel_wakeref_init((wf), (rpm), (ops), &__key);		\
+	__intel_wakeref_init((wf), (i915), (ops), &__key);		\
 } while (0)
 
 int __intel_wakeref_get_first(struct intel_wakeref *wf);
@@ -262,7 +262,7 @@ __intel_wakeref_defer_park(struct intel_wakeref *wf)
 int intel_wakeref_wait_for_idle(struct intel_wakeref *wf);
 
 struct intel_wakeref_auto {
-	struct intel_runtime_pm *rpm;
+	struct drm_i915_private *i915;
 	struct timer_list timer;
 	intel_wakeref_t wakeref;
 	spinlock_t lock;
@@ -287,7 +287,7 @@ struct intel_wakeref_auto {
 void intel_wakeref_auto(struct intel_wakeref_auto *wf, unsigned long timeout);
 
 void intel_wakeref_auto_init(struct intel_wakeref_auto *wf,
-			     struct intel_runtime_pm *rpm);
+			     struct drm_i915_private *i915);
 void intel_wakeref_auto_fini(struct intel_wakeref_auto *wf);
 
 #endif /* INTEL_WAKEREF_H */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Intel-gfx] [PATCH v3 2/3] drm/i915: add a dedicated workqueue inside drm_i915_private
  2023-05-30 11:15 [Intel-gfx] [PATCH v3 0/3] drm/i915: implement internal workqueues Luca Coelho
  2023-05-30 11:15 ` [Intel-gfx] [PATCH v3 1/3] drm/i915: use pointer to i915 instead of rpm in wakeref Luca Coelho
@ 2023-05-30 11:15 ` Luca Coelho
  2023-05-30 11:15 ` [Intel-gfx] [PATCH v3 3/3] drm/i915/selftests: add local workqueue for SW fence selftest Luca Coelho
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Luca Coelho @ 2023-05-30 11:15 UTC (permalink / raw)
  To: intel-gfx; +Cc: rodrigo.vivi

In order to avoid flush_scheduled_work() usage, add a dedicated
workqueue in the drm_i915_private structure.  In this way, we don't
need to use the system queue anymore.

This change is mostly mechanical and based on Tetsuo's original
patch[1].

Link: https://patchwork.freedesktop.org/series/114608/ [1]
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c  |  5 ++--
 .../drm/i915/display/intel_display_driver.c   |  2 +-
 drivers/gpu/drm/i915/display/intel_dmc.c      |  2 +-
 drivers/gpu/drm/i915/display/intel_dp.c       |  2 +-
 .../drm/i915/display/intel_dp_link_training.c |  3 ++-
 drivers/gpu/drm/i915/display/intel_drrs.c     |  4 +++-
 drivers/gpu/drm/i915/display/intel_fbc.c      |  2 +-
 drivers/gpu/drm/i915/display/intel_fbdev.c    |  3 ++-
 drivers/gpu/drm/i915/display/intel_hdcp.c     | 23 +++++++++++--------
 drivers/gpu/drm/i915/display/intel_hotplug.c  | 18 ++++++++++-----
 drivers/gpu/drm/i915/display/intel_opregion.c |  3 ++-
 drivers/gpu/drm/i915/display/intel_pps.c      |  4 +++-
 drivers/gpu/drm/i915/display/intel_psr.c      |  8 ++++---
 .../drm/i915/gt/intel_execlists_submission.c  |  5 ++--
 .../gpu/drm/i915/gt/intel_gt_buffer_pool.c    | 10 ++++----
 drivers/gpu/drm/i915/gt/intel_gt_irq.c        |  2 +-
 drivers/gpu/drm/i915/gt/intel_gt_requests.c   | 10 ++++----
 drivers/gpu/drm/i915/gt/intel_reset.c         |  2 +-
 drivers/gpu/drm/i915/gt/intel_rps.c           | 20 ++++++++--------
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c  |  2 +-
 drivers/gpu/drm/i915/i915_driver.c            | 13 +++++++++++
 drivers/gpu/drm/i915/i915_drv.h               | 10 ++++++++
 drivers/gpu/drm/i915/i915_request.c           |  2 +-
 drivers/gpu/drm/i915/intel_wakeref.c          |  2 +-
 24 files changed, 103 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index f51a55f4e9d0..39af6fc43bb6 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -7164,11 +7164,12 @@ intel_atomic_commit_ready(struct i915_sw_fence *fence,
 		break;
 	case FENCE_FREE:
 		{
+			struct drm_i915_private *i915 = to_i915(state->base.dev);
 			struct intel_atomic_helper *helper =
-				&to_i915(state->base.dev)->display.atomic_helper;
+				&i915->display.atomic_helper;
 
 			if (llist_add(&state->freed, &helper->free_list))
-				schedule_work(&helper->free_work);
+				queue_work(i915->unordered_wq, &helper->free_work);
 			break;
 		}
 	}
diff --git a/drivers/gpu/drm/i915/display/intel_display_driver.c b/drivers/gpu/drm/i915/display/intel_display_driver.c
index 60ce10fc7205..a9e36ddb6c1d 100644
--- a/drivers/gpu/drm/i915/display/intel_display_driver.c
+++ b/drivers/gpu/drm/i915/display/intel_display_driver.c
@@ -435,7 +435,7 @@ void intel_display_driver_remove_noirq(struct drm_i915_private *i915)
 	intel_unregister_dsm_handler();
 
 	/* flush any delayed tasks or pending work */
-	flush_scheduled_work();
+	flush_workqueue(i915->unordered_wq);
 
 	intel_hdcp_component_fini(i915);
 
diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c
index 8a88de67ff0a..5f479f3828bb 100644
--- a/drivers/gpu/drm/i915/display/intel_dmc.c
+++ b/drivers/gpu/drm/i915/display/intel_dmc.c
@@ -1057,7 +1057,7 @@ void intel_dmc_init(struct drm_i915_private *i915)
 	i915->display.dmc.dmc = dmc;
 
 	drm_dbg_kms(&i915->drm, "Loading %s\n", dmc->fw_path);
-	schedule_work(&dmc->work);
+	queue_work(i915->unordered_wq, &dmc->work);
 
 	return;
 
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
index 4bec8cd7979f..1466283cd569 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -5251,7 +5251,7 @@ static void intel_dp_oob_hotplug_event(struct drm_connector *connector)
 	spin_lock_irq(&i915->irq_lock);
 	i915->display.hotplug.event_bits |= BIT(encoder->hpd_pin);
 	spin_unlock_irq(&i915->irq_lock);
-	queue_delayed_work(system_wq, &i915->display.hotplug.hotplug_work, 0);
+	queue_delayed_work(i915->unordered_wq, &i915->display.hotplug.hotplug_work, 0);
 }
 
 static const struct drm_connector_funcs intel_dp_connector_funcs = {
diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.c b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
index 0952a707358c..2cccc0a970f1 100644
--- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c
+++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
@@ -1064,6 +1064,7 @@ static void intel_dp_schedule_fallback_link_training(struct intel_dp *intel_dp,
 						     const struct intel_crtc_state *crtc_state)
 {
 	struct intel_connector *intel_connector = intel_dp->attached_connector;
+	struct drm_i915_private *i915 = dp_to_i915(intel_dp);
 
 	if (!intel_digital_port_connected(&dp_to_dig_port(intel_dp)->base)) {
 		lt_dbg(intel_dp, DP_PHY_DPRX, "Link Training failed on disconnected sink.\n");
@@ -1081,7 +1082,7 @@ static void intel_dp_schedule_fallback_link_training(struct intel_dp *intel_dp,
 	}
 
 	/* Schedule a Hotplug Uevent to userspace to start modeset */
-	schedule_work(&intel_connector->modeset_retry_work);
+	queue_work(i915->unordered_wq, &intel_connector->modeset_retry_work);
 }
 
 /* Perform the link training on all LTTPRs and the DPRX on a link. */
diff --git a/drivers/gpu/drm/i915/display/intel_drrs.c b/drivers/gpu/drm/i915/display/intel_drrs.c
index 760e63cdc0c8..0d35b6be5b6a 100644
--- a/drivers/gpu/drm/i915/display/intel_drrs.c
+++ b/drivers/gpu/drm/i915/display/intel_drrs.c
@@ -111,7 +111,9 @@ static void intel_drrs_set_state(struct intel_crtc *crtc,
 
 static void intel_drrs_schedule_work(struct intel_crtc *crtc)
 {
-	mod_delayed_work(system_wq, &crtc->drrs.work, msecs_to_jiffies(1000));
+	struct drm_i915_private *i915 = to_i915(crtc->base.dev);
+
+	mod_delayed_work(i915->unordered_wq, &crtc->drrs.work, msecs_to_jiffies(1000));
 }
 
 static unsigned int intel_drrs_frontbuffer_bits(const struct intel_crtc_state *crtc_state)
diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c b/drivers/gpu/drm/i915/display/intel_fbc.c
index 1966f9396201..89bb7c588f05 100644
--- a/drivers/gpu/drm/i915/display/intel_fbc.c
+++ b/drivers/gpu/drm/i915/display/intel_fbc.c
@@ -1600,7 +1600,7 @@ static void __intel_fbc_handle_fifo_underrun_irq(struct intel_fbc *fbc)
 	if (READ_ONCE(fbc->underrun_detected))
 		return;
 
-	schedule_work(&fbc->underrun_work);
+	queue_work(fbc->i915->unordered_wq, &fbc->underrun_work);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
index aab1ae74a8f7..c88f13d319b9 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -689,7 +689,8 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
 			/* Don't block our own workqueue as this can
 			 * be run in parallel with other i915.ko tasks.
 			 */
-			schedule_work(&dev_priv->display.fbdev.suspend_work);
+			queue_work(dev_priv->unordered_wq,
+				   &dev_priv->display.fbdev.suspend_work);
 			return;
 		}
 	}
diff --git a/drivers/gpu/drm/i915/display/intel_hdcp.c b/drivers/gpu/drm/i915/display/intel_hdcp.c
index 1f96d1fa68e0..9b239c565377 100644
--- a/drivers/gpu/drm/i915/display/intel_hdcp.c
+++ b/drivers/gpu/drm/i915/display/intel_hdcp.c
@@ -983,6 +983,7 @@ static void intel_hdcp_update_value(struct intel_connector *connector,
 	struct drm_device *dev = connector->base.dev;
 	struct intel_digital_port *dig_port = intel_attached_dig_port(connector);
 	struct intel_hdcp *hdcp = &connector->hdcp;
+	struct drm_i915_private *i915 = to_i915(connector->base.dev);
 
 	drm_WARN_ON(connector->base.dev, !mutex_is_locked(&hdcp->mutex));
 
@@ -1001,7 +1002,7 @@ static void intel_hdcp_update_value(struct intel_connector *connector,
 	hdcp->value = value;
 	if (update_property) {
 		drm_connector_get(&connector->base);
-		schedule_work(&hdcp->prop_work);
+		queue_work(i915->unordered_wq, &hdcp->prop_work);
 	}
 }
 
@@ -2090,16 +2091,17 @@ static void intel_hdcp_check_work(struct work_struct *work)
 					       struct intel_hdcp,
 					       check_work);
 	struct intel_connector *connector = intel_hdcp_to_connector(hdcp);
+	struct drm_i915_private *i915 = to_i915(connector->base.dev);
 
 	if (drm_connector_is_unregistered(&connector->base))
 		return;
 
 	if (!intel_hdcp2_check_link(connector))
-		schedule_delayed_work(&hdcp->check_work,
-				      DRM_HDCP2_CHECK_PERIOD_MS);
+		queue_delayed_work(i915->unordered_wq, &hdcp->check_work,
+				   DRM_HDCP2_CHECK_PERIOD_MS);
 	else if (!intel_hdcp_check_link(connector))
-		schedule_delayed_work(&hdcp->check_work,
-				      DRM_HDCP_CHECK_PERIOD_MS);
+		queue_delayed_work(i915->unordered_wq, &hdcp->check_work,
+				   DRM_HDCP_CHECK_PERIOD_MS);
 }
 
 static int i915_hdcp_component_bind(struct device *i915_kdev,
@@ -2398,7 +2400,8 @@ int intel_hdcp_enable(struct intel_atomic_state *state,
 	}
 
 	if (!ret) {
-		schedule_delayed_work(&hdcp->check_work, check_link_interval);
+		queue_delayed_work(dev_priv->unordered_wq, &hdcp->check_work,
+				   check_link_interval);
 		intel_hdcp_update_value(connector,
 					DRM_MODE_CONTENT_PROTECTION_ENABLED,
 					true);
@@ -2447,6 +2450,7 @@ void intel_hdcp_update_pipe(struct intel_atomic_state *state,
 				to_intel_connector(conn_state->connector);
 	struct intel_hdcp *hdcp = &connector->hdcp;
 	bool content_protection_type_changed, desired_and_not_enabled = false;
+	struct drm_i915_private *i915 = to_i915(connector->base.dev);
 
 	if (!connector->hdcp.shim)
 		return;
@@ -2473,7 +2477,7 @@ void intel_hdcp_update_pipe(struct intel_atomic_state *state,
 		mutex_lock(&hdcp->mutex);
 		hdcp->value = DRM_MODE_CONTENT_PROTECTION_DESIRED;
 		drm_connector_get(&connector->base);
-		schedule_work(&hdcp->prop_work);
+		queue_work(i915->unordered_wq, &hdcp->prop_work);
 		mutex_unlock(&hdcp->mutex);
 	}
 
@@ -2490,7 +2494,7 @@ void intel_hdcp_update_pipe(struct intel_atomic_state *state,
 		 */
 		if (!desired_and_not_enabled && !content_protection_type_changed) {
 			drm_connector_get(&connector->base);
-			schedule_work(&hdcp->prop_work);
+			queue_work(i915->unordered_wq, &hdcp->prop_work);
 		}
 	}
 
@@ -2602,6 +2606,7 @@ void intel_hdcp_atomic_check(struct drm_connector *connector,
 void intel_hdcp_handle_cp_irq(struct intel_connector *connector)
 {
 	struct intel_hdcp *hdcp = &connector->hdcp;
+	struct drm_i915_private *i915 = to_i915(connector->base.dev);
 
 	if (!hdcp->shim)
 		return;
@@ -2609,5 +2614,5 @@ void intel_hdcp_handle_cp_irq(struct intel_connector *connector)
 	atomic_inc(&connector->hdcp.cp_irq_count);
 	wake_up_all(&connector->hdcp.cp_irq_queue);
 
-	schedule_delayed_work(&hdcp->check_work, 0);
+	queue_delayed_work(i915->unordered_wq, &hdcp->check_work, 0);
 }
diff --git a/drivers/gpu/drm/i915/display/intel_hotplug.c b/drivers/gpu/drm/i915/display/intel_hotplug.c
index 23a5e1a875f1..1160fa20433b 100644
--- a/drivers/gpu/drm/i915/display/intel_hotplug.c
+++ b/drivers/gpu/drm/i915/display/intel_hotplug.c
@@ -212,7 +212,8 @@ intel_hpd_irq_storm_switch_to_polling(struct drm_i915_private *dev_priv)
 	/* Enable polling and queue hotplug re-enabling. */
 	if (hpd_disabled) {
 		drm_kms_helper_poll_enable(&dev_priv->drm);
-		mod_delayed_work(system_wq, &dev_priv->display.hotplug.reenable_work,
+		mod_delayed_work(dev_priv->unordered_wq,
+				 &dev_priv->display.hotplug.reenable_work,
 				 msecs_to_jiffies(HPD_STORM_REENABLE_DELAY));
 	}
 }
@@ -339,7 +340,8 @@ static void i915_digport_work_func(struct work_struct *work)
 		spin_lock_irq(&dev_priv->irq_lock);
 		dev_priv->display.hotplug.event_bits |= old_bits;
 		spin_unlock_irq(&dev_priv->irq_lock);
-		queue_delayed_work(system_wq, &dev_priv->display.hotplug.hotplug_work, 0);
+		queue_delayed_work(dev_priv->unordered_wq,
+				   &dev_priv->display.hotplug.hotplug_work, 0);
 	}
 }
 
@@ -446,7 +448,8 @@ static void i915_hotplug_work_func(struct work_struct *work)
 		dev_priv->display.hotplug.retry_bits |= retry;
 		spin_unlock_irq(&dev_priv->irq_lock);
 
-		mod_delayed_work(system_wq, &dev_priv->display.hotplug.hotplug_work,
+		mod_delayed_work(dev_priv->unordered_wq,
+				 &dev_priv->display.hotplug.hotplug_work,
 				 msecs_to_jiffies(HPD_RETRY_DELAY));
 	}
 }
@@ -577,7 +580,8 @@ void intel_hpd_irq_handler(struct drm_i915_private *dev_priv,
 	if (queue_dig)
 		queue_work(dev_priv->display.hotplug.dp_wq, &dev_priv->display.hotplug.dig_port_work);
 	if (queue_hp)
-		queue_delayed_work(system_wq, &dev_priv->display.hotplug.hotplug_work, 0);
+		queue_delayed_work(dev_priv->unordered_wq,
+				   &dev_priv->display.hotplug.hotplug_work, 0);
 }
 
 /**
@@ -687,7 +691,8 @@ void intel_hpd_poll_enable(struct drm_i915_private *dev_priv)
 	 * As well, there's no issue if we race here since we always reschedule
 	 * this worker anyway
 	 */
-	schedule_work(&dev_priv->display.hotplug.poll_init_work);
+	queue_work(dev_priv->unordered_wq,
+		   &dev_priv->display.hotplug.poll_init_work);
 }
 
 /**
@@ -715,7 +720,8 @@ void intel_hpd_poll_disable(struct drm_i915_private *dev_priv)
 		return;
 
 	WRITE_ONCE(dev_priv->display.hotplug.poll_enabled, false);
-	schedule_work(&dev_priv->display.hotplug.poll_init_work);
+	queue_work(dev_priv->unordered_wq,
+		   &dev_priv->display.hotplug.poll_init_work);
 }
 
 void intel_hpd_init_early(struct drm_i915_private *i915)
diff --git a/drivers/gpu/drm/i915/display/intel_opregion.c b/drivers/gpu/drm/i915/display/intel_opregion.c
index b7973a05d022..84078fb82b2f 100644
--- a/drivers/gpu/drm/i915/display/intel_opregion.c
+++ b/drivers/gpu/drm/i915/display/intel_opregion.c
@@ -635,7 +635,8 @@ static void asle_work(struct work_struct *work)
 void intel_opregion_asle_intr(struct drm_i915_private *dev_priv)
 {
 	if (dev_priv->display.opregion.asle)
-		schedule_work(&dev_priv->display.opregion.asle_work);
+		queue_work(dev_priv->unordered_wq,
+			   &dev_priv->display.opregion.asle_work);
 }
 
 #define ACPI_EV_DISPLAY_SWITCH (1<<0)
diff --git a/drivers/gpu/drm/i915/display/intel_pps.c b/drivers/gpu/drm/i915/display/intel_pps.c
index 5e7ba594e7e7..73f0f1714b37 100644
--- a/drivers/gpu/drm/i915/display/intel_pps.c
+++ b/drivers/gpu/drm/i915/display/intel_pps.c
@@ -867,6 +867,7 @@ static void edp_panel_vdd_work(struct work_struct *__work)
 
 static void edp_panel_vdd_schedule_off(struct intel_dp *intel_dp)
 {
+	struct drm_i915_private *i915 = dp_to_i915(intel_dp);
 	unsigned long delay;
 
 	/*
@@ -882,7 +883,8 @@ static void edp_panel_vdd_schedule_off(struct intel_dp *intel_dp)
 	 * operations.
 	 */
 	delay = msecs_to_jiffies(intel_dp->pps.panel_power_cycle_delay * 5);
-	schedule_delayed_work(&intel_dp->pps.panel_vdd_work, delay);
+	queue_delayed_work(i915->unordered_wq,
+			   &intel_dp->pps.panel_vdd_work, delay);
 }
 
 /*
diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c
index ea0389c5f656..d58ed9b62e67 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -341,7 +341,7 @@ void intel_psr_irq_handler(struct intel_dp *intel_dp, u32 psr_iir)
 		 */
 		intel_de_rmw(dev_priv, imr_reg, 0, psr_irq_psr_error_bit_get(intel_dp));
 
-		schedule_work(&intel_dp->psr.work);
+		queue_work(dev_priv->unordered_wq, &intel_dp->psr.work);
 	}
 }
 
@@ -2440,6 +2440,8 @@ static void
 tgl_dc3co_flush_locked(struct intel_dp *intel_dp, unsigned int frontbuffer_bits,
 		       enum fb_op_origin origin)
 {
+	struct drm_i915_private *i915 = dp_to_i915(intel_dp);
+
 	if (!intel_dp->psr.dc3co_exitline || !intel_dp->psr.psr2_enabled ||
 	    !intel_dp->psr.active)
 		return;
@@ -2453,7 +2455,7 @@ tgl_dc3co_flush_locked(struct intel_dp *intel_dp, unsigned int frontbuffer_bits,
 		return;
 
 	tgl_psr2_enable_dc3co(intel_dp);
-	mod_delayed_work(system_wq, &intel_dp->psr.dc3co_work,
+	mod_delayed_work(i915->unordered_wq, &intel_dp->psr.dc3co_work,
 			 intel_dp->psr.dc3co_exit_delay);
 }
 
@@ -2493,7 +2495,7 @@ static void _psr_flush_handle(struct intel_dp *intel_dp)
 		psr_force_hw_tracking_exit(intel_dp);
 
 		if (!intel_dp->psr.active && !intel_dp->psr.busy_frontbuffer_bits)
-			schedule_work(&intel_dp->psr.work);
+			queue_work(dev_priv->unordered_wq, &intel_dp->psr.work);
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 750326434677..2ebd937f3b4c 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -2327,6 +2327,7 @@ static u32 active_ccid(struct intel_engine_cs *engine)
 
 static void execlists_capture(struct intel_engine_cs *engine)
 {
+	struct drm_i915_private *i915 = engine->i915;
 	struct execlists_capture *cap;
 
 	if (!IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR))
@@ -2375,7 +2376,7 @@ static void execlists_capture(struct intel_engine_cs *engine)
 		goto err_rq;
 
 	INIT_WORK(&cap->work, execlists_capture_work);
-	schedule_work(&cap->work);
+	queue_work(i915->unordered_wq, &cap->work);
 	return;
 
 err_rq:
@@ -3680,7 +3681,7 @@ static void virtual_context_destroy(struct kref *kref)
 	 * lock, we can delegate the free of the engine to an RCU worker.
 	 */
 	INIT_RCU_WORK(&ve->rcu, rcu_virtual_context_destroy);
-	queue_rcu_work(system_wq, &ve->rcu);
+	queue_rcu_work(ve->context.engine->i915->unordered_wq, &ve->rcu);
 }
 
 static void virtual_engine_initial_hint(struct virtual_engine *ve)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
index cadfd85785b1..86b5a9ba323d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
@@ -88,10 +88,11 @@ static void pool_free_work(struct work_struct *wrk)
 {
 	struct intel_gt_buffer_pool *pool =
 		container_of(wrk, typeof(*pool), work.work);
+	struct intel_gt *gt = container_of(pool, struct intel_gt, buffer_pool);
 
 	if (pool_free_older_than(pool, HZ))
-		schedule_delayed_work(&pool->work,
-				      round_jiffies_up_relative(HZ));
+		queue_delayed_work(gt->i915->unordered_wq, &pool->work,
+				   round_jiffies_up_relative(HZ));
 }
 
 static void pool_retire(struct i915_active *ref)
@@ -99,6 +100,7 @@ static void pool_retire(struct i915_active *ref)
 	struct intel_gt_buffer_pool_node *node =
 		container_of(ref, typeof(*node), active);
 	struct intel_gt_buffer_pool *pool = node->pool;
+	struct intel_gt *gt = container_of(pool, struct intel_gt, buffer_pool);
 	struct list_head *list = bucket_for_size(pool, node->obj->base.size);
 	unsigned long flags;
 
@@ -116,8 +118,8 @@ static void pool_retire(struct i915_active *ref)
 	WRITE_ONCE(node->age, jiffies ?: 1); /* 0 reserved for active nodes */
 	spin_unlock_irqrestore(&pool->lock, flags);
 
-	schedule_delayed_work(&pool->work,
-			      round_jiffies_up_relative(HZ));
+	queue_delayed_work(gt->i915->unordered_wq, &pool->work,
+			   round_jiffies_up_relative(HZ));
 }
 
 void intel_gt_buffer_pool_mark_used(struct intel_gt_buffer_pool_node *node)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
index 8f888d36f16d..62fd00c9e519 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
@@ -376,7 +376,7 @@ static void gen7_parity_error_irq_handler(struct intel_gt *gt, u32 iir)
 	if (iir & GT_RENDER_L3_PARITY_ERROR_INTERRUPT)
 		gt->i915->l3_parity.which_slice |= 1 << 0;
 
-	schedule_work(&gt->i915->l3_parity.error_work);
+	queue_work(gt->i915->unordered_wq, &gt->i915->l3_parity.error_work);
 }
 
 void gen6_gt_irq_handler(struct intel_gt *gt, u32 gt_iir)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index 1dfd01668c79..d1a382dfaa1d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -116,7 +116,7 @@ void intel_engine_add_retire(struct intel_engine_cs *engine,
 	GEM_BUG_ON(intel_engine_is_virtual(engine));
 
 	if (add_retire(engine, tl))
-		schedule_work(&engine->retire_work);
+		queue_work(engine->i915->unordered_wq, &engine->retire_work);
 }
 
 void intel_engine_init_retire(struct intel_engine_cs *engine)
@@ -207,8 +207,8 @@ static void retire_work_handler(struct work_struct *work)
 	struct intel_gt *gt =
 		container_of(work, typeof(*gt), requests.retire_work.work);
 
-	schedule_delayed_work(&gt->requests.retire_work,
-			      round_jiffies_up_relative(HZ));
+	queue_delayed_work(gt->i915->unordered_wq, &gt->requests.retire_work,
+			   round_jiffies_up_relative(HZ));
 	intel_gt_retire_requests(gt);
 }
 
@@ -224,8 +224,8 @@ void intel_gt_park_requests(struct intel_gt *gt)
 
 void intel_gt_unpark_requests(struct intel_gt *gt)
 {
-	schedule_delayed_work(&gt->requests.retire_work,
-			      round_jiffies_up_relative(HZ));
+	queue_delayed_work(gt->i915->unordered_wq, &gt->requests.retire_work,
+			   round_jiffies_up_relative(HZ));
 }
 
 void intel_gt_fini_requests(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index 195ff72d7a14..e2152f75ba2e 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1625,7 +1625,7 @@ void __intel_init_wedge(struct intel_wedge_me *w,
 	w->name = name;
 
 	INIT_DELAYED_WORK_ONSTACK(&w->work, intel_wedge_me);
-	schedule_delayed_work(&w->work, timeout);
+	queue_delayed_work(gt->i915->unordered_wq, &w->work, timeout);
 }
 
 void __intel_fini_wedge(struct intel_wedge_me *w)
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index e68a99205599..e92e626d4994 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -73,13 +73,14 @@ static void set(struct intel_uncore *uncore, i915_reg_t reg, u32 val)
 static void rps_timer(struct timer_list *t)
 {
 	struct intel_rps *rps = from_timer(rps, t, timer);
+	struct intel_gt *gt = rps_to_gt(rps);
 	struct intel_engine_cs *engine;
 	ktime_t dt, last, timestamp;
 	enum intel_engine_id id;
 	s64 max_busy[3] = {};
 
 	timestamp = 0;
-	for_each_engine(engine, rps_to_gt(rps), id) {
+	for_each_engine(engine, gt, id) {
 		s64 busy;
 		int i;
 
@@ -123,7 +124,7 @@ static void rps_timer(struct timer_list *t)
 
 			busy += div_u64(max_busy[i], 1 << i);
 		}
-		GT_TRACE(rps_to_gt(rps),
+		GT_TRACE(gt,
 			 "busy:%lld [%d%%], max:[%lld, %lld, %lld], interval:%d\n",
 			 busy, (int)div64_u64(100 * busy, dt),
 			 max_busy[0], max_busy[1], max_busy[2],
@@ -133,12 +134,12 @@ static void rps_timer(struct timer_list *t)
 		    rps->cur_freq < rps->max_freq_softlimit) {
 			rps->pm_iir |= GEN6_PM_RP_UP_THRESHOLD;
 			rps->pm_interval = 1;
-			schedule_work(&rps->work);
+			queue_work(gt->i915->unordered_wq, &rps->work);
 		} else if (100 * busy < rps->power.down_threshold * dt &&
 			   rps->cur_freq > rps->min_freq_softlimit) {
 			rps->pm_iir |= GEN6_PM_RP_DOWN_THRESHOLD;
 			rps->pm_interval = 1;
-			schedule_work(&rps->work);
+			queue_work(gt->i915->unordered_wq, &rps->work);
 		} else {
 			rps->last_adj = 0;
 		}
@@ -973,7 +974,7 @@ static int rps_set_boost_freq(struct intel_rps *rps, u32 val)
 	}
 	mutex_unlock(&rps->lock);
 	if (boost)
-		schedule_work(&rps->work);
+		queue_work(rps_to_gt(rps)->i915->unordered_wq, &rps->work);
 
 	return 0;
 }
@@ -1025,7 +1026,8 @@ void intel_rps_boost(struct i915_request *rq)
 			if (!atomic_fetch_inc(&slpc->num_waiters)) {
 				GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n",
 					 rq->fence.context, rq->fence.seqno);
-				schedule_work(&slpc->boost_work);
+				queue_work(rps_to_gt(rps)->i915->unordered_wq,
+					   &slpc->boost_work);
 			}
 
 			return;
@@ -1041,7 +1043,7 @@ void intel_rps_boost(struct i915_request *rq)
 			 rq->fence.context, rq->fence.seqno);
 
 		if (READ_ONCE(rps->cur_freq) < rps->boost_freq)
-			schedule_work(&rps->work);
+			queue_work(rps_to_gt(rps)->i915->unordered_wq, &rps->work);
 
 		WRITE_ONCE(rps->boosts, rps->boosts + 1); /* debug only */
 	}
@@ -1900,7 +1902,7 @@ void gen11_rps_irq_handler(struct intel_rps *rps, u32 pm_iir)
 	gen6_gt_pm_mask_irq(gt, events);
 
 	rps->pm_iir |= events;
-	schedule_work(&rps->work);
+	queue_work(gt->i915->unordered_wq, &rps->work);
 }
 
 void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir)
@@ -1917,7 +1919,7 @@ void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir)
 		gen6_gt_pm_mask_irq(gt, events);
 		rps->pm_iir |= events;
 
-		schedule_work(&rps->work);
+		queue_work(gt->i915->unordered_wq, &rps->work);
 		spin_unlock(gt->irq_lock);
 	}
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_cs.c b/drivers/gpu/drm/i915/gt/selftest_engine_cs.c
index 542ce6d2de19..78cdfc6f315f 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_cs.c
@@ -27,7 +27,7 @@ static void perf_begin(struct intel_gt *gt)
 
 	/* Boost gpufreq to max [waitboost] and keep it fixed */
 	atomic_inc(&gt->rps.num_waiters);
-	schedule_work(&gt->rps.work);
+	queue_work(gt->i915->unordered_wq, &gt->rps.work);
 	flush_work(&gt->rps.work);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
index 37532e55327d..d7d77f435450 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -132,8 +132,20 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv)
 	if (dev_priv->display.hotplug.dp_wq == NULL)
 		goto out_free_wq;
 
+	/*
+	 * The unordered i915 workqueue should be used for all work
+	 * scheduling that do not require running in order, which used
+	 * to be scheduled on the system_wq before moving to a driver
+	 * instance due deprecation of flush_scheduled_work().
+	 */
+	dev_priv->unordered_wq = alloc_workqueue("i915-unordered", 0, 0);
+	if (dev_priv->unordered_wq == NULL)
+		goto out_free_dp_wq;
+
 	return 0;
 
+out_free_dp_wq:
+	destroy_workqueue(dev_priv->display.hotplug.dp_wq);
 out_free_wq:
 	destroy_workqueue(dev_priv->wq);
 out_err:
@@ -144,6 +156,7 @@ static int i915_workqueues_init(struct drm_i915_private *dev_priv)
 
 static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
 {
+	destroy_workqueue(dev_priv->unordered_wq);
 	destroy_workqueue(dev_priv->display.hotplug.dp_wq);
 	destroy_workqueue(dev_priv->wq);
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f1205ed3ba71..f125f04b5f18 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -260,6 +260,16 @@ struct drm_i915_private {
 	 */
 	struct workqueue_struct *wq;
 
+	/**
+	 * unordered_wq - internal workqueue for unordered work
+	 *
+	 * This workqueue should be used for all unordered work
+	 * scheduling within i915, which used to be scheduled on the
+	 * system_wq before moving to a driver instance due
+	 * deprecation of flush_scheduled_work().
+	 */
+	struct workqueue_struct *unordered_wq;
+
 	/* pm private clock gating functions */
 	const struct drm_i915_clock_gating_funcs *clock_gating_funcs;
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 630a732aaecc..894068bb37b6 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -290,7 +290,7 @@ static enum hrtimer_restart __rq_watchdog_expired(struct hrtimer *hrtimer)
 
 	if (!i915_request_completed(rq)) {
 		if (llist_add(&rq->watchdog.link, &gt->watchdog.list))
-			schedule_work(&gt->watchdog.work);
+			queue_work(gt->i915->unordered_wq, &gt->watchdog.work);
 	} else {
 		i915_request_put(rq);
 	}
diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
index 40aafe676017..497ea21a347e 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.c
+++ b/drivers/gpu/drm/i915/intel_wakeref.c
@@ -75,7 +75,7 @@ void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags)
 
 	/* Assume we are not in process context and so cannot sleep. */
 	if (flags & INTEL_WAKEREF_PUT_ASYNC || !mutex_trylock(&wf->mutex)) {
-		mod_delayed_work(system_wq, &wf->work,
+		mod_delayed_work(wf->i915->wq, &wf->work,
 				 FIELD_GET(INTEL_WAKEREF_PUT_DELAY, flags));
 		return;
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Intel-gfx] [PATCH v3 3/3] drm/i915/selftests: add local workqueue for SW fence selftest
  2023-05-30 11:15 [Intel-gfx] [PATCH v3 0/3] drm/i915: implement internal workqueues Luca Coelho
  2023-05-30 11:15 ` [Intel-gfx] [PATCH v3 1/3] drm/i915: use pointer to i915 instead of rpm in wakeref Luca Coelho
  2023-05-30 11:15 ` [Intel-gfx] [PATCH v3 2/3] drm/i915: add a dedicated workqueue inside drm_i915_private Luca Coelho
@ 2023-05-30 11:15 ` Luca Coelho
  2023-05-31  6:05 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: implement internal workqueues (rev3) Patchwork
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Luca Coelho @ 2023-05-30 11:15 UTC (permalink / raw)
  To: intel-gfx; +Cc: rodrigo.vivi

Instead of using a global workqueue for the SW fence selftest,
allocate a separate one temporarily only while running the test.

Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
---
 drivers/gpu/drm/i915/selftests/i915_sw_fence.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_sw_fence.c b/drivers/gpu/drm/i915/selftests/i915_sw_fence.c
index daa985e5a19b..8f5ce71fa453 100644
--- a/drivers/gpu/drm/i915/selftests/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/selftests/i915_sw_fence.c
@@ -523,12 +523,19 @@ static void task_ipc(struct work_struct *work)
 static int test_ipc(void *arg)
 {
 	struct task_ipc ipc;
+	struct workqueue_struct *wq;
 	int ret = 0;
 
+	wq = alloc_workqueue("i1915-selftest", 0, 0);
+	if (wq == NULL)
+		return -ENOMEM;
+
 	/* Test use of i915_sw_fence as an interprocess signaling mechanism */
 	ipc.in = alloc_fence();
-	if (!ipc.in)
-		return -ENOMEM;
+	if (!ipc.in) {
+		ret = -ENOMEM;
+		goto err_work;
+	}
 	ipc.out = alloc_fence();
 	if (!ipc.out) {
 		ret = -ENOMEM;
@@ -540,7 +547,7 @@ static int test_ipc(void *arg)
 
 	ipc.value = 0;
 	INIT_WORK_ONSTACK(&ipc.work, task_ipc);
-	schedule_work(&ipc.work);
+	queue_work(wq, &ipc.work);
 
 	wait_for_completion(&ipc.started);
 
@@ -563,6 +570,9 @@ static int test_ipc(void *arg)
 	free_fence(ipc.out);
 err_in:
 	free_fence(ipc.in);
+err_work:
+	destroy_workqueue(wq);
+
 	return ret;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: implement internal workqueues (rev3)
  2023-05-30 11:15 [Intel-gfx] [PATCH v3 0/3] drm/i915: implement internal workqueues Luca Coelho
                   ` (2 preceding siblings ...)
  2023-05-30 11:15 ` [Intel-gfx] [PATCH v3 3/3] drm/i915/selftests: add local workqueue for SW fence selftest Luca Coelho
@ 2023-05-31  6:05 ` Patchwork
  2023-05-31  6:05 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
  2023-05-31  6:24 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
  5 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2023-05-31  6:05 UTC (permalink / raw)
  To: Luca Coelho; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: implement internal workqueues (rev3)
URL   : https://patchwork.freedesktop.org/series/117618/
State : warning

== Summary ==

Error: dim checkpatch failed
722468704c00 drm/i915: use pointer to i915 instead of rpm in wakeref
0b06c13ceb1d drm/i915: add a dedicated workqueue inside drm_i915_private
-:624: CHECK:COMPARISON_TO_NULL: Comparison to NULL could be written "!dev_priv->unordered_wq"
#624: FILE: drivers/gpu/drm/i915/i915_driver.c:142:
+	if (dev_priv->unordered_wq == NULL)

total: 0 errors, 0 warnings, 1 checks, 514 lines checked
03dd16579f0a drm/i915/selftests: add local workqueue for SW fence selftest
-:30: CHECK:COMPARISON_TO_NULL: Comparison to NULL could be written "!wq"
#30: FILE: drivers/gpu/drm/i915/selftests/i915_sw_fence.c:530:
+	if (wq == NULL)

total: 0 errors, 0 warnings, 1 checks, 38 lines checked



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915: implement internal workqueues (rev3)
  2023-05-30 11:15 [Intel-gfx] [PATCH v3 0/3] drm/i915: implement internal workqueues Luca Coelho
                   ` (3 preceding siblings ...)
  2023-05-31  6:05 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: implement internal workqueues (rev3) Patchwork
@ 2023-05-31  6:05 ` Patchwork
  2023-05-31  6:24 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
  5 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2023-05-31  6:05 UTC (permalink / raw)
  To: Luca Coelho; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: implement internal workqueues (rev3)
URL   : https://patchwork.freedesktop.org/series/117618/
State : warning

== Summary ==

Error: dim sparse failed
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: implement internal workqueues (rev3)
  2023-05-30 11:15 [Intel-gfx] [PATCH v3 0/3] drm/i915: implement internal workqueues Luca Coelho
                   ` (4 preceding siblings ...)
  2023-05-31  6:05 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2023-05-31  6:24 ` Patchwork
  2023-06-05 15:06   ` Jani Nikula
  5 siblings, 1 reply; 13+ messages in thread
From: Patchwork @ 2023-05-31  6:24 UTC (permalink / raw)
  To: Luca Coelho; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 6836 bytes --]

== Series Details ==

Series: drm/i915: implement internal workqueues (rev3)
URL   : https://patchwork.freedesktop.org/series/117618/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_13203 -> Patchwork_117618v3
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_117618v3 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_117618v3, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/index.html

Participating hosts (38 -> 37)
------------------------------

  Missing    (1): bat-mtlp-6 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_117618v3:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_close_race@basic-process:
    - fi-blb-e6850:       [PASS][1] -> [ABORT][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-blb-e6850/igt@gem_close_race@basic-process.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-blb-e6850/igt@gem_close_race@basic-process.html
    - fi-hsw-4770:        [PASS][3] -> [ABORT][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-hsw-4770/igt@gem_close_race@basic-process.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-hsw-4770/igt@gem_close_race@basic-process.html
    - fi-elk-e7500:       [PASS][5] -> [ABORT][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-elk-e7500/igt@gem_close_race@basic-process.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-elk-e7500/igt@gem_close_race@basic-process.html

  * igt@i915_selftest@live@evict:
    - bat-adlp-9:         [PASS][7] -> [ABORT][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlp-9/igt@i915_selftest@live@evict.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlp-9/igt@i915_selftest@live@evict.html
    - bat-rpls-2:         [PASS][9] -> [ABORT][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-2/igt@i915_selftest@live@evict.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-2/igt@i915_selftest@live@evict.html
    - bat-adlm-1:         [PASS][11] -> [ABORT][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlm-1/igt@i915_selftest@live@evict.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlm-1/igt@i915_selftest@live@evict.html
    - bat-rpls-1:         [PASS][13] -> [ABORT][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-1/igt@i915_selftest@live@evict.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-1/igt@i915_selftest@live@evict.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@i915_selftest@live@evict:
    - {bat-mtlp-8}:       [PASS][15] -> [ABORT][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-mtlp-8/igt@i915_selftest@live@evict.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-mtlp-8/igt@i915_selftest@live@evict.html

  
Known issues
------------

  Here are the changes found in Patchwork_117618v3 that come from known issues:

### CI changes ###

#### Issues hit ####

  * boot:
    - fi-kbl-8809g:       [PASS][17] -> [FAIL][18] ([i915#8293] / [i915#8298])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-kbl-8809g/boot.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-kbl-8809g/boot.html

  

### IGT changes ###

#### Issues hit ####

  * igt@i915_module_load@reload:
    - fi-kbl-soraka:      [PASS][19] -> [DMESG-WARN][20] ([i915#1982] / [i915#8189]) +1 similar issue
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-kbl-soraka/igt@i915_module_load@reload.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-kbl-soraka/igt@i915_module_load@reload.html

  * igt@i915_selftest@live@gt_heartbeat:
    - fi-kbl-soraka:      [PASS][21] -> [DMESG-FAIL][22] ([i915#5334] / [i915#7872])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@gt_heartbeat:
    - fi-skl-guc:         [DMESG-FAIL][23] ([i915#5334]) -> [PASS][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-skl-guc/igt@i915_selftest@live@gt_heartbeat.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-skl-guc/igt@i915_selftest@live@gt_heartbeat.html

  * igt@i915_selftest@live@migrate:
    - {bat-mtlp-8}:       [DMESG-FAIL][25] ([i915#7699]) -> [PASS][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-mtlp-8/igt@i915_selftest@live@migrate.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-mtlp-8/igt@i915_selftest@live@migrate.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#4423]: https://gitlab.freedesktop.org/drm/intel/issues/4423
  [i915#5334]: https://gitlab.freedesktop.org/drm/intel/issues/5334
  [i915#7699]: https://gitlab.freedesktop.org/drm/intel/issues/7699
  [i915#7872]: https://gitlab.freedesktop.org/drm/intel/issues/7872
  [i915#8189]: https://gitlab.freedesktop.org/drm/intel/issues/8189
  [i915#8293]: https://gitlab.freedesktop.org/drm/intel/issues/8293
  [i915#8298]: https://gitlab.freedesktop.org/drm/intel/issues/8298


Build changes
-------------

  * Linux: CI_DRM_13203 -> Patchwork_117618v3

  CI-20190529: 20190529
  CI_DRM_13203: f01a870c3a528f64447611d03cf9bdba51b6d481 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7311: c031030f39aff973330668a5a2f1593408da78ae @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_117618v3: f01a870c3a528f64447611d03cf9bdba51b6d481 @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

d305c6732492 drm/i915/selftests: add local workqueue for SW fence selftest
c45c28fe0aee drm/i915: add a dedicated workqueue inside drm_i915_private
4b0cbd6b9089 drm/i915: use pointer to i915 instead of rpm in wakeref

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/index.html

[-- Attachment #2: Type: text/html, Size: 7761 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-gfx]  ✗ Fi.CI.BAT: failure for drm/i915: implement internal workqueues (rev3)
  2023-05-31  6:24 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
@ 2023-06-05 15:06   ` Jani Nikula
  2023-06-06 10:06     ` Tvrtko Ursulin
  0 siblings, 1 reply; 13+ messages in thread
From: Jani Nikula @ 2023-06-05 15:06 UTC (permalink / raw)
  To: Luca Coelho, Tvrtko Ursulin; +Cc: intel-gfx

On Wed, 31 May 2023, Patchwork <patchwork@emeril.freedesktop.org> wrote:
> #### Possible regressions ####
>
>   * igt@gem_close_race@basic-process:
>     - fi-blb-e6850:       [PASS][1] -> [ABORT][2]
>    [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-blb-e6850/igt@gem_close_race@basic-process.html
>    [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-blb-e6850/igt@gem_close_race@basic-process.html
>     - fi-hsw-4770:        [PASS][3] -> [ABORT][4]
>    [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-hsw-4770/igt@gem_close_race@basic-process.html
>    [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-hsw-4770/igt@gem_close_race@basic-process.html
>     - fi-elk-e7500:       [PASS][5] -> [ABORT][6]
>    [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-elk-e7500/igt@gem_close_race@basic-process.html
>    [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-elk-e7500/igt@gem_close_race@basic-process.html
>
>   * igt@i915_selftest@live@evict:
>     - bat-adlp-9:         [PASS][7] -> [ABORT][8]
>    [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlp-9/igt@i915_selftest@live@evict.html
>    [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlp-9/igt@i915_selftest@live@evict.html
>     - bat-rpls-2:         [PASS][9] -> [ABORT][10]
>    [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-2/igt@i915_selftest@live@evict.html
>    [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-2/igt@i915_selftest@live@evict.html
>     - bat-adlm-1:         [PASS][11] -> [ABORT][12]
>    [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlm-1/igt@i915_selftest@live@evict.html
>    [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlm-1/igt@i915_selftest@live@evict.html
>     - bat-rpls-1:         [PASS][13] -> [ABORT][14]
>    [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-1/igt@i915_selftest@live@evict.html
>    [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-1/igt@i915_selftest@live@evict.html

This still fails consistently, I have no clue why, and the above aren't
even remotely related to display.

What now? Tvrtko?

BR,
Jani.



-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-gfx]  ✗ Fi.CI.BAT: failure for drm/i915: implement internal workqueues (rev3)
  2023-06-05 15:06   ` Jani Nikula
@ 2023-06-06 10:06     ` Tvrtko Ursulin
  2023-06-06 11:06       ` Coelho, Luciano
  0 siblings, 1 reply; 13+ messages in thread
From: Tvrtko Ursulin @ 2023-06-06 10:06 UTC (permalink / raw)
  To: Jani Nikula, Luca Coelho; +Cc: intel-gfx


On 05/06/2023 16:06, Jani Nikula wrote:
> On Wed, 31 May 2023, Patchwork <patchwork@emeril.freedesktop.org> wrote:
>> #### Possible regressions ####
>>
>>    * igt@gem_close_race@basic-process:
>>      - fi-blb-e6850:       [PASS][1] -> [ABORT][2]
>>     [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-blb-e6850/igt@gem_close_race@basic-process.html
>>     [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-blb-e6850/igt@gem_close_race@basic-process.html
>>      - fi-hsw-4770:        [PASS][3] -> [ABORT][4]
>>     [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-hsw-4770/igt@gem_close_race@basic-process.html
>>     [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-hsw-4770/igt@gem_close_race@basic-process.html
>>      - fi-elk-e7500:       [PASS][5] -> [ABORT][6]
>>     [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-elk-e7500/igt@gem_close_race@basic-process.html
>>     [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-elk-e7500/igt@gem_close_race@basic-process.html
>>
>>    * igt@i915_selftest@live@evict:
>>      - bat-adlp-9:         [PASS][7] -> [ABORT][8]
>>     [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlp-9/igt@i915_selftest@live@evict.html
>>     [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlp-9/igt@i915_selftest@live@evict.html
>>      - bat-rpls-2:         [PASS][9] -> [ABORT][10]
>>     [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-2/igt@i915_selftest@live@evict.html
>>     [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-2/igt@i915_selftest@live@evict.html
>>      - bat-adlm-1:         [PASS][11] -> [ABORT][12]
>>     [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlm-1/igt@i915_selftest@live@evict.html
>>     [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlm-1/igt@i915_selftest@live@evict.html
>>      - bat-rpls-1:         [PASS][13] -> [ABORT][14]
>>     [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-1/igt@i915_selftest@live@evict.html
>>     [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-1/igt@i915_selftest@live@evict.html
> 
> This still fails consistently, I have no clue why, and the above aren't
> even remotely related to display.
> 
> What now? Tvrtko?

Hmm..

<4> [46.782321] Chain exists of:
   (wq_completion)i915 --> (work_completion)(&i915->mm.free_work) --> &vm->mutex
<4> [46.782329]  Possible unsafe locking scenario:
<4> [46.782332]        CPU0                    CPU1
<4> [46.782334]        ----                    ----
<4> [46.782337]   lock(&vm->mutex);
<4> [46.782340]                                lock((work_completion)(&i915->mm.free_work));
<4> [46.782344]                                lock(&vm->mutex);
<4> [46.782348]   lock((wq_completion)i915);


"(wq_completion)i915"

So it's not about the new wq even. Perhaps it is this hunk:

--- a/drivers/gpu/drm/i915/intel_wakeref.c
+++ b/drivers/gpu/drm/i915/intel_wakeref.c
@@ -75,7 +75,7 @@ void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags)
  
  	/* Assume we are not in process context and so cannot sleep. */
  	if (flags & INTEL_WAKEREF_PUT_ASYNC || !mutex_trylock(&wf->mutex)) {
-		mod_delayed_work(system_wq, &wf->work,
+		mod_delayed_work(wf->i915->wq, &wf->work,

Transformation from this patch otherwise is system_wq with the new unordered wq, so I'd try that first.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-gfx]  ✗ Fi.CI.BAT: failure for drm/i915: implement internal workqueues (rev3)
  2023-06-06 10:06     ` Tvrtko Ursulin
@ 2023-06-06 11:06       ` Coelho, Luciano
  2023-06-06 13:33         ` Tvrtko Ursulin
  0 siblings, 1 reply; 13+ messages in thread
From: Coelho, Luciano @ 2023-06-06 11:06 UTC (permalink / raw)
  To: tvrtko.ursulin@linux.intel.com, jani.nikula@linux.intel.com
  Cc: intel-gfx@lists.freedesktop.org

On Tue, 2023-06-06 at 11:06 +0100, Tvrtko Ursulin wrote:
> On 05/06/2023 16:06, Jani Nikula wrote:
> > On Wed, 31 May 2023, Patchwork <patchwork@emeril.freedesktop.org> wrote:
> > > #### Possible regressions ####
> > > 
> > >    * igt@gem_close_race@basic-process:
> > >      - fi-blb-e6850:       [PASS][1] -> [ABORT][2]
> > >     [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-blb-e6850/igt@gem_close_race@basic-process.html
> > >     [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-blb-e6850/igt@gem_close_race@basic-process.html
> > >      - fi-hsw-4770:        [PASS][3] -> [ABORT][4]
> > >     [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-hsw-4770/igt@gem_close_race@basic-process.html
> > >     [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-hsw-4770/igt@gem_close_race@basic-process.html
> > >      - fi-elk-e7500:       [PASS][5] -> [ABORT][6]
> > >     [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-elk-e7500/igt@gem_close_race@basic-process.html
> > >     [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-elk-e7500/igt@gem_close_race@basic-process.html
> > > 
> > >    * igt@i915_selftest@live@evict:
> > >      - bat-adlp-9:         [PASS][7] -> [ABORT][8]
> > >     [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlp-9/igt@i915_selftest@live@evict.html
> > >     [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlp-9/igt@i915_selftest@live@evict.html
> > >      - bat-rpls-2:         [PASS][9] -> [ABORT][10]
> > >     [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-2/igt@i915_selftest@live@evict.html
> > >     [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-2/igt@i915_selftest@live@evict.html
> > >      - bat-adlm-1:         [PASS][11] -> [ABORT][12]
> > >     [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlm-1/igt@i915_selftest@live@evict.html
> > >     [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlm-1/igt@i915_selftest@live@evict.html
> > >      - bat-rpls-1:         [PASS][13] -> [ABORT][14]
> > >     [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-1/igt@i915_selftest@live@evict.html
> > >     [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-1/igt@i915_selftest@live@evict.html
> > 
> > This still fails consistently, I have no clue why, and the above aren't
> > even remotely related to display.
> > 
> > What now? Tvrtko?
> 
> Hmm..
> 
> <4> [46.782321] Chain exists of:
>    (wq_completion)i915 --> (work_completion)(&i915->mm.free_work) --> &vm->mutex
> <4> [46.782329]  Possible unsafe locking scenario:
> <4> [46.782332]        CPU0                    CPU1
> <4> [46.782334]        ----                    ----
> <4> [46.782337]   lock(&vm->mutex);
> <4> [46.782340]                                lock((work_completion)(&i915->mm.free_work));
> <4> [46.782344]                                lock(&vm->mutex);
> <4> [46.782348]   lock((wq_completion)i915);
> 
> 
> "(wq_completion)i915"
> 
> So it's not about the new wq even. Perhaps it is this hunk:
> 
> --- a/drivers/gpu/drm/i915/intel_wakeref.c
> +++ b/drivers/gpu/drm/i915/intel_wakeref.c
> @@ -75,7 +75,7 @@ void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags)
>   
>   	/* Assume we are not in process context and so cannot sleep. */
>   	if (flags & INTEL_WAKEREF_PUT_ASYNC || !mutex_trylock(&wf->mutex)) {
> -		mod_delayed_work(system_wq, &wf->work,
> +		mod_delayed_work(wf->i915->wq, &wf->work,
> 
> Transformation from this patch otherwise is system_wq with the new unordered wq, so I'd try that first.

Indeed this seems to be exactly the block that is causing the issue.  I
was sort of bisecting through all these changes and reverting this one
prevents the lockdep splat from happening...

So there's something that needs to be synced with the system_wq here,
but what? I need to dig into it.

--
Cheers,
Luca.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-gfx]  ✗ Fi.CI.BAT: failure for drm/i915: implement internal workqueues (rev3)
  2023-06-06 11:06       ` Coelho, Luciano
@ 2023-06-06 13:33         ` Tvrtko Ursulin
  2023-06-06 14:30           ` Coelho, Luciano
  0 siblings, 1 reply; 13+ messages in thread
From: Tvrtko Ursulin @ 2023-06-06 13:33 UTC (permalink / raw)
  To: Coelho, Luciano, jani.nikula@linux.intel.com
  Cc: intel-gfx@lists.freedesktop.org


On 06/06/2023 12:06, Coelho, Luciano wrote:
> On Tue, 2023-06-06 at 11:06 +0100, Tvrtko Ursulin wrote:
>> On 05/06/2023 16:06, Jani Nikula wrote:
>>> On Wed, 31 May 2023, Patchwork <patchwork@emeril.freedesktop.org> wrote:
>>>> #### Possible regressions ####
>>>>
>>>>     * igt@gem_close_race@basic-process:
>>>>       - fi-blb-e6850:       [PASS][1] -> [ABORT][2]
>>>>      [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-blb-e6850/igt@gem_close_race@basic-process.html
>>>>      [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-blb-e6850/igt@gem_close_race@basic-process.html
>>>>       - fi-hsw-4770:        [PASS][3] -> [ABORT][4]
>>>>      [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-hsw-4770/igt@gem_close_race@basic-process.html
>>>>      [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-hsw-4770/igt@gem_close_race@basic-process.html
>>>>       - fi-elk-e7500:       [PASS][5] -> [ABORT][6]
>>>>      [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-elk-e7500/igt@gem_close_race@basic-process.html
>>>>      [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-elk-e7500/igt@gem_close_race@basic-process.html
>>>>
>>>>     * igt@i915_selftest@live@evict:
>>>>       - bat-adlp-9:         [PASS][7] -> [ABORT][8]
>>>>      [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlp-9/igt@i915_selftest@live@evict.html
>>>>      [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlp-9/igt@i915_selftest@live@evict.html
>>>>       - bat-rpls-2:         [PASS][9] -> [ABORT][10]
>>>>      [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-2/igt@i915_selftest@live@evict.html
>>>>      [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-2/igt@i915_selftest@live@evict.html
>>>>       - bat-adlm-1:         [PASS][11] -> [ABORT][12]
>>>>      [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlm-1/igt@i915_selftest@live@evict.html
>>>>      [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlm-1/igt@i915_selftest@live@evict.html
>>>>       - bat-rpls-1:         [PASS][13] -> [ABORT][14]
>>>>      [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-1/igt@i915_selftest@live@evict.html
>>>>      [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-1/igt@i915_selftest@live@evict.html
>>>
>>> This still fails consistently, I have no clue why, and the above aren't
>>> even remotely related to display.
>>>
>>> What now? Tvrtko?
>>
>> Hmm..
>>
>> <4> [46.782321] Chain exists of:
>>     (wq_completion)i915 --> (work_completion)(&i915->mm.free_work) --> &vm->mutex
>> <4> [46.782329]  Possible unsafe locking scenario:
>> <4> [46.782332]        CPU0                    CPU1
>> <4> [46.782334]        ----                    ----
>> <4> [46.782337]   lock(&vm->mutex);
>> <4> [46.782340]                                lock((work_completion)(&i915->mm.free_work));
>> <4> [46.782344]                                lock(&vm->mutex);
>> <4> [46.782348]   lock((wq_completion)i915);
>>
>>
>> "(wq_completion)i915"
>>
>> So it's not about the new wq even. Perhaps it is this hunk:
>>
>> --- a/drivers/gpu/drm/i915/intel_wakeref.c
>> +++ b/drivers/gpu/drm/i915/intel_wakeref.c
>> @@ -75,7 +75,7 @@ void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags)
>>    
>>    	/* Assume we are not in process context and so cannot sleep. */
>>    	if (flags & INTEL_WAKEREF_PUT_ASYNC || !mutex_trylock(&wf->mutex)) {
>> -		mod_delayed_work(system_wq, &wf->work,
>> +		mod_delayed_work(wf->i915->wq, &wf->work,
>>
>> Transformation from this patch otherwise is system_wq with the new unordered wq, so I'd try that first.
> 
> Indeed this seems to be exactly the block that is causing the issue.  I
> was sort of bisecting through all these changes and reverting this one
> prevents the lockdep splat from happening...
> 
> So there's something that needs to be synced with the system_wq here,
> but what? I need to dig into it.

AFAICT it is saying that i915->mm.free_work and engine->wakeref.work 
must not be on the same ordered wq. Otherwise execbuf call trace 
flushing under vm->mutex can deadlock against the free worker trying to 
grab vm->mutex. If engine->wakeref.work is on a separate unordered wq it 
would be safe since then execution will not be serialized with the 
free_work. So just using the new i915->unordered_wq for this hunk should 
work.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-gfx]  ✗ Fi.CI.BAT: failure for drm/i915: implement internal workqueues (rev3)
  2023-06-06 13:33         ` Tvrtko Ursulin
@ 2023-06-06 14:30           ` Coelho, Luciano
  2023-06-06 16:22             ` Coelho, Luciano
  0 siblings, 1 reply; 13+ messages in thread
From: Coelho, Luciano @ 2023-06-06 14:30 UTC (permalink / raw)
  To: tvrtko.ursulin@linux.intel.com, jani.nikula@linux.intel.com
  Cc: intel-gfx@lists.freedesktop.org

On Tue, 2023-06-06 at 14:33 +0100, Tvrtko Ursulin wrote:
> On 06/06/2023 12:06, Coelho, Luciano wrote:
> > On Tue, 2023-06-06 at 11:06 +0100, Tvrtko Ursulin wrote:
> > > On 05/06/2023 16:06, Jani Nikula wrote:
> > > > On Wed, 31 May 2023, Patchwork <patchwork@emeril.freedesktop.org> wrote:
> > > > > #### Possible regressions ####
> > > > > 
> > > > >     * igt@gem_close_race@basic-process:
> > > > >       - fi-blb-e6850:       [PASS][1] -> [ABORT][2]
> > > > >      [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-blb-e6850/igt@gem_close_race@basic-process.html
> > > > >      [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-blb-e6850/igt@gem_close_race@basic-process.html
> > > > >       - fi-hsw-4770:        [PASS][3] -> [ABORT][4]
> > > > >      [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-hsw-4770/igt@gem_close_race@basic-process.html
> > > > >      [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-hsw-4770/igt@gem_close_race@basic-process.html
> > > > >       - fi-elk-e7500:       [PASS][5] -> [ABORT][6]
> > > > >      [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-elk-e7500/igt@gem_close_race@basic-process.html
> > > > >      [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-elk-e7500/igt@gem_close_race@basic-process.html
> > > > > 
> > > > >     * igt@i915_selftest@live@evict:
> > > > >       - bat-adlp-9:         [PASS][7] -> [ABORT][8]
> > > > >      [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlp-9/igt@i915_selftest@live@evict.html
> > > > >      [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlp-9/igt@i915_selftest@live@evict.html
> > > > >       - bat-rpls-2:         [PASS][9] -> [ABORT][10]
> > > > >      [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-2/igt@i915_selftest@live@evict.html
> > > > >      [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-2/igt@i915_selftest@live@evict.html
> > > > >       - bat-adlm-1:         [PASS][11] -> [ABORT][12]
> > > > >      [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlm-1/igt@i915_selftest@live@evict.html
> > > > >      [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlm-1/igt@i915_selftest@live@evict.html
> > > > >       - bat-rpls-1:         [PASS][13] -> [ABORT][14]
> > > > >      [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-1/igt@i915_selftest@live@evict.html
> > > > >      [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-1/igt@i915_selftest@live@evict.html
> > > > 
> > > > This still fails consistently, I have no clue why, and the above aren't
> > > > even remotely related to display.
> > > > 
> > > > What now? Tvrtko?
> > > 
> > > Hmm..
> > > 
> > > <4> [46.782321] Chain exists of:
> > >     (wq_completion)i915 --> (work_completion)(&i915->mm.free_work) --> &vm->mutex
> > > <4> [46.782329]  Possible unsafe locking scenario:
> > > <4> [46.782332]        CPU0                    CPU1
> > > <4> [46.782334]        ----                    ----
> > > <4> [46.782337]   lock(&vm->mutex);
> > > <4> [46.782340]                                lock((work_completion)(&i915->mm.free_work));
> > > <4> [46.782344]                                lock(&vm->mutex);
> > > <4> [46.782348]   lock((wq_completion)i915);
> > > 
> > > 
> > > "(wq_completion)i915"
> > > 
> > > So it's not about the new wq even. Perhaps it is this hunk:
> > > 
> > > --- a/drivers/gpu/drm/i915/intel_wakeref.c
> > > +++ b/drivers/gpu/drm/i915/intel_wakeref.c
> > > @@ -75,7 +75,7 @@ void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags)
> > >    
> > >    	/* Assume we are not in process context and so cannot sleep. */
> > >    	if (flags & INTEL_WAKEREF_PUT_ASYNC || !mutex_trylock(&wf->mutex)) {
> > > -		mod_delayed_work(system_wq, &wf->work,
> > > +		mod_delayed_work(wf->i915->wq, &wf->work,
> > > 
> > > Transformation from this patch otherwise is system_wq with the new unordered wq, so I'd try that first.
> > 
> > Indeed this seems to be exactly the block that is causing the issue.  I
> > was sort of bisecting through all these changes and reverting this one
> > prevents the lockdep splat from happening...
> > 
> > So there's something that needs to be synced with the system_wq here,
> > but what? I need to dig into it.
> 
> AFAICT it is saying that i915->mm.free_work and engine->wakeref.work 
> must not be on the same ordered wq. Otherwise execbuf call trace 
> flushing under vm->mutex can deadlock against the free worker trying to 
> grab vm->mutex. If engine->wakeref.work is on a separate unordered wq it 
> would be safe since then execution will not be serialized with the 
> free_work. So just using the new i915->unordered_wq for this hunk should 
> work.

Ah, great, thanks for the insight! I'll try it now and see how it goes.

--
Cheers,
Luca.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-gfx]  ✗ Fi.CI.BAT: failure for drm/i915: implement internal workqueues (rev3)
  2023-06-06 14:30           ` Coelho, Luciano
@ 2023-06-06 16:22             ` Coelho, Luciano
  0 siblings, 0 replies; 13+ messages in thread
From: Coelho, Luciano @ 2023-06-06 16:22 UTC (permalink / raw)
  To: tvrtko.ursulin@linux.intel.com, jani.nikula@linux.intel.com
  Cc: intel-gfx@lists.freedesktop.org

On Tue, 2023-06-06 at 14:30 +0000, Coelho, Luciano wrote:
> On Tue, 2023-06-06 at 14:33 +0100, Tvrtko Ursulin wrote:
> > On 06/06/2023 12:06, Coelho, Luciano wrote:
> > > On Tue, 2023-06-06 at 11:06 +0100, Tvrtko Ursulin wrote:
> > > > On 05/06/2023 16:06, Jani Nikula wrote:
> > > > > On Wed, 31 May 2023, Patchwork <patchwork@emeril.freedesktop.org> wrote:
> > > > > > #### Possible regressions ####
> > > > > > 
> > > > > >     * igt@gem_close_race@basic-process:
> > > > > >       - fi-blb-e6850:       [PASS][1] -> [ABORT][2]
> > > > > >      [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-blb-e6850/igt@gem_close_race@basic-process.html
> > > > > >      [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-blb-e6850/igt@gem_close_race@basic-process.html
> > > > > >       - fi-hsw-4770:        [PASS][3] -> [ABORT][4]
> > > > > >      [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-hsw-4770/igt@gem_close_race@basic-process.html
> > > > > >      [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-hsw-4770/igt@gem_close_race@basic-process.html
> > > > > >       - fi-elk-e7500:       [PASS][5] -> [ABORT][6]
> > > > > >      [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/fi-elk-e7500/igt@gem_close_race@basic-process.html
> > > > > >      [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/fi-elk-e7500/igt@gem_close_race@basic-process.html
> > > > > > 
> > > > > >     * igt@i915_selftest@live@evict:
> > > > > >       - bat-adlp-9:         [PASS][7] -> [ABORT][8]
> > > > > >      [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlp-9/igt@i915_selftest@live@evict.html
> > > > > >      [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlp-9/igt@i915_selftest@live@evict.html
> > > > > >       - bat-rpls-2:         [PASS][9] -> [ABORT][10]
> > > > > >      [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-2/igt@i915_selftest@live@evict.html
> > > > > >      [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-2/igt@i915_selftest@live@evict.html
> > > > > >       - bat-adlm-1:         [PASS][11] -> [ABORT][12]
> > > > > >      [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-adlm-1/igt@i915_selftest@live@evict.html
> > > > > >      [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-adlm-1/igt@i915_selftest@live@evict.html
> > > > > >       - bat-rpls-1:         [PASS][13] -> [ABORT][14]
> > > > > >      [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13203/bat-rpls-1/igt@i915_selftest@live@evict.html
> > > > > >      [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_117618v3/bat-rpls-1/igt@i915_selftest@live@evict.html
> > > > > 
> > > > > This still fails consistently, I have no clue why, and the above aren't
> > > > > even remotely related to display.
> > > > > 
> > > > > What now? Tvrtko?
> > > > 
> > > > Hmm..
> > > > 
> > > > <4> [46.782321] Chain exists of:
> > > >     (wq_completion)i915 --> (work_completion)(&i915->mm.free_work) --> &vm->mutex
> > > > <4> [46.782329]  Possible unsafe locking scenario:
> > > > <4> [46.782332]        CPU0                    CPU1
> > > > <4> [46.782334]        ----                    ----
> > > > <4> [46.782337]   lock(&vm->mutex);
> > > > <4> [46.782340]                                lock((work_completion)(&i915->mm.free_work));
> > > > <4> [46.782344]                                lock(&vm->mutex);
> > > > <4> [46.782348]   lock((wq_completion)i915);
> > > > 
> > > > 
> > > > "(wq_completion)i915"
> > > > 
> > > > So it's not about the new wq even. Perhaps it is this hunk:
> > > > 
> > > > --- a/drivers/gpu/drm/i915/intel_wakeref.c
> > > > +++ b/drivers/gpu/drm/i915/intel_wakeref.c
> > > > @@ -75,7 +75,7 @@ void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags)
> > > >    
> > > >    	/* Assume we are not in process context and so cannot sleep. */
> > > >    	if (flags & INTEL_WAKEREF_PUT_ASYNC || !mutex_trylock(&wf->mutex)) {
> > > > -		mod_delayed_work(system_wq, &wf->work,
> > > > +		mod_delayed_work(wf->i915->wq, &wf->work,
> > > > 
> > > > Transformation from this patch otherwise is system_wq with the new unordered wq, so I'd try that first.
> > > 
> > > Indeed this seems to be exactly the block that is causing the issue.  I
> > > was sort of bisecting through all these changes and reverting this one
> > > prevents the lockdep splat from happening...
> > > 
> > > So there's something that needs to be synced with the system_wq here,
> > > but what? I need to dig into it.
> > 
> > AFAICT it is saying that i915->mm.free_work and engine->wakeref.work 
> > must not be on the same ordered wq. Otherwise execbuf call trace 
> > flushing under vm->mutex can deadlock against the free worker trying to 
> > grab vm->mutex. If engine->wakeref.work is on a separate unordered wq it 
> > would be safe since then execution will not be serialized with the 
> > free_work. So just using the new i915->unordered_wq for this hunk should 
> > work.
> 
> Ah, great, thanks for the insight! I'll try it now and see how it goes.

This works now.  It was quite obviously wrong, but I was completely
blind to it.  Thanks a lot for the catch, Tvrtko!

v4 coming in a sec.

--
Cheers,
Luca.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-06-06 16:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-30 11:15 [Intel-gfx] [PATCH v3 0/3] drm/i915: implement internal workqueues Luca Coelho
2023-05-30 11:15 ` [Intel-gfx] [PATCH v3 1/3] drm/i915: use pointer to i915 instead of rpm in wakeref Luca Coelho
2023-05-30 11:15 ` [Intel-gfx] [PATCH v3 2/3] drm/i915: add a dedicated workqueue inside drm_i915_private Luca Coelho
2023-05-30 11:15 ` [Intel-gfx] [PATCH v3 3/3] drm/i915/selftests: add local workqueue for SW fence selftest Luca Coelho
2023-05-31  6:05 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: implement internal workqueues (rev3) Patchwork
2023-05-31  6:05 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-05-31  6:24 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2023-06-05 15:06   ` Jani Nikula
2023-06-06 10:06     ` Tvrtko Ursulin
2023-06-06 11:06       ` Coelho, Luciano
2023-06-06 13:33         ` Tvrtko Ursulin
2023-06-06 14:30           ` Coelho, Luciano
2023-06-06 16:22             ` Coelho, Luciano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.