public inbox for intel-xe@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Ville Syrjala <ville.syrjala@linux.intel.com>
To: intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	"Simona Vetter" <simona.vetter@ffwll.ch>,
	"Christian König" <christian.koenig@amd.com>,
	"Jani Nikula" <jani.nikula@intel.com>,
	"Jouni Högander" <jouni.hogander@intel.com>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>
Subject: [PATCH 5/6] drm/i915/reset: Handle the display vs. GPU reset deadlock using a custom dma-fence
Date: Thu,  9 Apr 2026 02:34:57 +0300	[thread overview]
Message-ID: <20260408233458.22666-6-ville.syrjala@linux.intel.com> (raw)
In-Reply-To: <20260408233458.22666-1-ville.syrjala@linux.intel.com>

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

The old display vs. GPU reset deadlock is back more or less.
The old (working) solution to the problem was originally
introduced in commit 9db529aac938 ("drm/i915: More surgically
unbreak the modeset vs reset deadlock"), but it got nuked with
commit d59cf7bb73f3 ("drm/i915/display: Use dma_fence interfaces
instead of i915_sw_fence").

Apparently no one looked hard enough to see that things didn't
work quite properly anymore. What is still saving us for the most
part is that we have a timeout on the fence wait
(CONFIG_DRM_I915_FENCE_TIMEOUT, 10 seconds by default). But
people are perhaps trying to get rid of that so we may need
another solution, and 10 seconds is a bit slow.

Re-solve the problem yet again with a custom dma-fence that gets
signaled just prior to a GPU reset, and have the atomic commit wait
for either that or the real fence using dma_fence_wait_any_timeout().
Whichever signals first will let the commit proceed. We create a new
"reset fence" whenever someone needs one, and keep it until the next
GPU reset has completed. After that the next guy will again get a
fresh unsignaled "reset fence".

Cc: Simona Vetter <simona.vetter@ffwll.ch>
Cc: Christian König <christian.koenig@amd.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Jouni Högander <jouni.hogander@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c  | 34 +++++---
 .../gpu/drm/i915/display/intel_display_core.h |  6 ++
 .../drm/i915/display/intel_display_driver.c   |  5 ++
 .../drm/i915/display/intel_display_reset.c    | 77 +++++++++++++++++++
 .../drm/i915/display/intel_display_reset.h    |  4 +
 drivers/gpu/drm/xe/Makefile                   |  1 +
 6 files changed, 117 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 58a654ca0d20..83ccf13c4b16 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -72,6 +72,7 @@
 #include "intel_display_driver.h"
 #include "intel_display_power.h"
 #include "intel_display_regs.h"
+#include "intel_display_reset.h"
 #include "intel_display_rpm.h"
 #include "intel_display_types.h"
 #include "intel_display_utils.h"
@@ -7149,22 +7150,35 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
 
 static void intel_atomic_commit_fence_wait(struct intel_atomic_state *intel_state)
 {
-	struct drm_plane *plane;
+	struct intel_display *display = to_intel_display(intel_state);
 	struct drm_plane_state *new_plane_state;
-	long ret;
+	struct dma_fence *reset_fence;
+	struct drm_plane *plane;
 	int i;
 
+	reset_fence = intel_display_reset_fence_get(display);
+
 	for_each_new_plane_in_state(&intel_state->base, plane, new_plane_state, i) {
-		if (new_plane_state->fence) {
-			ret = dma_fence_wait_timeout(new_plane_state->fence, false,
-						     i915_fence_timeout());
-			if (ret <= 0)
-				break;
+		struct dma_fence *fences[2] = {
+			[0] = new_plane_state->fence,
+			[1] = reset_fence,
+		};
+		long ret;
 
-			dma_fence_put(new_plane_state->fence);
-			new_plane_state->fence = NULL;
-		}
+		if (!new_plane_state->fence)
+			continue;
+
+		ret = dma_fence_wait_any_timeout(fences, reset_fence ? 2 : 1, false,
+						 i915_fence_timeout(), NULL);
+		if (ret <= 0)
+			break;
+
+		dma_fence_put(new_plane_state->fence);
+		new_plane_state->fence = NULL;
 	}
+
+	if (reset_fence)
+		dma_fence_put(reset_fence);
 }
 
 static void intel_atomic_dsb_wait_commit(struct intel_crtc_state *crtc_state)
diff --git a/drivers/gpu/drm/i915/display/intel_display_core.h b/drivers/gpu/drm/i915/display/intel_display_core.h
index 9e77003addd0..6687b658c51d 100644
--- a/drivers/gpu/drm/i915/display/intel_display_core.h
+++ b/drivers/gpu/drm/i915/display/intel_display_core.h
@@ -556,6 +556,12 @@ struct intel_display {
 		unsigned long mask;
 	} quirks;
 
+	struct {
+		/* protects reset.fence */
+		struct mutex mutex;
+		struct dma_fence *fence;
+	} reset;
+
 	struct {
 		/* restore state for suspend/resume and display reset */
 		struct drm_atomic_state *modeset_state;
diff --git a/drivers/gpu/drm/i915/display/intel_display_driver.c b/drivers/gpu/drm/i915/display/intel_display_driver.c
index 23bfecc983e8..fcd31722c731 100644
--- a/drivers/gpu/drm/i915/display/intel_display_driver.c
+++ b/drivers/gpu/drm/i915/display/intel_display_driver.c
@@ -34,6 +34,7 @@
 #include "intel_display_driver.h"
 #include "intel_display_irq.h"
 #include "intel_display_power.h"
+#include "intel_display_reset.h"
 #include "intel_display_types.h"
 #include "intel_display_utils.h"
 #include "intel_display_wa.h"
@@ -257,6 +258,8 @@ int intel_display_driver_probe_noirq(struct intel_display *display)
 
 	intel_mode_config_init(display);
 
+	intel_display_reset_fence_init(display);
+
 	ret = intel_cdclk_init(display);
 	if (ret)
 		goto cleanup_wq_unordered;
@@ -584,6 +587,8 @@ void intel_display_driver_remove(struct intel_display *display)
 	if (!HAS_DISPLAY(display))
 		return;
 
+	intel_display_reset_fence_discard(display);
+
 	flush_workqueue(display->wq.flip);
 	flush_workqueue(display->wq.modeset);
 	flush_workqueue(display->wq.cleanup);
diff --git a/drivers/gpu/drm/i915/display/intel_display_reset.c b/drivers/gpu/drm/i915/display/intel_display_reset.c
index ca15dc18ef0f..80dd2ea8a0c2 100644
--- a/drivers/gpu/drm/i915/display/intel_display_reset.c
+++ b/drivers/gpu/drm/i915/display/intel_display_reset.c
@@ -3,6 +3,8 @@
  * Copyright © 2023 Intel Corporation
  */
 
+#include <linux/dma-fence.h>
+
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_print.h>
 
@@ -16,6 +18,72 @@
 #include "intel_hotplug.h"
 #include "intel_pps.h"
 
+static const char *intel_display_reset_fence_get_driver_name(struct dma_fence *fence)
+{
+	return "intel_display";
+}
+
+static const char *intel_display_reset_fence_get_timeline_name(struct dma_fence *fence)
+{
+	return "reset";
+}
+
+static const struct dma_fence_ops intel_display_reset_fence_ops = {
+	.get_driver_name = intel_display_reset_fence_get_driver_name,
+	.get_timeline_name = intel_display_reset_fence_get_timeline_name,
+};
+
+static void intel_display_reset_create(struct intel_display *display)
+{
+	struct dma_fence *fence;
+
+	fence = kzalloc_obj(*fence);
+	if (!fence)
+		return;
+
+	dma_fence_init(fence, &intel_display_reset_fence_ops, NULL, 0, 0);
+
+	display->reset.fence = fence;
+}
+
+struct dma_fence *intel_display_reset_fence_get(struct intel_display *display)
+{
+	struct dma_fence *fence;
+
+	mutex_lock(&display->reset.mutex);
+
+	if (!display->reset.fence)
+		intel_display_reset_create(display);
+
+	fence = display->reset.fence;
+	if (fence)
+		dma_fence_get(fence);
+
+	mutex_unlock(&display->reset.mutex);
+
+	return fence;
+}
+
+void intel_display_reset_fence_discard(struct intel_display *display)
+{
+	struct dma_fence *fence;
+
+	mutex_lock(&display->reset.mutex);
+
+	fence = display->reset.fence;
+	if (fence)
+		dma_fence_put(fence);
+
+	display->reset.fence = NULL;
+
+	mutex_unlock(&display->reset.mutex);
+}
+
+void intel_display_reset_fence_init(struct intel_display *display)
+{
+	mutex_init(&display->reset.mutex);
+}
+
 bool intel_display_reset_supported(struct intel_display *display)
 {
 	return HAS_DISPLAY(display);
@@ -31,8 +99,15 @@ void intel_display_reset_prepare(struct intel_display *display)
 {
 	struct drm_modeset_acquire_ctx *ctx = &display->restore.reset_ctx;
 	struct drm_atomic_state *state;
+	struct dma_fence *reset_fence;
 	int ret;
 
+	reset_fence = intel_display_reset_fence_get(display);
+	if (reset_fence) {
+		dma_fence_signal(reset_fence);
+		dma_fence_put(reset_fence);
+	}
+
 	/*
 	 * Need mode_config.mutex so that we don't
 	 * trample ongoing ->detect() and whatnot.
@@ -110,6 +185,8 @@ void intel_display_reset_finish(struct intel_display *display, bool test_only)
 
 	drm_atomic_state_put(state);
 unlock:
+	intel_display_reset_fence_discard(display);
+
 	drm_modeset_drop_locks(ctx);
 	drm_modeset_acquire_fini(ctx);
 	mutex_unlock(&display->drm->mode_config.mutex);
diff --git a/drivers/gpu/drm/i915/display/intel_display_reset.h b/drivers/gpu/drm/i915/display/intel_display_reset.h
index a8aa7729d33f..c36a075c6b4d 100644
--- a/drivers/gpu/drm/i915/display/intel_display_reset.h
+++ b/drivers/gpu/drm/i915/display/intel_display_reset.h
@@ -10,6 +10,10 @@
 
 struct intel_display;
 
+struct dma_fence *intel_display_reset_fence_get(struct intel_display *display);
+void intel_display_reset_fence_discard(struct intel_display *display);
+void intel_display_reset_fence_init(struct intel_display *display);
+
 bool intel_display_reset_supported(struct intel_display *display);
 bool intel_display_reset_test(struct intel_display *display);
 void intel_display_reset_prepare(struct intel_display *display);
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 110fef511fe2..1a85dfe457f0 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -262,6 +262,7 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
 	i915-display/intel_display_power.o \
 	i915-display/intel_display_power_map.o \
 	i915-display/intel_display_power_well.o \
+	i915-display/intel_display_reset.o \
 	i915-display/intel_display_rpm.o \
 	i915-display/intel_display_rps.o \
 	i915-display/intel_display_trace.o \
-- 
2.52.0


  parent reply	other threads:[~2026-04-08 23:35 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-08 23:34 [PATCH 0/6] drm/i915/reset: Solve display vs. GPU reset deadlock, again Ville Syrjala
2026-04-08 23:34 ` [PATCH 1/6] dma-buf: Remove old lies about dma_fence_wait_any_timeout() not accepting some fences Ville Syrjala
2026-04-09  8:09   ` Jani Nikula
2026-04-09 10:39   ` Christian König
2026-04-08 23:34 ` [PATCH 2/6] drm/i915/reset: Reorganize display reset code Ville Syrjala
2026-04-09  8:13   ` Jani Nikula
2026-04-08 23:34 ` [PATCH 3/6] drm/i915/reset: Move pending_fb_pin handling to i915 Ville Syrjala
2026-04-09  8:17   ` Jani Nikula
2026-04-08 23:34 ` [PATCH 4/6] drm/xe/display: Add init_clock_gating.h stubs Ville Syrjala
2026-04-09  8:19   ` Jani Nikula
2026-04-08 23:34 ` Ville Syrjala [this message]
2026-04-09 10:37   ` [PATCH 5/6] drm/i915/reset: Handle the display vs. GPU reset deadlock using a custom dma-fence Jani Nikula
2026-04-09 10:46   ` Christian König
2026-04-09 11:19     ` Ville Syrjälä
2026-04-09 12:17       ` Christian König
2026-04-08 23:34 ` [PATCH 6/6] drm/i915/display: Make fence timeout infinite Ville Syrjala
2026-04-09 10:51   ` Jani Nikula
2026-04-08 23:42 ` ✗ CI.KUnit: failure for drm/i915/reset: Solve display vs. GPU reset deadlock, again Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260408233458.22666-6-ville.syrjala@linux.intel.com \
    --to=ville.syrjala@linux.intel.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@intel.com \
    --cc=jouni.hogander@intel.com \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=simona.vetter@ffwll.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox