From: Ville Syrjala <ville.syrjala@linux.intel.com>
To: intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
"Simona Vetter" <simona.vetter@ffwll.ch>,
"Christian König" <christian.koenig@amd.com>,
"Jani Nikula" <jani.nikula@intel.com>,
"Jouni Högander" <jouni.hogander@intel.com>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>
Subject: [PATCH 5/6] drm/i915/reset: Handle the display vs. GPU reset deadlock using a custom dma-fence
Date: Thu, 9 Apr 2026 02:34:57 +0300 [thread overview]
Message-ID: <20260408233458.22666-6-ville.syrjala@linux.intel.com> (raw)
In-Reply-To: <20260408233458.22666-1-ville.syrjala@linux.intel.com>
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
The old display vs. GPU reset deadlock is back more or less.
The old (working) solution to the problem was originally
introduced in commit 9db529aac938 ("drm/i915: More surgically
unbreak the modeset vs reset deadlock"), but it got nuked with
commit d59cf7bb73f3 ("drm/i915/display: Use dma_fence interfaces
instead of i915_sw_fence").
Apparently no one looked hard enough to see that things didn't
work quite properly anymore. What is still saving us for the most
part is that we have a timeout on the fence wait
(CONFIG_DRM_I915_FENCE_TIMEOUT, 10 seconds by default). But
people are perhaps trying to get rid of that so we may need
another solution, and 10 seconds is a bit slow.
Re-solve the problem yet again with a custom dma-fence that gets
signaled just prior to a GPU reset, and have the atomic commit wait
for either that or the real fence using dma_fence_wait_any_timeout().
Whichever signals first will let the commit proceed. We create a new
"reset fence" whenever someone needs one, and keep it until the next
GPU reset has completed. After that the next guy will again get a
fresh unsignaled "reset fence".
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Cc: Christian König <christian.koenig@amd.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Jouni Högander <jouni.hogander@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_display.c | 34 +++++---
.../gpu/drm/i915/display/intel_display_core.h | 6 ++
.../drm/i915/display/intel_display_driver.c | 5 ++
.../drm/i915/display/intel_display_reset.c | 77 +++++++++++++++++++
.../drm/i915/display/intel_display_reset.h | 4 +
drivers/gpu/drm/xe/Makefile | 1 +
6 files changed, 117 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 58a654ca0d20..83ccf13c4b16 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -72,6 +72,7 @@
#include "intel_display_driver.h"
#include "intel_display_power.h"
#include "intel_display_regs.h"
+#include "intel_display_reset.h"
#include "intel_display_rpm.h"
#include "intel_display_types.h"
#include "intel_display_utils.h"
@@ -7149,22 +7150,35 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
static void intel_atomic_commit_fence_wait(struct intel_atomic_state *intel_state)
{
- struct drm_plane *plane;
+ struct intel_display *display = to_intel_display(intel_state);
struct drm_plane_state *new_plane_state;
- long ret;
+ struct dma_fence *reset_fence;
+ struct drm_plane *plane;
int i;
+ reset_fence = intel_display_reset_fence_get(display);
+
for_each_new_plane_in_state(&intel_state->base, plane, new_plane_state, i) {
- if (new_plane_state->fence) {
- ret = dma_fence_wait_timeout(new_plane_state->fence, false,
- i915_fence_timeout());
- if (ret <= 0)
- break;
+ struct dma_fence *fences[2] = {
+ [0] = new_plane_state->fence,
+ [1] = reset_fence,
+ };
+ long ret;
- dma_fence_put(new_plane_state->fence);
- new_plane_state->fence = NULL;
- }
+ if (!new_plane_state->fence)
+ continue;
+
+ ret = dma_fence_wait_any_timeout(fences, reset_fence ? 2 : 1, false,
+ i915_fence_timeout(), NULL);
+ if (ret <= 0)
+ break;
+
+ dma_fence_put(new_plane_state->fence);
+ new_plane_state->fence = NULL;
}
+
+ if (reset_fence)
+ dma_fence_put(reset_fence);
}
static void intel_atomic_dsb_wait_commit(struct intel_crtc_state *crtc_state)
diff --git a/drivers/gpu/drm/i915/display/intel_display_core.h b/drivers/gpu/drm/i915/display/intel_display_core.h
index 9e77003addd0..6687b658c51d 100644
--- a/drivers/gpu/drm/i915/display/intel_display_core.h
+++ b/drivers/gpu/drm/i915/display/intel_display_core.h
@@ -556,6 +556,12 @@ struct intel_display {
unsigned long mask;
} quirks;
+ struct {
+ /* protects reset.fence */
+ struct mutex mutex;
+ struct dma_fence *fence;
+ } reset;
+
struct {
/* restore state for suspend/resume and display reset */
struct drm_atomic_state *modeset_state;
diff --git a/drivers/gpu/drm/i915/display/intel_display_driver.c b/drivers/gpu/drm/i915/display/intel_display_driver.c
index 23bfecc983e8..fcd31722c731 100644
--- a/drivers/gpu/drm/i915/display/intel_display_driver.c
+++ b/drivers/gpu/drm/i915/display/intel_display_driver.c
@@ -34,6 +34,7 @@
#include "intel_display_driver.h"
#include "intel_display_irq.h"
#include "intel_display_power.h"
+#include "intel_display_reset.h"
#include "intel_display_types.h"
#include "intel_display_utils.h"
#include "intel_display_wa.h"
@@ -257,6 +258,8 @@ int intel_display_driver_probe_noirq(struct intel_display *display)
intel_mode_config_init(display);
+ intel_display_reset_fence_init(display);
+
ret = intel_cdclk_init(display);
if (ret)
goto cleanup_wq_unordered;
@@ -584,6 +587,8 @@ void intel_display_driver_remove(struct intel_display *display)
if (!HAS_DISPLAY(display))
return;
+ intel_display_reset_fence_discard(display);
+
flush_workqueue(display->wq.flip);
flush_workqueue(display->wq.modeset);
flush_workqueue(display->wq.cleanup);
diff --git a/drivers/gpu/drm/i915/display/intel_display_reset.c b/drivers/gpu/drm/i915/display/intel_display_reset.c
index ca15dc18ef0f..80dd2ea8a0c2 100644
--- a/drivers/gpu/drm/i915/display/intel_display_reset.c
+++ b/drivers/gpu/drm/i915/display/intel_display_reset.c
@@ -3,6 +3,8 @@
* Copyright © 2023 Intel Corporation
*/
+#include <linux/dma-fence.h>
+
#include <drm/drm_atomic_helper.h>
#include <drm/drm_print.h>
@@ -16,6 +18,72 @@
#include "intel_hotplug.h"
#include "intel_pps.h"
+static const char *intel_display_reset_fence_get_driver_name(struct dma_fence *fence)
+{
+ return "intel_display";
+}
+
+static const char *intel_display_reset_fence_get_timeline_name(struct dma_fence *fence)
+{
+ return "reset";
+}
+
+static const struct dma_fence_ops intel_display_reset_fence_ops = {
+ .get_driver_name = intel_display_reset_fence_get_driver_name,
+ .get_timeline_name = intel_display_reset_fence_get_timeline_name,
+};
+
+static void intel_display_reset_create(struct intel_display *display)
+{
+ struct dma_fence *fence;
+
+ fence = kzalloc_obj(*fence);
+ if (!fence)
+ return;
+
+ dma_fence_init(fence, &intel_display_reset_fence_ops, NULL, 0, 0);
+
+ display->reset.fence = fence;
+}
+
+struct dma_fence *intel_display_reset_fence_get(struct intel_display *display)
+{
+ struct dma_fence *fence;
+
+ mutex_lock(&display->reset.mutex);
+
+ if (!display->reset.fence)
+ intel_display_reset_create(display);
+
+ fence = display->reset.fence;
+ if (fence)
+ dma_fence_get(fence);
+
+ mutex_unlock(&display->reset.mutex);
+
+ return fence;
+}
+
+void intel_display_reset_fence_discard(struct intel_display *display)
+{
+ struct dma_fence *fence;
+
+ mutex_lock(&display->reset.mutex);
+
+ fence = display->reset.fence;
+ if (fence)
+ dma_fence_put(fence);
+
+ display->reset.fence = NULL;
+
+ mutex_unlock(&display->reset.mutex);
+}
+
+void intel_display_reset_fence_init(struct intel_display *display)
+{
+ mutex_init(&display->reset.mutex);
+}
+
bool intel_display_reset_supported(struct intel_display *display)
{
return HAS_DISPLAY(display);
@@ -31,8 +99,15 @@ void intel_display_reset_prepare(struct intel_display *display)
{
struct drm_modeset_acquire_ctx *ctx = &display->restore.reset_ctx;
struct drm_atomic_state *state;
+ struct dma_fence *reset_fence;
int ret;
+ reset_fence = intel_display_reset_fence_get(display);
+ if (reset_fence) {
+ dma_fence_signal(reset_fence);
+ dma_fence_put(reset_fence);
+ }
+
/*
* Need mode_config.mutex so that we don't
* trample ongoing ->detect() and whatnot.
@@ -110,6 +185,8 @@ void intel_display_reset_finish(struct intel_display *display, bool test_only)
drm_atomic_state_put(state);
unlock:
+ intel_display_reset_fence_discard(display);
+
drm_modeset_drop_locks(ctx);
drm_modeset_acquire_fini(ctx);
mutex_unlock(&display->drm->mode_config.mutex);
diff --git a/drivers/gpu/drm/i915/display/intel_display_reset.h b/drivers/gpu/drm/i915/display/intel_display_reset.h
index a8aa7729d33f..c36a075c6b4d 100644
--- a/drivers/gpu/drm/i915/display/intel_display_reset.h
+++ b/drivers/gpu/drm/i915/display/intel_display_reset.h
@@ -10,6 +10,10 @@
struct intel_display;
+struct dma_fence *intel_display_reset_fence_get(struct intel_display *display);
+void intel_display_reset_fence_discard(struct intel_display *display);
+void intel_display_reset_fence_init(struct intel_display *display);
+
bool intel_display_reset_supported(struct intel_display *display);
bool intel_display_reset_test(struct intel_display *display);
void intel_display_reset_prepare(struct intel_display *display);
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 110fef511fe2..1a85dfe457f0 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -262,6 +262,7 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
i915-display/intel_display_power.o \
i915-display/intel_display_power_map.o \
i915-display/intel_display_power_well.o \
+ i915-display/intel_display_reset.o \
i915-display/intel_display_rpm.o \
i915-display/intel_display_rps.o \
i915-display/intel_display_trace.o \
--
2.52.0
next prev parent reply other threads:[~2026-04-08 23:35 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-08 23:34 [PATCH 0/6] drm/i915/reset: Solve display vs. GPU reset deadlock, again Ville Syrjala
2026-04-08 23:34 ` [PATCH 1/6] dma-buf: Remove old lies about dma_fence_wait_any_timeout() not accepting some fences Ville Syrjala
2026-04-09 8:09 ` Jani Nikula
2026-04-09 10:39 ` Christian König
2026-04-08 23:34 ` [PATCH 2/6] drm/i915/reset: Reorganize display reset code Ville Syrjala
2026-04-09 8:13 ` Jani Nikula
2026-04-08 23:34 ` [PATCH 3/6] drm/i915/reset: Move pending_fb_pin handling to i915 Ville Syrjala
2026-04-09 8:17 ` Jani Nikula
2026-04-08 23:34 ` [PATCH 4/6] drm/xe/display: Add init_clock_gating.h stubs Ville Syrjala
2026-04-09 8:19 ` Jani Nikula
2026-04-08 23:34 ` Ville Syrjala [this message]
2026-04-09 10:37 ` [PATCH 5/6] drm/i915/reset: Handle the display vs. GPU reset deadlock using a custom dma-fence Jani Nikula
2026-04-09 10:46 ` Christian König
2026-04-09 11:19 ` Ville Syrjälä
2026-04-09 12:17 ` Christian König
2026-04-08 23:34 ` [PATCH 6/6] drm/i915/display: Make fence timeout infinite Ville Syrjala
2026-04-09 10:51 ` Jani Nikula
2026-04-08 23:42 ` ✗ CI.KUnit: failure for drm/i915/reset: Solve display vs. GPU reset deadlock, again Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260408233458.22666-6-ville.syrjala@linux.intel.com \
--to=ville.syrjala@linux.intel.com \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=jani.nikula@intel.com \
--cc=jouni.hogander@intel.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=simona.vetter@ffwll.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox