From: Jani Nikula <jani.nikula@intel.com>
To: Ville Syrjala <ville.syrjala@linux.intel.com>,
intel-gfx@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
"Simona Vetter" <simona.vetter@ffwll.ch>,
"Christian König" <christian.koenig@amd.com>,
"Jouni Högander" <jouni.hogander@intel.com>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>
Subject: Re: [PATCH 5/6] drm/i915/reset: Handle the display vs. GPU reset deadlock using a custom dma-fence
Date: Thu, 09 Apr 2026 13:37:56 +0300 [thread overview]
Message-ID: <b914af7aed6d2d19fbb12e7a4f06b705a7f597d1@intel.com> (raw)
In-Reply-To: <20260408233458.22666-6-ville.syrjala@linux.intel.com>
On Thu, 09 Apr 2026, Ville Syrjala <ville.syrjala@linux.intel.com> wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> The old display vs. GPU reset deadlock is back more or less.
> The old (working) solution to the problem was originally
> introduced in commit 9db529aac938 ("drm/i915: More surgically
> unbreak the modeset vs reset deadlock"), but it got nuked with
> commit d59cf7bb73f3 ("drm/i915/display: Use dma_fence interfaces
> instead of i915_sw_fence").
>
> Apparently no one looked hard enough to see that things didn't
> work quite properly anymore. What is still saving us for the most
> part is that we have a timeout on the fence wait
> (CONFIG_DRM_I915_FENCE_TIMEOUT, 10 seconds by default). But
> people are perhaps trying to get rid of that so we may need
> another solution, and 10 seconds is a bit slow.
>
> Re-solve the problem yet again with a custom dma-fence that gets
> signaled just prior to a GPU reset, and have the atomic commit wait
> for either that or the real fence using dma_fence_wait_any_timeout().
> Whichever signals first will let the commit proceed. We create a new
> "reset fence" whenever someone needs one, and keep it until the next
> GPU reset has completed. After that the next guy will again get a
> fresh unsignaled "reset fence".
>
> Cc: Simona Vetter <simona.vetter@ffwll.ch>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Jani Nikula <jani.nikula@intel.com>
> Cc: Jouni Högander <jouni.hogander@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
This all makes sense to me, but I'd like to solicit addition review from
Simona, Christian and/or Maarten.
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
> ---
> drivers/gpu/drm/i915/display/intel_display.c | 34 +++++---
> .../gpu/drm/i915/display/intel_display_core.h | 6 ++
> .../drm/i915/display/intel_display_driver.c | 5 ++
> .../drm/i915/display/intel_display_reset.c | 77 +++++++++++++++++++
> .../drm/i915/display/intel_display_reset.h | 4 +
> drivers/gpu/drm/xe/Makefile | 1 +
> 6 files changed, 117 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 58a654ca0d20..83ccf13c4b16 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -72,6 +72,7 @@
> #include "intel_display_driver.h"
> #include "intel_display_power.h"
> #include "intel_display_regs.h"
> +#include "intel_display_reset.h"
> #include "intel_display_rpm.h"
> #include "intel_display_types.h"
> #include "intel_display_utils.h"
> @@ -7149,22 +7150,35 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
>
> static void intel_atomic_commit_fence_wait(struct intel_atomic_state *intel_state)
> {
> - struct drm_plane *plane;
> + struct intel_display *display = to_intel_display(intel_state);
> struct drm_plane_state *new_plane_state;
> - long ret;
> + struct dma_fence *reset_fence;
> + struct drm_plane *plane;
> int i;
>
> + reset_fence = intel_display_reset_fence_get(display);
> +
> for_each_new_plane_in_state(&intel_state->base, plane, new_plane_state, i) {
> - if (new_plane_state->fence) {
> - ret = dma_fence_wait_timeout(new_plane_state->fence, false,
> - i915_fence_timeout());
> - if (ret <= 0)
> - break;
> + struct dma_fence *fences[2] = {
> + [0] = new_plane_state->fence,
> + [1] = reset_fence,
> + };
> + long ret;
>
> - dma_fence_put(new_plane_state->fence);
> - new_plane_state->fence = NULL;
> - }
> + if (!new_plane_state->fence)
> + continue;
> +
> + ret = dma_fence_wait_any_timeout(fences, reset_fence ? 2 : 1, false,
> + i915_fence_timeout(), NULL);
> + if (ret <= 0)
> + break;
> +
> + dma_fence_put(new_plane_state->fence);
> + new_plane_state->fence = NULL;
> }
> +
> + if (reset_fence)
> + dma_fence_put(reset_fence);
> }
>
> static void intel_atomic_dsb_wait_commit(struct intel_crtc_state *crtc_state)
> diff --git a/drivers/gpu/drm/i915/display/intel_display_core.h b/drivers/gpu/drm/i915/display/intel_display_core.h
> index 9e77003addd0..6687b658c51d 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_core.h
> +++ b/drivers/gpu/drm/i915/display/intel_display_core.h
> @@ -556,6 +556,12 @@ struct intel_display {
> unsigned long mask;
> } quirks;
>
> + struct {
> + /* protects reset.fence */
> + struct mutex mutex;
> + struct dma_fence *fence;
> + } reset;
> +
> struct {
> /* restore state for suspend/resume and display reset */
> struct drm_atomic_state *modeset_state;
> diff --git a/drivers/gpu/drm/i915/display/intel_display_driver.c b/drivers/gpu/drm/i915/display/intel_display_driver.c
> index 23bfecc983e8..fcd31722c731 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_driver.c
> +++ b/drivers/gpu/drm/i915/display/intel_display_driver.c
> @@ -34,6 +34,7 @@
> #include "intel_display_driver.h"
> #include "intel_display_irq.h"
> #include "intel_display_power.h"
> +#include "intel_display_reset.h"
> #include "intel_display_types.h"
> #include "intel_display_utils.h"
> #include "intel_display_wa.h"
> @@ -257,6 +258,8 @@ int intel_display_driver_probe_noirq(struct intel_display *display)
>
> intel_mode_config_init(display);
>
> + intel_display_reset_fence_init(display);
> +
> ret = intel_cdclk_init(display);
> if (ret)
> goto cleanup_wq_unordered;
> @@ -584,6 +587,8 @@ void intel_display_driver_remove(struct intel_display *display)
> if (!HAS_DISPLAY(display))
> return;
>
> + intel_display_reset_fence_discard(display);
> +
> flush_workqueue(display->wq.flip);
> flush_workqueue(display->wq.modeset);
> flush_workqueue(display->wq.cleanup);
> diff --git a/drivers/gpu/drm/i915/display/intel_display_reset.c b/drivers/gpu/drm/i915/display/intel_display_reset.c
> index ca15dc18ef0f..80dd2ea8a0c2 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_reset.c
> +++ b/drivers/gpu/drm/i915/display/intel_display_reset.c
> @@ -3,6 +3,8 @@
> * Copyright © 2023 Intel Corporation
> */
>
> +#include <linux/dma-fence.h>
> +
> #include <drm/drm_atomic_helper.h>
> #include <drm/drm_print.h>
>
> @@ -16,6 +18,72 @@
> #include "intel_hotplug.h"
> #include "intel_pps.h"
>
> +static const char *intel_display_reset_fence_get_driver_name(struct dma_fence *fence)
> +{
> + return "intel_display";
> +}
> +
> +static const char *intel_display_reset_fence_get_timeline_name(struct dma_fence *fence)
> +{
> + return "reset";
> +}
> +
> +static const struct dma_fence_ops intel_display_reset_fence_ops = {
> + .get_driver_name = intel_display_reset_fence_get_driver_name,
> + .get_timeline_name = intel_display_reset_fence_get_timeline_name,
> +};
> +
> +static void intel_display_reset_create(struct intel_display *display)
> +{
> + struct dma_fence *fence;
> +
> + fence = kzalloc_obj(*fence);
> + if (!fence)
> + return;
> +
> + dma_fence_init(fence, &intel_display_reset_fence_ops, NULL, 0, 0);
> +
> + display->reset.fence = fence;
> +}
> +
> +struct dma_fence *intel_display_reset_fence_get(struct intel_display *display)
> +{
> + struct dma_fence *fence;
> +
> + mutex_lock(&display->reset.mutex);
> +
> + if (!display->reset.fence)
> + intel_display_reset_create(display);
> +
> + fence = display->reset.fence;
> + if (fence)
> + dma_fence_get(fence);
> +
> + mutex_unlock(&display->reset.mutex);
> +
> + return fence;
> +}
> +
> +void intel_display_reset_fence_discard(struct intel_display *display)
> +{
> + struct dma_fence *fence;
> +
> + mutex_lock(&display->reset.mutex);
> +
> + fence = display->reset.fence;
> + if (fence)
> + dma_fence_put(fence);
> +
> + display->reset.fence = NULL;
> +
> + mutex_unlock(&display->reset.mutex);
> +}
> +
> +void intel_display_reset_fence_init(struct intel_display *display)
> +{
> + mutex_init(&display->reset.mutex);
> +}
> +
> bool intel_display_reset_supported(struct intel_display *display)
> {
> return HAS_DISPLAY(display);
> @@ -31,8 +99,15 @@ void intel_display_reset_prepare(struct intel_display *display)
> {
> struct drm_modeset_acquire_ctx *ctx = &display->restore.reset_ctx;
> struct drm_atomic_state *state;
> + struct dma_fence *reset_fence;
> int ret;
>
> + reset_fence = intel_display_reset_fence_get(display);
> + if (reset_fence) {
> + dma_fence_signal(reset_fence);
> + dma_fence_put(reset_fence);
> + }
> +
> /*
> * Need mode_config.mutex so that we don't
> * trample ongoing ->detect() and whatnot.
> @@ -110,6 +185,8 @@ void intel_display_reset_finish(struct intel_display *display, bool test_only)
>
> drm_atomic_state_put(state);
> unlock:
> + intel_display_reset_fence_discard(display);
> +
> drm_modeset_drop_locks(ctx);
> drm_modeset_acquire_fini(ctx);
> mutex_unlock(&display->drm->mode_config.mutex);
> diff --git a/drivers/gpu/drm/i915/display/intel_display_reset.h b/drivers/gpu/drm/i915/display/intel_display_reset.h
> index a8aa7729d33f..c36a075c6b4d 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_reset.h
> +++ b/drivers/gpu/drm/i915/display/intel_display_reset.h
> @@ -10,6 +10,10 @@
>
> struct intel_display;
>
> +struct dma_fence *intel_display_reset_fence_get(struct intel_display *display);
> +void intel_display_reset_fence_discard(struct intel_display *display);
> +void intel_display_reset_fence_init(struct intel_display *display);
> +
> bool intel_display_reset_supported(struct intel_display *display);
> bool intel_display_reset_test(struct intel_display *display);
> void intel_display_reset_prepare(struct intel_display *display);
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 110fef511fe2..1a85dfe457f0 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -262,6 +262,7 @@ xe-$(CONFIG_DRM_XE_DISPLAY) += \
> i915-display/intel_display_power.o \
> i915-display/intel_display_power_map.o \
> i915-display/intel_display_power_well.o \
> + i915-display/intel_display_reset.o \
> i915-display/intel_display_rpm.o \
> i915-display/intel_display_rps.o \
> i915-display/intel_display_trace.o \
--
Jani Nikula, Intel
next prev parent reply other threads:[~2026-04-09 10:38 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-08 23:34 [PATCH 0/6] drm/i915/reset: Solve display vs. GPU reset deadlock, again Ville Syrjala
2026-04-08 23:34 ` [PATCH 1/6] dma-buf: Remove old lies about dma_fence_wait_any_timeout() not accepting some fences Ville Syrjala
2026-04-09 8:09 ` Jani Nikula
2026-04-09 10:39 ` Christian König
2026-04-08 23:34 ` [PATCH 2/6] drm/i915/reset: Reorganize display reset code Ville Syrjala
2026-04-09 8:13 ` Jani Nikula
2026-04-08 23:34 ` [PATCH 3/6] drm/i915/reset: Move pending_fb_pin handling to i915 Ville Syrjala
2026-04-09 8:17 ` Jani Nikula
2026-04-08 23:34 ` [PATCH 4/6] drm/xe/display: Add init_clock_gating.h stubs Ville Syrjala
2026-04-09 8:19 ` Jani Nikula
2026-04-08 23:34 ` [PATCH 5/6] drm/i915/reset: Handle the display vs. GPU reset deadlock using a custom dma-fence Ville Syrjala
2026-04-09 10:37 ` Jani Nikula [this message]
2026-04-09 10:46 ` Christian König
2026-04-09 11:19 ` Ville Syrjälä
2026-04-09 12:17 ` Christian König
2026-04-08 23:34 ` [PATCH 6/6] drm/i915/display: Make fence timeout infinite Ville Syrjala
2026-04-09 10:51 ` Jani Nikula
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b914af7aed6d2d19fbb12e7a4f06b705a7f597d1@intel.com \
--to=jani.nikula@intel.com \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=jouni.hogander@intel.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=simona.vetter@ffwll.ch \
--cc=ville.syrjala@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox