AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Hamza Mahfooz <someguy@effective-light.com>,
	dri-devel@lists.freedesktop.org
Cc: "Michel Dänzer" <michel.daenzer@mailbox.org>,
	"Harry Wentland" <harry.wentland@amd.com>,
	"Leo Li" <sunpeng.li@amd.com>,
	"Rodrigo Siqueira" <siqueira@igalia.com>,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"David Airlie" <airlied@gmail.com>,
	"Simona Vetter" <simona@ffwll.ch>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Maxime Ripard" <mripard@kernel.org>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	"Mario Limonciello" <mario.limonciello@amd.com>,
	"Alex Hung" <alex.hung@amd.com>, "Wayne Lin" <Wayne.Lin@amd.com>,
	"Aurabindo Pillai" <aurabindo.pillai@amd.com>,
	"Ivan Lipski" <ivan.lipski@amd.com>,
	"Timur Kristóf" <timur.kristof@gmail.com>,
	"Dominik Kaszewski" <dominik.kaszewski@amd.com>,
	amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 1/2] drm: introduce KMS recovery mechanism
Date: Wed, 18 Feb 2026 10:31:24 +0100	[thread overview]
Message-ID: <27b9da4f-bd6f-4426-a5fe-94e0afdbae70@amd.com> (raw)
In-Reply-To: <20260212230905.688006-1-someguy@effective-light.com>

On 2/13/26 00:08, Hamza Mahfooz wrote:
> There should be a mechanism for drivers to respond to flip_done
> timeouts. Since, as it stands it is possible for the display to stall
> indefinitely, necessitating a hard reset. So, introduce a new mechanism
> that tries various methods of recovery with increasing aggression, in
> the following order:
> 
> 1. Force a full modeset (have the compositor reprogram the state from
>    scratch).
> 2. As a last resort, have the driver attempt a vendor specific reset
>    (which they can do by reading the return value of
>    drm_atomic_helper_wait_for_flip_done()).
> 
> Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
> ---
> v2: new to the series
> v3: get rid of page_flip_timeout() and have
>     drm_atomic_helper_wait_for_flip_done() return a error.
> ---
>  drivers/gpu/drm/drm_atomic_helper.c | 45 +++++++++++++++++++++++++----
>  include/drm/drm_atomic_helper.h     |  4 +--
>  include/drm/drm_device.h            | 24 +++++++++++++++
>  3 files changed, 66 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> index 5840e9cc6f66..6ae1234b9e20 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -42,6 +42,7 @@
>  #include <drm/drm_gem_atomic_helper.h>
>  #include <drm/drm_panic.h>
>  #include <drm/drm_print.h>
> +#include <drm/drm_probe_helper.h>
>  #include <drm/drm_self_refresh_helper.h>
>  #include <drm/drm_vblank.h>
>  #include <drm/drm_writeback.h>
> @@ -1864,11 +1865,15 @@ EXPORT_SYMBOL(drm_atomic_helper_wait_for_vblanks);
>   *
>   * This requires that drivers use the nonblocking commit tracking support
>   * initialized using drm_atomic_helper_setup_commit().
> + *
> + * Returns:
> + * -ETIMEDOUT to indicate that drivers can attempt a vendor reset, 0 otherwise.
>   */
> -void drm_atomic_helper_wait_for_flip_done(struct drm_device *dev,
> -					  struct drm_atomic_state *state)
> +int drm_atomic_helper_wait_for_flip_done(struct drm_device *dev,
> +					 struct drm_atomic_state *state)
>  {
>  	struct drm_crtc *crtc;
> +	int ret = 0;
>  	int i;
>  
>  	for (i = 0; i < dev->mode_config.num_crtc; i++) {
> @@ -1881,13 +1886,43 @@ void drm_atomic_helper_wait_for_flip_done(struct drm_device *dev,
>  			continue;
>  
>  		ret = wait_for_completion_timeout(&commit->flip_done, 10 * HZ);
> -		if (ret == 0)
> -			drm_err(dev, "[CRTC:%d:%s] flip_done timed out\n",
> -				crtc->base.id, crtc->name);
> +		if (!ret) {
> +			switch (dev->reset_phase) {
> +			case DRM_KMS_RESET_NONE:

That state machine is driver specific and so doesn't belong into DRM.

So please completely nuke that and just return the error code to the driver.

Regards,
Christian.

> +				drm_err(dev, "[CRTC:%d:%s] flip_done timed out\n",
> +					crtc->base.id, crtc->name);
> +				dev->reset_phase = DRM_KMS_RESET_FORCE_MODESET;
> +				drm_kms_helper_hotplug_event(dev);
> +				break;
> +			case DRM_KMS_RESET_FORCE_MODESET:
> +				drm_err(dev, "[CRTC:%d:%s] force full modeset failed\n",
> +					crtc->base.id, crtc->name);
> +				dev->reset_phase = DRM_KMS_RESET_VENDOR;
> +				ret = -ETIMEDOUT;
> +				break;
> +			case DRM_KMS_RESET_VENDOR:
> +				drm_err(dev, "[CRTC:%d:%s] KMS recovery failed!\n",
> +					crtc->base.id, crtc->name);
> +				dev->reset_phase = DRM_KMS_RESET_GIVE_UP;
> +				break;
> +			default:
> +				break;
> +			}
> +
> +			goto exit;
> +		}
> +	}
> +
> +	if (dev->reset_phase) {
> +		drm_info(dev, "KMS recovery succeeded!\n");
> +		dev->reset_phase = DRM_KMS_RESET_NONE;
>  	}
>  
> +exit:
>  	if (state->fake_commit)
>  		complete_all(&state->fake_commit->flip_done);
> +
> +	return ret;
>  }
>  EXPORT_SYMBOL(drm_atomic_helper_wait_for_flip_done);
>  
> diff --git a/include/drm/drm_atomic_helper.h b/include/drm/drm_atomic_helper.h
> index 53382fe93537..298c8dff3993 100644
> --- a/include/drm/drm_atomic_helper.h
> +++ b/include/drm/drm_atomic_helper.h
> @@ -79,8 +79,8 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
>  void drm_atomic_helper_wait_for_vblanks(struct drm_device *dev,
>  					struct drm_atomic_state *old_state);
>  
> -void drm_atomic_helper_wait_for_flip_done(struct drm_device *dev,
> -					  struct drm_atomic_state *old_state);
> +int drm_atomic_helper_wait_for_flip_done(struct drm_device *dev,
> +					 struct drm_atomic_state *old_state);
>  
>  void
>  drm_atomic_helper_update_legacy_modeset_state(struct drm_device *dev,
> diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
> index bc78fb77cc27..1244d7527e7b 100644
> --- a/include/drm/drm_device.h
> +++ b/include/drm/drm_device.h
> @@ -66,6 +66,23 @@ enum switch_power_state {
>  	DRM_SWITCH_POWER_DYNAMIC_OFF = 3,
>  };
>  
> +/**
> + * enum drm_kms_reset_phase - reset phase of drm device
> + */
> +enum drm_kms_reset_phase {
> +	/** @DRM_KMS_RESET_NONE: Not currently attempting recovery */
> +	DRM_KMS_RESET_NONE,
> +
> +	/** @DRM_KMS_RESET_FORCE_MODESET: Force a full modeset */
> +	DRM_KMS_RESET_FORCE_MODESET,
> +
> +	/** @DRM_KMS_RESET_VENDOR: Attempt a vendor reset */
> +	DRM_KMS_RESET_VENDOR,
> +
> +	/** @DRM_KMS_RESET_GIVE_UP: All recovery methods failed */
> +	DRM_KMS_RESET_GIVE_UP,
> +};
> +
>  /**
>   * struct drm_device - DRM device structure
>   *
> @@ -375,6 +392,13 @@ struct drm_device {
>  	 * Root directory for debugfs files.
>  	 */
>  	struct dentry *debugfs_root;
> +
> +	/**
> +	 * @reset_phase:
> +	 *
> +	 * Reset phase that the device is in.
> +	 */
> +	enum drm_kms_reset_phase reset_phase;
>  };
>  
>  void drm_dev_set_dma_dev(struct drm_device *dev, struct device *dma_dev);


      parent reply	other threads:[~2026-02-18  9:31 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-12 23:08 [PATCH v3 1/2] drm: introduce KMS recovery mechanism Hamza Mahfooz
2026-02-12 23:09 ` [PATCH v3 2/2] drm/amd/display: add vendor specific reset Hamza Mahfooz
2026-02-18  9:34   ` Christian König
2026-02-13  0:18 ` [PATCH v3 1/2] drm: introduce KMS recovery mechanism Mario Limonciello
2026-02-13 19:35   ` Hamza Mahfooz
2026-02-14 14:02     ` Michel Dänzer
2026-02-14 22:16       ` Hamza Mahfooz
2026-02-16  9:28         ` Michel Dänzer
2026-02-18  0:45           ` Hamza Mahfooz
2026-02-18  9:22             ` Michel Dänzer
2026-02-20  8:42               ` Michel Dänzer
2026-02-20  9:09                 ` Michel Dänzer
2026-02-20 14:08                   ` Michel Dänzer
2026-04-17 16:08                   ` Michel Dänzer
2026-04-17 16:28                     ` Ville Syrjälä
2026-02-18  9:31 ` Christian König [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27b9da4f-bd6f-4426-a5fe-94e0afdbae70@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Wayne.Lin@amd.com \
    --cc=airlied@gmail.com \
    --cc=alex.hung@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=aurabindo.pillai@amd.com \
    --cc=dominik.kaszewski@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=harry.wentland@amd.com \
    --cc=ivan.lipski@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mario.limonciello@amd.com \
    --cc=michel.daenzer@mailbox.org \
    --cc=mripard@kernel.org \
    --cc=simona@ffwll.ch \
    --cc=siqueira@igalia.com \
    --cc=someguy@effective-light.com \
    --cc=sunpeng.li@amd.com \
    --cc=timur.kristof@gmail.com \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox