Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Raag Jadav <raag.jadav@intel.com>
To: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: "Christian König" <christian.koenig@amd.com>,
	intel-xe@lists.freedesktop.org, matthew.brost@intel.com,
	riana.tauro@intel.com, michal.wajdeczko@intel.com,
	matthew.d.roper@intel.com, lukasz.laguna@intel.com
Subject: Re: [PATCH v1] drm/xe: Send unknown recovery method for XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET
Date: Mon, 16 Feb 2026 11:30:17 +0100	[thread overview]
Message-ID: <aZLxubuW7UrLaP1D@black.igk.intel.com> (raw)
In-Reply-To: <aY3dVn4pSgw-wNZD@intel.com>

On Thu, Feb 12, 2026 at 09:01:58AM -0500, Rodrigo Vivi wrote:
> On Thu, Feb 12, 2026 at 06:28:34AM +0100, Raag Jadav wrote:
> > On Wed, Feb 11, 2026 at 12:46:10PM -0500, Rodrigo Vivi wrote:
> > > On Fri, Feb 06, 2026 at 07:32:08AM +0100, Raag Jadav wrote:
> > > > On Thu, Feb 05, 2026 at 05:54:29PM -0500, Rodrigo Vivi wrote:
> > > > > On Thu, Feb 05, 2026 at 04:48:35PM +0530, Raag Jadav wrote:
> > > > > > XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET is intended for debugging hangs,
> > > > > > so wedge the device without any recovery method (unknown) and have it
> > > > > > available to the user for debugging.
> > > > > > 
> > > > > > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> > > > > > ---
> > > > > >  drivers/gpu/drm/xe/xe_device.c | 9 ++++++++-
> > > > > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > > > > > index b1241fa4c3d6..815f0b0c9dfd 100644
> > > > > > --- a/drivers/gpu/drm/xe/xe_device.c
> > > > > > +++ b/drivers/gpu/drm/xe/xe_device.c
> > > > > > @@ -1326,8 +1326,15 @@ void xe_device_declare_wedged(struct xe_device *xe)
> > > > > >  		xe_gt_declare_wedged(gt);
> > > > > >  
> > > > > >  	if (xe_device_wedged(xe)) {
> > > > > > +		/*
> > > > > > +		 * XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET is intended for debugging hangs,
> > > > > > +		 * so wedge the device without any recovery method and have it available
> > > > > > +		 * to the user for debugging.
> > > > > 
> > > > > agree....
> > > > > 
> > > > > > +		 */
> > > > > > +		if (xe->wedged.mode == XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET)
> > > > > > +			xe_device_set_wedged_method(xe, 0);
> > > > > 
> > > > > but why not using the already defined:
> > > > > 
> > > > > #define DRM_WEDGE_RECOVERY_NONE    BIT(0)  /* optional telemetry collection */
> > > > 
> > > > We originally added this for AMD usecase, and it doesn't strictly speaking
> > > > means 'wedged'.
> > > > 
> > > > Documentation/gpu/drm-uapi.rst +441
> > > > 
> > > > "The only exception to this is ``WEDGED=none``, which signifies that the device
> > > > was temporarily 'wedged' at some point but was recovered from driver context
> > > > using device specific methods like reset."
> > > 
> > > Well, so, why not to change that to a more generic meaning then?!
> > > 
> > > 'none' should mean, no recovery help is needed. go away user space.
> > > regardless if it is temporary or permanent...
> > 
> > A few things,
> > 
> > 1. I'm doubtful if Christian will allow it since they've built a lot of
> > infrastruction around it.
> 
> there's only way to know that...
> 
> Cc: Christian König <christian.koenig@amd.com>

What do you think Chris? Any objections?

Raag

> > 2. "Debugging" != "go away userspace" IMO since we ultimately do need the
> > recovery, it just won't be automated.
> 
> exactly in my view: 'none' = 'no automation needed'
> 
> much easier and meaningfully aligned than 'unkown'
> 
> > 
> > 3. I had debug cases in mind at the time and have already kept a provision
> > for them.
> > 
> > Documentation/gpu/drm-uapi.rst +533
> > 
> > "Consumers can also choose to have the device available for debugging or
> > telemetry collection and base their recovery decision on the findings.
> > This is useful especially when the driver is unsure about recovery or
> > method is unknown."
> 
> Okay, so perhaps we need to update that. Because in my view, driver knows
> and it is pretty sure that no automated recover should take place in this
> case.
> 
> > > > > >  		/* If no wedge recovery method is set, use default */
> > > > > > -		if (!xe->wedged.method)
> > > > > > +		else if (!xe->wedged.method)
> > > > > >  			xe_device_set_wedged_method(xe, DRM_WEDGE_RECOVERY_REBIND |
> > > > > >  						    DRM_WEDGE_RECOVERY_BUS_RESET);
> > > > > >  
> > > > > > -- 
> > > > > > 2.43.0
> > > > > > 

  reply	other threads:[~2026-02-16 10:30 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-05 11:18 [PATCH v1] drm/xe: Send unknown recovery method for XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET Raag Jadav
2026-02-05 12:43 ` ✓ CI.KUnit: success for " Patchwork
2026-02-05 13:22 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-05 22:54 ` [PATCH v1] " Rodrigo Vivi
2026-02-06  6:32   ` Raag Jadav
2026-02-11 17:46     ` Rodrigo Vivi
2026-02-12  5:28       ` Raag Jadav
2026-02-12 14:01         ` Rodrigo Vivi
2026-02-16 10:30           ` Raag Jadav [this message]
2026-02-06  9:53 ` ✗ Xe.CI.FULL: failure for " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aZLxubuW7UrLaP1D@black.igk.intel.com \
    --to=raag.jadav@intel.com \
    --cc=christian.koenig@amd.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=lukasz.laguna@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=michal.wajdeczko@intel.com \
    --cc=riana.tauro@intel.com \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox