From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: "Raag Jadav" <raag.jadav@intel.com>,
"Christian König" <christian.koenig@amd.com>
Cc: <intel-xe@lists.freedesktop.org>, <matthew.brost@intel.com>,
<riana.tauro@intel.com>, <michal.wajdeczko@intel.com>,
<matthew.d.roper@intel.com>, <lukasz.laguna@intel.com>
Subject: Re: [PATCH v1] drm/xe: Send unknown recovery method for XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET
Date: Thu, 12 Feb 2026 09:01:58 -0500 [thread overview]
Message-ID: <aY3dVn4pSgw-wNZD@intel.com> (raw)
In-Reply-To: <aY1lApm0_ZDMoyK5@black.igk.intel.com>
On Thu, Feb 12, 2026 at 06:28:34AM +0100, Raag Jadav wrote:
> On Wed, Feb 11, 2026 at 12:46:10PM -0500, Rodrigo Vivi wrote:
> > On Fri, Feb 06, 2026 at 07:32:08AM +0100, Raag Jadav wrote:
> > > On Thu, Feb 05, 2026 at 05:54:29PM -0500, Rodrigo Vivi wrote:
> > > > On Thu, Feb 05, 2026 at 04:48:35PM +0530, Raag Jadav wrote:
> > > > > XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET is intended for debugging hangs,
> > > > > so wedge the device without any recovery method (unknown) and have it
> > > > > available to the user for debugging.
> > > > >
> > > > > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> > > > > ---
> > > > > drivers/gpu/drm/xe/xe_device.c | 9 ++++++++-
> > > > > 1 file changed, 8 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > > > > index b1241fa4c3d6..815f0b0c9dfd 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_device.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_device.c
> > > > > @@ -1326,8 +1326,15 @@ void xe_device_declare_wedged(struct xe_device *xe)
> > > > > xe_gt_declare_wedged(gt);
> > > > >
> > > > > if (xe_device_wedged(xe)) {
> > > > > + /*
> > > > > + * XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET is intended for debugging hangs,
> > > > > + * so wedge the device without any recovery method and have it available
> > > > > + * to the user for debugging.
> > > >
> > > > agree....
> > > >
> > > > > + */
> > > > > + if (xe->wedged.mode == XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET)
> > > > > + xe_device_set_wedged_method(xe, 0);
> > > >
> > > > but why not using the already defined:
> > > >
> > > > #define DRM_WEDGE_RECOVERY_NONE BIT(0) /* optional telemetry collection */
> > >
> > > We originally added this for AMD usecase, and it doesn't strictly speaking
> > > means 'wedged'.
> > >
> > > Documentation/gpu/drm-uapi.rst +441
> > >
> > > "The only exception to this is ``WEDGED=none``, which signifies that the device
> > > was temporarily 'wedged' at some point but was recovered from driver context
> > > using device specific methods like reset."
> >
> > Well, so, why not to change that to a more generic meaning then?!
> >
> > 'none' should mean, no recovery help is needed. go away user space.
> > regardless if it is temporary or permanent...
>
> A few things,
>
> 1. I'm doubtful if Christian will allow it since they've built a lot of
> infrastruction around it.
there's only way to know that...
Cc: Christian König <christian.koenig@amd.com>
>
> 2. "Debugging" != "go away userspace" IMO since we ultimately do need the
> recovery, it just won't be automated.
exactly in my view: 'none' = 'no automation needed'
much easier and meaningfully aligned than 'unkown'
>
> 3. I had debug cases in mind at the time and have already kept a provision
> for them.
>
> Documentation/gpu/drm-uapi.rst +533
>
> "Consumers can also choose to have the device available for debugging or
> telemetry collection and base their recovery decision on the findings.
> This is useful especially when the driver is unsure about recovery or
> method is unknown."
Okay, so perhaps we need to update that. Because in my view, driver knows
and it is pretty sure that no automated recover should take place in this
case.
>
> Raag
>
> > > > > /* If no wedge recovery method is set, use default */
> > > > > - if (!xe->wedged.method)
> > > > > + else if (!xe->wedged.method)
> > > > > xe_device_set_wedged_method(xe, DRM_WEDGE_RECOVERY_REBIND |
> > > > > DRM_WEDGE_RECOVERY_BUS_RESET);
> > > > >
> > > > > --
> > > > > 2.43.0
> > > > >
next prev parent reply other threads:[~2026-02-12 14:03 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-05 11:18 [PATCH v1] drm/xe: Send unknown recovery method for XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET Raag Jadav
2026-02-05 12:43 ` ✓ CI.KUnit: success for " Patchwork
2026-02-05 13:22 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-05 22:54 ` [PATCH v1] " Rodrigo Vivi
2026-02-06 6:32 ` Raag Jadav
2026-02-11 17:46 ` Rodrigo Vivi
2026-02-12 5:28 ` Raag Jadav
2026-02-12 14:01 ` Rodrigo Vivi [this message]
2026-02-16 10:30 ` Raag Jadav
2026-02-06 9:53 ` ✗ Xe.CI.FULL: failure for " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aY3dVn4pSgw-wNZD@intel.com \
--to=rodrigo.vivi@intel.com \
--cc=christian.koenig@amd.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lukasz.laguna@intel.com \
--cc=matthew.brost@intel.com \
--cc=matthew.d.roper@intel.com \
--cc=michal.wajdeczko@intel.com \
--cc=raag.jadav@intel.com \
--cc=riana.tauro@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox