From: Raag Jadav <raag.jadav@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: airlied@gmail.com, daniel@ffwll.ch, lucas.demarchi@intel.com,
thomas.hellstrom@linux.intel.com, rodrigo.vivi@intel.com,
jani.nikula@linux.intel.com, joonas.lahtinen@linux.intel.com,
tursulin@ursulin.net, intel-xe@lists.freedesktop.org,
intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
himal.prasad.ghimiray@intel.com, francois.dugast@intel.com,
aravind.iddamsetty@linux.intel.com, anshuman.gupta@intel.com
Subject: Re: [PATCH v4 1/3] drm: Introduce device wedged event
Date: Tue, 10 Sep 2024 18:49:00 +0300 [thread overview]
Message-ID: <ZuBqbFA8_d0khPCY@black.fi.intel.com> (raw)
In-Reply-To: <20240909215323.GC5774@mdroper-desk1.amr.corp.intel.com>
On Mon, Sep 09, 2024 at 02:53:23PM -0700, Matt Roper wrote:
> On Fri, Sep 06, 2024 at 03:12:23PM +0530, Raag Jadav wrote:
> > Introduce device wedged event, which will notify userspace of wedged
> > (hanged/unusable) state of the DRM device through a uevent. This is
> > useful especially in cases where the device is in unrecoverable state
> > and requires userspace intervention for recovery.
> >
> > Purpose of this implementation is to be vendor agnostic. Userspace
> > consumers (sysadmin) can define udev rules to parse this event and
> > take respective action to recover the device.
> >
> > Consumer expectations:
> > ----------------------
> > 1) Unbind driver
> > 2) Reset bus device
> > 3) Re-bind driver
> >
> > v4: s/drm_dev_wedged/drm_dev_wedged_event
> > Use drm_info() (Jani)
> > Kernel doc adjustment (Aravind)
> >
> > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> > ---
> > drivers/gpu/drm/drm_drv.c | 20 ++++++++++++++++++++
> > include/drm/drm_drv.h | 1 +
> > 2 files changed, 21 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > index 93543071a500..cca5d8295eb7 100644
> > --- a/drivers/gpu/drm/drm_drv.c
> > +++ b/drivers/gpu/drm/drm_drv.c
> > @@ -499,6 +499,26 @@ void drm_dev_unplug(struct drm_device *dev)
> > }
> > EXPORT_SYMBOL(drm_dev_unplug);
> >
> > +/**
> > + * drm_dev_wedged_event - generate a device wedged uevent
> > + * @dev: DRM device
> > + *
> > + * This generates a device wedged uevent for the DRM device specified by @dev,
> > + * on the basis of which, userspace may take respective action to recover the
> > + * device. Currently we only set WEDGED=1 in the uevent environment, but this
> > + * can be expanded in the future.
>
> Just to clarify, is "wedged" intended to always mean "the entire device
> is unusable" or are there cases where it would also get sent if only
> part of the device is in a bad state? For example, using i915/Xe
> terminology, maybe the GT is dead but display is still working. Or one
> GT is dead, but another is still alive.
The idea is to provide drivers a way to recover through userspace intervention.
It is upto the drivers to decide when they see the need for recovery and how
they want to recover.
> Basically, is this event intended as a signal that userspace should stop
> trying to do _anything_ with the device, or just that the device has
> degraded functionality in some way (and maybe userspace can still do
> something useful if it's lucky)? It would be good to clarify that in
> the docs here in case different drivers have different ideas about how
> this is expected to work.
And hence the open discussion. Improvements are welcome :)
Raag
next prev parent reply other threads:[~2024-09-10 15:49 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-06 9:42 [PATCH v4 0/3] Introduce DRM device wedged event Raag Jadav
2024-09-06 9:42 ` [PATCH v4 1/3] drm: Introduce " Raag Jadav
2024-09-07 11:38 ` Asahi Lina
2024-09-07 15:07 ` Lucas De Marchi
2024-09-08 14:08 ` Asahi Lina
2024-09-09 20:01 ` Lucas De Marchi
2024-09-10 15:53 ` Raag Jadav
2024-09-10 16:06 ` Lucas De Marchi
2024-09-09 20:43 ` Rodrigo Vivi
2024-09-09 21:53 ` Matt Roper
2024-09-10 15:49 ` Raag Jadav [this message]
2024-09-24 9:37 ` Simona Vetter
2024-09-06 9:42 ` [PATCH v4 2/3] drm/xe: Use " Raag Jadav
2024-09-06 9:42 ` [PATCH v4 3/3] drm/i915: " Raag Jadav
2024-09-06 9:51 ` ✓ CI.Patch_applied: success for Introduce DRM device wedged event (rev2) Patchwork
2024-09-06 9:51 ` ✗ CI.checkpatch: warning " Patchwork
2024-09-06 9:53 ` ✓ CI.KUnit: success " Patchwork
2024-09-06 10:05 ` ✓ CI.Build: " Patchwork
2024-09-06 10:07 ` ✓ CI.Hooks: " Patchwork
2024-09-06 10:08 ` ✗ CI.checksparse: warning " Patchwork
2024-09-06 10:23 ` ✓ CI.BAT: success " Patchwork
2024-09-06 10:52 ` ✗ Fi.CI.CHECKPATCH: warning " Patchwork
2024-09-06 10:52 ` ✗ Fi.CI.SPARSE: " Patchwork
2024-09-06 10:59 ` ✓ Fi.CI.BAT: success " Patchwork
2024-09-10 8:53 ` ✗ Fi.CI.IGT: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZuBqbFA8_d0khPCY@black.fi.intel.com \
--to=raag.jadav@intel.com \
--cc=airlied@gmail.com \
--cc=anshuman.gupta@intel.com \
--cc=aravind.iddamsetty@linux.intel.com \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=francois.dugast@intel.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=jani.nikula@linux.intel.com \
--cc=joonas.lahtinen@linux.intel.com \
--cc=lucas.demarchi@intel.com \
--cc=matthew.d.roper@intel.com \
--cc=rodrigo.vivi@intel.com \
--cc=thomas.hellstrom@linux.intel.com \
--cc=tursulin@ursulin.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.