Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v12 0/5] Introduce DRM device wedged event
@ 2025-02-04  7:05 Raag Jadav
  2025-02-04  7:05 ` [PATCH v12 1/5] drm: Introduce " Raag Jadav
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: Raag Jadav @ 2025-02-04  7:05 UTC (permalink / raw)
  To: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
	christian.koenig, alexander.deucher
  Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
	aravind.iddamsetty, anshuman.gupta, andriy.shevchenko, lina,
	michal.wajdeczko, andrealmeid, amd-gfx, kernel-dev, xaver.hugl,
	pekka.paalanen, Raag Jadav

This series introduces device wedged event in DRM subsystem and uses it
in xe, i915 and amdgpu drivers. Detailed description in commit message.

This was earlier attempted as xe specific uevent in v1 and v2 on [1].
Similar work by André Almeida on [2].
Wedged event support for amdgpu by André Almeida on [3].
Consumer implementation by Xaver Hugl on [4].

 [1] https://patchwork.freedesktop.org/series/136909/
 [2] https://lore.kernel.org/dri-devel/20221125175203.52481-1-andrealmeid@igalia.com/
 [3] https://lore.kernel.org/dri-devel/20241216162104.58241-1-andrealmeid@igalia.com/
 [4] https://invent.kde.org/plasma/kwin/-/merge_requests/7027

 v2: Change authorship to Himal (Aravind)
     Add uevent for all device wedged cases (Aravind)

 v3: Generic implementation in DRM subsystem (Lucas)

 v4: s/drm_dev_wedged/drm_dev_wedged_event
     Use drm_info() (Jani)
     Kernel doc adjustment (Aravind)
     Change authorship to Raag (Aravind)

 v5: Send recovery method with uevent (Lina)
     Expose supported recovery methods via sysfs (Lucas)

 v6: Access wedge_recovery_opts[] using helper function (Jani)
     Use snprintf() (Jani)

 v7: Convert recovery helpers into regular functions (Andy, Jani)
     Aesthetic adjustments (Andy)
     Handle invalid recovery method
     Add documentation to drm-uapi.rst (Sima)

 v8: Drop sysfs and allow sending multiple methods with uevent (Lucas, Michal)
     Improve documentation (Christian, Rodrigo)
     static_assert() globally (Andy)

 v9: Document prerequisites section (Christian)
     Provide 'none' method for device reset (Christian)
     Provide recovery opts using switch cases

v10: Clarify mmap cleanup and consumer prerequisites (Christian, Aravind)

v11: Log device reset (André)
     Reference wedged event in device reset chapter (André)
     Wedged event support for amdgpu (André)

v12: Refine consumer expectations and terminologies (Xaver, Pekka)

André Almeida (1):
  drm/amdgpu: Use device wedged event

Raag Jadav (4):
  drm: Introduce device wedged event
  drm/doc: Document device wedged event
  drm/xe: Use device wedged event
  drm/i915: Use device wedged event

 Documentation/gpu/drm-uapi.rst             | 116 ++++++++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   4 +
 drivers/gpu/drm/drm_drv.c                  |  68 ++++++++++++
 drivers/gpu/drm/i915/gt/intel_reset.c      |   3 +
 drivers/gpu/drm/xe/xe_device.c             |   7 +-
 include/drm/drm_device.h                   |   8 ++
 include/drm/drm_drv.h                      |   1 +
 7 files changed, 203 insertions(+), 4 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v12 1/5] drm: Introduce device wedged event
  2025-02-04  7:05 [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
@ 2025-02-04  7:05 ` Raag Jadav
  2025-02-04 15:50   ` Christian König
  2025-02-04  7:05 ` [PATCH v12 2/5] drm/doc: Document " Raag Jadav
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Raag Jadav @ 2025-02-04  7:05 UTC (permalink / raw)
  To: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
	christian.koenig, alexander.deucher
  Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
	aravind.iddamsetty, anshuman.gupta, andriy.shevchenko, lina,
	michal.wajdeczko, andrealmeid, amd-gfx, kernel-dev, xaver.hugl,
	pekka.paalanen, Raag Jadav

Introduce device wedged event, which notifies userspace of 'wedged'
(hanged/unusable) state of the DRM device through a uevent. This is
useful especially in cases where the device is no longer operating as
expected and has become unrecoverable from driver context. Purpose of
this implementation is to provide drivers a generic way to recover the
device with the help of userspace intervention without taking any drastic
measures (like resetting or re-enumerating the full bus, on which the
underlying physical device is sitting) in the driver.

A 'wedged' device is basically a device that is declared dead by the
driver after exhausting all possible attempts to recover it from driver
context. The uevent is the notification that is sent to userspace along
with a hint about what could possibly be attempted to recover the device
from userspace and bring it back to usable state. Different drivers may
have different ideas of a 'wedged' device depending on hardware
implementation of the underlying physical device, and hence the vendor
agnostic nature of the event. It is up to the drivers to decide when they
see the need for device recovery and how they want to recover from the
available methods.

Driver prerequisites
--------------------

The driver, before opting for recovery, needs to make sure that the
'wedged' device doesn't harm the system as a whole by taking care of the
prerequisites. Necessary actions must include disabling DMA to system
memory as well as any communication channels with other devices. Further,
the driver must ensure that all dma_fences are signalled and any device
state that the core kernel might depend on is cleaned up. All existing
mmaps should be invalidated and page faults should be redirected to a
dummy page. Once the event is sent, the device must be kept in 'wedged'
state until the recovery is performed. New accesses to the device
(IOCTLs) should be rejected, preferably with an error code that resembles
the type of failure the device has encountered. This will signify the
reason for wedging, which can be reported to the application if needed.

Recovery
--------

Current implementation defines three recovery methods, out of which,
drivers can use any one, multiple or none. Method(s) of choice will be
sent in the uevent environment as ``WEDGED=<method1>[,..,<methodN>]`` in
order of less to more side-effects. If driver is unsure about recovery
or method is unknown (like soft/hard system reboot, firmware flashing,
physical device replacement or any other procedure which can't be
attempted on the fly), ``WEDGED=unknown`` will be sent instead.

Userspace consumers can parse this event and attempt recovery as per the
following expectations.

    =============== ========================================
    Recovery method Consumer expectations
    =============== ========================================
    none            optional telemetry collection
    rebind          unbind + bind driver
    bus-reset       unbind + bus reset/re-enumeration + bind
    unknown         consumer policy
    =============== ========================================

The only exception to this is ``WEDGED=none``, which signifies that the
device was temporarily 'wedged' at some point but was recovered from driver
context using device specific methods like reset. No explicit recovery is
expected from the consumer in this case, but it can still take additional
steps like gathering telemetry information (devcoredump, syslog). This is
useful because the first hang is usually the most critical one which can
result in consequential hangs or complete wedging.

Consumer prerequisites
----------------------

It is the responsibility of the consumer to make sure that the device or
its resources are not in use by any process before attempting recovery.
With IOCTLs erroring out, all device memory should be unmapped and file
descriptors should be closed to prevent leaks or undefined behaviour. The
idea here is to clear the device of all user context beforehand and set
the stage for a clean recovery.

Example
-------

Udev rule::

    SUBSYSTEM=="drm", ENV{WEDGED}=="rebind", DEVPATH=="*/drm/card[0-9]",
    RUN+="/path/to/rebind.sh $env{DEVPATH}"

Recovery script::

    #!/bin/sh

    DEVPATH=$(readlink -f /sys/$1/device)
    DEVICE=$(basename $DEVPATH)
    DRIVER=$(readlink -f $DEVPATH/driver)

    echo -n $DEVICE > $DRIVER/unbind
    echo -n $DEVICE > $DRIVER/bind

Customization
-------------

Although basic recovery is possible with a simple script, consumers can
define custom policies around recovery. For example, if the driver supports
multiple recovery methods, consumers can opt for the suitable one depending
on scenarios like repeat offences or vendor specific failures. Consumers
can also choose to have the device available for debugging or telemetry
collection and base their recovery decision on the findings. This is useful
especially when the driver is unsure about recovery or method is unknown.

 v4: s/drm_dev_wedged/drm_dev_wedged_event
     Use drm_info() (Jani)
     Kernel doc adjustment (Aravind)
 v5: Send recovery method with uevent (Lina)
 v6: Access wedge_recovery_opts[] using helper function (Jani)
     Use snprintf() (Jani)
 v7: Convert recovery helpers into regular functions (Andy, Jani)
     Aesthetic adjustments (Andy)
     Handle invalid recovery method
 v8: Allow sending multiple methods with uevent (Lucas, Michal)
     static_assert() globally (Andy)
 v9: Provide 'none' method for device reset (Christian)
     Provide recovery opts using switch cases
v11: Log device reset (André)

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
---
 drivers/gpu/drm/drm_drv.c | 68 +++++++++++++++++++++++++++++++++++++++
 include/drm/drm_device.h  |  8 +++++
 include/drm/drm_drv.h     |  1 +
 3 files changed, 77 insertions(+)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 3cf440eee8a2..17fc5dc708f4 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -26,6 +26,7 @@
  * DEALINGS IN THE SOFTWARE.
  */
 
+#include <linux/bitops.h>
 #include <linux/cgroup_dmem.h>
 #include <linux/debugfs.h>
 #include <linux/fs.h>
@@ -34,6 +35,7 @@
 #include <linux/mount.h>
 #include <linux/pseudo_fs.h>
 #include <linux/slab.h>
+#include <linux/sprintf.h>
 #include <linux/srcu.h>
 #include <linux/xarray.h>
 
@@ -498,6 +500,72 @@ void drm_dev_unplug(struct drm_device *dev)
 }
 EXPORT_SYMBOL(drm_dev_unplug);
 
+/*
+ * Available recovery methods for wedged device. To be sent along with device
+ * wedged uevent.
+ */
+static const char *drm_get_wedge_recovery(unsigned int opt)
+{
+	switch (BIT(opt)) {
+	case DRM_WEDGE_RECOVERY_NONE:
+		return "none";
+	case DRM_WEDGE_RECOVERY_REBIND:
+		return "rebind";
+	case DRM_WEDGE_RECOVERY_BUS_RESET:
+		return "bus-reset";
+	default:
+		return NULL;
+	}
+}
+
+/**
+ * drm_dev_wedged_event - generate a device wedged uevent
+ * @dev: DRM device
+ * @method: method(s) to be used for recovery
+ *
+ * This generates a device wedged uevent for the DRM device specified by @dev.
+ * Recovery @method\(s) of choice will be sent in the uevent environment as
+ * ``WEDGED=<method1>[,..,<methodN>]`` in order of less to more side-effects.
+ * If caller is unsure about recovery or @method is unknown (0),
+ * ``WEDGED=unknown`` will be sent instead.
+ *
+ * Refer to "Device Wedging" chapter in Documentation/gpu/drm-uapi.rst for more
+ * details.
+ *
+ * Returns: 0 on success, negative error code otherwise.
+ */
+int drm_dev_wedged_event(struct drm_device *dev, unsigned long method)
+{
+	const char *recovery = NULL;
+	unsigned int len, opt;
+	/* Event string length up to 28+ characters with available methods */
+	char event_string[32];
+	char *envp[] = { event_string, NULL };
+
+	len = scnprintf(event_string, sizeof(event_string), "%s", "WEDGED=");
+
+	for_each_set_bit(opt, &method, BITS_PER_TYPE(method)) {
+		recovery = drm_get_wedge_recovery(opt);
+		if (drm_WARN_ONCE(dev, !recovery, "invalid recovery method %u\n", opt))
+			break;
+
+		len += scnprintf(event_string + len, sizeof(event_string), "%s,", recovery);
+	}
+
+	if (recovery)
+		/* Get rid of trailing comma */
+		event_string[len - 1] = '\0';
+	else
+		/* Caller is unsure about recovery, do the best we can at this point. */
+		snprintf(event_string, sizeof(event_string), "%s", "WEDGED=unknown");
+
+	drm_info(dev, "device wedged, %s\n", method == DRM_WEDGE_RECOVERY_NONE ?
+		 "but recovered through reset" : "needs recovery");
+
+	return kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp);
+}
+EXPORT_SYMBOL(drm_dev_wedged_event);
+
 /*
  * DRM internal mount
  * We want to be able to allocate our own "struct address_space" to control
diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
index c91f87b5242d..6ea54a578cda 100644
--- a/include/drm/drm_device.h
+++ b/include/drm/drm_device.h
@@ -21,6 +21,14 @@ struct inode;
 struct pci_dev;
 struct pci_controller;
 
+/*
+ * Recovery methods for wedged device in order of less to more side-effects.
+ * To be used with drm_dev_wedged_event() as recovery @method. Callers can
+ * use any one, multiple (or'd) or none depending on their needs.
+ */
+#define DRM_WEDGE_RECOVERY_NONE		BIT(0)	/* optional telemetry collection */
+#define DRM_WEDGE_RECOVERY_REBIND	BIT(1)	/* unbind + bind driver */
+#define DRM_WEDGE_RECOVERY_BUS_RESET	BIT(2)	/* unbind + reset bus device + bind */
 
 /**
  * enum switch_power_state - power state of drm device
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index 9952b846c170..a43d707b5f36 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -482,6 +482,7 @@ void drm_put_dev(struct drm_device *dev);
 bool drm_dev_enter(struct drm_device *dev, int *idx);
 void drm_dev_exit(int idx);
 void drm_dev_unplug(struct drm_device *dev);
+int drm_dev_wedged_event(struct drm_device *dev, unsigned long method);
 
 /**
  * drm_dev_is_unplugged - is a DRM device unplugged
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v12 2/5] drm/doc: Document device wedged event
  2025-02-04  7:05 [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
  2025-02-04  7:05 ` [PATCH v12 1/5] drm: Introduce " Raag Jadav
@ 2025-02-04  7:05 ` Raag Jadav
  2025-02-04  7:05 ` [PATCH v12 3/5] drm/xe: Use " Raag Jadav
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Raag Jadav @ 2025-02-04  7:05 UTC (permalink / raw)
  To: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
	christian.koenig, alexander.deucher
  Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
	aravind.iddamsetty, anshuman.gupta, andriy.shevchenko, lina,
	michal.wajdeczko, andrealmeid, amd-gfx, kernel-dev, xaver.hugl,
	pekka.paalanen, Raag Jadav

Add documentation for device wedged event in a new "Device wedging"
chapter. This describes basic definitions, prerequisites and consumer
expectations along with an example.

 v8: Improve introduction (Christian, Rodrigo)
 v9: Add prerequisites section (Christian)
v10: Clarify mmap cleanup and consumer prerequisites (Christian, Aravind)
v11: Reference wedged event in device reset chapter (André)
v12: Refine consumer expectations and terminologies (Xaver, Pekka)

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
---
 Documentation/gpu/drm-uapi.rst | 116 ++++++++++++++++++++++++++++++++-
 1 file changed, 113 insertions(+), 3 deletions(-)

diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
index b75cc9a70d1f..69f72e71a96e 100644
--- a/Documentation/gpu/drm-uapi.rst
+++ b/Documentation/gpu/drm-uapi.rst
@@ -371,9 +371,119 @@ Reporting causes of resets
 
 Apart from propagating the reset through the stack so apps can recover, it's
 really useful for driver developers to learn more about what caused the reset in
-the first place. DRM devices should make use of devcoredump to store relevant
-information about the reset, so this information can be added to user bug
-reports.
+the first place. For this, drivers can make use of devcoredump to store relevant
+information about the reset and send device wedged event with ``none`` recovery
+method (as explained in "Device Wedging" chapter) to notify userspace, so this
+information can be collected and added to user bug reports.
+
+Device Wedging
+==============
+
+Drivers can optionally make use of device wedged event (implemented as
+drm_dev_wedged_event() in DRM subsystem), which notifies userspace of 'wedged'
+(hanged/unusable) state of the DRM device through a uevent. This is useful
+especially in cases where the device is no longer operating as expected and has
+become unrecoverable from driver context. Purpose of this implementation is to
+provide drivers a generic way to recover the device with the help of userspace
+intervention, without taking any drastic measures (like resetting or
+re-enumerating the full bus, on which the underlying physical device is sitting)
+in the driver.
+
+A 'wedged' device is basically a device that is declared dead by the driver
+after exhausting all possible attempts to recover it from driver context. The
+uevent is the notification that is sent to userspace along with a hint about
+what could possibly be attempted to recover the device from userspace and bring
+it back to usable state. Different drivers may have different ideas of a
+'wedged' device depending on hardware implementation of the underlying physical
+device, and hence the vendor agnostic nature of the event. It is up to the
+drivers to decide when they see the need for device recovery and how they want
+to recover from the available methods.
+
+Driver prerequisites
+--------------------
+
+The driver, before opting for recovery, needs to make sure that the 'wedged'
+device doesn't harm the system as a whole by taking care of the prerequisites.
+Necessary actions must include disabling DMA to system memory as well as any
+communication channels with other devices. Further, the driver must ensure
+that all dma_fences are signalled and any device state that the core kernel
+might depend on is cleaned up. All existing mmaps should be invalidated and
+page faults should be redirected to a dummy page. Once the event is sent, the
+device must be kept in 'wedged' state until the recovery is performed. New
+accesses to the device (IOCTLs) should be rejected, preferably with an error
+code that resembles the type of failure the device has encountered. This will
+signify the reason for wedging, which can be reported to the application if
+needed.
+
+Recovery
+--------
+
+Current implementation defines three recovery methods, out of which, drivers
+can use any one, multiple or none. Method(s) of choice will be sent in the
+uevent environment as ``WEDGED=<method1>[,..,<methodN>]`` in order of less to
+more side-effects. If driver is unsure about recovery or method is unknown
+(like soft/hard system reboot, firmware flashing, physical device replacement
+or any other procedure which can't be attempted on the fly), ``WEDGED=unknown``
+will be sent instead.
+
+Userspace consumers can parse this event and attempt recovery as per the
+following expectations.
+
+    =============== ========================================
+    Recovery method Consumer expectations
+    =============== ========================================
+    none            optional telemetry collection
+    rebind          unbind + bind driver
+    bus-reset       unbind + bus reset/re-enumeration + bind
+    unknown         consumer policy
+    =============== ========================================
+
+The only exception to this is ``WEDGED=none``, which signifies that the device
+was temporarily 'wedged' at some point but was recovered from driver context
+using device specific methods like reset. No explicit recovery is expected from
+the consumer in this case, but it can still take additional steps like gathering
+telemetry information (devcoredump, syslog). This is useful because the first
+hang is usually the most critical one which can result in consequential hangs or
+complete wedging.
+
+Consumer prerequisites
+----------------------
+
+It is the responsibility of the consumer to make sure that the device or its
+resources are not in use by any process before attempting recovery. With IOCTLs
+erroring out, all device memory should be unmapped and file descriptors should
+be closed to prevent leaks or undefined behaviour. The idea here is to clear the
+device of all user context beforehand and set the stage for a clean recovery.
+
+Example
+-------
+
+Udev rule::
+
+    SUBSYSTEM=="drm", ENV{WEDGED}=="rebind", DEVPATH=="*/drm/card[0-9]",
+    RUN+="/path/to/rebind.sh $env{DEVPATH}"
+
+Recovery script::
+
+    #!/bin/sh
+
+    DEVPATH=$(readlink -f /sys/$1/device)
+    DEVICE=$(basename $DEVPATH)
+    DRIVER=$(readlink -f $DEVPATH/driver)
+
+    echo -n $DEVICE > $DRIVER/unbind
+    echo -n $DEVICE > $DRIVER/bind
+
+Customization
+-------------
+
+Although basic recovery is possible with a simple script, consumers can define
+custom policies around recovery. For example, if the driver supports multiple
+recovery methods, consumers can opt for the suitable one depending on scenarios
+like repeat offences or vendor specific failures. Consumers can also choose to
+have the device available for debugging or telemetry collection and base their
+recovery decision on the findings. This is useful especially when the driver is
+unsure about recovery or method is unknown.
 
 .. _drm_driver_ioctl:
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v12 3/5] drm/xe: Use device wedged event
  2025-02-04  7:05 [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
  2025-02-04  7:05 ` [PATCH v12 1/5] drm: Introduce " Raag Jadav
  2025-02-04  7:05 ` [PATCH v12 2/5] drm/doc: Document " Raag Jadav
@ 2025-02-04  7:05 ` Raag Jadav
  2025-02-04 17:23   ` Rodrigo Vivi
  2025-02-04  7:05 ` [PATCH v12 4/5] drm/i915: " Raag Jadav
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Raag Jadav @ 2025-02-04  7:05 UTC (permalink / raw)
  To: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
	christian.koenig, alexander.deucher
  Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
	aravind.iddamsetty, anshuman.gupta, andriy.shevchenko, lina,
	michal.wajdeczko, andrealmeid, amd-gfx, kernel-dev, xaver.hugl,
	pekka.paalanen, Raag Jadav

This was previously attempted as xe specific reset uevent but dropped
in commit 77a0d4d1cea2 ("drm/xe/uapi: Remove reset uevent for now")
as part of refactoring.

Now that we have device wedged event provided by DRM core, make use
of it and support both driver rebind and bus-reset based recovery.
With this in place userspace will be notified of wedged device, on
the basis of which, userspace may take respective action to recover
the device.

$ udevadm monitor --property --kernel
monitor will print the received events for:
KERNEL - the kernel uevent

KERNEL[265.802982] change   /devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0 (drm)
ACTION=change
DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0
SUBSYSTEM=drm
WEDGED=rebind,bus-reset
DEVNAME=/dev/dri/card0
DEVTYPE=drm_minor
SEQNUM=5208
MAJOR=226
MINOR=0

v2: Change authorship to Himal (Aravind)
    Add uevent for all device wedged cases (Aravind)
v3: Generic implementation in DRM subsystem (Lucas)
v4: Change authorship to Raag (Aravind)

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
---
 drivers/gpu/drm/xe/xe_device.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index f30f8f668dee..1cacefcb5afe 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -1120,7 +1120,8 @@ static void xe_device_wedged_fini(struct drm_device *drm, void *arg)
  * re-probe (unbind + bind).
  * In this state every IOCTL will be blocked so the GT cannot be used.
  * In general it will be called upon any critical error such as gt reset
- * failure or guc loading failure.
+ * failure or guc loading failure. Userspace will be notified of this state
+ * through device wedged uevent.
  * If xe.wedged module parameter is set to 2, this function will be called
  * on every single execution timeout (a.k.a. GPU hang) right after devcoredump
  * snapshot capture. In this mode, GT reset won't be attempted so the state of
@@ -1150,6 +1151,10 @@ void xe_device_declare_wedged(struct xe_device *xe)
 			"IOCTLs and executions are blocked. Only a rebind may clear the failure\n"
 			"Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new\n",
 			dev_name(xe->drm.dev));
+
+		/* Notify userspace of wedged device */
+		drm_dev_wedged_event(&xe->drm,
+				     DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET);
 	}
 
 	for_each_gt(gt, xe, id)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v12 4/5] drm/i915: Use device wedged event
  2025-02-04  7:05 [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
                   ` (2 preceding siblings ...)
  2025-02-04  7:05 ` [PATCH v12 3/5] drm/xe: Use " Raag Jadav
@ 2025-02-04  7:05 ` Raag Jadav
  2025-02-04 17:24   ` Rodrigo Vivi
  2025-02-04  7:05 ` [PATCH v12 5/5] drm/amdgpu: " Raag Jadav
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Raag Jadav @ 2025-02-04  7:05 UTC (permalink / raw)
  To: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
	christian.koenig, alexander.deucher
  Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
	aravind.iddamsetty, anshuman.gupta, andriy.shevchenko, lina,
	michal.wajdeczko, andrealmeid, amd-gfx, kernel-dev, xaver.hugl,
	pekka.paalanen, Raag Jadav

Now that we have device wedged event provided by DRM core, make use
of it and support both driver rebind and bus-reset based recovery.
With this in place, userspace will be notified of wedged device on
gt reset failure.

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_reset.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index aae5a081cb53..d6dc12fd87c1 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1422,6 +1422,9 @@ static void intel_gt_reset_global(struct intel_gt *gt,
 
 	if (!test_bit(I915_WEDGED, &gt->reset.flags))
 		kobject_uevent_env(kobj, KOBJ_CHANGE, reset_done_event);
+	else
+		drm_dev_wedged_event(&gt->i915->drm,
+				     DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET);
 }
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v12 5/5] drm/amdgpu: Use device wedged event
  2025-02-04  7:05 [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
                   ` (3 preceding siblings ...)
  2025-02-04  7:05 ` [PATCH v12 4/5] drm/i915: " Raag Jadav
@ 2025-02-04  7:05 ` Raag Jadav
  2025-02-04  7:53 ` ✗ Fi.CI.CHECKPATCH: warning for Introduce DRM device wedged event (rev10) Patchwork
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Raag Jadav @ 2025-02-04  7:05 UTC (permalink / raw)
  To: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
	christian.koenig, alexander.deucher
  Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
	aravind.iddamsetty, anshuman.gupta, andriy.shevchenko, lina,
	michal.wajdeczko, andrealmeid, amd-gfx, kernel-dev, xaver.hugl,
	pekka.paalanen, Raag Jadav, Shashank Sharma

From: André Almeida <andrealmeid@igalia.com>

Use DRM's device wedged event to notify userspace that a reset had
happened. For now, only use `none` method meant for telemetry
capture.

In the future we might want to report a recovery method if the reset didn't
succeed.

Acked-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index d100bb7a137c..9e7219bff0c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -6116,6 +6116,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 		dev_info(adev->dev, "GPU reset end with ret = %d\n", r);
 
 	atomic_set(&adev->reset_domain->reset_res, r);
+
+	if (!r)
+		drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE);
+
 	return r;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for Introduce DRM device wedged event (rev10)
  2025-02-04  7:05 [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
                   ` (4 preceding siblings ...)
  2025-02-04  7:05 ` [PATCH v12 5/5] drm/amdgpu: " Raag Jadav
@ 2025-02-04  7:53 ` Patchwork
  2025-02-04  7:53 ` ✗ Fi.CI.SPARSE: " Patchwork
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2025-02-04  7:53 UTC (permalink / raw)
  To: Raag Jadav; +Cc: intel-gfx

== Series Details ==

Series: Introduce DRM device wedged event (rev10)
URL   : https://patchwork.freedesktop.org/series/138069/
State : warning

== Summary ==

Error: dim checkpatch failed
9e92dc048adb drm: Introduce device wedged event
-:198: WARNING:STATIC_CONST_CHAR_ARRAY: char * array declaration might be better as static const
#198: FILE: drivers/gpu/drm/drm_drv.c:543:
+	char *envp[] = { event_string, NULL };

total: 0 errors, 1 warnings, 0 checks, 107 lines checked
13908eadb66e drm/doc: Document device wedged event
c89d21c38fb5 drm/xe: Use device wedged event
-:20: WARNING:COMMIT_LOG_LONG_LINE: Prefer a maximum 75 chars per line (possible unwrapped commit description?)
#20: 
KERNEL[265.802982] change   /devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0 (drm)

total: 0 errors, 1 warnings, 0 checks, 19 lines checked
3548c4964b04 drm/i915: Use device wedged event
c0a600de029d drm/amdgpu: Use device wedged event



^ permalink raw reply	[flat|nested] 16+ messages in thread

* ✗ Fi.CI.SPARSE: warning for Introduce DRM device wedged event (rev10)
  2025-02-04  7:05 [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
                   ` (5 preceding siblings ...)
  2025-02-04  7:53 ` ✗ Fi.CI.CHECKPATCH: warning for Introduce DRM device wedged event (rev10) Patchwork
@ 2025-02-04  7:53 ` Patchwork
  2025-02-04  8:11 ` ✗ i915.CI.BAT: failure " Patchwork
  2025-02-12  6:57 ` [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
  8 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2025-02-04  7:53 UTC (permalink / raw)
  To: Raag Jadav; +Cc: intel-gfx

== Series Details ==

Series: Introduce DRM device wedged event (rev10)
URL   : https://patchwork.freedesktop.org/series/138069/
State : warning

== Summary ==

Error: dim sparse failed
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* ✗ i915.CI.BAT: failure for Introduce DRM device wedged event (rev10)
  2025-02-04  7:05 [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
                   ` (6 preceding siblings ...)
  2025-02-04  7:53 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2025-02-04  8:11 ` Patchwork
  2025-02-12  6:57 ` [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
  8 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2025-02-04  8:11 UTC (permalink / raw)
  To: Raag Jadav; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 7646 bytes --]

== Series Details ==

Series: Introduce DRM device wedged event (rev10)
URL   : https://patchwork.freedesktop.org/series/138069/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_16059 -> Patchwork_138069v10
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_138069v10 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_138069v10, please notify your bug team (I915-ci-infra@lists.freedesktop.org) to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/index.html

Participating hosts (44 -> 42)
------------------------------

  Additional (1): fi-pnv-d510 
  Missing    (3): fi-blb-e6850 fi-glk-j4005 fi-snb-2520m 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_138069v10:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live@gt_heartbeat:
    - bat-twl-1:          [PASS][1] -> [ABORT][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16059/bat-twl-1/igt@i915_selftest@live@gt_heartbeat.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/bat-twl-1/igt@i915_selftest@live@gt_heartbeat.html

  
Known issues
------------

  Here are the changes found in Patchwork_138069v10 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@dmabuf@all-tests:
    - fi-pnv-d510:        NOTRUN -> [INCOMPLETE][3] ([i915#12904]) +1 other test incomplete
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/fi-pnv-d510/igt@dmabuf@all-tests.html

  * igt@i915_module_load@load:
    - bat-twl-1:          [PASS][4] -> [DMESG-WARN][5] ([i915#1982])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16059/bat-twl-1/igt@i915_module_load@load.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/bat-twl-1/igt@i915_module_load@load.html

  * igt@i915_pm_rpm@module-reload:
    - bat-dg1-7:          [PASS][6] -> [FAIL][7] ([i915#13401])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16059/bat-dg1-7/igt@i915_pm_rpm@module-reload.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/bat-dg1-7/igt@i915_pm_rpm@module-reload.html
    - bat-rpls-4:         [PASS][8] -> [FAIL][9] ([i915#13401])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16059/bat-rpls-4/igt@i915_pm_rpm@module-reload.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/bat-rpls-4/igt@i915_pm_rpm@module-reload.html

  * igt@i915_selftest@live:
    - bat-arlh-3:         [PASS][10] -> [DMESG-FAIL][11] ([i915#12061] / [i915#12435])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16059/bat-arlh-3/igt@i915_selftest@live.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/bat-arlh-3/igt@i915_selftest@live.html
    - bat-twl-1:          [PASS][12] -> [ABORT][13] ([i915#12919] / [i915#13503])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16059/bat-twl-1/igt@i915_selftest@live.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/bat-twl-1/igt@i915_selftest@live.html

  * igt@i915_selftest@live@workarounds:
    - bat-arlh-3:         [PASS][14] -> [DMESG-FAIL][15] ([i915#12061])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16059/bat-arlh-3/igt@i915_selftest@live@workarounds.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/bat-arlh-3/igt@i915_selftest@live@workarounds.html

  * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence:
    - bat-dg2-11:         [PASS][16] -> [SKIP][17] ([i915#9197]) +2 other tests skip
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16059/bat-dg2-11/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/bat-dg2-11/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html

  * igt@kms_psr@psr-primary-mmap-gtt:
    - fi-pnv-d510:        NOTRUN -> [SKIP][18] +36 other tests skip
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/fi-pnv-d510/igt@kms_psr@psr-primary-mmap-gtt.html

  
#### Possible fixes ####

  * igt@dmabuf@all-tests:
    - bat-apl-1:          [INCOMPLETE][19] ([i915#12904]) -> [PASS][20] +1 other test pass
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16059/bat-apl-1/igt@dmabuf@all-tests.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/bat-apl-1/igt@dmabuf@all-tests.html

  * igt@gem_exec_fence@basic-wait:
    - bat-rpls-4:         [DMESG-WARN][21] ([i915#13400]) -> [PASS][22] +1 other test pass
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16059/bat-rpls-4/igt@gem_exec_fence@basic-wait.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/bat-rpls-4/igt@gem_exec_fence@basic-wait.html

  * igt@i915_selftest@live:
    - bat-adlp-6:         [ABORT][23] ([i915#13399]) -> [PASS][24] +1 other test pass
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16059/bat-adlp-6/igt@i915_selftest@live.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/bat-adlp-6/igt@i915_selftest@live.html

  * igt@i915_selftest@live@late_gt_pm:
    - fi-cfl-8109u:       [DMESG-WARN][25] ([i915#11621]) -> [PASS][26] +132 other tests pass
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16059/fi-cfl-8109u/igt@i915_selftest@live@late_gt_pm.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/fi-cfl-8109u/igt@i915_selftest@live@late_gt_pm.html

  * igt@i915_selftest@live@workarounds:
    - {bat-arls-6}:       [DMESG-FAIL][27] ([i915#12061]) -> [PASS][28] +1 other test pass
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16059/bat-arls-6/igt@i915_selftest@live@workarounds.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/bat-arls-6/igt@i915_selftest@live@workarounds.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#11621]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11621
  [i915#12061]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12061
  [i915#12435]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12435
  [i915#12904]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12904
  [i915#12919]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12919
  [i915#13399]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13399
  [i915#13400]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13400
  [i915#13401]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13401
  [i915#13503]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13503
  [i915#1982]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/1982
  [i915#9197]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/9197


Build changes
-------------

  * Linux: CI_DRM_16059 -> Patchwork_138069v10

  CI-20190529: 20190529
  CI_DRM_16059: e300f8946bc0ce873e4c4bc1a2cd05e7b617b1db @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_8221: ad1f57286d15d083b08c94f3d93600db85f9945b @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_138069v10: e300f8946bc0ce873e4c4bc1a2cd05e7b617b1db @ git://anongit.freedesktop.org/gfx-ci/linux

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_138069v10/index.html

[-- Attachment #2: Type: text/html, Size: 8890 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v12 1/5] drm: Introduce device wedged event
  2025-02-04  7:05 ` [PATCH v12 1/5] drm: Introduce " Raag Jadav
@ 2025-02-04 15:50   ` Christian König
  0 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2025-02-04 15:50 UTC (permalink / raw)
  To: Raag Jadav, airlied, simona, lucas.demarchi, rodrigo.vivi,
	jani.nikula, alexander.deucher
  Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
	aravind.iddamsetty, anshuman.gupta, andriy.shevchenko, lina,
	michal.wajdeczko, andrealmeid, amd-gfx, kernel-dev, xaver.hugl,
	pekka.paalanen

Am 04.02.25 um 08:05 schrieb Raag Jadav:
> Introduce device wedged event, which notifies userspace of 'wedged'
> (hanged/unusable) state of the DRM device through a uevent. This is
> useful especially in cases where the device is no longer operating as
> expected and has become unrecoverable from driver context. Purpose of
> this implementation is to provide drivers a generic way to recover the
> device with the help of userspace intervention without taking any drastic
> measures (like resetting or re-enumerating the full bus, on which the
> underlying physical device is sitting) in the driver.
>
> A 'wedged' device is basically a device that is declared dead by the
> driver after exhausting all possible attempts to recover it from driver
> context. The uevent is the notification that is sent to userspace along
> with a hint about what could possibly be attempted to recover the device
> from userspace and bring it back to usable state. Different drivers may
> have different ideas of a 'wedged' device depending on hardware
> implementation of the underlying physical device, and hence the vendor
> agnostic nature of the event. It is up to the drivers to decide when they
> see the need for device recovery and how they want to recover from the
> available methods.
>
> Driver prerequisites
> --------------------
>
> The driver, before opting for recovery, needs to make sure that the
> 'wedged' device doesn't harm the system as a whole by taking care of the
> prerequisites. Necessary actions must include disabling DMA to system
> memory as well as any communication channels with other devices. Further,
> the driver must ensure that all dma_fences are signalled and any device
> state that the core kernel might depend on is cleaned up. All existing
> mmaps should be invalidated and page faults should be redirected to a
> dummy page. Once the event is sent, the device must be kept in 'wedged'
> state until the recovery is performed. New accesses to the device
> (IOCTLs) should be rejected, preferably with an error code that resembles
> the type of failure the device has encountered. This will signify the
> reason for wedging, which can be reported to the application if needed.
>
> Recovery
> --------
>
> Current implementation defines three recovery methods, out of which,
> drivers can use any one, multiple or none. Method(s) of choice will be
> sent in the uevent environment as ``WEDGED=<method1>[,..,<methodN>]`` in
> order of less to more side-effects. If driver is unsure about recovery
> or method is unknown (like soft/hard system reboot, firmware flashing,
> physical device replacement or any other procedure which can't be
> attempted on the fly), ``WEDGED=unknown`` will be sent instead.
>
> Userspace consumers can parse this event and attempt recovery as per the
> following expectations.
>
>      =============== ========================================
>      Recovery method Consumer expectations
>      =============== ========================================
>      none            optional telemetry collection
>      rebind          unbind + bind driver
>      bus-reset       unbind + bus reset/re-enumeration + bind
>      unknown         consumer policy
>      =============== ========================================
>
> The only exception to this is ``WEDGED=none``, which signifies that the
> device was temporarily 'wedged' at some point but was recovered from driver
> context using device specific methods like reset. No explicit recovery is
> expected from the consumer in this case, but it can still take additional
> steps like gathering telemetry information (devcoredump, syslog). This is
> useful because the first hang is usually the most critical one which can
> result in consequential hangs or complete wedging.
>
> Consumer prerequisites
> ----------------------
>
> It is the responsibility of the consumer to make sure that the device or
> its resources are not in use by any process before attempting recovery.
> With IOCTLs erroring out, all device memory should be unmapped and file
> descriptors should be closed to prevent leaks or undefined behaviour. The
> idea here is to clear the device of all user context beforehand and set
> the stage for a clean recovery.
>
> Example
> -------
>
> Udev rule::
>
>      SUBSYSTEM=="drm", ENV{WEDGED}=="rebind", DEVPATH=="*/drm/card[0-9]",
>      RUN+="/path/to/rebind.sh $env{DEVPATH}"
>
> Recovery script::
>
>      #!/bin/sh
>
>      DEVPATH=$(readlink -f /sys/$1/device)
>      DEVICE=$(basename $DEVPATH)
>      DRIVER=$(readlink -f $DEVPATH/driver)
>
>      echo -n $DEVICE > $DRIVER/unbind
>      echo -n $DEVICE > $DRIVER/bind
>
> Customization
> -------------
>
> Although basic recovery is possible with a simple script, consumers can
> define custom policies around recovery. For example, if the driver supports
> multiple recovery methods, consumers can opt for the suitable one depending
> on scenarios like repeat offences or vendor specific failures. Consumers
> can also choose to have the device available for debugging or telemetry
> collection and base their recovery decision on the findings. This is useful
> especially when the driver is unsure about recovery or method is unknown.
>
>   v4: s/drm_dev_wedged/drm_dev_wedged_event
>       Use drm_info() (Jani)
>       Kernel doc adjustment (Aravind)
>   v5: Send recovery method with uevent (Lina)
>   v6: Access wedge_recovery_opts[] using helper function (Jani)
>       Use snprintf() (Jani)
>   v7: Convert recovery helpers into regular functions (Andy, Jani)
>       Aesthetic adjustments (Andy)
>       Handle invalid recovery method
>   v8: Allow sending multiple methods with uevent (Lucas, Michal)
>       static_assert() globally (Andy)
>   v9: Provide 'none' method for device reset (Christian)
>       Provide recovery opts using switch cases
> v11: Log device reset (André)
>
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> Reviewed-by: André Almeida <andrealmeid@igalia.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/drm_drv.c | 68 +++++++++++++++++++++++++++++++++++++++
>   include/drm/drm_device.h  |  8 +++++
>   include/drm/drm_drv.h     |  1 +
>   3 files changed, 77 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 3cf440eee8a2..17fc5dc708f4 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -26,6 +26,7 @@
>    * DEALINGS IN THE SOFTWARE.
>    */
>   
> +#include <linux/bitops.h>
>   #include <linux/cgroup_dmem.h>
>   #include <linux/debugfs.h>
>   #include <linux/fs.h>
> @@ -34,6 +35,7 @@
>   #include <linux/mount.h>
>   #include <linux/pseudo_fs.h>
>   #include <linux/slab.h>
> +#include <linux/sprintf.h>
>   #include <linux/srcu.h>
>   #include <linux/xarray.h>
>   
> @@ -498,6 +500,72 @@ void drm_dev_unplug(struct drm_device *dev)
>   }
>   EXPORT_SYMBOL(drm_dev_unplug);
>   
> +/*
> + * Available recovery methods for wedged device. To be sent along with device
> + * wedged uevent.
> + */
> +static const char *drm_get_wedge_recovery(unsigned int opt)
> +{
> +	switch (BIT(opt)) {
> +	case DRM_WEDGE_RECOVERY_NONE:
> +		return "none";
> +	case DRM_WEDGE_RECOVERY_REBIND:
> +		return "rebind";
> +	case DRM_WEDGE_RECOVERY_BUS_RESET:
> +		return "bus-reset";
> +	default:
> +		return NULL;
> +	}
> +}
> +
> +/**
> + * drm_dev_wedged_event - generate a device wedged uevent
> + * @dev: DRM device
> + * @method: method(s) to be used for recovery
> + *
> + * This generates a device wedged uevent for the DRM device specified by @dev.
> + * Recovery @method\(s) of choice will be sent in the uevent environment as
> + * ``WEDGED=<method1>[,..,<methodN>]`` in order of less to more side-effects.
> + * If caller is unsure about recovery or @method is unknown (0),
> + * ``WEDGED=unknown`` will be sent instead.
> + *
> + * Refer to "Device Wedging" chapter in Documentation/gpu/drm-uapi.rst for more
> + * details.
> + *
> + * Returns: 0 on success, negative error code otherwise.
> + */
> +int drm_dev_wedged_event(struct drm_device *dev, unsigned long method)
> +{
> +	const char *recovery = NULL;
> +	unsigned int len, opt;
> +	/* Event string length up to 28+ characters with available methods */
> +	char event_string[32];
> +	char *envp[] = { event_string, NULL };
> +
> +	len = scnprintf(event_string, sizeof(event_string), "%s", "WEDGED=");
> +
> +	for_each_set_bit(opt, &method, BITS_PER_TYPE(method)) {
> +		recovery = drm_get_wedge_recovery(opt);
> +		if (drm_WARN_ONCE(dev, !recovery, "invalid recovery method %u\n", opt))
> +			break;
> +
> +		len += scnprintf(event_string + len, sizeof(event_string), "%s,", recovery);
> +	}
> +
> +	if (recovery)
> +		/* Get rid of trailing comma */
> +		event_string[len - 1] = '\0';
> +	else
> +		/* Caller is unsure about recovery, do the best we can at this point. */
> +		snprintf(event_string, sizeof(event_string), "%s", "WEDGED=unknown");
> +
> +	drm_info(dev, "device wedged, %s\n", method == DRM_WEDGE_RECOVERY_NONE ?
> +		 "but recovered through reset" : "needs recovery");
> +
> +	return kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp);
> +}
> +EXPORT_SYMBOL(drm_dev_wedged_event);
> +
>   /*
>    * DRM internal mount
>    * We want to be able to allocate our own "struct address_space" to control
> diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
> index c91f87b5242d..6ea54a578cda 100644
> --- a/include/drm/drm_device.h
> +++ b/include/drm/drm_device.h
> @@ -21,6 +21,14 @@ struct inode;
>   struct pci_dev;
>   struct pci_controller;
>   
> +/*
> + * Recovery methods for wedged device in order of less to more side-effects.
> + * To be used with drm_dev_wedged_event() as recovery @method. Callers can
> + * use any one, multiple (or'd) or none depending on their needs.
> + */
> +#define DRM_WEDGE_RECOVERY_NONE		BIT(0)	/* optional telemetry collection */
> +#define DRM_WEDGE_RECOVERY_REBIND	BIT(1)	/* unbind + bind driver */
> +#define DRM_WEDGE_RECOVERY_BUS_RESET	BIT(2)	/* unbind + reset bus device + bind */
>   
>   /**
>    * enum switch_power_state - power state of drm device
> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> index 9952b846c170..a43d707b5f36 100644
> --- a/include/drm/drm_drv.h
> +++ b/include/drm/drm_drv.h
> @@ -482,6 +482,7 @@ void drm_put_dev(struct drm_device *dev);
>   bool drm_dev_enter(struct drm_device *dev, int *idx);
>   void drm_dev_exit(int idx);
>   void drm_dev_unplug(struct drm_device *dev);
> +int drm_dev_wedged_event(struct drm_device *dev, unsigned long method);
>   
>   /**
>    * drm_dev_is_unplugged - is a DRM device unplugged


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v12 3/5] drm/xe: Use device wedged event
  2025-02-04  7:05 ` [PATCH v12 3/5] drm/xe: Use " Raag Jadav
@ 2025-02-04 17:23   ` Rodrigo Vivi
  0 siblings, 0 replies; 16+ messages in thread
From: Rodrigo Vivi @ 2025-02-04 17:23 UTC (permalink / raw)
  To: Raag Jadav
  Cc: airlied, simona, lucas.demarchi, jani.nikula, christian.koenig,
	alexander.deucher, intel-xe, intel-gfx, dri-devel,
	himal.prasad.ghimiray, aravind.iddamsetty, anshuman.gupta,
	andriy.shevchenko, lina, michal.wajdeczko, andrealmeid, amd-gfx,
	kernel-dev, xaver.hugl, pekka.paalanen

On Tue, Feb 04, 2025 at 12:35:26PM +0530, Raag Jadav wrote:
> This was previously attempted as xe specific reset uevent but dropped
> in commit 77a0d4d1cea2 ("drm/xe/uapi: Remove reset uevent for now")
> as part of refactoring.
> 
> Now that we have device wedged event provided by DRM core, make use
> of it and support both driver rebind and bus-reset based recovery.
> With this in place userspace will be notified of wedged device, on
> the basis of which, userspace may take respective action to recover
> the device.
> 
> $ udevadm monitor --property --kernel
> monitor will print the received events for:
> KERNEL - the kernel uevent
> 
> KERNEL[265.802982] change   /devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0 (drm)
> ACTION=change
> DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0
> SUBSYSTEM=drm
> WEDGED=rebind,bus-reset
> DEVNAME=/dev/dri/card0
> DEVTYPE=drm_minor
> SEQNUM=5208
> MAJOR=226
> MINOR=0
> 
> v2: Change authorship to Himal (Aravind)
>     Add uevent for all device wedged cases (Aravind)
> v3: Generic implementation in DRM subsystem (Lucas)
> v4: Change authorship to Raag (Aravind)
> 
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>


Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

to merge this through drm-misc-next

> ---
>  drivers/gpu/drm/xe/xe_device.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index f30f8f668dee..1cacefcb5afe 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -1120,7 +1120,8 @@ static void xe_device_wedged_fini(struct drm_device *drm, void *arg)
>   * re-probe (unbind + bind).
>   * In this state every IOCTL will be blocked so the GT cannot be used.
>   * In general it will be called upon any critical error such as gt reset
> - * failure or guc loading failure.
> + * failure or guc loading failure. Userspace will be notified of this state
> + * through device wedged uevent.
>   * If xe.wedged module parameter is set to 2, this function will be called
>   * on every single execution timeout (a.k.a. GPU hang) right after devcoredump
>   * snapshot capture. In this mode, GT reset won't be attempted so the state of
> @@ -1150,6 +1151,10 @@ void xe_device_declare_wedged(struct xe_device *xe)
>  			"IOCTLs and executions are blocked. Only a rebind may clear the failure\n"
>  			"Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new\n",
>  			dev_name(xe->drm.dev));
> +
> +		/* Notify userspace of wedged device */
> +		drm_dev_wedged_event(&xe->drm,
> +				     DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET);
>  	}
>  
>  	for_each_gt(gt, xe, id)
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v12 4/5] drm/i915: Use device wedged event
  2025-02-04  7:05 ` [PATCH v12 4/5] drm/i915: " Raag Jadav
@ 2025-02-04 17:24   ` Rodrigo Vivi
  2025-02-04 17:40     ` Tvrtko Ursulin
  0 siblings, 1 reply; 16+ messages in thread
From: Rodrigo Vivi @ 2025-02-04 17:24 UTC (permalink / raw)
  To: Raag Jadav, Joonas Lahtinen, Tvrtko Ursulin
  Cc: airlied, simona, lucas.demarchi, jani.nikula, christian.koenig,
	alexander.deucher, intel-xe, intel-gfx, dri-devel,
	himal.prasad.ghimiray, aravind.iddamsetty, anshuman.gupta,
	andriy.shevchenko, lina, michal.wajdeczko, andrealmeid, amd-gfx,
	kernel-dev, xaver.hugl, pekka.paalanen

On Tue, Feb 04, 2025 at 12:35:27PM +0530, Raag Jadav wrote:
> Now that we have device wedged event provided by DRM core, make use
> of it and support both driver rebind and bus-reset based recovery.
> With this in place, userspace will be notified of wedged device on
> gt reset failure.
> 
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_reset.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c

Joonas, Tvrtko, ack on getting this through drm-misc-next?

> index aae5a081cb53..d6dc12fd87c1 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -1422,6 +1422,9 @@ static void intel_gt_reset_global(struct intel_gt *gt,
>  
>  	if (!test_bit(I915_WEDGED, &gt->reset.flags))
>  		kobject_uevent_env(kobj, KOBJ_CHANGE, reset_done_event);
> +	else
> +		drm_dev_wedged_event(&gt->i915->drm,
> +				     DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET);
>  }
>  
>  /**
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v12 4/5] drm/i915: Use device wedged event
  2025-02-04 17:24   ` Rodrigo Vivi
@ 2025-02-04 17:40     ` Tvrtko Ursulin
  0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2025-02-04 17:40 UTC (permalink / raw)
  To: Rodrigo Vivi, Raag Jadav, Joonas Lahtinen
  Cc: airlied, simona, lucas.demarchi, jani.nikula, christian.koenig,
	alexander.deucher, intel-xe, intel-gfx, dri-devel,
	himal.prasad.ghimiray, aravind.iddamsetty, anshuman.gupta,
	andriy.shevchenko, lina, michal.wajdeczko, andrealmeid, amd-gfx,
	kernel-dev, xaver.hugl, pekka.paalanen


On 04/02/2025 17:24, Rodrigo Vivi wrote:
> On Tue, Feb 04, 2025 at 12:35:27PM +0530, Raag Jadav wrote:
>> Now that we have device wedged event provided by DRM core, make use
>> of it and support both driver rebind and bus-reset based recovery.
>> With this in place, userspace will be notified of wedged device on
>> gt reset failure.
>>
>> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
>> Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_reset.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> 
> Joonas, Tvrtko, ack on getting this through drm-misc-next?

Acked-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

Regards,

Tvrtko

>> index aae5a081cb53..d6dc12fd87c1 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
>> @@ -1422,6 +1422,9 @@ static void intel_gt_reset_global(struct intel_gt *gt,
>>   
>>   	if (!test_bit(I915_WEDGED, &gt->reset.flags))
>>   		kobject_uevent_env(kobj, KOBJ_CHANGE, reset_done_event);
>> +	else
>> +		drm_dev_wedged_event(&gt->i915->drm,
>> +				     DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET);
>>   }
>>   
>>   /**
>> -- 
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v12 0/5] Introduce DRM device wedged event
  2025-02-04  7:05 [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
                   ` (7 preceding siblings ...)
  2025-02-04  8:11 ` ✗ i915.CI.BAT: failure " Patchwork
@ 2025-02-12  6:57 ` Raag Jadav
  2025-02-12 18:25   ` Rodrigo Vivi
  8 siblings, 1 reply; 16+ messages in thread
From: Raag Jadav @ 2025-02-12  6:57 UTC (permalink / raw)
  To: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
	christian.koenig, alexander.deucher, maarten.lankhorst, mripard,
	tzimmermann
  Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
	aravind.iddamsetty, anshuman.gupta, andriy.shevchenko, lina,
	michal.wajdeczko, andrealmeid, amd-gfx, kernel-dev, xaver.hugl,
	pekka.paalanen

On Tue, Feb 04, 2025 at 12:35:23PM +0530, Raag Jadav wrote:
> This series introduces device wedged event in DRM subsystem and uses it
> in xe, i915 and amdgpu drivers. Detailed description in commit message.
> 
> This was earlier attempted as xe specific uevent in v1 and v2 on [1].
> Similar work by André Almeida on [2].
> Wedged event support for amdgpu by André Almeida on [3].
> Consumer implementation by Xaver Hugl on [4].
> 
>  [1] https://patchwork.freedesktop.org/series/136909/
>  [2] https://lore.kernel.org/dri-devel/20221125175203.52481-1-andrealmeid@igalia.com/
>  [3] https://lore.kernel.org/dri-devel/20241216162104.58241-1-andrealmeid@igalia.com/
>  [4] https://invent.kde.org/plasma/kwin/-/merge_requests/7027

Bump. Anything I can do to move this forward?

Raag

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v12 0/5] Introduce DRM device wedged event
  2025-02-12  6:57 ` [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
@ 2025-02-12 18:25   ` Rodrigo Vivi
  2025-02-13 17:16     ` Rodrigo Vivi
  0 siblings, 1 reply; 16+ messages in thread
From: Rodrigo Vivi @ 2025-02-12 18:25 UTC (permalink / raw)
  To: Raag Jadav, Simona Vetter, Dave Airlie
  Cc: airlied, simona, lucas.demarchi, jani.nikula, christian.koenig,
	alexander.deucher, maarten.lankhorst, mripard, tzimmermann,
	intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
	aravind.iddamsetty, anshuman.gupta, andriy.shevchenko, lina,
	michal.wajdeczko, andrealmeid, amd-gfx, kernel-dev, xaver.hugl,
	pekka.paalanen

On Wed, Feb 12, 2025 at 08:57:10AM +0200, Raag Jadav wrote:
> On Tue, Feb 04, 2025 at 12:35:23PM +0530, Raag Jadav wrote:
> > This series introduces device wedged event in DRM subsystem and uses it
> > in xe, i915 and amdgpu drivers. Detailed description in commit message.
> > 
> > This was earlier attempted as xe specific uevent in v1 and v2 on [1].
> > Similar work by André Almeida on [2].
> > Wedged event support for amdgpu by André Almeida on [3].
> > Consumer implementation by Xaver Hugl on [4].
> > 
> >  [1] https://patchwork.freedesktop.org/series/136909/
> >  [2] https://lore.kernel.org/dri-devel/20221125175203.52481-1-andrealmeid@igalia.com/
> >  [3] https://lore.kernel.org/dri-devel/20241216162104.58241-1-andrealmeid@igalia.com/
> >  [4] https://invent.kde.org/plasma/kwin/-/merge_requests/7027
> 
> Bump. Anything I can do to move this forward?

Well, it would be great if we could get that merge request to remove
the draft and move forward with this approach. But, based on our last
discussion on #dri-devel, I don't see that as a blocker in any way.

And we also have all the reviews and acks needed to move forward with
this on drm-misc.

So, if no other objection I'm planing to push this to drm-misc-next
tomorrow.

> 
> Raag

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v12 0/5] Introduce DRM device wedged event
  2025-02-12 18:25   ` Rodrigo Vivi
@ 2025-02-13 17:16     ` Rodrigo Vivi
  0 siblings, 0 replies; 16+ messages in thread
From: Rodrigo Vivi @ 2025-02-13 17:16 UTC (permalink / raw)
  To: Raag Jadav, Simona Vetter, Dave Airlie
  Cc: simona, lucas.demarchi, jani.nikula, christian.koenig,
	alexander.deucher, maarten.lankhorst, mripard, tzimmermann,
	intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
	aravind.iddamsetty, anshuman.gupta, andriy.shevchenko, lina,
	michal.wajdeczko, andrealmeid, amd-gfx, kernel-dev, xaver.hugl,
	pekka.paalanen

On Wed, Feb 12, 2025 at 01:25:05PM -0500, Rodrigo Vivi wrote:
> On Wed, Feb 12, 2025 at 08:57:10AM +0200, Raag Jadav wrote:
> > On Tue, Feb 04, 2025 at 12:35:23PM +0530, Raag Jadav wrote:
> > > This series introduces device wedged event in DRM subsystem and uses it
> > > in xe, i915 and amdgpu drivers. Detailed description in commit message.
> > > 
> > > This was earlier attempted as xe specific uevent in v1 and v2 on [1].
> > > Similar work by André Almeida on [2].
> > > Wedged event support for amdgpu by André Almeida on [3].
> > > Consumer implementation by Xaver Hugl on [4].
> > > 
> > >  [1] https://patchwork.freedesktop.org/series/136909/
> > >  [2] https://lore.kernel.org/dri-devel/20221125175203.52481-1-andrealmeid@igalia.com/
> > >  [3] https://lore.kernel.org/dri-devel/20241216162104.58241-1-andrealmeid@igalia.com/
> > >  [4] https://invent.kde.org/plasma/kwin/-/merge_requests/7027
> > 
> > Bump. Anything I can do to move this forward?
> 
> Well, it would be great if we could get that merge request to remove
> the draft and move forward with this approach. But, based on our last
> discussion on #dri-devel, I don't see that as a blocker in any way.
> 
> And we also have all the reviews and acks needed to move forward with
> this on drm-misc.
> 
> So, if no other objection I'm planing to push this to drm-misc-next
> tomorrow.

done. pushed to drm-misc-next

> 
> > 
> > Raag

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-02-13 17:17 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-04  7:05 [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
2025-02-04  7:05 ` [PATCH v12 1/5] drm: Introduce " Raag Jadav
2025-02-04 15:50   ` Christian König
2025-02-04  7:05 ` [PATCH v12 2/5] drm/doc: Document " Raag Jadav
2025-02-04  7:05 ` [PATCH v12 3/5] drm/xe: Use " Raag Jadav
2025-02-04 17:23   ` Rodrigo Vivi
2025-02-04  7:05 ` [PATCH v12 4/5] drm/i915: " Raag Jadav
2025-02-04 17:24   ` Rodrigo Vivi
2025-02-04 17:40     ` Tvrtko Ursulin
2025-02-04  7:05 ` [PATCH v12 5/5] drm/amdgpu: " Raag Jadav
2025-02-04  7:53 ` ✗ Fi.CI.CHECKPATCH: warning for Introduce DRM device wedged event (rev10) Patchwork
2025-02-04  7:53 ` ✗ Fi.CI.SPARSE: " Patchwork
2025-02-04  8:11 ` ✗ i915.CI.BAT: failure " Patchwork
2025-02-12  6:57 ` [PATCH v12 0/5] Introduce DRM device wedged event Raag Jadav
2025-02-12 18:25   ` Rodrigo Vivi
2025-02-13 17:16     ` Rodrigo Vivi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox