* [PATCH v9 1/4] drm: Introduce device wedged event
2024-11-15 5:07 [PATCH v9 0/4] Introduce DRM device wedged event Raag Jadav
@ 2024-11-15 5:07 ` Raag Jadav
2024-11-18 14:56 ` Aravind Iddamsetty
2024-11-15 5:07 ` [PATCH v9 2/4] drm/doc: Document " Raag Jadav
` (6 subsequent siblings)
7 siblings, 1 reply; 24+ messages in thread
From: Raag Jadav @ 2024-11-15 5:07 UTC (permalink / raw)
To: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
andriy.shevchenko, lina, michal.wajdeczko, christian.koenig
Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
aravind.iddamsetty, anshuman.gupta, alexander.deucher,
andrealmeid, amd-gfx, kernel-dev, Raag Jadav
Introduce device wedged event, which notifies userspace of 'wedged'
(hanged/unusable) state of the DRM device through a uevent. This is
useful especially in cases where the device is no longer operating as
expected and has become unrecoverable from driver context. Purpose of
this implementation is to provide drivers a generic way to recover with
the help of userspace intervention without taking any drastic measures
in the driver.
A 'wedged' device is basically a dead device that needs attention. The
uevent is the notification that is sent to userspace along with a hint
about what could possibly be attempted to recover the device and bring
it back to usable state. Different drivers may have different ideas of
a 'wedged' device depending on their hardware implementation, and hence
the vendor agnostic nature of the event. It is up to the drivers to
decide when they see the need for recovery and how they want to recover
from the available methods.
Prerequisites
-------------
The driver, before opting for recovery, needs to make sure that the
'wedged' device doesn't harm the system as a whole by taking care of the
prerequisites. Necessary actions must include disabling DMA to system
memory as well as any communication channels with other devices. Further,
the driver must ensure that all dma_fences are signalled and any device
state that the core kernel might depend on are cleaned up. Once the event
is sent, the device must be kept in 'wedged' state until the recovery is
performed. New accesses to the device (IOCTLs) should be blocked,
preferably with an error code that resembles the type of failure the
device has encountered. This will signify the reason for wegeding which
can be reported to the application if needed.
Recovery
--------
Current implementation defines three recovery methods, out of which,
drivers can use any one, multiple or none. Method(s) of choice will be
sent in the uevent environment as ``WEDGED=<method1>[,<method2>]`` in
order of less to more side-effects. If driver is unsure about recovery
or method is unknown (like soft/hard reboot, firmware flashing, hardware
replacement or any other procedure which can't be attempted on the fly),
``WEDGED=unknown`` will be sent instead.
Userspace consumers can parse this event and attempt recovery as per the
following expectations.
=============== ================================
Recovery method Consumer expectations
=============== ================================
none optional telemetry collection
rebind unbind + bind driver
bus-reset unbind + reset bus device + bind
unknown admin/user policy
=============== ================================
The only exception to this is ``WEDGED=none``, which signifies that the
device was temporarily 'wedged' at some point but was able to recover
using device specific methods like reset. No explicit action is expected
from userspace consumers in this case, but they can still take additional
steps like gathering telemetry information (devcoredump, syslog). This is
useful because the first hang is usually the most critical one which can
result in consequential hangs or complete wedging.
Example
-------
Udev rule::
SUBSYSTEM=="drm", ENV{WEDGED}=="rebind", DEVPATH=="*/drm/card[0-9]",
RUN+="/path/to/rebind.sh $env{DEVPATH}"
Recovery script::
#!/bin/sh
DEVPATH=$(readlink -f /sys/$1/device)
DEVICE=$(basename $DEVPATH)
DRIVER=$(readlink -f $DEVPATH/driver)
echo -n $DEVICE > $DRIVER/unbind
sleep 1
echo -n $DEVICE > $DRIVER/bind
Customization
-------------
Although basic recovery is possible with a simple script, admin/users can
define custom policies around recovery action. For example, if the driver
supports multiple recovery methods, consumers can opt for the suitable one
based on policy definition. Consumers can also choose to have the device
available for debugging or additional data collection before performing
the recovery. This is useful especially when the driver is unsure about
recovery or method is unknown.
v4: s/drm_dev_wedged/drm_dev_wedged_event
Use drm_info() (Jani)
Kernel doc adjustment (Aravind)
v5: Send recovery method with uevent (Lina)
v6: Access wedge_recovery_opts[] using helper function (Jani)
Use snprintf() (Jani)
v7: Convert recovery helpers into regular functions (Andy, Jani)
Aesthetic adjustments (Andy)
Handle invalid method cases
v8: Allow sending multiple methods with uevent (Lucas, Michal)
static_assert() globally (Andy)
v9: Provide 'none' method for reset cases (Christian)
Provide recovery opts using switch cases
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
---
drivers/gpu/drm/drm_drv.c | 63 +++++++++++++++++++++++++++++++++++++++
include/drm/drm_device.h | 8 +++++
include/drm/drm_drv.h | 1 +
3 files changed, 72 insertions(+)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index c2c172eb25df..115e1d1c80ea 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -26,6 +26,7 @@
* DEALINGS IN THE SOFTWARE.
*/
+#include <linux/bitops.h>
#include <linux/debugfs.h>
#include <linux/fs.h>
#include <linux/module.h>
@@ -33,6 +34,7 @@
#include <linux/mount.h>
#include <linux/pseudo_fs.h>
#include <linux/slab.h>
+#include <linux/sprintf.h>
#include <linux/srcu.h>
#include <linux/xarray.h>
@@ -497,6 +499,67 @@ void drm_dev_unplug(struct drm_device *dev)
}
EXPORT_SYMBOL(drm_dev_unplug);
+/*
+ * Available recovery methods for wedged device. To be sent along with device
+ * wedged uevent.
+ */
+static const char *drm_get_wedge_recovery(unsigned int opt)
+{
+ switch (BIT(opt)) {
+ case DRM_WEDGE_RECOVERY_NONE:
+ return "none";
+ case DRM_WEDGE_RECOVERY_REBIND:
+ return "rebind";
+ case DRM_WEDGE_RECOVERY_BUS_RESET:
+ return "bus-reset";
+ default:
+ return NULL;
+ }
+}
+
+/**
+ * drm_dev_wedged_event - generate a device wedged uevent
+ * @dev: DRM device
+ * @method: method(s) to be used for recovery
+ *
+ * This generates a device wedged uevent for the DRM device specified by @dev.
+ * Recovery @method\(s) of choice will be sent in the uevent environment as
+ * ``WEDGED=<method1>[,<method2>]`` in order of less to more side-effects.
+ * If caller is unsure about recovery or @method is unknown (0),
+ * ``WEDGED=unknown`` will be sent instead.
+ *
+ * Returns: 0 on success, negative error code otherwise.
+ */
+int drm_dev_wedged_event(struct drm_device *dev, unsigned long method)
+{
+ const char *recovery = NULL;
+ unsigned int len, opt;
+ /* Event string length up to 28+ characters with available methods */
+ char event_string[32];
+ char *envp[] = { event_string, NULL };
+
+ len = scnprintf(event_string, sizeof(event_string), "%s", "WEDGED=");
+
+ for_each_set_bit(opt, &method, BITS_PER_TYPE(method)) {
+ recovery = drm_get_wedge_recovery(opt);
+ if (drm_WARN(dev, !recovery, "device wedged, invalid recovery method %u\n", opt))
+ break;
+
+ len += scnprintf(event_string + len, sizeof(event_string), "%s,", recovery);
+ }
+
+ if (recovery)
+ /* Get rid of trailing comma */
+ event_string[len - 1] = '\0';
+ else
+ /* Caller is unsure about recovery, do the best we can at this point. */
+ snprintf(event_string, sizeof(event_string), "%s", "WEDGED=unknown");
+
+ drm_info(dev, "device wedged, needs recovery\n");
+ return kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp);
+}
+EXPORT_SYMBOL(drm_dev_wedged_event);
+
/*
* DRM internal mount
* We want to be able to allocate our own "struct address_space" to control
diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
index c91f87b5242d..6ea54a578cda 100644
--- a/include/drm/drm_device.h
+++ b/include/drm/drm_device.h
@@ -21,6 +21,14 @@ struct inode;
struct pci_dev;
struct pci_controller;
+/*
+ * Recovery methods for wedged device in order of less to more side-effects.
+ * To be used with drm_dev_wedged_event() as recovery @method. Callers can
+ * use any one, multiple (or'd) or none depending on their needs.
+ */
+#define DRM_WEDGE_RECOVERY_NONE BIT(0) /* optional telemetry collection */
+#define DRM_WEDGE_RECOVERY_REBIND BIT(1) /* unbind + bind driver */
+#define DRM_WEDGE_RECOVERY_BUS_RESET BIT(2) /* unbind + reset bus device + bind */
/**
* enum switch_power_state - power state of drm device
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index 1bbbcb8e2d23..f41a82839e28 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -479,6 +479,7 @@ void drm_put_dev(struct drm_device *dev);
bool drm_dev_enter(struct drm_device *dev, int *idx);
void drm_dev_exit(int idx);
void drm_dev_unplug(struct drm_device *dev);
+int drm_dev_wedged_event(struct drm_device *dev, unsigned long method);
/**
* drm_dev_is_unplugged - is a DRM device unplugged
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH v9 1/4] drm: Introduce device wedged event
2024-11-15 5:07 ` [PATCH v9 1/4] drm: Introduce " Raag Jadav
@ 2024-11-18 14:56 ` Aravind Iddamsetty
2024-11-22 7:07 ` Raag Jadav
0 siblings, 1 reply; 24+ messages in thread
From: Aravind Iddamsetty @ 2024-11-18 14:56 UTC (permalink / raw)
To: Raag Jadav, airlied, simona, lucas.demarchi, rodrigo.vivi,
jani.nikula, andriy.shevchenko, lina, michal.wajdeczko,
christian.koenig
Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
anshuman.gupta, alexander.deucher, andrealmeid, amd-gfx,
kernel-dev
On 15/11/24 10:37, Raag Jadav wrote:
> Introduce device wedged event, which notifies userspace of 'wedged'
> (hanged/unusable) state of the DRM device through a uevent. This is
> useful especially in cases where the device is no longer operating as
> expected and has become unrecoverable from driver context. Purpose of
> this implementation is to provide drivers a generic way to recover with
> the help of userspace intervention without taking any drastic measures
> in the driver.
>
> A 'wedged' device is basically a dead device that needs attention. The
> uevent is the notification that is sent to userspace along with a hint
> about what could possibly be attempted to recover the device and bring
> it back to usable state. Different drivers may have different ideas of
> a 'wedged' device depending on their hardware implementation, and hence
> the vendor agnostic nature of the event. It is up to the drivers to
> decide when they see the need for recovery and how they want to recover
> from the available methods.
>
> Prerequisites
> -------------
>
> The driver, before opting for recovery, needs to make sure that the
> 'wedged' device doesn't harm the system as a whole by taking care of the
> prerequisites. Necessary actions must include disabling DMA to system
> memory as well as any communication channels with other devices. Further,
> the driver must ensure that all dma_fences are signalled and any device
> state that the core kernel might depend on are cleaned up. Once the event
> is sent, the device must be kept in 'wedged' state until the recovery is
> performed. New accesses to the device (IOCTLs) should be blocked,
> preferably with an error code that resembles the type of failure the
> device has encountered. This will signify the reason for wegeding which
> can be reported to the application if needed.
should we even drop the mmaps we created?
Thanks,
Aravind.
>
> Recovery
> --------
>
> Current implementation defines three recovery methods, out of which,
> drivers can use any one, multiple or none. Method(s) of choice will be
> sent in the uevent environment as ``WEDGED=<method1>[,<method2>]`` in
> order of less to more side-effects. If driver is unsure about recovery
> or method is unknown (like soft/hard reboot, firmware flashing, hardware
> replacement or any other procedure which can't be attempted on the fly),
> ``WEDGED=unknown`` will be sent instead.
>
> Userspace consumers can parse this event and attempt recovery as per the
> following expectations.
>
> =============== ================================
> Recovery method Consumer expectations
> =============== ================================
> none optional telemetry collection
> rebind unbind + bind driver
> bus-reset unbind + reset bus device + bind
> unknown admin/user policy
> =============== ================================
>
> The only exception to this is ``WEDGED=none``, which signifies that the
> device was temporarily 'wedged' at some point but was able to recover
> using device specific methods like reset. No explicit action is expected
> from userspace consumers in this case, but they can still take additional
> steps like gathering telemetry information (devcoredump, syslog). This is
> useful because the first hang is usually the most critical one which can
> result in consequential hangs or complete wedging.
>
> Example
> -------
>
> Udev rule::
>
> SUBSYSTEM=="drm", ENV{WEDGED}=="rebind", DEVPATH=="*/drm/card[0-9]",
> RUN+="/path/to/rebind.sh $env{DEVPATH}"
>
> Recovery script::
>
> #!/bin/sh
>
> DEVPATH=$(readlink -f /sys/$1/device)
> DEVICE=$(basename $DEVPATH)
> DRIVER=$(readlink -f $DEVPATH/driver)
>
> echo -n $DEVICE > $DRIVER/unbind
> sleep 1
> echo -n $DEVICE > $DRIVER/bind
>
> Customization
> -------------
>
> Although basic recovery is possible with a simple script, admin/users can
> define custom policies around recovery action. For example, if the driver
> supports multiple recovery methods, consumers can opt for the suitable one
> based on policy definition. Consumers can also choose to have the device
> available for debugging or additional data collection before performing
> the recovery. This is useful especially when the driver is unsure about
> recovery or method is unknown.
>
> v4: s/drm_dev_wedged/drm_dev_wedged_event
> Use drm_info() (Jani)
> Kernel doc adjustment (Aravind)
> v5: Send recovery method with uevent (Lina)
> v6: Access wedge_recovery_opts[] using helper function (Jani)
> Use snprintf() (Jani)
> v7: Convert recovery helpers into regular functions (Andy, Jani)
> Aesthetic adjustments (Andy)
> Handle invalid method cases
> v8: Allow sending multiple methods with uevent (Lucas, Michal)
> static_assert() globally (Andy)
> v9: Provide 'none' method for reset cases (Christian)
> Provide recovery opts using switch cases
>
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> ---
> drivers/gpu/drm/drm_drv.c | 63 +++++++++++++++++++++++++++++++++++++++
> include/drm/drm_device.h | 8 +++++
> include/drm/drm_drv.h | 1 +
> 3 files changed, 72 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index c2c172eb25df..115e1d1c80ea 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -26,6 +26,7 @@
> * DEALINGS IN THE SOFTWARE.
> */
>
> +#include <linux/bitops.h>
> #include <linux/debugfs.h>
> #include <linux/fs.h>
> #include <linux/module.h>
> @@ -33,6 +34,7 @@
> #include <linux/mount.h>
> #include <linux/pseudo_fs.h>
> #include <linux/slab.h>
> +#include <linux/sprintf.h>
> #include <linux/srcu.h>
> #include <linux/xarray.h>
>
> @@ -497,6 +499,67 @@ void drm_dev_unplug(struct drm_device *dev)
> }
> EXPORT_SYMBOL(drm_dev_unplug);
>
> +/*
> + * Available recovery methods for wedged device. To be sent along with device
> + * wedged uevent.
> + */
> +static const char *drm_get_wedge_recovery(unsigned int opt)
> +{
> + switch (BIT(opt)) {
> + case DRM_WEDGE_RECOVERY_NONE:
> + return "none";
> + case DRM_WEDGE_RECOVERY_REBIND:
> + return "rebind";
> + case DRM_WEDGE_RECOVERY_BUS_RESET:
> + return "bus-reset";
> + default:
> + return NULL;
> + }
> +}
> +
> +/**
> + * drm_dev_wedged_event - generate a device wedged uevent
> + * @dev: DRM device
> + * @method: method(s) to be used for recovery
> + *
> + * This generates a device wedged uevent for the DRM device specified by @dev.
> + * Recovery @method\(s) of choice will be sent in the uevent environment as
> + * ``WEDGED=<method1>[,<method2>]`` in order of less to more side-effects.
> + * If caller is unsure about recovery or @method is unknown (0),
> + * ``WEDGED=unknown`` will be sent instead.
> + *
> + * Returns: 0 on success, negative error code otherwise.
> + */
> +int drm_dev_wedged_event(struct drm_device *dev, unsigned long method)
> +{
> + const char *recovery = NULL;
> + unsigned int len, opt;
> + /* Event string length up to 28+ characters with available methods */
> + char event_string[32];
> + char *envp[] = { event_string, NULL };
> +
> + len = scnprintf(event_string, sizeof(event_string), "%s", "WEDGED=");
> +
> + for_each_set_bit(opt, &method, BITS_PER_TYPE(method)) {
> + recovery = drm_get_wedge_recovery(opt);
> + if (drm_WARN(dev, !recovery, "device wedged, invalid recovery method %u\n", opt))
> + break;
> +
> + len += scnprintf(event_string + len, sizeof(event_string), "%s,", recovery);
> + }
> +
> + if (recovery)
> + /* Get rid of trailing comma */
> + event_string[len - 1] = '\0';
> + else
> + /* Caller is unsure about recovery, do the best we can at this point. */
> + snprintf(event_string, sizeof(event_string), "%s", "WEDGED=unknown");
> +
> + drm_info(dev, "device wedged, needs recovery\n");
> + return kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp);
> +}
> +EXPORT_SYMBOL(drm_dev_wedged_event);
> +
> /*
> * DRM internal mount
> * We want to be able to allocate our own "struct address_space" to control
> diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
> index c91f87b5242d..6ea54a578cda 100644
> --- a/include/drm/drm_device.h
> +++ b/include/drm/drm_device.h
> @@ -21,6 +21,14 @@ struct inode;
> struct pci_dev;
> struct pci_controller;
>
> +/*
> + * Recovery methods for wedged device in order of less to more side-effects.
> + * To be used with drm_dev_wedged_event() as recovery @method. Callers can
> + * use any one, multiple (or'd) or none depending on their needs.
> + */
> +#define DRM_WEDGE_RECOVERY_NONE BIT(0) /* optional telemetry collection */
> +#define DRM_WEDGE_RECOVERY_REBIND BIT(1) /* unbind + bind driver */
> +#define DRM_WEDGE_RECOVERY_BUS_RESET BIT(2) /* unbind + reset bus device + bind */
>
> /**
> * enum switch_power_state - power state of drm device
> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> index 1bbbcb8e2d23..f41a82839e28 100644
> --- a/include/drm/drm_drv.h
> +++ b/include/drm/drm_drv.h
> @@ -479,6 +479,7 @@ void drm_put_dev(struct drm_device *dev);
> bool drm_dev_enter(struct drm_device *dev, int *idx);
> void drm_dev_exit(int idx);
> void drm_dev_unplug(struct drm_device *dev);
> +int drm_dev_wedged_event(struct drm_device *dev, unsigned long method);
>
> /**
> * drm_dev_is_unplugged - is a DRM device unplugged
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH v9 1/4] drm: Introduce device wedged event
2024-11-18 14:56 ` Aravind Iddamsetty
@ 2024-11-22 7:07 ` Raag Jadav
2024-11-22 10:09 ` Christian König
0 siblings, 1 reply; 24+ messages in thread
From: Raag Jadav @ 2024-11-22 7:07 UTC (permalink / raw)
To: Aravind Iddamsetty
Cc: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
andriy.shevchenko, lina, michal.wajdeczko, christian.koenig,
intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
anshuman.gupta, alexander.deucher, andrealmeid, amd-gfx,
kernel-dev
On Mon, Nov 18, 2024 at 08:26:37PM +0530, Aravind Iddamsetty wrote:
> On 15/11/24 10:37, Raag Jadav wrote:
> > Introduce device wedged event, which notifies userspace of 'wedged'
> > (hanged/unusable) state of the DRM device through a uevent. This is
> > useful especially in cases where the device is no longer operating as
> > expected and has become unrecoverable from driver context. Purpose of
> > this implementation is to provide drivers a generic way to recover with
> > the help of userspace intervention without taking any drastic measures
> > in the driver.
> >
> > A 'wedged' device is basically a dead device that needs attention. The
> > uevent is the notification that is sent to userspace along with a hint
> > about what could possibly be attempted to recover the device and bring
> > it back to usable state. Different drivers may have different ideas of
> > a 'wedged' device depending on their hardware implementation, and hence
> > the vendor agnostic nature of the event. It is up to the drivers to
> > decide when they see the need for recovery and how they want to recover
> > from the available methods.
> >
> > Prerequisites
> > -------------
> >
> > The driver, before opting for recovery, needs to make sure that the
> > 'wedged' device doesn't harm the system as a whole by taking care of the
> > prerequisites. Necessary actions must include disabling DMA to system
> > memory as well as any communication channels with other devices. Further,
> > the driver must ensure that all dma_fences are signalled and any device
> > state that the core kernel might depend on are cleaned up. Once the event
> > is sent, the device must be kept in 'wedged' state until the recovery is
> > performed. New accesses to the device (IOCTLs) should be blocked,
> > preferably with an error code that resembles the type of failure the
> > device has encountered. This will signify the reason for wegeding which
> > can be reported to the application if needed.
>
> should we even drop the mmaps we created?
Whatever is required for a clean recovery, yes.
Although how would this play out? Do we risk loosing display?
Or any other possible side-effects?
Raag
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v9 1/4] drm: Introduce device wedged event
2024-11-22 7:07 ` Raag Jadav
@ 2024-11-22 10:09 ` Christian König
2024-11-22 16:02 ` Raag Jadav
0 siblings, 1 reply; 24+ messages in thread
From: Christian König @ 2024-11-22 10:09 UTC (permalink / raw)
To: Raag Jadav, Aravind Iddamsetty
Cc: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
andriy.shevchenko, lina, michal.wajdeczko, intel-xe, intel-gfx,
dri-devel, himal.prasad.ghimiray, anshuman.gupta,
alexander.deucher, andrealmeid, amd-gfx, kernel-dev
Am 22.11.24 um 08:07 schrieb Raag Jadav:
> On Mon, Nov 18, 2024 at 08:26:37PM +0530, Aravind Iddamsetty wrote:
>> On 15/11/24 10:37, Raag Jadav wrote:
>>> Introduce device wedged event, which notifies userspace of 'wedged'
>>> (hanged/unusable) state of the DRM device through a uevent. This is
>>> useful especially in cases where the device is no longer operating as
>>> expected and has become unrecoverable from driver context. Purpose of
>>> this implementation is to provide drivers a generic way to recover with
>>> the help of userspace intervention without taking any drastic measures
>>> in the driver.
>>>
>>> A 'wedged' device is basically a dead device that needs attention. The
>>> uevent is the notification that is sent to userspace along with a hint
>>> about what could possibly be attempted to recover the device and bring
>>> it back to usable state. Different drivers may have different ideas of
>>> a 'wedged' device depending on their hardware implementation, and hence
>>> the vendor agnostic nature of the event. It is up to the drivers to
>>> decide when they see the need for recovery and how they want to recover
>>> from the available methods.
>>>
>>> Prerequisites
>>> -------------
>>>
>>> The driver, before opting for recovery, needs to make sure that the
>>> 'wedged' device doesn't harm the system as a whole by taking care of the
>>> prerequisites. Necessary actions must include disabling DMA to system
>>> memory as well as any communication channels with other devices. Further,
>>> the driver must ensure that all dma_fences are signalled and any device
>>> state that the core kernel might depend on are cleaned up. Once the event
>>> is sent, the device must be kept in 'wedged' state until the recovery is
>>> performed. New accesses to the device (IOCTLs) should be blocked,
>>> preferably with an error code that resembles the type of failure the
>>> device has encountered. This will signify the reason for wegeding which
>>> can be reported to the application if needed.
>> should we even drop the mmaps we created?
> Whatever is required for a clean recovery, yes.
>
> Although how would this play out? Do we risk loosing display?
> Or any other possible side-effects?
Before sending a wedge event all DMA transfers of the device have to be
blocked.
So yes, all display, mmap() and file descriptor connections you had with
the device would need to be re-created.
Regards,
Christian.
>
> Raag
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v9 1/4] drm: Introduce device wedged event
2024-11-22 10:09 ` Christian König
@ 2024-11-22 16:02 ` Raag Jadav
2024-11-25 5:56 ` Aravind Iddamsetty
2024-11-25 9:32 ` Christian König
0 siblings, 2 replies; 24+ messages in thread
From: Raag Jadav @ 2024-11-22 16:02 UTC (permalink / raw)
To: Christian König
Cc: Aravind Iddamsetty, airlied, simona, lucas.demarchi, rodrigo.vivi,
jani.nikula, andriy.shevchenko, lina, michal.wajdeczko, intel-xe,
intel-gfx, dri-devel, himal.prasad.ghimiray, anshuman.gupta,
alexander.deucher, andrealmeid, amd-gfx, kernel-dev
On Fri, Nov 22, 2024 at 11:09:32AM +0100, Christian König wrote:
> Am 22.11.24 um 08:07 schrieb Raag Jadav:
> > On Mon, Nov 18, 2024 at 08:26:37PM +0530, Aravind Iddamsetty wrote:
> > > On 15/11/24 10:37, Raag Jadav wrote:
> > > > Introduce device wedged event, which notifies userspace of 'wedged'
> > > > (hanged/unusable) state of the DRM device through a uevent. This is
> > > > useful especially in cases where the device is no longer operating as
> > > > expected and has become unrecoverable from driver context. Purpose of
> > > > this implementation is to provide drivers a generic way to recover with
> > > > the help of userspace intervention without taking any drastic measures
> > > > in the driver.
> > > >
> > > > A 'wedged' device is basically a dead device that needs attention. The
> > > > uevent is the notification that is sent to userspace along with a hint
> > > > about what could possibly be attempted to recover the device and bring
> > > > it back to usable state. Different drivers may have different ideas of
> > > > a 'wedged' device depending on their hardware implementation, and hence
> > > > the vendor agnostic nature of the event. It is up to the drivers to
> > > > decide when they see the need for recovery and how they want to recover
> > > > from the available methods.
> > > >
> > > > Prerequisites
> > > > -------------
> > > >
> > > > The driver, before opting for recovery, needs to make sure that the
> > > > 'wedged' device doesn't harm the system as a whole by taking care of the
> > > > prerequisites. Necessary actions must include disabling DMA to system
> > > > memory as well as any communication channels with other devices. Further,
> > > > the driver must ensure that all dma_fences are signalled and any device
> > > > state that the core kernel might depend on are cleaned up. Once the event
> > > > is sent, the device must be kept in 'wedged' state until the recovery is
> > > > performed. New accesses to the device (IOCTLs) should be blocked,
> > > > preferably with an error code that resembles the type of failure the
> > > > device has encountered. This will signify the reason for wegeding which
> > > > can be reported to the application if needed.
> > > should we even drop the mmaps we created?
> > Whatever is required for a clean recovery, yes.
> >
> > Although how would this play out? Do we risk loosing display?
> > Or any other possible side-effects?
>
> Before sending a wedge event all DMA transfers of the device have to be
> blocked.
>
> So yes, all display, mmap() and file descriptor connections you had with the
> device would need to be re-created.
Does it mean we'd have to rely on userspace to unmap()?
Raag
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v9 1/4] drm: Introduce device wedged event
2024-11-22 16:02 ` Raag Jadav
@ 2024-11-25 5:56 ` Aravind Iddamsetty
2024-11-25 10:27 ` Christian König
2024-11-25 9:32 ` Christian König
1 sibling, 1 reply; 24+ messages in thread
From: Aravind Iddamsetty @ 2024-11-25 5:56 UTC (permalink / raw)
To: Raag Jadav
Cc: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
andriy.shevchenko, lina, michal.wajdeczko, intel-xe, intel-gfx,
dri-devel, himal.prasad.ghimiray, anshuman.gupta,
alexander.deucher, andrealmeid, amd-gfx, kernel-dev,
Christian König
On 22/11/24 21:32, Raag Jadav wrote:
> On Fri, Nov 22, 2024 at 11:09:32AM +0100, Christian König wrote:
>> Am 22.11.24 um 08:07 schrieb Raag Jadav:
>>> On Mon, Nov 18, 2024 at 08:26:37PM +0530, Aravind Iddamsetty wrote:
>>>> On 15/11/24 10:37, Raag Jadav wrote:
>>>>> Introduce device wedged event, which notifies userspace of 'wedged'
>>>>> (hanged/unusable) state of the DRM device through a uevent. This is
>>>>> useful especially in cases where the device is no longer operating as
>>>>> expected and has become unrecoverable from driver context. Purpose of
>>>>> this implementation is to provide drivers a generic way to recover with
>>>>> the help of userspace intervention without taking any drastic measures
>>>>> in the driver.
>>>>>
>>>>> A 'wedged' device is basically a dead device that needs attention. The
>>>>> uevent is the notification that is sent to userspace along with a hint
>>>>> about what could possibly be attempted to recover the device and bring
>>>>> it back to usable state. Different drivers may have different ideas of
>>>>> a 'wedged' device depending on their hardware implementation, and hence
>>>>> the vendor agnostic nature of the event. It is up to the drivers to
>>>>> decide when they see the need for recovery and how they want to recover
>>>>> from the available methods.
>>>>>
>>>>> Prerequisites
>>>>> -------------
>>>>>
>>>>> The driver, before opting for recovery, needs to make sure that the
>>>>> 'wedged' device doesn't harm the system as a whole by taking care of the
>>>>> prerequisites. Necessary actions must include disabling DMA to system
>>>>> memory as well as any communication channels with other devices. Further,
>>>>> the driver must ensure that all dma_fences are signalled and any device
>>>>> state that the core kernel might depend on are cleaned up. Once the event
>>>>> is sent, the device must be kept in 'wedged' state until the recovery is
>>>>> performed. New accesses to the device (IOCTLs) should be blocked,
>>>>> preferably with an error code that resembles the type of failure the
>>>>> device has encountered. This will signify the reason for wegeding which
>>>>> can be reported to the application if needed.
>>>> should we even drop the mmaps we created?
>>> Whatever is required for a clean recovery, yes.
>>>
>>> Although how would this play out? Do we risk loosing display?
>>> Or any other possible side-effects?
>> Before sending a wedge event all DMA transfers of the device have to be
>> blocked.
>>
>> So yes, all display, mmap() and file descriptor connections you had with the
>> device would need to be re-created.
> Does it mean we'd have to rely on userspace to unmap()?
I'm not sure of display, but at least all user mappings can be destroyed
using drm_vma_node_unmap.
Thanks,
Aravind.
>
> Raag
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v9 1/4] drm: Introduce device wedged event
2024-11-25 5:56 ` Aravind Iddamsetty
@ 2024-11-25 10:27 ` Christian König
0 siblings, 0 replies; 24+ messages in thread
From: Christian König @ 2024-11-25 10:27 UTC (permalink / raw)
To: Aravind Iddamsetty, Raag Jadav
Cc: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
andriy.shevchenko, lina, michal.wajdeczko, intel-xe, intel-gfx,
dri-devel, himal.prasad.ghimiray, anshuman.gupta,
alexander.deucher, andrealmeid, amd-gfx, kernel-dev
Am 25.11.24 um 06:56 schrieb Aravind Iddamsetty:
> On 22/11/24 21:32, Raag Jadav wrote:
>> On Fri, Nov 22, 2024 at 11:09:32AM +0100, Christian König wrote:
>>> Am 22.11.24 um 08:07 schrieb Raag Jadav:
>>>> On Mon, Nov 18, 2024 at 08:26:37PM +0530, Aravind Iddamsetty wrote:
>>>>> On 15/11/24 10:37, Raag Jadav wrote:
>>>>>> Introduce device wedged event, which notifies userspace of 'wedged'
>>>>>> (hanged/unusable) state of the DRM device through a uevent. This is
>>>>>> useful especially in cases where the device is no longer operating as
>>>>>> expected and has become unrecoverable from driver context. Purpose of
>>>>>> this implementation is to provide drivers a generic way to recover with
>>>>>> the help of userspace intervention without taking any drastic measures
>>>>>> in the driver.
>>>>>>
>>>>>> A 'wedged' device is basically a dead device that needs attention. The
>>>>>> uevent is the notification that is sent to userspace along with a hint
>>>>>> about what could possibly be attempted to recover the device and bring
>>>>>> it back to usable state. Different drivers may have different ideas of
>>>>>> a 'wedged' device depending on their hardware implementation, and hence
>>>>>> the vendor agnostic nature of the event. It is up to the drivers to
>>>>>> decide when they see the need for recovery and how they want to recover
>>>>>> from the available methods.
>>>>>>
>>>>>> Prerequisites
>>>>>> -------------
>>>>>>
>>>>>> The driver, before opting for recovery, needs to make sure that the
>>>>>> 'wedged' device doesn't harm the system as a whole by taking care of the
>>>>>> prerequisites. Necessary actions must include disabling DMA to system
>>>>>> memory as well as any communication channels with other devices. Further,
>>>>>> the driver must ensure that all dma_fences are signalled and any device
>>>>>> state that the core kernel might depend on are cleaned up. Once the event
>>>>>> is sent, the device must be kept in 'wedged' state until the recovery is
>>>>>> performed. New accesses to the device (IOCTLs) should be blocked,
>>>>>> preferably with an error code that resembles the type of failure the
>>>>>> device has encountered. This will signify the reason for wegeding which
>>>>>> can be reported to the application if needed.
>>>>> should we even drop the mmaps we created?
>>>> Whatever is required for a clean recovery, yes.
>>>>
>>>> Although how would this play out? Do we risk loosing display?
>>>> Or any other possible side-effects?
>>> Before sending a wedge event all DMA transfers of the device have to be
>>> blocked.
>>>
>>> So yes, all display, mmap() and file descriptor connections you had with the
>>> device would need to be re-created.
>> Does it mean we'd have to rely on userspace to unmap()?
>
> I'm not sure of display, but at least all user mappings can be destroyed
> using drm_vma_node_unmap.
That is not really correct. The mappings are not destroy, they are
invalidated.
On access a page fault is generated and TTM should redirect the access
to the dummy page.
The userspace application still has to unmap the VMA to destroy it.
Regards,
Christian.
>
> Thanks,
> Aravind.
>> Raag
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v9 1/4] drm: Introduce device wedged event
2024-11-22 16:02 ` Raag Jadav
2024-11-25 5:56 ` Aravind Iddamsetty
@ 2024-11-25 9:32 ` Christian König
2024-11-26 6:38 ` Raag Jadav
1 sibling, 1 reply; 24+ messages in thread
From: Christian König @ 2024-11-25 9:32 UTC (permalink / raw)
To: Raag Jadav
Cc: Aravind Iddamsetty, airlied, simona, lucas.demarchi, rodrigo.vivi,
jani.nikula, andriy.shevchenko, lina, michal.wajdeczko, intel-xe,
intel-gfx, dri-devel, himal.prasad.ghimiray, anshuman.gupta,
alexander.deucher, andrealmeid, amd-gfx, kernel-dev
Am 22.11.24 um 17:02 schrieb Raag Jadav:
> On Fri, Nov 22, 2024 at 11:09:32AM +0100, Christian König wrote:
>> Am 22.11.24 um 08:07 schrieb Raag Jadav:
>>> On Mon, Nov 18, 2024 at 08:26:37PM +0530, Aravind Iddamsetty wrote:
>>>> On 15/11/24 10:37, Raag Jadav wrote:
>>>>> Introduce device wedged event, which notifies userspace of 'wedged'
>>>>> (hanged/unusable) state of the DRM device through a uevent. This is
>>>>> useful especially in cases where the device is no longer operating as
>>>>> expected and has become unrecoverable from driver context. Purpose of
>>>>> this implementation is to provide drivers a generic way to recover with
>>>>> the help of userspace intervention without taking any drastic measures
>>>>> in the driver.
>>>>>
>>>>> A 'wedged' device is basically a dead device that needs attention. The
>>>>> uevent is the notification that is sent to userspace along with a hint
>>>>> about what could possibly be attempted to recover the device and bring
>>>>> it back to usable state. Different drivers may have different ideas of
>>>>> a 'wedged' device depending on their hardware implementation, and hence
>>>>> the vendor agnostic nature of the event. It is up to the drivers to
>>>>> decide when they see the need for recovery and how they want to recover
>>>>> from the available methods.
>>>>>
>>>>> Prerequisites
>>>>> -------------
>>>>>
>>>>> The driver, before opting for recovery, needs to make sure that the
>>>>> 'wedged' device doesn't harm the system as a whole by taking care of the
>>>>> prerequisites. Necessary actions must include disabling DMA to system
>>>>> memory as well as any communication channels with other devices. Further,
>>>>> the driver must ensure that all dma_fences are signalled and any device
>>>>> state that the core kernel might depend on are cleaned up. Once the event
>>>>> is sent, the device must be kept in 'wedged' state until the recovery is
>>>>> performed. New accesses to the device (IOCTLs) should be blocked,
>>>>> preferably with an error code that resembles the type of failure the
>>>>> device has encountered. This will signify the reason for wegeding which
>>>>> can be reported to the application if needed.
>>>> should we even drop the mmaps we created?
>>> Whatever is required for a clean recovery, yes.
>>>
>>> Although how would this play out? Do we risk loosing display?
>>> Or any other possible side-effects?
>> Before sending a wedge event all DMA transfers of the device have to be
>> blocked.
>>
>> So yes, all display, mmap() and file descriptor connections you had with the
>> device would need to be re-created.
> Does it mean we'd have to rely on userspace to unmap()?
Yes and no :)
The handling should be similar to how hotplug is handled. E.g. the
device becomes inaccessible by normal applications all mappings become
invalid.
But we don't send a SIGBUS or similar on access, instead all mappings
redirected to a dummy page which basically shallows all writes and gives
undefined data on reads.
On IOCTLs the applications should get an error code and eventually
restart or at least unmap all their mappings.
Regards,
Christian.
>
> Raag
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v9 1/4] drm: Introduce device wedged event
2024-11-25 9:32 ` Christian König
@ 2024-11-26 6:38 ` Raag Jadav
2024-11-26 8:12 ` Christian König
0 siblings, 1 reply; 24+ messages in thread
From: Raag Jadav @ 2024-11-26 6:38 UTC (permalink / raw)
To: Christian König
Cc: Aravind Iddamsetty, airlied, simona, lucas.demarchi, rodrigo.vivi,
jani.nikula, andriy.shevchenko, lina, michal.wajdeczko, intel-xe,
intel-gfx, dri-devel, himal.prasad.ghimiray, anshuman.gupta,
alexander.deucher, andrealmeid, amd-gfx, kernel-dev
On Mon, Nov 25, 2024 at 10:32:42AM +0100, Christian König wrote:
> Am 22.11.24 um 17:02 schrieb Raag Jadav:
> > On Fri, Nov 22, 2024 at 11:09:32AM +0100, Christian König wrote:
> > > Am 22.11.24 um 08:07 schrieb Raag Jadav:
> > > > On Mon, Nov 18, 2024 at 08:26:37PM +0530, Aravind Iddamsetty wrote:
> > > > > On 15/11/24 10:37, Raag Jadav wrote:
> > > > > > Introduce device wedged event, which notifies userspace of 'wedged'
> > > > > > (hanged/unusable) state of the DRM device through a uevent. This is
> > > > > > useful especially in cases where the device is no longer operating as
> > > > > > expected and has become unrecoverable from driver context. Purpose of
> > > > > > this implementation is to provide drivers a generic way to recover with
> > > > > > the help of userspace intervention without taking any drastic measures
> > > > > > in the driver.
> > > > > >
> > > > > > A 'wedged' device is basically a dead device that needs attention. The
> > > > > > uevent is the notification that is sent to userspace along with a hint
> > > > > > about what could possibly be attempted to recover the device and bring
> > > > > > it back to usable state. Different drivers may have different ideas of
> > > > > > a 'wedged' device depending on their hardware implementation, and hence
> > > > > > the vendor agnostic nature of the event. It is up to the drivers to
> > > > > > decide when they see the need for recovery and how they want to recover
> > > > > > from the available methods.
> > > > > >
> > > > > > Prerequisites
> > > > > > -------------
> > > > > >
> > > > > > The driver, before opting for recovery, needs to make sure that the
> > > > > > 'wedged' device doesn't harm the system as a whole by taking care of the
> > > > > > prerequisites. Necessary actions must include disabling DMA to system
> > > > > > memory as well as any communication channels with other devices. Further,
> > > > > > the driver must ensure that all dma_fences are signalled and any device
> > > > > > state that the core kernel might depend on are cleaned up. Once the event
> > > > > > is sent, the device must be kept in 'wedged' state until the recovery is
> > > > > > performed. New accesses to the device (IOCTLs) should be blocked,
> > > > > > preferably with an error code that resembles the type of failure the
> > > > > > device has encountered. This will signify the reason for wegeding which
> > > > > > can be reported to the application if needed.
> > > > > should we even drop the mmaps we created?
> > > > Whatever is required for a clean recovery, yes.
> > > >
> > > > Although how would this play out? Do we risk loosing display?
> > > > Or any other possible side-effects?
> > > Before sending a wedge event all DMA transfers of the device have to be
> > > blocked.
> > >
> > > So yes, all display, mmap() and file descriptor connections you had with the
> > > device would need to be re-created.
> > Does it mean we'd have to rely on userspace to unmap()?
>
> Yes and no :)
>
> The handling should be similar to how hotplug is handled. E.g. the device
> becomes inaccessible by normal applications all mappings become invalid.
Isn't that just unbind (which is already part of recovery)?
> But we don't send a SIGBUS or similar on access, instead all mappings
> redirected to a dummy page which basically shallows all writes and gives
> undefined data on reads.
>
> On IOCTLs the applications should get an error code and eventually restart
> or at least unmap all their mappings.
Thanks for the detailed explanation.
Rethinking about this, the criteria set for prerequisites is to not do
anything that could possibly harm the system. So I think the important
question is,
with fences signalled and ioctls already blocked, is live mmap on a wedged
device capable of producing harmful behaviour or unintended side-effects
(atleast until the application has the opportunity to unmap() as part of
recovery)?
Raag
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v9 1/4] drm: Introduce device wedged event
2024-11-26 6:38 ` Raag Jadav
@ 2024-11-26 8:12 ` Christian König
0 siblings, 0 replies; 24+ messages in thread
From: Christian König @ 2024-11-26 8:12 UTC (permalink / raw)
To: Raag Jadav
Cc: Aravind Iddamsetty, airlied, simona, lucas.demarchi, rodrigo.vivi,
jani.nikula, andriy.shevchenko, lina, michal.wajdeczko, intel-xe,
intel-gfx, dri-devel, himal.prasad.ghimiray, anshuman.gupta,
alexander.deucher, andrealmeid, amd-gfx, kernel-dev
[-- Attachment #1: Type: text/plain, Size: 4559 bytes --]
Am 26.11.24 um 07:38 schrieb Raag Jadav:
> On Mon, Nov 25, 2024 at 10:32:42AM +0100, Christian König wrote:
>> Am 22.11.24 um 17:02 schrieb Raag Jadav:
>>> On Fri, Nov 22, 2024 at 11:09:32AM +0100, Christian König wrote:
>>>> Am 22.11.24 um 08:07 schrieb Raag Jadav:
>>>>> On Mon, Nov 18, 2024 at 08:26:37PM +0530, Aravind Iddamsetty wrote:
>>>>>> On 15/11/24 10:37, Raag Jadav wrote:
>>>>>>> Introduce device wedged event, which notifies userspace of 'wedged'
>>>>>>> (hanged/unusable) state of the DRM device through a uevent. This is
>>>>>>> useful especially in cases where the device is no longer operating as
>>>>>>> expected and has become unrecoverable from driver context. Purpose of
>>>>>>> this implementation is to provide drivers a generic way to recover with
>>>>>>> the help of userspace intervention without taking any drastic measures
>>>>>>> in the driver.
>>>>>>>
>>>>>>> A 'wedged' device is basically a dead device that needs attention. The
>>>>>>> uevent is the notification that is sent to userspace along with a hint
>>>>>>> about what could possibly be attempted to recover the device and bring
>>>>>>> it back to usable state. Different drivers may have different ideas of
>>>>>>> a 'wedged' device depending on their hardware implementation, and hence
>>>>>>> the vendor agnostic nature of the event. It is up to the drivers to
>>>>>>> decide when they see the need for recovery and how they want to recover
>>>>>>> from the available methods.
>>>>>>>
>>>>>>> Prerequisites
>>>>>>> -------------
>>>>>>>
>>>>>>> The driver, before opting for recovery, needs to make sure that the
>>>>>>> 'wedged' device doesn't harm the system as a whole by taking care of the
>>>>>>> prerequisites. Necessary actions must include disabling DMA to system
>>>>>>> memory as well as any communication channels with other devices. Further,
>>>>>>> the driver must ensure that all dma_fences are signalled and any device
>>>>>>> state that the core kernel might depend on are cleaned up. Once the event
>>>>>>> is sent, the device must be kept in 'wedged' state until the recovery is
>>>>>>> performed. New accesses to the device (IOCTLs) should be blocked,
>>>>>>> preferably with an error code that resembles the type of failure the
>>>>>>> device has encountered. This will signify the reason for wegeding which
>>>>>>> can be reported to the application if needed.
>>>>>> should we even drop the mmaps we created?
>>>>> Whatever is required for a clean recovery, yes.
>>>>>
>>>>> Although how would this play out? Do we risk loosing display?
>>>>> Or any other possible side-effects?
>>>> Before sending a wedge event all DMA transfers of the device have to be
>>>> blocked.
>>>>
>>>> So yes, all display, mmap() and file descriptor connections you had with the
>>>> device would need to be re-created.
>>> Does it mean we'd have to rely on userspace to unmap()?
>> Yes and no :)
>>
>> The handling should be similar to how hotplug is handled. E.g. the device
>> becomes inaccessible by normal applications all mappings become invalid.
> Isn't that just unbind (which is already part of recovery)?
No, unbind just invalidates all mappings but it doesn't catches any page
faults which would validate them again.
The driver or framework must make sure that page faults now get
redirected to a dummy page. See ttm_bo_vm_dummy_page() for how TTM
handles that for all drivers using it.
Not sure about i915, since it never deals with device memory it can
potentially just keep the access to the allocated system memory intact.
>> But we don't send a SIGBUS or similar on access, instead all mappings
>> redirected to a dummy page which basically shallows all writes and gives
>> undefined data on reads.
>>
>> On IOCTLs the applications should get an error code and eventually restart
>> or at least unmap all their mappings.
> Thanks for the detailed explanation.
>
> Rethinking about this, the criteria set for prerequisites is to not do
> anything that could possibly harm the system. So I think the important
> question is,
>
> with fences signalled and ioctls already blocked, is live mmap on a wedged
> device capable of producing harmful behaviour or unintended side-effects
> (atleast until the application has the opportunity to unmap() as part of
> recovery)?
I think we are already rather good there.
The potential options are to redirect everything to a dummy page or to
crash the application by sending a SIGBUS.
Redirecting everything to the dummy page sounds like the more defensive
approach.
Regards,
Christian.
>
> Raag
[-- Attachment #2: Type: text/html, Size: 6078 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v9 2/4] drm/doc: Document device wedged event
2024-11-15 5:07 [PATCH v9 0/4] Introduce DRM device wedged event Raag Jadav
2024-11-15 5:07 ` [PATCH v9 1/4] drm: Introduce " Raag Jadav
@ 2024-11-15 5:07 ` Raag Jadav
2024-11-15 9:19 ` Christian König
2024-11-15 5:07 ` [PATCH v9 3/4] drm/xe: Use " Raag Jadav
` (5 subsequent siblings)
7 siblings, 1 reply; 24+ messages in thread
From: Raag Jadav @ 2024-11-15 5:07 UTC (permalink / raw)
To: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
andriy.shevchenko, lina, michal.wajdeczko, christian.koenig
Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
aravind.iddamsetty, anshuman.gupta, alexander.deucher,
andrealmeid, amd-gfx, kernel-dev, Raag Jadav
Add documentation for device wedged event in a new 'Device wedging'
chapter. The describes basic definitions and consumer expectations
along with an example.
v8: Improve documentation (Christian, Rodrigo)
v9: Add prerequisites section (Christian)
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
---
Documentation/gpu/drm-uapi.rst | 102 ++++++++++++++++++++++++++++++++-
1 file changed, 99 insertions(+), 3 deletions(-)
diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
index b75cc9a70d1f..33d9c253d4d6 100644
--- a/Documentation/gpu/drm-uapi.rst
+++ b/Documentation/gpu/drm-uapi.rst
@@ -371,9 +371,105 @@ Reporting causes of resets
Apart from propagating the reset through the stack so apps can recover, it's
really useful for driver developers to learn more about what caused the reset in
-the first place. DRM devices should make use of devcoredump to store relevant
-information about the reset, so this information can be added to user bug
-reports.
+the first place. For this, drivers can make use of devcoredump to store relevant
+information about the reset and send device wedged event without recovery method
+(as explained in next chapter) to notify userspace, so this information can be
+collected and added to user bug reports.
+
+Device wedging
+==============
+
+Drivers can optionally make use of device wedged event (implemented as
+drm_dev_wedged_event() in DRM subsystem), which notifies userspace of 'wedged'
+(hanged/unusable) state of the DRM device through a uevent. This is useful
+especially in cases where the device is no longer operating as expected and
+has become unrecoverable from driver context. Purpose of this implementation
+is to provide drivers a generic way to recover with the help of userspace
+intervention without taking any drastic measures in the driver.
+
+A 'wedged' device is basically a dead device that needs attention. The
+uevent is the notification that is sent to userspace along with a hint about
+what could possibly be attempted to recover the device and bring it back to
+usable state. Different drivers may have different ideas of a 'wedged' device
+depending on their hardware implementation, and hence the vendor agnostic
+nature of the event. It is up to the drivers to decide when they see the need
+for recovery and how they want to recover from the available methods.
+
+Prerequisites
+-------------
+
+The driver, before opting for recovery, needs to make sure that the 'wedged'
+device doesn't harm the system as a whole by taking care of the prerequisites.
+Necessary actions must include disabling DMA to system memory as well as any
+communication channels with other devices. Further, the driver must ensure
+that all dma_fences are signalled and any device state that the core kernel
+might depend on are cleaned up. Once the event is sent, the device must be
+kept in 'wedged' state until the recovery is performed. New accesses to the
+device (IOCTLs) should be blocked, preferably with an error code that
+resembles the type of failure the device has encountered. This will signify
+the reason for wegeding which can be reported to the application if needed.
+
+Recovery
+--------
+
+Current implementation defines three recovery methods, out of which, drivers
+can use any one, multiple or none. Method(s) of choice will be sent in the
+uevent environment as ``WEDGED=<method1>[,<method2>]`` in order of less to
+more side-effects. If driver is unsure about recovery or method is unknown
+(like soft/hard reboot, firmware flashing, hardware replacement or any other
+procedure which can't be attempted on the fly), ``WEDGED=unknown`` will be
+sent instead.
+
+Userspace consumers can parse this event and attempt recovery as per the
+following expectations.
+
+ =============== ================================
+ Recovery method Consumer expectations
+ =============== ================================
+ none optional telemetry collection
+ rebind unbind + bind driver
+ bus-reset unbind + reset bus device + bind
+ unknown admin/user policy
+ =============== ================================
+
+The only exception to this is ``WEDGED=none``, which signifies that the
+device was temporarily 'wedged' at some point but was able to recover using
+device specific methods like reset. No explicit action is expected from
+userspace consumers in this case, but they can still take additional steps
+like gathering telemetry information (devcoredump, syslog). This is useful
+because the first hang is usually the most critical one which can result in
+consequential hangs or complete wedging.
+
+Example
+-------
+
+Udev rule::
+
+ SUBSYSTEM=="drm", ENV{WEDGED}=="rebind", DEVPATH=="*/drm/card[0-9]",
+ RUN+="/path/to/rebind.sh $env{DEVPATH}"
+
+Recovery script::
+
+ #!/bin/sh
+
+ DEVPATH=$(readlink -f /sys/$1/device)
+ DEVICE=$(basename $DEVPATH)
+ DRIVER=$(readlink -f $DEVPATH/driver)
+
+ echo -n $DEVICE > $DRIVER/unbind
+ sleep 1
+ echo -n $DEVICE > $DRIVER/bind
+
+Customization
+-------------
+
+Although basic recovery is possible with a simple script, admin/users can
+define custom policies around recovery action. For example, if the driver
+supports multiple recovery methods, consumers can opt for the suitable one
+based on policy definition. Consumers can also choose to have the device
+available for debugging or additional data collection before performing the
+recovery. This is useful especially when the driver is unsure about recovery
+or method is unknown.
.. _drm_driver_ioctl:
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH v9 2/4] drm/doc: Document device wedged event
2024-11-15 5:07 ` [PATCH v9 2/4] drm/doc: Document " Raag Jadav
@ 2024-11-15 9:19 ` Christian König
2024-11-15 11:44 ` Andy Shevchenko
0 siblings, 1 reply; 24+ messages in thread
From: Christian König @ 2024-11-15 9:19 UTC (permalink / raw)
To: Raag Jadav, airlied, simona, lucas.demarchi, rodrigo.vivi,
jani.nikula, andriy.shevchenko, lina, michal.wajdeczko
Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
aravind.iddamsetty, anshuman.gupta, alexander.deucher,
andrealmeid, amd-gfx, kernel-dev
Am 15.11.24 um 06:07 schrieb Raag Jadav:
> Add documentation for device wedged event in a new 'Device wedging'
> chapter. The describes basic definitions and consumer expectations
> along with an example.
>
> v8: Improve documentation (Christian, Rodrigo)
> v9: Add prerequisites section (Christian)
>
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Sounds totally sane to me, but I'm not a native speaker of English so
other should probably look at it as well.
Anyway feel free to add Reviewed-by: Christian König
<christian.koenig@amd.com>.
Regards,
Christian.
> ---
> Documentation/gpu/drm-uapi.rst | 102 ++++++++++++++++++++++++++++++++-
> 1 file changed, 99 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
> index b75cc9a70d1f..33d9c253d4d6 100644
> --- a/Documentation/gpu/drm-uapi.rst
> +++ b/Documentation/gpu/drm-uapi.rst
> @@ -371,9 +371,105 @@ Reporting causes of resets
>
> Apart from propagating the reset through the stack so apps can recover, it's
> really useful for driver developers to learn more about what caused the reset in
> -the first place. DRM devices should make use of devcoredump to store relevant
> -information about the reset, so this information can be added to user bug
> -reports.
> +the first place. For this, drivers can make use of devcoredump to store relevant
> +information about the reset and send device wedged event without recovery method
> +(as explained in next chapter) to notify userspace, so this information can be
> +collected and added to user bug reports.
> +
> +Device wedging
> +==============
> +
> +Drivers can optionally make use of device wedged event (implemented as
> +drm_dev_wedged_event() in DRM subsystem), which notifies userspace of 'wedged'
> +(hanged/unusable) state of the DRM device through a uevent. This is useful
> +especially in cases where the device is no longer operating as expected and
> +has become unrecoverable from driver context. Purpose of this implementation
> +is to provide drivers a generic way to recover with the help of userspace
> +intervention without taking any drastic measures in the driver.
> +
> +A 'wedged' device is basically a dead device that needs attention. The
> +uevent is the notification that is sent to userspace along with a hint about
> +what could possibly be attempted to recover the device and bring it back to
> +usable state. Different drivers may have different ideas of a 'wedged' device
> +depending on their hardware implementation, and hence the vendor agnostic
> +nature of the event. It is up to the drivers to decide when they see the need
> +for recovery and how they want to recover from the available methods.
> +
> +Prerequisites
> +-------------
> +
> +The driver, before opting for recovery, needs to make sure that the 'wedged'
> +device doesn't harm the system as a whole by taking care of the prerequisites.
> +Necessary actions must include disabling DMA to system memory as well as any
> +communication channels with other devices. Further, the driver must ensure
> +that all dma_fences are signalled and any device state that the core kernel
> +might depend on are cleaned up. Once the event is sent, the device must be
> +kept in 'wedged' state until the recovery is performed. New accesses to the
> +device (IOCTLs) should be blocked, preferably with an error code that
> +resembles the type of failure the device has encountered. This will signify
> +the reason for wegeding which can be reported to the application if needed.
> +
> +Recovery
> +--------
> +
> +Current implementation defines three recovery methods, out of which, drivers
> +can use any one, multiple or none. Method(s) of choice will be sent in the
> +uevent environment as ``WEDGED=<method1>[,<method2>]`` in order of less to
> +more side-effects. If driver is unsure about recovery or method is unknown
> +(like soft/hard reboot, firmware flashing, hardware replacement or any other
> +procedure which can't be attempted on the fly), ``WEDGED=unknown`` will be
> +sent instead.
> +
> +Userspace consumers can parse this event and attempt recovery as per the
> +following expectations.
> +
> + =============== ================================
> + Recovery method Consumer expectations
> + =============== ================================
> + none optional telemetry collection
> + rebind unbind + bind driver
> + bus-reset unbind + reset bus device + bind
> + unknown admin/user policy
> + =============== ================================
> +
> +The only exception to this is ``WEDGED=none``, which signifies that the
> +device was temporarily 'wedged' at some point but was able to recover using
> +device specific methods like reset. No explicit action is expected from
> +userspace consumers in this case, but they can still take additional steps
> +like gathering telemetry information (devcoredump, syslog). This is useful
> +because the first hang is usually the most critical one which can result in
> +consequential hangs or complete wedging.
> +
> +Example
> +-------
> +
> +Udev rule::
> +
> + SUBSYSTEM=="drm", ENV{WEDGED}=="rebind", DEVPATH=="*/drm/card[0-9]",
> + RUN+="/path/to/rebind.sh $env{DEVPATH}"
> +
> +Recovery script::
> +
> + #!/bin/sh
> +
> + DEVPATH=$(readlink -f /sys/$1/device)
> + DEVICE=$(basename $DEVPATH)
> + DRIVER=$(readlink -f $DEVPATH/driver)
> +
> + echo -n $DEVICE > $DRIVER/unbind
> + sleep 1
> + echo -n $DEVICE > $DRIVER/bind
> +
> +Customization
> +-------------
> +
> +Although basic recovery is possible with a simple script, admin/users can
> +define custom policies around recovery action. For example, if the driver
> +supports multiple recovery methods, consumers can opt for the suitable one
> +based on policy definition. Consumers can also choose to have the device
> +available for debugging or additional data collection before performing the
> +recovery. This is useful especially when the driver is unsure about recovery
> +or method is unknown.
>
> .. _drm_driver_ioctl:
>
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH v9 2/4] drm/doc: Document device wedged event
2024-11-15 9:19 ` Christian König
@ 2024-11-15 11:44 ` Andy Shevchenko
0 siblings, 0 replies; 24+ messages in thread
From: Andy Shevchenko @ 2024-11-15 11:44 UTC (permalink / raw)
To: Christian König
Cc: Raag Jadav, airlied, simona, lucas.demarchi, rodrigo.vivi,
jani.nikula, lina, michal.wajdeczko, intel-xe, intel-gfx,
dri-devel, himal.prasad.ghimiray, aravind.iddamsetty,
anshuman.gupta, alexander.deucher, andrealmeid, amd-gfx,
kernel-dev
On Fri, Nov 15, 2024 at 10:19:42AM +0100, Christian König wrote:
> Am 15.11.24 um 06:07 schrieb Raag Jadav:
> > Add documentation for device wedged event in a new 'Device wedging'
> > chapter. The describes basic definitions and consumer expectations
> > along with an example.
> >
> > v8: Improve documentation (Christian, Rodrigo)
> > v9: Add prerequisites section (Christian)
> >
> > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
>
> Sounds totally sane to me, but I'm not a native speaker of English so other
> should probably look at it as well.
> Anyway feel free to add Reviewed-by: Christian König
> <christian.koenig@amd.com>.
Side note: I don't believe tools support embedded tags, so we usually give
a tag as one tag per one line without. Otherwise it adds a manual job to
harvest them and ensure no typos made during that.
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v9 3/4] drm/xe: Use device wedged event
2024-11-15 5:07 [PATCH v9 0/4] Introduce DRM device wedged event Raag Jadav
2024-11-15 5:07 ` [PATCH v9 1/4] drm: Introduce " Raag Jadav
2024-11-15 5:07 ` [PATCH v9 2/4] drm/doc: Document " Raag Jadav
@ 2024-11-15 5:07 ` Raag Jadav
2024-11-19 3:28 ` Aravind Iddamsetty
2024-11-19 4:55 ` Ghimiray, Himal Prasad
2024-11-15 5:07 ` [PATCH v9 4/4] drm/i915: " Raag Jadav
` (4 subsequent siblings)
7 siblings, 2 replies; 24+ messages in thread
From: Raag Jadav @ 2024-11-15 5:07 UTC (permalink / raw)
To: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
andriy.shevchenko, lina, michal.wajdeczko, christian.koenig
Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
aravind.iddamsetty, anshuman.gupta, alexander.deucher,
andrealmeid, amd-gfx, kernel-dev, Raag Jadav
This was previously attempted as xe specific reset uevent but dropped
in commit 77a0d4d1cea2 ("drm/xe/uapi: Remove reset uevent for now")
as part of refactoring.
Now that we have device wedged event provided by DRM core, make use
of it and support both driver rebind and bus-reset based recovery.
With this in place userspace will be notified of wedged device, on
the basis of which, userspace may take respective action to recover
the device.
$ udevadm monitor --property --kernel
monitor will print the received events for:
KERNEL - the kernel uevent
KERNEL[265.802982] change /devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0 (drm)
ACTION=change
DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0
SUBSYSTEM=drm
WEDGED=rebind,bus-reset
DEVNAME=/dev/dri/card0
DEVTYPE=drm_minor
SEQNUM=5208
MAJOR=226
MINOR=0
v2: Change authorship to Himal (Aravind)
Add uevent for all device wedged cases (Aravind)
v3: Generic re-implementation in DRM subsystem (Lucas)
v4: Change authorship to Raag (Aravind)
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
---
drivers/gpu/drm/xe/xe_device.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 0e2dd691bdae..5878b331e35c 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -989,11 +989,12 @@ static void xe_device_wedged_fini(struct drm_device *drm, void *arg)
* xe_device_declare_wedged - Declare device wedged
* @xe: xe device instance
*
- * This is a final state that can only be cleared with a mudule
+ * This is a final state that can only be cleared with a module
* re-probe (unbind + bind).
* In this state every IOCTL will be blocked so the GT cannot be used.
* In general it will be called upon any critical error such as gt reset
- * failure or guc loading failure.
+ * failure or guc loading failure. Userspace will be notified of this state
+ * by a DRM uevent.
* If xe.wedged module parameter is set to 2, this function will be called
* on every single execution timeout (a.k.a. GPU hang) right after devcoredump
* snapshot capture. In this mode, GT reset won't be attempted so the state of
@@ -1023,6 +1024,10 @@ void xe_device_declare_wedged(struct xe_device *xe)
"IOCTLs and executions are blocked. Only a rebind may clear the failure\n"
"Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new\n",
dev_name(xe->drm.dev));
+
+ /* Notify userspace of wedged device */
+ drm_dev_wedged_event(&xe->drm,
+ DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET);
}
for_each_gt(gt, xe, id)
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH v9 3/4] drm/xe: Use device wedged event
2024-11-15 5:07 ` [PATCH v9 3/4] drm/xe: Use " Raag Jadav
@ 2024-11-19 3:28 ` Aravind Iddamsetty
2024-11-19 4:55 ` Ghimiray, Himal Prasad
1 sibling, 0 replies; 24+ messages in thread
From: Aravind Iddamsetty @ 2024-11-19 3:28 UTC (permalink / raw)
To: Raag Jadav, airlied, simona, lucas.demarchi, rodrigo.vivi,
jani.nikula, andriy.shevchenko, lina, michal.wajdeczko,
christian.koenig
Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
anshuman.gupta, alexander.deucher, andrealmeid, amd-gfx,
kernel-dev
On 15/11/24 10:37, Raag Jadav wrote:
> This was previously attempted as xe specific reset uevent but dropped
> in commit 77a0d4d1cea2 ("drm/xe/uapi: Remove reset uevent for now")
> as part of refactoring.
>
> Now that we have device wedged event provided by DRM core, make use
> of it and support both driver rebind and bus-reset based recovery.
> With this in place userspace will be notified of wedged device, on
> the basis of which, userspace may take respective action to recover
> the device.
>
> $ udevadm monitor --property --kernel
> monitor will print the received events for:
> KERNEL - the kernel uevent
>
> KERNEL[265.802982] change /devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0 (drm)
> ACTION=change
> DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0
> SUBSYSTEM=drm
> WEDGED=rebind,bus-reset
> DEVNAME=/dev/dri/card0
> DEVTYPE=drm_minor
> SEQNUM=5208
> MAJOR=226
> MINOR=0
LGTM.
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Thanks,
Aravind.
>
> v2: Change authorship to Himal (Aravind)
> Add uevent for all device wedged cases (Aravind)
> v3: Generic re-implementation in DRM subsystem (Lucas)
> v4: Change authorship to Raag (Aravind)
>
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> ---
> drivers/gpu/drm/xe/xe_device.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 0e2dd691bdae..5878b331e35c 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -989,11 +989,12 @@ static void xe_device_wedged_fini(struct drm_device *drm, void *arg)
> * xe_device_declare_wedged - Declare device wedged
> * @xe: xe device instance
> *
> - * This is a final state that can only be cleared with a mudule
> + * This is a final state that can only be cleared with a module
> * re-probe (unbind + bind).
> * In this state every IOCTL will be blocked so the GT cannot be used.
> * In general it will be called upon any critical error such as gt reset
> - * failure or guc loading failure.
> + * failure or guc loading failure. Userspace will be notified of this state
> + * by a DRM uevent.
> * If xe.wedged module parameter is set to 2, this function will be called
> * on every single execution timeout (a.k.a. GPU hang) right after devcoredump
> * snapshot capture. In this mode, GT reset won't be attempted so the state of
> @@ -1023,6 +1024,10 @@ void xe_device_declare_wedged(struct xe_device *xe)
> "IOCTLs and executions are blocked. Only a rebind may clear the failure\n"
> "Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new\n",
> dev_name(xe->drm.dev));
> +
> + /* Notify userspace of wedged device */
> + drm_dev_wedged_event(&xe->drm,
> + DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET);
> }
>
> for_each_gt(gt, xe, id)
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH v9 3/4] drm/xe: Use device wedged event
2024-11-15 5:07 ` [PATCH v9 3/4] drm/xe: Use " Raag Jadav
2024-11-19 3:28 ` Aravind Iddamsetty
@ 2024-11-19 4:55 ` Ghimiray, Himal Prasad
2024-11-20 7:26 ` Raag Jadav
1 sibling, 1 reply; 24+ messages in thread
From: Ghimiray, Himal Prasad @ 2024-11-19 4:55 UTC (permalink / raw)
To: Raag Jadav, airlied, simona, lucas.demarchi, rodrigo.vivi,
jani.nikula, andriy.shevchenko, lina, michal.wajdeczko,
christian.koenig
Cc: intel-xe, intel-gfx, dri-devel, aravind.iddamsetty,
anshuman.gupta, alexander.deucher, andrealmeid, amd-gfx,
kernel-dev
On 15-11-2024 10:37, Raag Jadav wrote:
> This was previously attempted as xe specific reset uevent but dropped
> in commit 77a0d4d1cea2 ("drm/xe/uapi: Remove reset uevent for now")
> as part of refactoring.
>
> Now that we have device wedged event provided by DRM core, make use
> of it and support both driver rebind and bus-reset based recovery.
> With this in place userspace will be notified of wedged device, on
> the basis of which, userspace may take respective action to recover
> the device.
>
> $ udevadm monitor --property --kernel
> monitor will print the received events for:
> KERNEL - the kernel uevent
>
> KERNEL[265.802982] change /devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0 (drm)
> ACTION=change
> DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0
> SUBSYSTEM=drm
> WEDGED=rebind,bus-reset
> DEVNAME=/dev/dri/card0
> DEVTYPE=drm_minor
> SEQNUM=5208
> MAJOR=226
> MINOR=0
The patch in itself looks good. Do we have IGT tests in place to
validate this uevent ?
BR
Himal
>
> v2: Change authorship to Himal (Aravind)
> Add uevent for all device wedged cases (Aravind)
> v3: Generic re-implementation in DRM subsystem (Lucas)
> v4: Change authorship to Raag (Aravind)
>
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> ---
> drivers/gpu/drm/xe/xe_device.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 0e2dd691bdae..5878b331e35c 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -989,11 +989,12 @@ static void xe_device_wedged_fini(struct drm_device *drm, void *arg)
> * xe_device_declare_wedged - Declare device wedged
> * @xe: xe device instance
> *
> - * This is a final state that can only be cleared with a mudule
> + * This is a final state that can only be cleared with a module
> * re-probe (unbind + bind).
> * In this state every IOCTL will be blocked so the GT cannot be used.
> * In general it will be called upon any critical error such as gt reset
> - * failure or guc loading failure.
> + * failure or guc loading failure. Userspace will be notified of this state
> + * by a DRM uevent.
> * If xe.wedged module parameter is set to 2, this function will be called
> * on every single execution timeout (a.k.a. GPU hang) right after devcoredump
> * snapshot capture. In this mode, GT reset won't be attempted so the state of
> @@ -1023,6 +1024,10 @@ void xe_device_declare_wedged(struct xe_device *xe)
> "IOCTLs and executions are blocked. Only a rebind may clear the failure\n"
> "Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new\n",
> dev_name(xe->drm.dev));
> +
> + /* Notify userspace of wedged device */
> + drm_dev_wedged_event(&xe->drm,
> + DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET);
> }
>
> for_each_gt(gt, xe, id)
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH v9 3/4] drm/xe: Use device wedged event
2024-11-19 4:55 ` Ghimiray, Himal Prasad
@ 2024-11-20 7:26 ` Raag Jadav
0 siblings, 0 replies; 24+ messages in thread
From: Raag Jadav @ 2024-11-20 7:26 UTC (permalink / raw)
To: Ghimiray, Himal Prasad
Cc: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
andriy.shevchenko, lina, michal.wajdeczko, christian.koenig,
intel-xe, intel-gfx, dri-devel, aravind.iddamsetty,
anshuman.gupta, alexander.deucher, andrealmeid, amd-gfx,
kernel-dev
On Tue, Nov 19, 2024 at 10:25:10AM +0530, Ghimiray, Himal Prasad wrote:
> On 15-11-2024 10:37, Raag Jadav wrote:
> > This was previously attempted as xe specific reset uevent but dropped
> > in commit 77a0d4d1cea2 ("drm/xe/uapi: Remove reset uevent for now")
> > as part of refactoring.
> >
> > Now that we have device wedged event provided by DRM core, make use
> > of it and support both driver rebind and bus-reset based recovery.
> > With this in place userspace will be notified of wedged device, on
> > the basis of which, userspace may take respective action to recover
> > the device.
> >
> > $ udevadm monitor --property --kernel
> > monitor will print the received events for:
> > KERNEL - the kernel uevent
> >
> > KERNEL[265.802982] change /devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0 (drm)
> > ACTION=change
> > DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0
> > SUBSYSTEM=drm
> > WEDGED=rebind,bus-reset
> > DEVNAME=/dev/dri/card0
> > DEVTYPE=drm_minor
> > SEQNUM=5208
> > MAJOR=226
> > MINOR=0
>
>
> The patch in itself looks good. Do we have IGT tests in place to validate
> this uevent ?
I unit tested it with documented example which seems to work. We can have an
igt in place once we have acceptance from the community.
Raag
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v9 4/4] drm/i915: Use device wedged event
2024-11-15 5:07 [PATCH v9 0/4] Introduce DRM device wedged event Raag Jadav
` (2 preceding siblings ...)
2024-11-15 5:07 ` [PATCH v9 3/4] drm/xe: Use " Raag Jadav
@ 2024-11-15 5:07 ` Raag Jadav
2024-11-19 3:43 ` Aravind Iddamsetty
2024-11-15 5:15 ` ✓ CI.Patch_applied: success for Introduce DRM device wedged event (rev7) Patchwork
` (3 subsequent siblings)
7 siblings, 1 reply; 24+ messages in thread
From: Raag Jadav @ 2024-11-15 5:07 UTC (permalink / raw)
To: airlied, simona, lucas.demarchi, rodrigo.vivi, jani.nikula,
andriy.shevchenko, lina, michal.wajdeczko, christian.koenig
Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
aravind.iddamsetty, anshuman.gupta, alexander.deucher,
andrealmeid, amd-gfx, kernel-dev, Raag Jadav
Now that we have device wedged event provided by DRM core, make use
of it and support both driver rebind and bus-reset based recovery.
With this in place, userspace will be notified of wedged device on
gt reset failure.
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
---
drivers/gpu/drm/i915/gt/intel_reset.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index f42f21632306..18cf50a1e84d 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1418,6 +1418,9 @@ static void intel_gt_reset_global(struct intel_gt *gt,
if (!test_bit(I915_WEDGED, >->reset.flags))
kobject_uevent_env(kobj, KOBJ_CHANGE, reset_done_event);
+ else
+ drm_dev_wedged_event(>->i915->drm,
+ DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET);
}
/**
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH v9 4/4] drm/i915: Use device wedged event
2024-11-15 5:07 ` [PATCH v9 4/4] drm/i915: " Raag Jadav
@ 2024-11-19 3:43 ` Aravind Iddamsetty
0 siblings, 0 replies; 24+ messages in thread
From: Aravind Iddamsetty @ 2024-11-19 3:43 UTC (permalink / raw)
To: Raag Jadav, airlied, simona, lucas.demarchi, rodrigo.vivi,
jani.nikula, andriy.shevchenko, lina, michal.wajdeczko,
christian.koenig
Cc: intel-xe, intel-gfx, dri-devel, himal.prasad.ghimiray,
anshuman.gupta, alexander.deucher, andrealmeid, amd-gfx,
kernel-dev
On 15/11/24 10:37, Raag Jadav wrote:
> Now that we have device wedged event provided by DRM core, make use
> of it and support both driver rebind and bus-reset based recovery.
> With this in place, userspace will be notified of wedged device on
> gt reset failure.
>
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_reset.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index f42f21632306..18cf50a1e84d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -1418,6 +1418,9 @@ static void intel_gt_reset_global(struct intel_gt *gt,
>
> if (!test_bit(I915_WEDGED, >->reset.flags))
> kobject_uevent_env(kobj, KOBJ_CHANGE, reset_done_event);
> + else
> + drm_dev_wedged_event(>->i915->drm,
> + DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET);
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Thanks,
Aravind.
> }
>
> /**
^ permalink raw reply [flat|nested] 24+ messages in thread
* ✓ CI.Patch_applied: success for Introduce DRM device wedged event (rev7)
2024-11-15 5:07 [PATCH v9 0/4] Introduce DRM device wedged event Raag Jadav
` (3 preceding siblings ...)
2024-11-15 5:07 ` [PATCH v9 4/4] drm/i915: " Raag Jadav
@ 2024-11-15 5:15 ` Patchwork
2024-11-15 5:15 ` ✗ CI.checkpatch: warning " Patchwork
` (2 subsequent siblings)
7 siblings, 0 replies; 24+ messages in thread
From: Patchwork @ 2024-11-15 5:15 UTC (permalink / raw)
To: Raag Jadav; +Cc: intel-xe
== Series Details ==
Series: Introduce DRM device wedged event (rev7)
URL : https://patchwork.freedesktop.org/series/138070/
State : success
== Summary ==
=== Applying kernel patches on branch 'drm-tip' with base: ===
Base commit: 36fec0eb8786 drm-tip: 2024y-11m-14d-23h-49m-25s UTC integration manifest
=== git am output follows ===
Applying: drm: Introduce device wedged event
Applying: drm/doc: Document device wedged event
Applying: drm/xe: Use device wedged event
Applying: drm/i915: Use device wedged event
^ permalink raw reply [flat|nested] 24+ messages in thread* ✗ CI.checkpatch: warning for Introduce DRM device wedged event (rev7)
2024-11-15 5:07 [PATCH v9 0/4] Introduce DRM device wedged event Raag Jadav
` (4 preceding siblings ...)
2024-11-15 5:15 ` ✓ CI.Patch_applied: success for Introduce DRM device wedged event (rev7) Patchwork
@ 2024-11-15 5:15 ` Patchwork
2024-11-15 5:16 ` ✓ CI.KUnit: success " Patchwork
2024-11-15 5:25 ` ✗ CI.Build: failure " Patchwork
7 siblings, 0 replies; 24+ messages in thread
From: Patchwork @ 2024-11-15 5:15 UTC (permalink / raw)
To: Raag Jadav; +Cc: intel-xe
== Series Details ==
Series: Introduce DRM device wedged event (rev7)
URL : https://patchwork.freedesktop.org/series/138070/
State : warning
== Summary ==
+ KERNEL=/kernel
+ git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt
Cloning into 'mt'...
warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/
+ git -C mt rev-list -n1 origin/master
30ab6715fc09baee6cc14cb3c89ad8858688d474
+ cd /kernel
+ git config --global --add safe.directory /kernel
+ git log -n1
commit ba8560b1efbd075ce9fbbfb0957ce9a247bff73a
Author: Raag Jadav <raag.jadav@intel.com>
Date: Fri Nov 15 10:37:33 2024 +0530
drm/i915: Use device wedged event
Now that we have device wedged event provided by DRM core, make use
of it and support both driver rebind and bus-reset based recovery.
With this in place, userspace will be notified of wedged device on
gt reset failure.
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
+ /mt/dim checkpatch 36fec0eb87867bca47f8829c9e5dbf5b3e2b3aaf drm-intel
0a84e7bbf738 drm: Introduce device wedged event
-:177: WARNING:STATIC_CONST_CHAR_ARRAY: char * array declaration might be better as static const
#177: FILE: drivers/gpu/drm/drm_drv.c:539:
+ char *envp[] = { event_string, NULL };
total: 0 errors, 1 warnings, 0 checks, 102 lines checked
fa99e27443cf drm/doc: Document device wedged event
424c8105293b drm/xe: Use device wedged event
-:20: WARNING:COMMIT_LOG_LONG_LINE: Prefer a maximum 75 chars per line (possible unwrapped commit description?)
#20:
KERNEL[265.802982] change /devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0 (drm)
total: 0 errors, 1 warnings, 0 checks, 24 lines checked
ba8560b1efbd drm/i915: Use device wedged event
^ permalink raw reply [flat|nested] 24+ messages in thread* ✓ CI.KUnit: success for Introduce DRM device wedged event (rev7)
2024-11-15 5:07 [PATCH v9 0/4] Introduce DRM device wedged event Raag Jadav
` (5 preceding siblings ...)
2024-11-15 5:15 ` ✗ CI.checkpatch: warning " Patchwork
@ 2024-11-15 5:16 ` Patchwork
2024-11-15 5:25 ` ✗ CI.Build: failure " Patchwork
7 siblings, 0 replies; 24+ messages in thread
From: Patchwork @ 2024-11-15 5:16 UTC (permalink / raw)
To: Raag Jadav; +Cc: intel-xe
== Series Details ==
Series: Introduce DRM device wedged event (rev7)
URL : https://patchwork.freedesktop.org/series/138070/
State : success
== Summary ==
+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
[05:15:21] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[05:15:25] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json ARCH=um O=.kunit --jobs=48
../lib/iomap.c:156:5: warning: no previous prototype for ‘ioread64_lo_hi’ [-Wmissing-prototypes]
156 | u64 ioread64_lo_hi(const void __iomem *addr)
| ^~~~~~~~~~~~~~
../lib/iomap.c:163:5: warning: no previous prototype for ‘ioread64_hi_lo’ [-Wmissing-prototypes]
163 | u64 ioread64_hi_lo(const void __iomem *addr)
| ^~~~~~~~~~~~~~
../lib/iomap.c:170:5: warning: no previous prototype for ‘ioread64be_lo_hi’ [-Wmissing-prototypes]
170 | u64 ioread64be_lo_hi(const void __iomem *addr)
| ^~~~~~~~~~~~~~~~
../lib/iomap.c:178:5: warning: no previous prototype for ‘ioread64be_hi_lo’ [-Wmissing-prototypes]
178 | u64 ioread64be_hi_lo(const void __iomem *addr)
| ^~~~~~~~~~~~~~~~
../lib/iomap.c:264:6: warning: no previous prototype for ‘iowrite64_lo_hi’ [-Wmissing-prototypes]
264 | void iowrite64_lo_hi(u64 val, void __iomem *addr)
| ^~~~~~~~~~~~~~~
../lib/iomap.c:272:6: warning: no previous prototype for ‘iowrite64_hi_lo’ [-Wmissing-prototypes]
272 | void iowrite64_hi_lo(u64 val, void __iomem *addr)
| ^~~~~~~~~~~~~~~
../lib/iomap.c:280:6: warning: no previous prototype for ‘iowrite64be_lo_hi’ [-Wmissing-prototypes]
280 | void iowrite64be_lo_hi(u64 val, void __iomem *addr)
| ^~~~~~~~~~~~~~~~~
../lib/iomap.c:288:6: warning: no previous prototype for ‘iowrite64be_hi_lo’ [-Wmissing-prototypes]
288 | void iowrite64be_hi_lo(u64 val, void __iomem *addr)
| ^~~~~~~~~~~~~~~~~
[05:15:54] Starting KUnit Kernel (1/1)...
[05:15:54] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[05:15:54] =================== guc_dbm (7 subtests) ===================
[05:15:54] [PASSED] test_empty
[05:15:54] [PASSED] test_default
[05:15:54] ======================== test_size ========================
[05:15:54] [PASSED] 4
[05:15:54] [PASSED] 8
[05:15:54] [PASSED] 32
[05:15:54] [PASSED] 256
[05:15:54] ==================== [PASSED] test_size ====================
[05:15:54] ======================= test_reuse ========================
[05:15:54] [PASSED] 4
[05:15:54] [PASSED] 8
[05:15:54] [PASSED] 32
[05:15:54] [PASSED] 256
[05:15:54] =================== [PASSED] test_reuse ====================
[05:15:54] =================== test_range_overlap ====================
[05:15:54] [PASSED] 4
[05:15:54] [PASSED] 8
[05:15:54] [PASSED] 32
[05:15:54] [PASSED] 256
[05:15:54] =============== [PASSED] test_range_overlap ================
[05:15:54] =================== test_range_compact ====================
[05:15:54] [PASSED] 4
[05:15:54] [PASSED] 8
[05:15:54] [PASSED] 32
[05:15:54] [PASSED] 256
[05:15:54] =============== [PASSED] test_range_compact ================
[05:15:54] ==================== test_range_spare =====================
[05:15:54] [PASSED] 4
[05:15:54] [PASSED] 8
[05:15:54] [PASSED] 32
[05:15:54] [PASSED] 256
[05:15:54] ================ [PASSED] test_range_spare =================
[05:15:54] ===================== [PASSED] guc_dbm =====================
[05:15:54] =================== guc_idm (6 subtests) ===================
[05:15:54] [PASSED] bad_init
[05:15:54] [PASSED] no_init
[05:15:54] [PASSED] init_fini
[05:15:54] [PASSED] check_used
[05:15:54] [PASSED] check_quota
[05:15:54] [PASSED] check_all
[05:15:54] ===================== [PASSED] guc_idm =====================
[05:15:54] ================== no_relay (3 subtests) ===================
[05:15:54] [PASSED] xe_drops_guc2pf_if_not_ready
[05:15:54] [PASSED] xe_drops_guc2vf_if_not_ready
[05:15:54] [PASSED] xe_rejects_send_if_not_ready
[05:15:54] ==================== [PASSED] no_relay =====================
[05:15:54] ================== pf_relay (14 subtests) ==================
[05:15:54] [PASSED] pf_rejects_guc2pf_too_short
[05:15:54] [PASSED] pf_rejects_guc2pf_too_long
[05:15:54] [PASSED] pf_rejects_guc2pf_no_payload
[05:15:54] [PASSED] pf_fails_no_payload
[05:15:54] [PASSED] pf_fails_bad_origin
[05:15:54] [PASSED] pf_fails_bad_type
[05:15:54] [PASSED] pf_txn_reports_error
[05:15:54] [PASSED] pf_txn_sends_pf2guc
[05:15:54] [PASSED] pf_sends_pf2guc
[05:15:54] [SKIPPED] pf_loopback_nop
[05:15:54] [SKIPPED] pf_loopback_echo
[05:15:54] [SKIPPED] pf_loopback_fail
[05:15:54] [SKIPPED] pf_loopback_busy
[05:15:54] [SKIPPED] pf_loopback_retry
[05:15:54] ==================== [PASSED] pf_relay =====================
[05:15:54] ================== vf_relay (3 subtests) ===================
[05:15:54] [PASSED] vf_rejects_guc2vf_too_short
[05:15:54] [PASSED] vf_rejects_guc2vf_too_long
[05:15:54] [PASSED] vf_rejects_guc2vf_no_payload
[05:15:54] ==================== [PASSED] vf_relay =====================
[05:15:54] ================= pf_service (11 subtests) =================
[05:15:54] [PASSED] pf_negotiate_any
[05:15:54] [PASSED] pf_negotiate_base_match
[05:15:54] [PASSED] pf_negotiate_base_newer
[05:15:54] [PASSED] pf_negotiate_base_next
[05:15:54] [SKIPPED] pf_negotiate_base_older
[05:15:54] [PASSED] pf_negotiate_base_prev
[05:15:54] [PASSED] pf_negotiate_latest_match
[05:15:54] [PASSED] pf_negotiate_latest_newer
[05:15:54] [PASSED] pf_negotiate_latest_next
[05:15:54] [SKIPPED] pf_negotiate_latest_older
[05:15:54] [SKIPPED] pf_negotiate_latest_prev
[05:15:54] =================== [PASSED] pf_service ====================
[05:15:54] ===================== lmtt (1 subtest) =====================
[05:15:54] ======================== test_ops =========================
[05:15:54] [PASSED] 2-level
[05:15:54] [PASSED] multi-level
[05:15:54] ==================== [PASSED] test_ops =====================
[05:15:54] ====================== [PASSED] lmtt =======================
[05:15:54] =================== xe_mocs (2 subtests) ===================
[05:15:54] ================ xe_live_mocs_kernel_kunit ================
[05:15:54] =========== [SKIPPED] xe_live_mocs_kernel_kunit ============
[05:15:54] ================ xe_live_mocs_reset_kunit =================
[05:15:54] ============ [SKIPPED] xe_live_mocs_reset_kunit ============
[05:15:54] ==================== [SKIPPED] xe_mocs =====================
[05:15:54] ================= xe_migrate (2 subtests) ==================
[05:15:54] ================= xe_migrate_sanity_kunit =================
[05:15:54] ============ [SKIPPED] xe_migrate_sanity_kunit =============
[05:15:54] ================== xe_validate_ccs_kunit ==================
[05:15:54] ============= [SKIPPED] xe_validate_ccs_kunit ==============
[05:15:54] =================== [SKIPPED] xe_migrate ===================
[05:15:54] ================== xe_dma_buf (1 subtest) ==================
[05:15:54] ==================== xe_dma_buf_kunit =====================
[05:15:54] ================ [SKIPPED] xe_dma_buf_kunit ================
[05:15:54] =================== [SKIPPED] xe_dma_buf ===================
[05:15:54] ==================== xe_bo (3 subtests) ====================
[05:15:54] ================== xe_ccs_migrate_kunit ===================
[05:15:54] ============== [SKIPPED] xe_ccs_migrate_kunit ==============
[05:15:54] ==================== xe_bo_evict_kunit ====================
[05:15:54] =============== [SKIPPED] xe_bo_evict_kunit ================
[05:15:54] =================== xe_bo_shrink_kunit ====================
[05:15:54] =============== [SKIPPED] xe_bo_shrink_kunit ===============
[05:15:54] ===================== [SKIPPED] xe_bo ======================
[05:15:54] ==================== args (11 subtests) ====================
[05:15:54] [PASSED] count_args_test
[05:15:54] [PASSED] call_args_example
[05:15:54] [PASSED] call_args_test
[05:15:54] [PASSED] drop_first_arg_example
[05:15:54] [PASSED] drop_first_arg_test
[05:15:54] [PASSED] first_arg_example
[05:15:54] [PASSED] first_arg_test
[05:15:54] [PASSED] last_arg_example
[05:15:54] [PASSED] last_arg_test
[05:15:54] [PASSED] pick_arg_example
[05:15:54] [PASSED] sep_comma_examplestty: 'standard input': Inappropriate ioctl for device
[05:15:54] ====================== [PASSED] args =======================
[05:15:54] =================== xe_pci (2 subtests) ====================
[05:15:54] [PASSED] xe_gmdid_graphics_ip
[05:15:54] [PASSED] xe_gmdid_media_ip
[05:15:54] ===================== [PASSED] xe_pci ======================
[05:15:54] =================== xe_rtp (2 subtests) ====================
[05:15:54] =============== xe_rtp_process_to_sr_tests ================
[05:15:54] [PASSED] coalesce-same-reg
[05:15:54] [PASSED] no-match-no-add
[05:15:54] [PASSED] match-or
[05:15:54] [PASSED] match-or-xfail
[05:15:54] [PASSED] no-match-no-add-multiple-rules
[05:15:54] [PASSED] two-regs-two-entries
[05:15:54] [PASSED] clr-one-set-other
[05:15:54] [PASSED] set-field
[05:15:54] [PASSED] conflict-duplicate
[05:15:54] [PASSED] conflict-not-disjoint
[05:15:54] [PASSED] conflict-reg-type
[05:15:54] =========== [PASSED] xe_rtp_process_to_sr_tests ============
[05:15:54] ================== xe_rtp_process_tests ===================
[05:15:54] [PASSED] active1
[05:15:54] [PASSED] active2
[05:15:54] [PASSED] active-inactive
[05:15:54] [PASSED] inactive-active
[05:15:54] [PASSED] inactive-1st_or_active-inactive
[05:15:54] [PASSED] inactive-2nd_or_active-inactive
[05:15:54] [PASSED] inactive-last_or_active-inactive
[05:15:54] [PASSED] inactive-no_or_active-inactive
[05:15:54] ============== [PASSED] xe_rtp_process_tests ===============
[05:15:54] ===================== [PASSED] xe_rtp ======================
[05:15:54] ==================== xe_wa (1 subtest) =====================
[05:15:54] ======================== xe_wa_gt =========================
[05:15:54] [PASSED] TIGERLAKE (B0)
[05:15:54] [PASSED] DG1 (A0)
[05:15:54] [PASSED] DG1 (B0)
[05:15:54] [PASSED] ALDERLAKE_S (A0)
[05:15:54] [PASSED] ALDERLAKE_S (B0)
[05:15:54] [PASSED] ALDERLAKE_S (C0)
[05:15:54] [PASSED] ALDERLAKE_S (D0)
[05:15:54] [PASSED] ALDERLAKE_P (A0)
[05:15:54] [PASSED] ALDERLAKE_P (B0)
[05:15:54] [PASSED] ALDERLAKE_P (C0)
[05:15:54] [PASSED] ALDERLAKE_S_RPLS (D0)
[05:15:54] [PASSED] ALDERLAKE_P_RPLU (E0)
[05:15:54] [PASSED] DG2_G10 (C0)
[05:15:54] [PASSED] DG2_G11 (B1)
[05:15:54] [PASSED] DG2_G12 (A1)
[05:15:54] [PASSED] METEORLAKE (g:A0, m:A0)
[05:15:54] [PASSED] METEORLAKE (g:A0, m:A0)
[05:15:54] [PASSED] METEORLAKE (g:A0, m:A0)
[05:15:54] [PASSED] LUNARLAKE (g:A0, m:A0)
[05:15:54] [PASSED] LUNARLAKE (g:B0, m:A0)
[05:15:54] [PASSED] BATTLEMAGE (g:A0, m:A1)
[05:15:54] ==================== [PASSED] xe_wa_gt =====================
[05:15:54] ====================== [PASSED] xe_wa ======================
[05:15:54] ============================================================
[05:15:54] Testing complete. Ran 122 tests: passed: 106, skipped: 16
[05:15:54] Elapsed time: 32.773s total, 4.421s configuring, 28.086s building, 0.222s running
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/tests/.kunitconfig
[05:15:54] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[05:15:56] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json ARCH=um O=.kunit --jobs=48
../lib/iomap.c:156:5: warning: no previous prototype for ‘ioread64_lo_hi’ [-Wmissing-prototypes]
156 | u64 ioread64_lo_hi(const void __iomem *addr)
| ^~~~~~~~~~~~~~
../lib/iomap.c:163:5: warning: no previous prototype for ‘ioread64_hi_lo’ [-Wmissing-prototypes]
163 | u64 ioread64_hi_lo(const void __iomem *addr)
| ^~~~~~~~~~~~~~
../lib/iomap.c:170:5: warning: no previous prototype for ‘ioread64be_lo_hi’ [-Wmissing-prototypes]
170 | u64 ioread64be_lo_hi(const void __iomem *addr)
| ^~~~~~~~~~~~~~~~
../lib/iomap.c:178:5: warning: no previous prototype for ‘ioread64be_hi_lo’ [-Wmissing-prototypes]
178 | u64 ioread64be_hi_lo(const void __iomem *addr)
| ^~~~~~~~~~~~~~~~
../lib/iomap.c:264:6: warning: no previous prototype for ‘iowrite64_lo_hi’ [-Wmissing-prototypes]
264 | void iowrite64_lo_hi(u64 val, void __iomem *addr)
| ^~~~~~~~~~~~~~~
../lib/iomap.c:272:6: warning: no previous prototype for ‘iowrite64_hi_lo’ [-Wmissing-prototypes]
272 | void iowrite64_hi_lo(u64 val, void __iomem *addr)
| ^~~~~~~~~~~~~~~
../lib/iomap.c:280:6: warning: no previous prototype for ‘iowrite64be_lo_hi’ [-Wmissing-prototypes]
280 | void iowrite64be_lo_hi(u64 val, void __iomem *addr)
| ^~~~~~~~~~~~~~~~~
../lib/iomap.c:288:6: warning: no previous prototype for ‘iowrite64be_hi_lo’ [-Wmissing-prototypes]
288 | void iowrite64be_hi_lo(u64 val, void __iomem *addr)
| ^~~~~~~~~~~~~~~~~
[05:16:19] Starting KUnit Kernel (1/1)...
[05:16:19] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[05:16:19] ================== drm_buddy (7 subtests) ==================
[05:16:19] [PASSED] drm_test_buddy_alloc_limit
[05:16:19] [PASSED] drm_test_buddy_alloc_optimistic
[05:16:19] [PASSED] drm_test_buddy_alloc_pessimistic
[05:16:19] [PASSED] drm_test_buddy_alloc_pathological
[05:16:19] [PASSED] drm_test_buddy_alloc_contiguous
[05:16:19] [PASSED] drm_test_buddy_alloc_clear
[05:16:19] [PASSED] drm_test_buddy_alloc_range_bias
[05:16:19] ==================== [PASSED] drm_buddy ====================
[05:16:19] ============= drm_cmdline_parser (40 subtests) =============
[05:16:19] [PASSED] drm_test_cmdline_force_d_only
[05:16:19] [PASSED] drm_test_cmdline_force_D_only_dvi
[05:16:19] [PASSED] drm_test_cmdline_force_D_only_hdmi
[05:16:19] [PASSED] drm_test_cmdline_force_D_only_not_digital
[05:16:19] [PASSED] drm_test_cmdline_force_e_only
[05:16:19] [PASSED] drm_test_cmdline_res
[05:16:19] [PASSED] drm_test_cmdline_res_vesa
[05:16:19] [PASSED] drm_test_cmdline_res_vesa_rblank
[05:16:19] [PASSED] drm_test_cmdline_res_rblank
[05:16:19] [PASSED] drm_test_cmdline_res_bpp
[05:16:19] [PASSED] drm_test_cmdline_res_refresh
[05:16:19] [PASSED] drm_test_cmdline_res_bpp_refresh
[05:16:19] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced
[05:16:19] [PASSED] drm_test_cmdline_res_bpp_refresh_margins
[05:16:19] [PASSED] drm_test_cmdline_res_bpp_refresh_force_off
[05:16:19] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on
[05:16:19] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_analog
[05:16:19] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_digital
[05:16:19] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced_margins_force_on
[05:16:19] [PASSED] drm_test_cmdline_res_margins_force_on
[05:16:19] [PASSED] drm_test_cmdline_res_vesa_margins
[05:16:19] [PASSED] drm_test_cmdline_name
[05:16:19] [PASSED] drm_test_cmdline_name_bpp
[05:16:19] [PASSED] drm_test_cmdline_name_option
[05:16:19] [PASSED] drm_test_cmdline_name_bpp_option
[05:16:19] [PASSED] drm_test_cmdline_rotate_0
[05:16:19] [PASSED] drm_test_cmdline_rotate_90
[05:16:19] [PASSED] drm_test_cmdline_rotate_180
[05:16:19] [PASSED] drm_test_cmdline_rotate_270
[05:16:19] [PASSED] drm_test_cmdline_hmirror
[05:16:19] [PASSED] drm_test_cmdline_vmirror
[05:16:19] [PASSED] drm_test_cmdline_margin_options
[05:16:19] [PASSED] drm_test_cmdline_multiple_options
[05:16:19] [PASSED] drm_test_cmdline_bpp_extra_and_option
[05:16:19] [PASSED] drm_test_cmdline_extra_and_option
[05:16:19] [PASSED] drm_test_cmdline_freestanding_options
[05:16:19] [PASSED] drm_test_cmdline_freestanding_force_e_and_options
[05:16:19] [PASSED] drm_test_cmdline_panel_orientation
[05:16:19] ================ drm_test_cmdline_invalid =================
[05:16:19] [PASSED] margin_only
[05:16:19] [PASSED] interlace_only
[05:16:19] [PASSED] res_missing_x
[05:16:19] [PASSED] res_missing_y
[05:16:19] [PASSED] res_bad_y
[05:16:19] [PASSED] res_missing_y_bpp
[05:16:19] [PASSED] res_bad_bpp
[05:16:19] [PASSED] res_bad_refresh
[05:16:19] [PASSED] res_bpp_refresh_force_on_off
[05:16:19] [PASSED] res_invalid_mode
[05:16:19] [PASSED] res_bpp_wrong_place_mode
[05:16:19] [PASSED] name_bpp_refresh
[05:16:19] [PASSED] name_refresh
[05:16:19] [PASSED] name_refresh_wrong_mode
[05:16:19] [PASSED] name_refresh_invalid_mode
[05:16:19] [PASSED] rotate_multiple
[05:16:19] [PASSED] rotate_invalid_val
[05:16:19] [PASSED] rotate_truncated
[05:16:19] [PASSED] invalid_option
[05:16:19] [PASSED] invalid_tv_option
[05:16:19] [PASSED] truncated_tv_option
[05:16:19] ============ [PASSED] drm_test_cmdline_invalid =============
[05:16:19] =============== drm_test_cmdline_tv_options ===============
[05:16:19] [PASSED] NTSC
[05:16:19] [PASSED] NTSC_443
[05:16:19] [PASSED] NTSC_J
[05:16:19] [PASSED] PAL
[05:16:19] [PASSED] PAL_M
[05:16:19] [PASSED] PAL_N
[05:16:19] [PASSED] SECAM
[05:16:19] [PASSED] MONO_525
[05:16:19] [PASSED] MONO_625
[05:16:19] =========== [PASSED] drm_test_cmdline_tv_options ===========
[05:16:19] =============== [PASSED] drm_cmdline_parser ================
[05:16:19] ========== drmm_connector_hdmi_init (19 subtests) ==========
[05:16:19] [PASSED] drm_test_connector_hdmi_init_valid
[05:16:19] [PASSED] drm_test_connector_hdmi_init_bpc_8
[05:16:19] [PASSED] drm_test_connector_hdmi_init_bpc_10
[05:16:19] [PASSED] drm_test_connector_hdmi_init_bpc_12
[05:16:19] [PASSED] drm_test_connector_hdmi_init_bpc_invalid
[05:16:19] [PASSED] drm_test_connector_hdmi_init_bpc_null
[05:16:19] [PASSED] drm_test_connector_hdmi_init_formats_empty
[05:16:19] [PASSED] drm_test_connector_hdmi_init_formats_no_rgb
[05:16:19] [PASSED] drm_test_connector_hdmi_init_null_ddc
[05:16:19] [PASSED] drm_test_connector_hdmi_init_null_product
[05:16:19] [PASSED] drm_test_connector_hdmi_init_null_vendor
[05:16:19] [PASSED] drm_test_connector_hdmi_init_product_length_exact
[05:16:19] [PASSED] drm_test_connector_hdmi_init_product_length_too_long
[05:16:19] [PASSED] drm_test_connector_hdmi_init_product_valid
[05:16:19] [PASSED] drm_test_connector_hdmi_init_vendor_length_exact
[05:16:19] [PASSED] drm_test_connector_hdmi_init_vendor_length_too_long
[05:16:19] [PASSED] drm_test_connector_hdmi_init_vendor_valid
[05:16:19] ========= drm_test_connector_hdmi_init_type_valid =========
[05:16:19] [PASSED] HDMI-A
[05:16:19] [PASSED] HDMI-B
[05:16:19] ===== [PASSED] drm_test_connector_hdmi_init_type_valid =====
[05:16:19] ======== drm_test_connector_hdmi_init_type_invalid ========
[05:16:19] [PASSED] Unknown
[05:16:19] [PASSED] VGA
[05:16:19] [PASSED] DVI-I
[05:16:19] [PASSED] DVI-D
[05:16:19] [PASSED] DVI-A
[05:16:19] [PASSED] Composite
[05:16:19] [PASSED] SVIDEO
[05:16:19] [PASSED] LVDS
[05:16:19] [PASSED] Component
[05:16:19] [PASSED] DIN
[05:16:19] [PASSED] DP
[05:16:19] [PASSED] TV
[05:16:19] [PASSED] eDP
[05:16:19] [PASSED] Virtual
[05:16:19] [PASSED] DSI
[05:16:19] [PASSED] DPI
[05:16:19] [PASSED] Writeback
[05:16:19] [PASSED] SPI
[05:16:19] [PASSED] USB
[05:16:19] ==== [PASSED] drm_test_connector_hdmi_init_type_invalid ====
[05:16:19] ============ [PASSED] drmm_connector_hdmi_init =============
[05:16:19] ============= drmm_connector_init (3 subtests) =============
[05:16:19] [PASSED] drm_test_drmm_connector_init
[05:16:19] [PASSED] drm_test_drmm_connector_init_null_ddc
[05:16:19] ========= drm_test_drmm_connector_init_type_valid =========
[05:16:19] [PASSED] Unknown
[05:16:19] [PASSED] VGA
[05:16:19] [PASSED] DVI-I
[05:16:19] [PASSED] DVI-D
[05:16:19] [PASSED] DVI-A
[05:16:19] [PASSED] Composite
[05:16:19] [PASSED] SVIDEO
[05:16:19] [PASSED] LVDS
[05:16:19] [PASSED] Component
[05:16:19] [PASSED] DIN
[05:16:19] [PASSED] DP
[05:16:19] [PASSED] HDMI-A
[05:16:19] [PASSED] HDMI-B
[05:16:19] [PASSED] TV
[05:16:19] [PASSED] eDP
[05:16:19] [PASSED] Virtual
[05:16:19] [PASSED] DSI
[05:16:19] [PASSED] DPI
[05:16:19] [PASSED] Writeback
[05:16:19] [PASSED] SPI
[05:16:19] [PASSED] USB
[05:16:19] ===== [PASSED] drm_test_drmm_connector_init_type_valid =====
[05:16:19] =============== [PASSED] drmm_connector_init ===============
[05:16:19] = drm_connector_attach_broadcast_rgb_property (2 subtests) =
[05:16:19] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property
[05:16:19] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property_hdmi_connector
[05:16:19] === [PASSED] drm_connector_attach_broadcast_rgb_property ===
[05:16:19] ========== drm_get_tv_mode_from_name (2 subtests) ==========
[05:16:19] ========== drm_test_get_tv_mode_from_name_valid ===========
[05:16:19] [PASSED] NTSC
[05:16:19] [PASSED] NTSC-443
[05:16:19] [PASSED] NTSC-J
[05:16:19] [PASSED] PAL
[05:16:19] [PASSED] PAL-M
[05:16:19] [PASSED] PAL-N
[05:16:19] [PASSED] SECAM
[05:16:19] [PASSED] Mono
[05:16:19] ====== [PASSED] drm_test_get_tv_mode_from_name_valid =======
[05:16:19] [PASSED] drm_test_get_tv_mode_from_name_truncated
[05:16:19] ============ [PASSED] drm_get_tv_mode_from_name ============
[05:16:19] = drm_test_connector_hdmi_compute_mode_clock (12 subtests) =
[05:16:19] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb
[05:16:19] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc
[05:16:19] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc_vic_1
[05:16:19] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc
[05:16:19] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc_vic_1
[05:16:19] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_double
[05:16:19] = drm_test_connector_hdmi_compute_mode_clock_yuv420_valid =
[05:16:19] [PASSED] VIC 96
[05:16:19] [PASSED] VIC 97
[05:16:19] [PASSED] VIC 101
[05:16:19] [PASSED] VIC 102
[05:16:19] [PASSED] VIC 106
[05:16:19] [PASSED] VIC 107
[05:16:19] === [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_valid ===
[05:16:19] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_10_bpc
[05:16:19] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_12_bpc
[05:16:19] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_8_bpc
[05:16:19] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_10_bpc
[05:16:19] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_12_bpc
[05:16:19] === [PASSED] drm_test_connector_hdmi_compute_mode_clock ====
[05:16:19] == drm_hdmi_connector_get_broadcast_rgb_name (2 subtests) ==
[05:16:19] === drm_test_drm_hdmi_connector_get_broadcast_rgb_name ====
[05:16:19] [PASSED] Automatic
[05:16:19] [PASSED] Full
[05:16:19] [PASSED] Limited 16:235
[05:16:19] === [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name ===
[05:16:19] [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name_invalid
[05:16:19] ==== [PASSED] drm_hdmi_connector_get_broadcast_rgb_name ====
[05:16:19] == drm_hdmi_connector_get_output_format_name (2 subtests) ==
[05:16:19] === drm_test_drm_hdmi_connector_get_output_format_name ====
[05:16:19] [PASSED] RGB
[05:16:19] [PASSED] YUV 4:2:0
[05:16:19] [PASSED] YUV 4:2:2
[05:16:19] [PASSED] YUV 4:4:4
[05:16:19] === [PASSED] drm_test_drm_hdmi_connector_get_output_format_name ===
[05:16:19] [PASSED] drm_test_drm_hdmi_connector_get_output_format_name_invalid
[05:16:19] ==== [PASSED] drm_hdmi_connector_get_output_format_name ====
[05:16:19] ============= drm_damage_helper (21 subtests) ==============
[05:16:19] [PASSED] drm_test_damage_iter_no_damage
[05:16:19] [PASSED] drm_test_damage_iter_no_damage_fractional_src
[05:16:19] [PASSED] drm_test_damage_iter_no_damage_src_moved
[05:16:19] [PASSED] drm_test_damage_iter_no_damage_fractional_src_moved
[05:16:19] [PASSED] drm_test_damage_iter_no_damage_not_visible
[05:16:19] [PASSED] drm_test_damage_iter_no_damage_no_crtc
[05:16:19] [PASSED] drm_test_damage_iter_no_damage_no_fb
[05:16:19] [PASSED] drm_test_damage_iter_simple_damage
[05:16:19] [PASSED] drm_test_damage_iter_single_damage
[05:16:19] [PASSED] drm_test_damage_iter_single_damage_intersect_src
[05:16:19] [PASSED] drm_test_damage_iter_single_damage_outside_src
[05:16:19] [PASSED] drm_test_damage_iter_single_damage_fractional_src
[05:16:19] [PASSED] drm_test_damage_iter_single_damage_intersect_fractional_src
[05:16:19] [PASSED] drm_test_damage_iter_single_damage_outside_fractional_src
[05:16:19] [PASSED] drm_test_damage_iter_single_damage_src_moved
[05:16:19] [PASSED] drm_test_damage_iter_single_damage_fractional_src_moved
[05:16:19] [PASSED] drm_test_damage_iter_damage
[05:16:19] [PASSED] drm_test_damage_iter_damage_one_intersect
[05:16:19] [PASSED] drm_test_damage_iter_damage_one_outside
[05:16:19] [PASSED] drm_test_damage_iter_damage_src_moved
[05:16:19] [PASSED] drm_test_damage_iter_damage_not_visible
[05:16:19] ================ [PASSED] drm_damage_helper ================
[05:16:19] ============== drm_dp_mst_helper (3 subtests) ==============
[05:16:19] ============== drm_test_dp_mst_calc_pbn_mode ==============
[05:16:19] [PASSED] Clock 154000 BPP 30 DSC disabled
[05:16:19] [PASSED] Clock 234000 BPP 30 DSC disabled
[05:16:19] [PASSED] Clock 297000 BPP 24 DSC disabled
[05:16:19] [PASSED] Clock 332880 BPP 24 DSC enabled
[05:16:19] [PASSED] Clock 324540 BPP 24 DSC enabled
[05:16:19] ========== [PASSED] drm_test_dp_mst_calc_pbn_mode ==========
[05:16:19] ============== drm_test_dp_mst_calc_pbn_div ===============
[05:16:19] [PASSED] Link rate 2000000 lane count 4
[05:16:19] [PASSED] Link rate 2000000 lane count 2
[05:16:19] [PASSED] Link rate 2000000 lane count 1
[05:16:19] [PASSED] Link rate 1350000 lane count 4
[05:16:19] [PASSED] Link rate 1350000 lane count 2
[05:16:19] [PASSED] Link rate 1350000 lane count 1
[05:16:19] [PASSED] Link rate 1000000 lane count 4
[05:16:19] [PASSED] Link rate 1000000 lane count 2
[05:16:19] [PASSED] Link rate 1000000 lane count 1
[05:16:19] [PASSED] Link rate 810000 lane count 4
[05:16:19] [PASSED] Link rate 810000 lane count 2
[05:16:19] [PASSED] Link rate 810000 lane count 1
[05:16:19] [PASSED] Link rate 540000 lane count 4
[05:16:19] [PASSED] Link rate 540000 lane count 2
[05:16:19] [PASSED] Link rate 540000 lane count 1
[05:16:19] [PASSED] Link rate 270000 lane count 4
[05:16:19] [PASSED] Link rate 270000 lane count 2
[05:16:19] [PASSED] Link rate 270000 lane count 1
[05:16:19] [PASSED] Link rate 162000 lane count 4
[05:16:19] [PASSED] Link rate 162000 lane count 2
[05:16:19] [PASSED] Link rate 162000 lane count 1
[05:16:19] ========== [PASSED] drm_test_dp_mst_calc_pbn_div ===========
[05:16:19] ========= drm_test_dp_mst_sideband_msg_req_decode =========
[05:16:19] [PASSED] DP_ENUM_PATH_RESOURCES with port number
[05:16:19] [PASSED] DP_POWER_UP_PHY with port number
[05:16:19] [PASSED] DP_POWER_DOWN_PHY with port number
[05:16:19] [PASSED] DP_ALLOCATE_PAYLOAD with SDP stream sinks
[05:16:19] [PASSED] DP_ALLOCATE_PAYLOAD with port number
[05:16:19] [PASSED] DP_ALLOCATE_PAYLOAD with VCPI
[05:16:19] [PASSED] DP_ALLOCATE_PAYLOAD with PBN
[05:16:19] [PASSED] DP_QUERY_PAYLOAD with port number
[05:16:19] [PASSED] DP_QUERY_PAYLOAD with VCPI
[05:16:19] [PASSED] DP_REMOTE_DPCD_READ with port number
[05:16:19] [PASSED] DP_REMOTE_DPCD_READ with DPCD address
[05:16:19] [PASSED] DP_REMOTE_DPCD_READ with max number of bytes
[05:16:19] [PASSED] DP_REMOTE_DPCD_WRITE with port number
[05:16:19] [PASSED] DP_REMOTE_DPCD_WRITE with DPCD address
[05:16:19] [PASSED] DP_REMOTE_DPCD_WRITE with data array
[05:16:19] [PASSED] DP_REMOTE_I2C_READ with port number
[05:16:19] [PASSED] DP_REMOTE_I2C_READ with I2C device ID
[05:16:19] [PASSED] DP_REMOTE_I2C_READ with transactions array
[05:16:19] [PASSED] DP_REMOTE_I2C_WRITE with port number
[05:16:19] [PASSED] DP_REMOTE_I2C_WRITE with I2C device ID
[05:16:19] [PASSED] DP_REMOTE_I2C_WRITE with data array
[05:16:19] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream ID
[05:16:19] [PASSED] DP_QUERY_STREAM_ENC_STATUS with client ID
[05:16:19] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream event
[05:16:19] [PASSED] DP_QUERY_STREAM_ENC_STATUS with valid stream event
[05:16:19] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream behavior
[05:16:19] [PASSED] DP_QUERY_STREAM_ENC_STATUS with a valid stream behavior
[05:16:19] ===== [PASSED] drm_test_dp_mst_sideband_msg_req_decode =====
[05:16:19] ================ [PASSED] drm_dp_mst_helper ================
[05:16:19] ================== drm_exec (7 subtests) ===================
[05:16:19] [PASSED] sanitycheck
[05:16:19] [PASSED] test_lock
[05:16:19] [PASSED] test_lock_unlock
[05:16:19] [PASSED] test_duplicates
[05:16:19] [PASSED] test_prepare
[05:16:19] [PASSED] test_prepare_array
[05:16:19] [PASSED] test_multiple_loops
[05:16:19] ==================== [PASSED] drm_exec =====================
[05:16:19] =========== drm_format_helper_test (17 subtests) ===========
[05:16:19] ============== drm_test_fb_xrgb8888_to_gray8 ==============
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ========== [PASSED] drm_test_fb_xrgb8888_to_gray8 ==========
[05:16:19] ============= drm_test_fb_xrgb8888_to_rgb332 ==============
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb332 ==========
[05:16:19] ============= drm_test_fb_xrgb8888_to_rgb565 ==============
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb565 ==========
[05:16:19] ============ drm_test_fb_xrgb8888_to_xrgb1555 =============
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ======== [PASSED] drm_test_fb_xrgb8888_to_xrgb1555 =========
[05:16:19] ============ drm_test_fb_xrgb8888_to_argb1555 =============
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ======== [PASSED] drm_test_fb_xrgb8888_to_argb1555 =========
[05:16:19] ============ drm_test_fb_xrgb8888_to_rgba5551 =============
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ======== [PASSED] drm_test_fb_xrgb8888_to_rgba5551 =========
[05:16:19] ============= drm_test_fb_xrgb8888_to_rgb888 ==============
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb888 ==========
[05:16:19] ============ drm_test_fb_xrgb8888_to_argb8888 =============
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ======== [PASSED] drm_test_fb_xrgb8888_to_argb8888 =========
[05:16:19] =========== drm_test_fb_xrgb8888_to_xrgb2101010 ===========
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ======= [PASSED] drm_test_fb_xrgb8888_to_xrgb2101010 =======
[05:16:19] =========== drm_test_fb_xrgb8888_to_argb2101010 ===========
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ======= [PASSED] drm_test_fb_xrgb8888_to_argb2101010 =======
[05:16:19] ============== drm_test_fb_xrgb8888_to_mono ===============
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ========== [PASSED] drm_test_fb_xrgb8888_to_mono ===========
[05:16:19] ==================== drm_test_fb_swab =====================
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ================ [PASSED] drm_test_fb_swab =================
[05:16:19] ============ drm_test_fb_xrgb8888_to_xbgr8888 =============
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ======== [PASSED] drm_test_fb_xrgb8888_to_xbgr8888 =========
[05:16:19] ============ drm_test_fb_xrgb8888_to_abgr8888 =============
[05:16:19] [PASSED] single_pixel_source_buffer
[05:16:19] [PASSED] single_pixel_clip_rectangle
[05:16:19] [PASSED] well_known_colors
[05:16:19] [PASSED] destination_pitch
[05:16:19] ======== [PASSED] drm_test_fb_xrgb8888_to_abgr8888 =========
[05:16:19] ================= drm_test_fb_clip_offset =================
[05:16:19] [PASSED] pass through
[05:16:19] [PASSED] horizontal offset
[05:16:19] [PASSED] vertical offset
[05:16:19] [PASSED] horizontal and vertical offset
[05:16:19] [PASSED] horizontal offset (custom pitch)
[05:16:19] [PASSED] vertical offset (custom pitch)
[05:16:19] [PASSED] horizontal and vertical offset (custom pitch)
[05:16:19] ============= [PASSED] drm_test_fb_clip_offset =============
[05:16:19] ============== drm_test_fb_build_fourcc_list ==============
[05:16:19] [PASSED] no native formats
[05:16:19] [PASSED] XRGB8888 as native format
[05:16:19] [PASSED] remove duplicates
[05:16:19] [PASSED] convert alpha formats
[05:16:19] [PASSED] random formats
[05:16:19] ========== [PASSED] drm_test_fb_build_fourcc_list ==========
[05:16:19] =================== drm_test_fb_memcpy ====================
[05:16:19] [PASSED] single_pixel_source_buffer: XR24 little-endian (0x34325258)
[05:16:19] [PASSED] single_pixel_source_buffer: XRA8 little-endian (0x38415258)
[05:16:19] [PASSED] single_pixel_source_buffer: YU24 little-endian (0x34325559)
[05:16:19] [PASSED] single_pixel_clip_rectangle: XB24 little-endian (0x34324258)
[05:16:19] [PASSED] single_pixel_clip_rectangle: XRA8 little-endian (0x38415258)
[05:16:19] [PASSED] single_pixel_clip_rectangle: YU24 little-endian (0x34325559)
[05:16:19] [PASSED] well_known_colors: XB24 little-endian (0x34324258)
[05:16:19] [PASSED] well_known_colors: XRA8 little-endian (0x38415258)
[05:16:19] [PASSED] well_known_colors: YU24 little-endian (0x34325559)
[05:16:19] [PASSED] destination_pitch: XB24 little-endian (0x34324258)
[05:16:19] [PASSED] destination_pitch: XRA8 little-endian (0x38415258)
[05:16:19] [PASSED] destination_pitch: YU24 little-endian (0x34325559)
[05:16:19] =============== [PASSED] drm_test_fb_memcpy ================
[05:16:19] ============= [PASSED] drm_format_helper_test ==============
[05:16:19] ================= drm_format (18 subtests) =================
[05:16:19] [PASSED] drm_test_format_block_width_invalid
[05:16:19] [PASSED] drm_test_format_block_width_one_plane
[05:16:19] [PASSED] drm_test_format_block_width_two_plane
[05:16:19] [PASSED] drm_test_format_block_width_three_plane
[05:16:19] [PASSED] drm_test_format_block_width_tiled
[05:16:19] [PASSED] drm_test_format_block_height_invalid
[05:16:19] [PASSED] drm_test_format_block_height_one_plane
[05:16:19] [PASSED] drm_test_format_block_height_two_plane
[05:16:19] [PASSED] drm_test_format_block_height_three_plane
[05:16:19] [PASSED] drm_test_format_block_height_tiled
[05:16:19] [PASSED] drm_test_format_min_pitch_invalid
[05:16:19] [PASSED] drm_test_format_min_pitch_one_plane_8bpp
[05:16:19] [PASSED] drm_test_format_min_pitch_one_plane_16bpp
[05:16:19] [PASSED] drm_test_format_min_pitch_one_plane_24bpp
[05:16:19] [PASSED] drm_test_format_min_pitch_one_plane_32bpp
[05:16:19] [PASSED] drm_test_format_min_pitch_two_plane
[05:16:19] [PASSED] drm_test_format_min_pitch_three_plane_8bpp
[05:16:19] [PASSED] drm_test_format_min_pitch_tiled
[05:16:19] =================== [PASSED] drm_format ====================
[05:16:19] ============== drm_framebuffer (10 subtests) ===============
[05:16:19] ========== drm_test_framebuffer_check_src_coords ==========
[05:16:19] [PASSED] Success: source fits into fb
[05:16:19] [PASSED] Fail: overflowing fb with x-axis coordinate
[05:16:19] [PASSED] Fail: overflowing fb with y-axis coordinate
[05:16:19] [PASSED] Fail: overflowing fb with source width
[05:16:19] [PASSED] Fail: overflowing fb with source height
[05:16:19] ====== [PASSED] drm_test_framebuffer_check_src_coords ======
[05:16:19] [PASSED] drm_test_framebuffer_cleanup
[05:16:19] =============== drm_test_framebuffer_create ===============
[05:16:19] [PASSED] ABGR8888 normal sizes
[05:16:19] [PASSED] ABGR8888 max sizes
[05:16:19] [PASSED] ABGR8888 pitch greater than min required
[05:16:19] [PASSED] ABGR8888 pitch less than min required
[05:16:19] [PASSED] ABGR8888 Invalid width
[05:16:19] [PASSED] ABGR8888 Invalid buffer handle
[05:16:19] [PASSED] No pixel format
[05:16:19] [PASSED] ABGR8888 Width 0
[05:16:19] [PASSED] ABGR8888 Height 0
[05:16:19] [PASSED] ABGR8888 Out of bound height * pitch combination
[05:16:19] [PASSED] ABGR8888 Large buffer offset
[05:16:19] [PASSED] ABGR8888 Buffer offset for inexistent plane
[05:16:19] [PASSED] ABGR8888 Invalid flag
[05:16:19] [PASSED] ABGR8888 Set DRM_MODE_FB_MODIFIERS without modifiers
[05:16:19] [PASSED] ABGR8888 Valid buffer modifier
[05:16:19] [PASSED] ABGR8888 Invalid buffer modifier(DRM_FORMAT_MOD_SAMSUNG_64_32_TILE)
[05:16:19] [PASSED] ABGR8888 Extra pitches without DRM_MODE_FB_MODIFIERS
[05:16:19] [PASSED] ABGR8888 Extra pitches with DRM_MODE_FB_MODIFIERS
[05:16:19] [PASSED] NV12 Normal sizes
[05:16:19] [PASSED] NV12 Max sizes
[05:16:19] [PASSED] NV12 Invalid pitch
[05:16:19] [PASSED] NV12 Invalid modifier/missing DRM_MODE_FB_MODIFIERS flag
[05:16:19] [PASSED] NV12 different modifier per-plane
[05:16:19] [PASSED] NV12 with DRM_FORMAT_MOD_SAMSUNG_64_32_TILE
[05:16:19] [PASSED] NV12 Valid modifiers without DRM_MODE_FB_MODIFIERS
[05:16:19] [PASSED] NV12 Modifier for inexistent plane
[05:16:19] [PASSED] NV12 Handle for inexistent plane
[05:16:19] [PASSED] NV12 Handle for inexistent plane without DRM_MODE_FB_MODIFIERS
[05:16:19] [PASSED] YVU420 DRM_MODE_FB_MODIFIERS set without modifier
[05:16:19] [PASSED] YVU420 Normal sizes
[05:16:19] [PASSED] YVU420 Max sizes
[05:16:19] [PASSED] YVU420 Invalid pitch
[05:16:19] [PASSED] YVU420 Different pitches
[05:16:19] [PASSED] YVU420 Different buffer offsets/pitches
[05:16:19] [PASSED] YVU420 Modifier set just for plane 0, without DRM_MODE_FB_MODIFIERS
[05:16:19] [PASSED] YVU420 Modifier set just for planes 0, 1, without DRM_MODE_FB_MODIFIERS
[05:16:19] [PASSED] YVU420 Modifier set just for plane 0, 1, with DRM_MODE_FB_MODIFIERS
[05:16:19] [PASSED] YVU420 Valid modifier
[05:16:19] [PASSED] YVU420 Different modifiers per plane
[05:16:19] [PASSED] YVU420 Modifier for inexistent plane
[05:16:19] [PASSED] YUV420_10BIT Invalid modifier(DRM_FORMAT_MOD_LINEAR)
[05:16:19] [PASSED] X0L2 Normal sizes
[05:16:19] [PASSED] X0L2 Max sizes
[05:16:19] [PASSED] X0L2 Invalid pitch
[05:16:19] [PASSED] X0L2 Pitch greater than minimum required
[05:16:19] [PASSED] X0L2 Handle for inexistent plane
[05:16:19] [PASSED] X0L2 Offset for inexistent plane, without DRM_MODE_FB_MODIFIERS set
[05:16:19] [PASSED] X0L2 Modifier without DRM_MODE_FB_MODIFIERS set
[05:16:19] [PASSED] X0L2 Valid modifier
[05:16:19] [PASSED] X0L2 Modifier for inexistent plane
[05:16:19] =========== [PASSED] drm_test_framebuffer_create ===========
[05:16:19] [PASSED] drm_test_framebuffer_free
[05:16:19] [PASSED] drm_test_framebuffer_init
[05:16:19] [PASSED] drm_test_framebuffer_init_bad_format
[05:16:19] [PASSED] drm_test_framebuffer_init_dev_mismatch
[05:16:19] [PASSED] drm_test_framebuffer_lookup
[05:16:19] [PASSED] drm_test_framebuffer_lookup_inexistent
[05:16:19] [PASSED] drm_test_framebuffer_modifiers_not_supported
[05:16:19] ================= [PASSED] drm_framebuffer =================
[05:16:19] ================ drm_gem_shmem (8 subtests) ================
[05:16:19] [PASSED] drm_gem_shmem_test_obj_create
[05:16:19] [PASSED] drm_gem_shmem_test_obj_create_private
[05:16:19] [PASSED] drm_gem_shmem_test_pin_pages
[05:16:19] [PASSED] drm_gem_shmem_test_vmap
[05:16:19] [PASSED] drm_gem_shmem_test_get_pages_sgt
[05:16:19] [PASSED] drm_gem_shmem_test_get_sg_table
[05:16:19] [PASSED] drm_gem_shmem_test_madvise
[05:16:19] [PASSED] drm_gem_shmem_test_purge
[05:16:19] ================== [PASSED] drm_gem_shmem ==================
[05:16:19] === drm_atomic_helper_connector_hdmi_check (22 subtests) ===
[05:16:19] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode
[05:16:19] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode_vic_1
[05:16:19] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode
[05:16:19] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode_vic_1
[05:16:19] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode
[05:16:19] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode_vic_1
[05:16:19] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_changed
[05:16:19] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_not_changed
[05:16:19] [PASSED] drm_test_check_hdmi_funcs_reject_rate
[05:16:19] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback
[05:16:19] [PASSED] drm_test_check_max_tmds_rate_format_fallback
[05:16:19] [PASSED] drm_test_check_output_bpc_crtc_mode_changed
[05:16:19] [PASSED] drm_test_check_output_bpc_crtc_mode_not_changed
[05:16:19] [PASSED] drm_test_check_output_bpc_dvi
[05:16:19] [PASSED] drm_test_check_output_bpc_format_vic_1
[05:16:19] [PASSED] drm_test_check_output_bpc_format_display_8bpc_only
[05:16:19] [PASSED] drm_test_check_output_bpc_format_display_rgb_only
[05:16:19] [PASSED] drm_test_check_output_bpc_format_driver_8bpc_only
[05:16:19] [PASSED] drm_test_check_output_bpc_format_driver_rgb_only
[05:16:19] [PASSED] drm_test_check_tmds_char_rate_rgb_8bpc
[05:16:19] [PASSED] drm_test_check_tmds_char_rate_rgb_10bpc
[05:16:19] [PASSED] drm_test_check_tmds_char_rate_rgb_12bpc
[05:16:19] ===== [PASSED] drm_atomic_helper_connector_hdmi_check ======
[05:16:19] === drm_atomic_helper_connector_hdmi_reset (6 subtests) ====
[05:16:19] [PASSED] drm_test_check_broadcast_rgb_value
[05:16:19] [PASSED] drm_test_check_bpc_8_value
[05:16:19] [PASSED] drm_test_check_bpc_10_value
[05:16:19] [PASSED] drm_test_check_bpc_12_value
[05:16:19] [PASSED] drm_test_check_format_value
[05:16:19] [PASSED] drm_test_check_tmds_char_value
[05:16:19] ===== [PASSED] drm_atomic_helper_connector_hdmi_reset ======
[05:16:19] ================= drm_managed (2 subtests) =================
[05:16:19] [PASSED] drm_test_managed_release_action
[05:16:19] [PASSED] drm_test_managed_run_action
[05:16:19] =================== [PASSED] drm_managed ===================
[05:16:19] =================== drm_mm (6 subtests) ====================
[05:16:19] [PASSED] drm_test_mm_init
[05:16:19] [PASSED] drm_test_mm_debug
[05:16:19] [PASSED] drm_test_mm_align32
[05:16:19] [PASSED] drm_test_mm_align64
[05:16:19] [PASSED] drm_test_mm_lowest
[05:16:19] [PASSED] drm_test_mm_highest
[05:16:19] ===================== [PASSED] drm_mm ======================
[05:16:19] ============= drm_modes_analog_tv (5 subtests) =============
[05:16:19] [PASSED] drm_test_modes_analog_tv_mono_576i
[05:16:19] [PASSED] drm_test_modes_analog_tv_ntsc_480i
[05:16:19] [PASSED] drm_test_modes_analog_tv_ntsc_480i_inlined
[05:16:19] [PASSED] drm_test_modes_analog_tv_pal_576i
[05:16:19] [PASSED] drm_test_modes_analog_tv_pal_576i_inlined
[05:16:19] =============== [PASSED] drm_modes_analog_tv ===============
stty: 'standard input': Inappropriate ioctl for device
[05:16:19] ============== drm_plane_helper (2 subtests) ===============
[05:16:19] =============== drm_test_check_plane_state ================
[05:16:19] [PASSED] clipping_simple
[05:16:19] [PASSED] clipping_rotate_reflect
[05:16:19] [PASSED] positioning_simple
[05:16:19] [PASSED] upscaling
[05:16:19] [PASSED] downscaling
[05:16:19] [PASSED] rounding1
[05:16:19] [PASSED] rounding2
[05:16:19] [PASSED] rounding3
[05:16:19] [PASSED] rounding4
[05:16:19] =========== [PASSED] drm_test_check_plane_state ============
[05:16:19] =========== drm_test_check_invalid_plane_state ============
[05:16:19] [PASSED] positioning_invalid
[05:16:19] [PASSED] upscaling_invalid
[05:16:19] [PASSED] downscaling_invalid
[05:16:19] ======= [PASSED] drm_test_check_invalid_plane_state ========
[05:16:19] ================ [PASSED] drm_plane_helper =================
[05:16:19] ====== drm_connector_helper_tv_get_modes (1 subtest) =======
[05:16:19] ====== drm_test_connector_helper_tv_get_modes_check =======
[05:16:19] [PASSED] None
[05:16:19] [PASSED] PAL
[05:16:19] [PASSED] NTSC
[05:16:19] [PASSED] Both, NTSC Default
[05:16:19] [PASSED] Both, PAL Default
[05:16:19] [PASSED] Both, NTSC Default, with PAL on command-line
[05:16:19] [PASSED] Both, PAL Default, with NTSC on command-line
[05:16:19] == [PASSED] drm_test_connector_helper_tv_get_modes_check ===
[05:16:19] ======== [PASSED] drm_connector_helper_tv_get_modes ========
[05:16:19] ================== drm_rect (9 subtests) ===================
[05:16:19] [PASSED] drm_test_rect_clip_scaled_div_by_zero
[05:16:19] [PASSED] drm_test_rect_clip_scaled_not_clipped
[05:16:19] [PASSED] drm_test_rect_clip_scaled_clipped
[05:16:19] [PASSED] drm_test_rect_clip_scaled_signed_vs_unsigned
[05:16:19] ================= drm_test_rect_intersect =================
[05:16:19] [PASSED] top-left x bottom-right: 2x2+1+1 x 2x2+0+0
[05:16:19] [PASSED] top-right x bottom-left: 2x2+0+0 x 2x2+1-1
[05:16:19] [PASSED] bottom-left x top-right: 2x2+1-1 x 2x2+0+0
[05:16:19] [PASSED] bottom-right x top-left: 2x2+0+0 x 2x2+1+1
[05:16:19] [PASSED] right x left: 2x1+0+0 x 3x1+1+0
[05:16:19] [PASSED] left x right: 3x1+1+0 x 2x1+0+0
[05:16:19] [PASSED] up x bottom: 1x2+0+0 x 1x3+0-1
[05:16:19] [PASSED] bottom x up: 1x3+0-1 x 1x2+0+0
[05:16:19] [PASSED] touching corner: 1x1+0+0 x 2x2+1+1
[05:16:19] [PASSED] touching side: 1x1+0+0 x 1x1+1+0
[05:16:19] [PASSED] equal rects: 2x2+0+0 x 2x2+0+0
[05:16:19] [PASSED] inside another: 2x2+0+0 x 1x1+1+1
[05:16:19] [PASSED] far away: 1x1+0+0 x 1x1+3+6
[05:16:19] [PASSED] points intersecting: 0x0+5+10 x 0x0+5+10
[05:16:19] [PASSED] points not intersecting: 0x0+0+0 x 0x0+5+10
[05:16:19] ============= [PASSED] drm_test_rect_intersect =============
[05:16:19] ================ drm_test_rect_calc_hscale ================
[05:16:19] [PASSED] normal use
[05:16:19] [PASSED] out of max range
[05:16:19] [PASSED] out of min range
[05:16:19] [PASSED] zero dst
[05:16:19] [PASSED] negative src
[05:16:19] [PASSED] negative dst
[05:16:19] ============ [PASSED] drm_test_rect_calc_hscale ============
[05:16:19] ================ drm_test_rect_calc_vscale ================
[05:16:19] [PASSED] normal use
[05:16:19] [PASSED] out of max range
[05:16:19] [PASSED] out of min range
[05:16:19] [PASSED] zero dst
[05:16:19] [PASSED] negative src
[05:16:19] [PASSED] negative dst
[05:16:19] ============ [PASSED] drm_test_rect_calc_vscale ============
[05:16:19] ================== drm_test_rect_rotate ===================
[05:16:19] [PASSED] reflect-x
[05:16:19] [PASSED] reflect-y
[05:16:19] [PASSED] rotate-0
[05:16:19] [PASSED] rotate-90
[05:16:19] [PASSED] rotate-180
[05:16:19] [PASSED] rotate-270
[05:16:19] ============== [PASSED] drm_test_rect_rotate ===============
[05:16:19] ================ drm_test_rect_rotate_inv =================
[05:16:19] [PASSED] reflect-x
[05:16:19] [PASSED] reflect-y
[05:16:19] [PASSED] rotate-0
[05:16:19] [PASSED] rotate-90
[05:16:19] [PASSED] rotate-180
[05:16:19] [PASSED] rotate-270
[05:16:19] ============ [PASSED] drm_test_rect_rotate_inv =============
[05:16:19] ==================== [PASSED] drm_rect =====================
[05:16:19] ============================================================
[05:16:19] Testing complete. Ran 526 tests: passed: 526
[05:16:19] Elapsed time: 24.808s total, 1.645s configuring, 22.995s building, 0.166s running
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/ttm/tests/.kunitconfig
[05:16:19] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[05:16:21] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json ARCH=um O=.kunit --jobs=48
[05:16:28] Starting KUnit Kernel (1/1)...
[05:16:28] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[05:16:28] ================= ttm_device (5 subtests) ==================
[05:16:28] [PASSED] ttm_device_init_basic
[05:16:28] [PASSED] ttm_device_init_multiple
[05:16:28] [PASSED] ttm_device_fini_basic
[05:16:28] [PASSED] ttm_device_init_no_vma_man
[05:16:28] ================== ttm_device_init_pools ==================
[05:16:28] [PASSED] No DMA allocations, no DMA32 required
[05:16:28] [PASSED] DMA allocations, DMA32 required
[05:16:28] [PASSED] No DMA allocations, DMA32 required
[05:16:28] [PASSED] DMA allocations, no DMA32 required
[05:16:28] ============== [PASSED] ttm_device_init_pools ==============
[05:16:28] =================== [PASSED] ttm_device ====================
[05:16:28] ================== ttm_pool (8 subtests) ===================
[05:16:28] ================== ttm_pool_alloc_basic ===================
[05:16:28] [PASSED] One page
[05:16:28] [PASSED] More than one page
[05:16:28] [PASSED] Above the allocation limit
[05:16:28] [PASSED] One page, with coherent DMA mappings enabled
[05:16:28] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[05:16:28] ============== [PASSED] ttm_pool_alloc_basic ===============
[05:16:28] ============== ttm_pool_alloc_basic_dma_addr ==============
[05:16:28] [PASSED] One page
[05:16:28] [PASSED] More than one page
[05:16:28] [PASSED] Above the allocation limit
[05:16:28] [PASSED] One page, with coherent DMA mappings enabled
[05:16:28] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[05:16:28] ========== [PASSED] ttm_pool_alloc_basic_dma_addr ==========
[05:16:28] [PASSED] ttm_pool_alloc_order_caching_match
[05:16:28] [PASSED] ttm_pool_alloc_caching_mismatch
[05:16:28] [PASSED] ttm_pool_alloc_order_mismatch
[05:16:28] [PASSED] ttm_pool_free_dma_alloc
[05:16:28] [PASSED] ttm_pool_free_no_dma_alloc
[05:16:28] [PASSED] ttm_pool_fini_basic
[05:16:28] ==================== [PASSED] ttm_pool =====================
[05:16:28] ================ ttm_resource (8 subtests) =================
[05:16:28] ================= ttm_resource_init_basic =================
[05:16:28] [PASSED] Init resource in TTM_PL_SYSTEM
[05:16:28] [PASSED] Init resource in TTM_PL_VRAM
[05:16:28] [PASSED] Init resource in a private placement
[05:16:28] [PASSED] Init resource in TTM_PL_SYSTEM, set placement flags
[05:16:28] ============= [PASSED] ttm_resource_init_basic =============
[05:16:28] [PASSED] ttm_resource_init_pinned
[05:16:28] [PASSED] ttm_resource_fini_basic
[05:16:28] [PASSED] ttm_resource_manager_init_basic
[05:16:28] [PASSED] ttm_resource_manager_usage_basic
[05:16:28] [PASSED] ttm_resource_manager_set_used_basic
[05:16:28] [PASSED] ttm_sys_man_alloc_basic
[05:16:28] [PASSED] ttm_sys_man_free_basic
[05:16:28] ================== [PASSED] ttm_resource ===================
[05:16:28] =================== ttm_tt (15 subtests) ===================
[05:16:28] ==================== ttm_tt_init_basic ====================
[05:16:28] [PASSED] Page-aligned size
[05:16:28] [PASSED] Extra pages requested
[05:16:28] ================ [PASSED] ttm_tt_init_basic ================
[05:16:28] [PASSED] ttm_tt_init_misaligned
[05:16:28] [PASSED] ttm_tt_fini_basic
[05:16:28] [PASSED] ttm_tt_fini_sg
[05:16:28] [PASSED] ttm_tt_fini_shmem
[05:16:28] [PASSED] ttm_tt_create_basic
[05:16:28] [PASSED] ttm_tt_create_invalid_bo_type
[05:16:28] [PASSED] ttm_tt_create_ttm_exists
[05:16:28] [PASSED] ttm_tt_create_failed
[05:16:28] [PASSED] ttm_tt_destroy_basic
[05:16:28] [PASSED] ttm_tt_populate_null_ttm
[05:16:28] [PASSED] ttm_tt_populate_populated_ttm
[05:16:28] [PASSED] ttm_tt_unpopulate_basic
[05:16:28] [PASSED] ttm_tt_unpopulate_empty_ttm
[05:16:28] [PASSED] ttm_tt_swapin_basic
[05:16:28] ===================== [PASSED] ttm_tt ======================
[05:16:28] =================== ttm_bo (14 subtests) ===================
[05:16:28] =========== ttm_bo_reserve_optimistic_no_ticket ===========
[05:16:28] [PASSED] Cannot be interrupted and sleeps
[05:16:28] [PASSED] Cannot be interrupted, locks straight away
[05:16:28] [PASSED] Can be interrupted, sleeps
[05:16:28] ======= [PASSED] ttm_bo_reserve_optimistic_no_ticket =======
[05:16:28] [PASSED] ttm_bo_reserve_locked_no_sleep
[05:16:28] [PASSED] ttm_bo_reserve_no_wait_ticket
[05:16:28] [PASSED] ttm_bo_reserve_double_resv
[05:16:28] [PASSED] ttm_bo_reserve_interrupted
[05:16:28] [PASSED] ttm_bo_reserve_deadlock
[05:16:28] [PASSED] ttm_bo_unreserve_basic
[05:16:28] [PASSED] ttm_bo_unreserve_pinned
[05:16:28] [PASSED] ttm_bo_unreserve_bulk
[05:16:28] [PASSED] ttm_bo_put_basic
[05:16:28] [PASSED] ttm_bo_put_shared_resv
[05:16:28] [PASSED] ttm_bo_pin_basic
[05:16:28] [PASSED] ttm_bo_pin_unpin_resource
[05:16:28] [PASSED] ttm_bo_multiple_pin_one_unpin
[05:16:28] ===================== [PASSED] ttm_bo ======================
[05:16:28] ============== ttm_bo_validate (22 subtests) ===============
[05:16:28] ============== ttm_bo_init_reserved_sys_man ===============
[05:16:28] [PASSED] Buffer object for userspace
[05:16:28] [PASSED] Kernel buffer object
[05:16:28] [PASSED] Shared buffer object
[05:16:28] ========== [PASSED] ttm_bo_init_reserved_sys_man ===========
[05:16:28] ============== ttm_bo_init_reserved_mock_man ==============
[05:16:28] [PASSED] Buffer object for userspace
[05:16:28] [PASSED] Kernel buffer object
[05:16:28] [PASSED] Shared buffer object
[05:16:28] ========== [PASSED] ttm_bo_init_reserved_mock_man ==========
[05:16:28] [PASSED] ttm_bo_init_reserved_resv
[05:16:28] ================== ttm_bo_validate_basic ==================
[05:16:28] [PASSED] Buffer object for userspace
[05:16:28] [PASSED] Kernel buffer object
[05:16:28] [PASSED] Shared buffer object
[05:16:28] ============== [PASSED] ttm_bo_validate_basic ==============
[05:16:28] [PASSED] ttm_bo_validate_invalid_placement
[05:16:28] ============= ttm_bo_validate_same_placement ==============
[05:16:28] [PASSED] System manager
[05:16:28] [PASSED] VRAM manager
[05:16:28] ========= [PASSED] ttm_bo_validate_same_placement ==========
[05:16:28] [PASSED] ttm_bo_validate_failed_alloc
[05:16:28] [PASSED] ttm_bo_validate_pinned
[05:16:28] [PASSED] ttm_bo_validate_busy_placement
[05:16:28] ================ ttm_bo_validate_multihop =================
[05:16:28] [PASSED] Buffer object for userspace
[05:16:28] [PASSED] Kernel buffer object
[05:16:28] [PASSED] Shared buffer object
[05:16:28] ============ [PASSED] ttm_bo_validate_multihop =============
[05:16:28] ========== ttm_bo_validate_no_placement_signaled ==========
[05:16:28] [PASSED] Buffer object in system domain, no page vector
[05:16:28] [PASSED] Buffer object in system domain with an existing page vector
[05:16:28] ====== [PASSED] ttm_bo_validate_no_placement_signaled ======
[05:16:28] ======== ttm_bo_validate_no_placement_not_signaled ========
[05:16:28] [PASSED] Buffer object for userspace
[05:16:28] [PASSED] Kernel buffer object
[05:16:28] [PASSED] Shared buffer object
[05:16:28] ==== [PASSED] ttm_bo_validate_no_placement_not_signaled ====
[05:16:28] [PASSED] ttm_bo_validate_move_fence_signaled
[05:16:28] ========= ttm_bo_validate_move_fence_not_signaled =========
[05:16:28] [PASSED] Waits for GPU
[05:16:28] [PASSED] Tries to lock straight away
[05:16:29] ===== [PASSED] ttm_bo_validate_move_fence_not_signaled =====
[05:16:29] [PASSED] ttm_bo_validate_swapout
[05:16:29] [PASSED] ttm_bo_validate_happy_evict
[05:16:29] [PASSED] ttm_bo_validate_all_pinned_evict
[05:16:29] [PASSED] ttm_bo_validate_allowed_only_evict
[05:16:29] [PASSED] ttm_bo_validate_deleted_evict
[05:16:29] [PASSED] ttm_bo_validate_busy_domain_evict
[05:16:29] [PASSED] ttm_bo_validate_evict_gutting
[05:16:29] [PASSED] ttm_bo_validate_recrusive_evict
stty: 'standard input': Inappropriate ioctl for device
[05:16:29] ================= [PASSED] ttm_bo_validate =================
[05:16:29] ============================================================
[05:16:29] Testing complete. Ran 102 tests: passed: 102
[05:16:29] Elapsed time: 9.924s total, 1.683s configuring, 7.624s building, 0.535s running
+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel
^ permalink raw reply [flat|nested] 24+ messages in thread* ✗ CI.Build: failure for Introduce DRM device wedged event (rev7)
2024-11-15 5:07 [PATCH v9 0/4] Introduce DRM device wedged event Raag Jadav
` (6 preceding siblings ...)
2024-11-15 5:16 ` ✓ CI.KUnit: success " Patchwork
@ 2024-11-15 5:25 ` Patchwork
7 siblings, 0 replies; 24+ messages in thread
From: Patchwork @ 2024-11-15 5:25 UTC (permalink / raw)
To: Raag Jadav; +Cc: intel-xe
== Series Details ==
Series: Introduce DRM device wedged event (rev7)
URL : https://patchwork.freedesktop.org/series/138070/
State : failure
== Summary ==
lib/modules/6.12.0-rc7-xe/kernel/crypto/ecrdsa_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/xcbc.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/serpent_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/aria_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/crypto_simd.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/adiantum.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/tcrypt.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/crypto_engine.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/zstd.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/asymmetric_keys/
lib/modules/6.12.0-rc7-xe/kernel/crypto/asymmetric_keys/pkcs7_test_key.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/asymmetric_keys/pkcs8_key_parser.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/des_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/xctr.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/authenc.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/sm4_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/keywrap.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/camellia_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/sm3.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/pcrypt.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/aegis128.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/af_alg.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/algif_aead.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/cmac.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/sm3_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/aes_ti.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/chacha_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/poly1305_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/nhpoly1305.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/crc32_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/essiv.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/ccm.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/wp512.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/streebog_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/authencesn.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/echainiv.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/lrw.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/cryptd.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/crypto_user.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/algif_hash.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/vmac.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/polyval-generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/hctr2.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/842.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/pcbc.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/ansi_cprng.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/cast6_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/twofish_common.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/twofish_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/lz4hc.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/blowfish_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/md4.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/chacha20poly1305.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/curve25519-generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/lz4.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/rmd160.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/algif_skcipher.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/cast5_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/fcrypt.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/ecdsa_generic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/sm4.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/cast_common.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/blowfish_common.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/michael_mic.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/async_tx/
lib/modules/6.12.0-rc7-xe/kernel/crypto/async_tx/async_xor.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/async_tx/async_tx.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/async_tx/async_memcpy.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/async_tx/async_pq.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/async_tx/async_raid6_recov.ko
lib/modules/6.12.0-rc7-xe/kernel/crypto/algif_rng.ko
lib/modules/6.12.0-rc7-xe/kernel/block/
lib/modules/6.12.0-rc7-xe/kernel/block/bfq.ko
lib/modules/6.12.0-rc7-xe/kernel/block/kyber-iosched.ko
lib/modules/6.12.0-rc7-xe/build
lib/modules/6.12.0-rc7-xe/modules.alias.bin
lib/modules/6.12.0-rc7-xe/modules.builtin
lib/modules/6.12.0-rc7-xe/modules.softdep
lib/modules/6.12.0-rc7-xe/modules.alias
lib/modules/6.12.0-rc7-xe/modules.order
lib/modules/6.12.0-rc7-xe/modules.symbols
lib/modules/6.12.0-rc7-xe/modules.dep.bin
+ mv kernel.tar.gz ..
+ cd ..
+ rm -rf archive
++ date +%s
^[[0Ksection_end:1731648332:package_x86_64
^[[0K
+ echo -e '\e[0Ksection_end:1731648332:package_x86_64\r\e[0K'
++ date +%s
+ echo -e '\e[0Ksection_start:1731648332:build_x86_64_nodebug[collapsed=true]\r\e[0KBuild x86-64 NoDebug'
+ mkdir -p build64-nodebug
^[[0Ksection_start:1731648332:build_x86_64_nodebug[collapsed=true]
^[[0KBuild x86-64 NoDebug
+ KCONFIG_CONFIG=build64-nodebug/.config
+ ./scripts/kconfig/merge_config.sh -m .ci/kernel/kconfig .ci/kernel/nodebug.fragment
Using .ci/kernel/kconfig as base
Merging .ci/kernel/nodebug.fragment
The merge file '.ci/kernel/nodebug.fragment' does not exist. Exit.
+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel
^ permalink raw reply [flat|nested] 24+ messages in thread