* [PATCH v6 0/3] drm: Create a task info option for wedge events
@ 2025-05-21 15:33 André Almeida
2025-05-21 15:33 ` [PATCH v6 1/3] " André Almeida
` (5 more replies)
0 siblings, 6 replies; 8+ messages in thread
From: André Almeida @ 2025-05-21 15:33 UTC (permalink / raw)
To: Alex Deucher, Christian König, siqueira, airlied, simona,
Raag Jadav, rodrigo.vivi, jani.nikula, Xaver Hugl,
Krzysztof Karas
Cc: dri-devel, linux-kernel, kernel-dev, amd-gfx, intel-xe, intel-gfx,
André Almeida
This patchset implements a request made by Xaver Hugl about wedge events:
"I'd really like to have the PID of the client that triggered the GPU
reset, so that we can kill it if multiple resets are triggered in a
row (or switch to software rendering if it's KWin itself) and show a
user-friendly notification about why their app(s) crashed, but that
can be added later."
>From https://lore.kernel.org/dri-devel/CAFZQkGwJ4qgHV8WTp2=svJ_VXhb-+Y8_VNtKB=jLsk6DqMYp9w@mail.gmail.com/
For testing, I've used amdgpu's debug_mask options debug_disable_soft_recovery
and debug_disable_gpu_ring_reset to test both wedge event paths in the driver.
To trigger a ring timeout, I've used this app:
https://gitlab.freedesktop.org/andrealmeid/gpu-timeout
Thanks!
Changelog:
v6:
- Check if PID >= 0 for displaying the task info
- s/app/task in a comment
v5:
- Change from app to task also in structs, commit message and docs
- Add a check for NULL or empty task name string
v4:
- Change from APP to TASK
- Add defines for event_string and pid_string length
v3:
- Make comm_string and pid_string empty when there's no app info
- Change "app that caused ..." to "app involved ..."
- Clarify that devcoredump have more information about what happened
v2:
- Rebased on top of drm/drm-next
- Added new patch for documentation
André Almeida (3):
drm: Create a task info option for wedge events
drm/doc: Add a section about "Task information" for the wedge API
drm/amdgpu: Make use of drm_wedge_task_info
Documentation/gpu/drm-uapi.rst | 17 +++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 +++++++++++++++++--
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 +++++-
drivers/gpu/drm/drm_drv.c | 19 +++++++++++++++----
drivers/gpu/drm/i915/gt/intel_reset.c | 3 ++-
drivers/gpu/drm/xe/xe_device.c | 3 ++-
include/drm/drm_device.h | 8 ++++++++
include/drm/drm_drv.h | 3 ++-
8 files changed, 68 insertions(+), 10 deletions(-)
--
2.49.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v6 1/3] drm: Create a task info option for wedge events
2025-05-21 15:33 [PATCH v6 0/3] drm: Create a task info option for wedge events André Almeida
@ 2025-05-21 15:33 ` André Almeida
2025-05-23 2:45 ` Raag Jadav
2025-05-21 15:33 ` [PATCH v6 2/3] drm/doc: Add a section about "Task information" for the wedge API André Almeida
` (4 subsequent siblings)
5 siblings, 1 reply; 8+ messages in thread
From: André Almeida @ 2025-05-21 15:33 UTC (permalink / raw)
To: Alex Deucher, Christian König, siqueira, airlied, simona,
Raag Jadav, rodrigo.vivi, jani.nikula, Xaver Hugl,
Krzysztof Karas
Cc: dri-devel, linux-kernel, kernel-dev, amd-gfx, intel-xe, intel-gfx,
André Almeida
When a device get wedged, it might be caused by a guilty application.
For userspace, knowing which task was the cause can be useful for some
situations, like for implementing a policy, logs or for giving a chance
for the compositor to let the user know what task caused the problem.
This is an optional argument, when the task info is not available, the
PID and TASK string won't appear in the event string.
Sometimes just the PID isn't enough giving that the task might be already
dead by the time userspace will try to check what was this PID's name,
so to make the life easier also notify what's the task's name in the user
event.
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (for i915 and xe)
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
v6:
- s/app/task in a comment
- add PID >= 0 check
v5:
- s/app/task for struct and commit message as well
- move defines to drm_drv.c
- validates if comm is not NULL and it's not empty
v4: s/APP/TASK
v3: Make comm_string and pid_string empty when there's no app info
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +-
drivers/gpu/drm/drm_drv.c | 19 +++++++++++++++----
drivers/gpu/drm/i915/gt/intel_reset.c | 3 ++-
drivers/gpu/drm/xe/xe_device.c | 3 ++-
include/drm/drm_device.h | 8 ++++++++
include/drm/drm_drv.h | 3 ++-
7 files changed, 31 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 4d1b54f58495..d27091d5929c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -6363,7 +6363,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
atomic_set(&adev->reset_domain->reset_res, r);
if (!r)
- drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE);
+ drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, NULL);
return r;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index acb21fc8b3ce..a47b2eb301e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -166,7 +166,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
if (amdgpu_ring_sched_ready(ring))
drm_sched_start(&ring->sched, 0);
dev_err(adev->dev, "Ring %s reset succeeded\n", ring->sched.name);
- drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE);
+ drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, NULL);
goto exit;
}
dev_err(adev->dev, "Ring %s reset failure\n", ring->sched.name);
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 3dc7acd56b1d..db0cfa367b4e 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -538,10 +538,14 @@ static const char *drm_get_wedge_recovery(unsigned int opt)
}
}
+#define WEDGE_STR_LEN 32
+#define PID_LEN 15
+
/**
* drm_dev_wedged_event - generate a device wedged uevent
* @dev: DRM device
* @method: method(s) to be used for recovery
+ * @info: optional information about the guilty task
*
* This generates a device wedged uevent for the DRM device specified by @dev.
* Recovery @method\(s) of choice will be sent in the uevent environment as
@@ -554,13 +558,13 @@ static const char *drm_get_wedge_recovery(unsigned int opt)
*
* Returns: 0 on success, negative error code otherwise.
*/
-int drm_dev_wedged_event(struct drm_device *dev, unsigned long method)
+int drm_dev_wedged_event(struct drm_device *dev, unsigned long method,
+ struct drm_wedge_task_info *info)
{
const char *recovery = NULL;
unsigned int len, opt;
- /* Event string length up to 28+ characters with available methods */
- char event_string[32];
- char *envp[] = { event_string, NULL };
+ char event_string[WEDGE_STR_LEN], pid_string[PID_LEN] = "", comm_string[TASK_COMM_LEN] = "";
+ char *envp[] = { event_string, NULL, NULL, NULL };
len = scnprintf(event_string, sizeof(event_string), "%s", "WEDGED=");
@@ -582,6 +586,13 @@ int drm_dev_wedged_event(struct drm_device *dev, unsigned long method)
drm_info(dev, "device wedged, %s\n", method == DRM_WEDGE_RECOVERY_NONE ?
"but recovered through reset" : "needs recovery");
+ if (info && ((info->comm && info->comm[0] != '\0')) && (info->pid >= 0)) {
+ snprintf(pid_string, sizeof(pid_string), "PID=%u", info->pid);
+ snprintf(comm_string, sizeof(comm_string), "TASK=%s", info->comm);
+ envp[1] = pid_string;
+ envp[2] = comm_string;
+ }
+
return kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp);
}
EXPORT_SYMBOL(drm_dev_wedged_event);
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index dbdcfe130ad4..ba1d8fdc3c7b 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1448,7 +1448,8 @@ static void intel_gt_reset_global(struct intel_gt *gt,
kobject_uevent_env(kobj, KOBJ_CHANGE, reset_done_event);
else
drm_dev_wedged_event(>->i915->drm,
- DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET);
+ DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET,
+ NULL);
}
/**
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index c02c4c4e9412..f329613e061f 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -1168,7 +1168,8 @@ void xe_device_declare_wedged(struct xe_device *xe)
/* Notify userspace of wedged device */
drm_dev_wedged_event(&xe->drm,
- DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET);
+ DRM_WEDGE_RECOVERY_REBIND | DRM_WEDGE_RECOVERY_BUS_RESET,
+ NULL);
}
for_each_gt(gt, xe, id)
diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
index e2f894f1b90a..91931301355e 100644
--- a/include/drm/drm_device.h
+++ b/include/drm/drm_device.h
@@ -30,6 +30,14 @@ struct pci_controller;
#define DRM_WEDGE_RECOVERY_REBIND BIT(1) /* unbind + bind driver */
#define DRM_WEDGE_RECOVERY_BUS_RESET BIT(2) /* unbind + reset bus device + bind */
+/**
+ * struct drm_wedge_task_info - information about the guilty task of a wedge dev
+ */
+struct drm_wedge_task_info {
+ pid_t pid;
+ char *comm;
+};
+
/**
* enum switch_power_state - power state of drm device
*/
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index a43d707b5f36..ac45ed855321 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -482,7 +482,8 @@ void drm_put_dev(struct drm_device *dev);
bool drm_dev_enter(struct drm_device *dev, int *idx);
void drm_dev_exit(int idx);
void drm_dev_unplug(struct drm_device *dev);
-int drm_dev_wedged_event(struct drm_device *dev, unsigned long method);
+int drm_dev_wedged_event(struct drm_device *dev, unsigned long method,
+ struct drm_wedge_task_info *info);
/**
* drm_dev_is_unplugged - is a DRM device unplugged
--
2.49.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v6 2/3] drm/doc: Add a section about "Task information" for the wedge API
2025-05-21 15:33 [PATCH v6 0/3] drm: Create a task info option for wedge events André Almeida
2025-05-21 15:33 ` [PATCH v6 1/3] " André Almeida
@ 2025-05-21 15:33 ` André Almeida
2025-05-21 15:33 ` [PATCH v6 3/3] drm/amdgpu: Make use of drm_wedge_task_info André Almeida
` (3 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: André Almeida @ 2025-05-21 15:33 UTC (permalink / raw)
To: Alex Deucher, Christian König, siqueira, airlied, simona,
Raag Jadav, rodrigo.vivi, jani.nikula, Xaver Hugl,
Krzysztof Karas
Cc: dri-devel, linux-kernel, kernel-dev, amd-gfx, intel-xe, intel-gfx,
André Almeida
Add a section about "Task information" for the wedge API.
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Raag Jadav <raag.jadav@intel.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
v5:
- Change app to task in the text as well
v4:
- Change APP to TASK
v3:
- Change "app that caused ..." to "app involved ..."
- Clarify that devcoredump have more information about what happened
- Update that PID and APP will be empty if there's no app info
---
Documentation/gpu/drm-uapi.rst | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
index 69f72e71a96e..24aa9f320ebc 100644
--- a/Documentation/gpu/drm-uapi.rst
+++ b/Documentation/gpu/drm-uapi.rst
@@ -446,6 +446,23 @@ telemetry information (devcoredump, syslog). This is useful because the first
hang is usually the most critical one which can result in consequential hangs or
complete wedging.
+Task information
+---------------
+
+The information about which application (if any) was involved in the device
+wedging is useful for userspace if they want to notify the user about what
+happened (e.g. the compositor display a message to the user "The <task name>
+caused a graphical error and the system recovered") or to implement policies
+(e.g. the daemon may "ban" an task that keeps resetting the device). If the task
+information is available, the uevent will display as ``PID=<pid>`` and
+``TASK=<task name>``. Otherwise, ``PID`` and ``TASK`` will not appear in the
+event string.
+
+The reliability of this information is driver and hardware specific, and should
+be taken with a caution regarding it's precision. To have a big picture of what
+really happened, the devcoredump file provides should have much more detailed
+information about the device state and about the event.
+
Consumer prerequisites
----------------------
--
2.49.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v6 3/3] drm/amdgpu: Make use of drm_wedge_task_info
2025-05-21 15:33 [PATCH v6 0/3] drm: Create a task info option for wedge events André Almeida
2025-05-21 15:33 ` [PATCH v6 1/3] " André Almeida
2025-05-21 15:33 ` [PATCH v6 2/3] drm/doc: Add a section about "Task information" for the wedge API André Almeida
@ 2025-05-21 15:33 ` André Almeida
2025-05-21 15:57 ` ✓ CI.Patch_applied: success for drm: Create a task info option for wedge events Patchwork
` (2 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: André Almeida @ 2025-05-21 15:33 UTC (permalink / raw)
To: Alex Deucher, Christian König, siqueira, airlied, simona,
Raag Jadav, rodrigo.vivi, jani.nikula, Xaver Hugl,
Krzysztof Karas
Cc: dri-devel, linux-kernel, kernel-dev, amd-gfx, intel-xe, intel-gfx,
André Almeida
To notify userspace about which task (if any) made the device get in a
wedge state, make use of drm_wedge_task_info parameter, filling it with
the task PID and name.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 +++++++++++++++++--
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 +++++-
2 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index d27091d5929c..c29c924aa506 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -6362,8 +6362,23 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
atomic_set(&adev->reset_domain->reset_res, r);
- if (!r)
- drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, NULL);
+ if (!r) {
+ struct drm_wedge_task_info aux, *info = NULL;
+
+ if (job) {
+ struct amdgpu_task_info *ti;
+
+ ti = amdgpu_vm_get_task_info_pasid(adev, job->pasid);
+ if (ti) {
+ aux.pid = ti->pid;
+ aux.comm = ti->process_name;
+ info = &aux;
+ amdgpu_vm_put_task_info(ti);
+ }
+ }
+
+ drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, info);
+ }
return r;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index a47b2eb301e5..5cb17e62df57 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -89,6 +89,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
{
struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
struct amdgpu_job *job = to_amdgpu_job(s_job);
+ struct drm_wedge_task_info aux, *info = NULL;
struct amdgpu_task_info *ti;
struct amdgpu_device *adev = ring->adev;
int idx;
@@ -127,6 +128,9 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
dev_err(adev->dev,
"Process information: process %s pid %d thread %s pid %d\n",
ti->process_name, ti->tgid, ti->task_name, ti->pid);
+ aux.pid = ti->pid;
+ aux.comm = ti->process_name;
+ info = &aux;
amdgpu_vm_put_task_info(ti);
}
@@ -166,7 +170,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
if (amdgpu_ring_sched_ready(ring))
drm_sched_start(&ring->sched, 0);
dev_err(adev->dev, "Ring %s reset succeeded\n", ring->sched.name);
- drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, NULL);
+ drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE, info);
goto exit;
}
dev_err(adev->dev, "Ring %s reset failure\n", ring->sched.name);
--
2.49.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* ✓ CI.Patch_applied: success for drm: Create a task info option for wedge events
2025-05-21 15:33 [PATCH v6 0/3] drm: Create a task info option for wedge events André Almeida
` (2 preceding siblings ...)
2025-05-21 15:33 ` [PATCH v6 3/3] drm/amdgpu: Make use of drm_wedge_task_info André Almeida
@ 2025-05-21 15:57 ` Patchwork
2025-05-21 15:57 ` ✗ CI.checkpatch: warning " Patchwork
2025-05-21 15:57 ` ✗ CI.KUnit: failure " Patchwork
5 siblings, 0 replies; 8+ messages in thread
From: Patchwork @ 2025-05-21 15:57 UTC (permalink / raw)
To: André Almeida; +Cc: intel-xe
== Series Details ==
Series: drm: Create a task info option for wedge events
URL : https://patchwork.freedesktop.org/series/149332/
State : success
== Summary ==
=== Applying kernel patches on branch 'drm-tip' with base: ===
Base commit: 2d8dc23f1e6c drm-tip: 2025y-05m-21d-13h-29m-52s UTC integration manifest
=== git am output follows ===
Applying: drm: Create a task info option for wedge events
Applying: drm/doc: Add a section about "Task information" for the wedge API
Applying: drm/amdgpu: Make use of drm_wedge_task_info
^ permalink raw reply [flat|nested] 8+ messages in thread
* ✗ CI.checkpatch: warning for drm: Create a task info option for wedge events
2025-05-21 15:33 [PATCH v6 0/3] drm: Create a task info option for wedge events André Almeida
` (3 preceding siblings ...)
2025-05-21 15:57 ` ✓ CI.Patch_applied: success for drm: Create a task info option for wedge events Patchwork
@ 2025-05-21 15:57 ` Patchwork
2025-05-21 15:57 ` ✗ CI.KUnit: failure " Patchwork
5 siblings, 0 replies; 8+ messages in thread
From: Patchwork @ 2025-05-21 15:57 UTC (permalink / raw)
To: André Almeida; +Cc: intel-xe
== Series Details ==
Series: drm: Create a task info option for wedge events
URL : https://patchwork.freedesktop.org/series/149332/
State : warning
== Summary ==
+ KERNEL=/kernel
+ git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt
Cloning into 'mt'...
warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/
+ git -C mt rev-list -n1 origin/master
202708c00696422fd217223bb679a353a5936e23
+ cd /kernel
+ git config --global --add safe.directory /kernel
+ git log -n1
commit 31e76dc1708a823bacf3fe8faccf756db7a0069d
Author: André Almeida <andrealmeid@igalia.com>
Date: Wed May 21 12:33:23 2025 -0300
drm/amdgpu: Make use of drm_wedge_task_info
To notify userspace about which task (if any) made the device get in a
wedge state, make use of drm_wedge_task_info parameter, filling it with
the task PID and name.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
+ /mt/dim checkpatch 2d8dc23f1e6ce72fe9e45fd0db71076d90a4e1aa drm-intel
0048782eb049 drm: Create a task info option for wedge events
-:84: WARNING:STATIC_CONST_CHAR_ARRAY: char * array declaration might be better as static const
#84: FILE: drivers/gpu/drm/drm_drv.c:567:
+ char *envp[] = { event_string, NULL, NULL, NULL };
-:92: CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'info->pid >= 0'
#92: FILE: drivers/gpu/drm/drm_drv.c:589:
+ if (info && ((info->comm && info->comm[0] != '\0')) && (info->pid >= 0)) {
total: 0 errors, 1 warnings, 1 checks, 101 lines checked
61cf01402432 drm/doc: Add a section about "Task information" for the wedge API
31e76dc1708a drm/amdgpu: Make use of drm_wedge_task_info
^ permalink raw reply [flat|nested] 8+ messages in thread
* ✗ CI.KUnit: failure for drm: Create a task info option for wedge events
2025-05-21 15:33 [PATCH v6 0/3] drm: Create a task info option for wedge events André Almeida
` (4 preceding siblings ...)
2025-05-21 15:57 ` ✗ CI.checkpatch: warning " Patchwork
@ 2025-05-21 15:57 ` Patchwork
5 siblings, 0 replies; 8+ messages in thread
From: Patchwork @ 2025-05-21 15:57 UTC (permalink / raw)
To: André Almeida; +Cc: intel-xe
== Series Details ==
Series: drm: Create a task info option for wedge events
URL : https://patchwork.freedesktop.org/series/149332/
State : failure
== Summary ==
+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
[15:57:29] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[15:57:33] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
ERROR:root:../drivers/gpu/drm/drm_bridge.c:1406:6: error: redefinition of ‘devm_drm_put_bridge’
1406 | void devm_drm_put_bridge(struct device *dev, struct drm_bridge *bridge)
| ^~~~~~~~~~~~~~~~~~~
In file included from ../drivers/gpu/drm/drm_bridge.c:31:
../include/drm/drm_bridge.h:1314:20: note: previous definition of ‘devm_drm_put_bridge’ with type ‘void(struct device *, struct drm_bridge *)’
1314 | static inline void devm_drm_put_bridge(struct device *dev, struct drm_bridge *bridge) {}
| ^~~~~~~~~~~~~~~~~~~
make[6]: *** [../scripts/Makefile.build:203: drivers/gpu/drm/drm_bridge.o] Error 1
make[6]: *** Waiting for unfinished jobs....
make[5]: *** [../scripts/Makefile.build:461: drivers/gpu/drm] Error 2
make[4]: *** [../scripts/Makefile.build:461: drivers/gpu] Error 2
make[3]: *** [../scripts/Makefile.build:461: drivers] Error 2
make[2]: *** [/kernel/Makefile:2003: .] Error 2
make[1]: *** [/kernel/Makefile:248: __sub-make] Error 2
make: *** [Makefile:248: __sub-make] Error 2
+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v6 1/3] drm: Create a task info option for wedge events
2025-05-21 15:33 ` [PATCH v6 1/3] " André Almeida
@ 2025-05-23 2:45 ` Raag Jadav
0 siblings, 0 replies; 8+ messages in thread
From: Raag Jadav @ 2025-05-23 2:45 UTC (permalink / raw)
To: André Almeida
Cc: Alex Deucher, Christian König, siqueira, airlied, simona,
rodrigo.vivi, jani.nikula, Xaver Hugl, Krzysztof Karas, dri-devel,
linux-kernel, kernel-dev, amd-gfx, intel-xe, intel-gfx
On Wed, May 21, 2025 at 12:33:21PM -0300, André Almeida wrote:
> When a device get wedged, it might be caused by a guilty application.
> For userspace, knowing which task was the cause can be useful for some
s/cause/involved
> situations, like for implementing a policy, logs or for giving a chance
> for the compositor to let the user know what task caused the problem.
Ditto
> This is an optional argument, when the task info is not available, the
> PID and TASK string won't appear in the event string.
>
> Sometimes just the PID isn't enough giving that the task might be already
> dead by the time userspace will try to check what was this PID's name,
> so to make the life easier also notify what's the task's name in the user
> event.
...
> -int drm_dev_wedged_event(struct drm_device *dev, unsigned long method)
> +int drm_dev_wedged_event(struct drm_device *dev, unsigned long method,
> + struct drm_wedge_task_info *info)
> {
> const char *recovery = NULL;
> unsigned int len, opt;
> - /* Event string length up to 28+ characters with available methods */
> - char event_string[32];
> - char *envp[] = { event_string, NULL };
> + char event_string[WEDGE_STR_LEN], pid_string[PID_LEN] = "", comm_string[TASK_COMM_LEN] = "";
Most likely there's no need to initialize these.
With above changes,
Reviewed-by: Raag Jadav <raag.jadav@intel.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-05-23 3:02 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-21 15:33 [PATCH v6 0/3] drm: Create a task info option for wedge events André Almeida
2025-05-21 15:33 ` [PATCH v6 1/3] " André Almeida
2025-05-23 2:45 ` Raag Jadav
2025-05-21 15:33 ` [PATCH v6 2/3] drm/doc: Add a section about "Task information" for the wedge API André Almeida
2025-05-21 15:33 ` [PATCH v6 3/3] drm/amdgpu: Make use of drm_wedge_task_info André Almeida
2025-05-21 15:57 ` ✓ CI.Patch_applied: success for drm: Create a task info option for wedge events Patchwork
2025-05-21 15:57 ` ✗ CI.checkpatch: warning " Patchwork
2025-05-21 15:57 ` ✗ CI.KUnit: failure " Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox