From: Raag Jadav <raag.jadav@intel.com>
To: "André Almeida" <andrealmeid@igalia.com>
Cc: "Alex Deucher" <alexander.deucher@amd.com>,
"Christian König" <christian.koenig@amd.com>,
siqueira@igalia.com, airlied@gmail.com, simona@ffwll.ch,
rodrigo.vivi@intel.com, jani.nikula@linux.intel.com,
"Xaver Hugl" <xaver.hugl@gmail.com>,
"Krzysztof Karas" <krzysztof.karas@intel.com>,
dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
kernel-dev@igalia.com, amd-gfx@lists.freedesktop.org,
intel-xe@lists.freedesktop.org, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v7 2/5] drm: Create a task info option for wedge events
Date: Sun, 15 Jun 2025 14:01:40 +0300 [thread overview]
Message-ID: <aE6oFOBbQ_3oRwtB@black.fi.intel.com> (raw)
In-Reply-To: <20250613184348.1761020-3-andrealmeid@igalia.com>
On Fri, Jun 13, 2025 at 03:43:45PM -0300, André Almeida wrote:
> When a device get wedged, it might be caused by a guilty application.
> For userspace, knowing which task was involved can be useful for some
> situations, like for implementing a policy, logs or for giving a chance
> for the compositor to let the user know what task was involved in the
> problem. This is an optional argument, when the task info is not
> available, the PID and TASK string won't appear in the event string.
>
> Sometimes just the PID isn't enough giving that the task might be already
> dead by the time userspace will try to check what was this PID's name,
> so to make the life easier also notify what's the task's name in the user
> event.
>
> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (for i915 and xe)
> Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
> Reviewed-by: Raag Jadav <raag.jadav@intel.com>
Although I'm okay with this version, a few aesthetic nits below.
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
...
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 56dd61f8e05a..eba99a081ec1 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -538,10 +538,15 @@ static const char *drm_get_wedge_recovery(unsigned int opt)
> }
> }
>
> +#define WEDGE_STR_LEN 32
> +#define PID_STR_LEN 15
> +#define COMM_STR_LEN (TASK_COMM_LEN + 5)
Align the values using tabs for readability, and since you're using
TASK_COMM_LEN here please include sched.h instead of relying on
intermediate header which may not guarantee it for other archs and
randconfigs.
> +
> /**
> * drm_dev_wedged_event - generate a device wedged uevent
> * @dev: DRM device
> * @method: method(s) to be used for recovery
> + * @info: optional information about the guilty task
> *
> * This generates a device wedged uevent for the DRM device specified by @dev.
> * Recovery @method\(s) of choice will be sent in the uevent environment as
> @@ -554,13 +559,13 @@ static const char *drm_get_wedge_recovery(unsigned int opt)
> *
> * Returns: 0 on success, negative error code otherwise.
> */
> -int drm_dev_wedged_event(struct drm_device *dev, unsigned long method)
> +int drm_dev_wedged_event(struct drm_device *dev, unsigned long method,
> + struct drm_wedge_task_info *info)
> {
> const char *recovery = NULL;
> unsigned int len, opt;
> - /* Event string length up to 28+ characters with available methods */
> - char event_string[32];
> - char *envp[] = { event_string, NULL };
> + char event_string[WEDGE_STR_LEN], pid_string[PID_STR_LEN], comm_string[COMM_STR_LEN];
> + char *envp[] = { event_string, NULL, NULL, NULL };
Let's make this reverse xmas order and be consistent with other helpers
in this file.
> len = scnprintf(event_string, sizeof(event_string), "%s", "WEDGED=");
>
> @@ -582,6 +587,13 @@ int drm_dev_wedged_event(struct drm_device *dev, unsigned long method)
> drm_info(dev, "device wedged, %s\n", method == DRM_WEDGE_RECOVERY_NONE ?
> "but recovered through reset" : "needs recovery");
>
> + if (info && (info->comm[0] != '\0') && (info->pid >= 0)) {
> + snprintf(pid_string, sizeof(pid_string), "PID=%u", info->pid);
> + snprintf(comm_string, sizeof(comm_string), "TASK=%s", info->comm);
> + envp[1] = pid_string;
> + envp[2] = comm_string;
> + }
> +
> return kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp);
> }
> EXPORT_SYMBOL(drm_dev_wedged_event);
...
> diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
> index e2f894f1b90a..729e1c6da138 100644
> --- a/include/drm/drm_device.h
> +++ b/include/drm/drm_device.h
> @@ -30,6 +30,14 @@ struct pci_controller;
> #define DRM_WEDGE_RECOVERY_REBIND BIT(1) /* unbind + bind driver */
> #define DRM_WEDGE_RECOVERY_BUS_RESET BIT(2) /* unbind + reset bus device + bind */
>
> +/**
> + * struct drm_wedge_task_info - information about the guilty task of a wedge dev
> + */
> +struct drm_wedge_task_info {
> + pid_t pid;
> + char comm[TASK_COMM_LEN];
Ditto for sched.h.
Raag
> +};
> +
> /**
> * enum switch_power_state - power state of drm device
> */
> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> index 63b51942d606..3f76a32d6b84 100644
> --- a/include/drm/drm_drv.h
> +++ b/include/drm/drm_drv.h
> @@ -487,7 +487,8 @@ void drm_put_dev(struct drm_device *dev);
> bool drm_dev_enter(struct drm_device *dev, int *idx);
> void drm_dev_exit(int idx);
> void drm_dev_unplug(struct drm_device *dev);
> -int drm_dev_wedged_event(struct drm_device *dev, unsigned long method);
> +int drm_dev_wedged_event(struct drm_device *dev, unsigned long method,
> + struct drm_wedge_task_info *info);
>
> /**
> * drm_dev_is_unplugged - is a DRM device unplugged
> --
> 2.49.0
>
next prev parent reply other threads:[~2025-06-16 13:10 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-13 18:43 [PATCH v7 0/5] drm: Create a task info option for wedge events André Almeida
2025-06-13 18:43 ` [PATCH v7 1/5] drm: amdgpu: Create amdgpu_vm_print_task_info() André Almeida
2025-06-16 7:03 ` Christian König
2025-06-13 18:43 ` [PATCH v7 2/5] drm: Create a task info option for wedge events André Almeida
2025-06-15 11:01 ` Raag Jadav [this message]
2025-06-13 18:43 ` [PATCH v7 3/5] drm/doc: Add a section about "Task information" for the wedge API André Almeida
2025-06-13 18:43 ` [PATCH v7 4/5] drm: amdgpu: Use struct drm_wedge_task_info inside of struct amdgpu_task_info André Almeida
2025-06-16 7:05 ` Christian König
2025-06-13 18:43 ` [PATCH v7 5/5] drm/amdgpu: Make use of drm_wedge_task_info André Almeida
2025-06-13 21:15 ` Alex Deucher
2025-06-16 7:10 ` Christian König
2025-06-13 19:21 ` ✗ CI.checkpatch: warning for drm: Create a task info option for wedge events (rev2) Patchwork
2025-06-13 19:23 ` ✓ CI.KUnit: success " Patchwork
2025-06-13 19:34 ` ✗ i915.CI.BAT: failure " Patchwork
2025-06-13 19:37 ` ✗ CI.checksparse: warning " Patchwork
2025-06-13 20:04 ` ✓ Xe.CI.BAT: success " Patchwork
2025-06-15 17:29 ` ✓ Xe.CI.Full: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aE6oFOBbQ_3oRwtB@black.fi.intel.com \
--to=raag.jadav@intel.com \
--cc=airlied@gmail.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=andrealmeid@igalia.com \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=jani.nikula@linux.intel.com \
--cc=kernel-dev@igalia.com \
--cc=krzysztof.karas@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rodrigo.vivi@intel.com \
--cc=simona@ffwll.ch \
--cc=siqueira@igalia.com \
--cc=xaver.hugl@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.