All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Riana Tauro" <riana.tauro@intel.com>
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v2 1/2] drm/drm_ras: Add drm_ras netlink error event
Date: Fri, 12 Jun 2026 05:22:23 +0000	[thread overview]
Message-ID: <20260612052224.1D6A31F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260611052144.784969-5-riana.tauro@intel.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

New issues:
- [Medium] drm_ras_nl_error_event() restricts netlink multicast notifications to the initial network namespace, breaking netnsok = true functionality.

Pre-existing issues:
- [Critical] Missing RCU read lock or refcounting during xarray iteration and node lookup causes a Use-After-Free vulnerability.
- [High] Uninitialized stack variable `ret` is returned to userspace in netlink dumpit operations.
--

commit 162404af4b01c208febc1891d2805e74285b3e40
Author: Riana Tauro <riana.tauro@intel.com>

drm/drm_ras: Add drm_ras netlink error event

Define a new netlink event 'error-event' and a new multicast group
'error-notify' in drm_ras. Each event contains device name, node and
error information to identify the error triggering the event.

> diff --git a/drivers/gpu/drm/drm_ras.c b/drivers/gpu/drm/drm_ras.c
> index d6eab29a1394c..503d555e067c0 100644
> --- a/drivers/gpu/drm/drm_ras.c
> +++ b/drivers/gpu/drm/drm_ras.c

[Severity: Critical]
This is a pre-existing issue, but missing RCU read locks or refcounting
during xarray iteration and node lookup might cause a use-after-free.

Looking at get_node_error_counter(), drm_ras_nl_clear_error_counter_doit(),
and drm_ras_nl_list_nodes_dumpit(), they retrieve a node pointer from
drm_ras_xa:

	node = xa_load(&drm_ras_xa, node_id);
	if (!node || !node->query_error_counter)
		return -ENOENT;
	...
	return node->query_error_counter(node, error_id, name, value);

Since xa_load() releases the RCU read lock before returning, and the caller
does not hold an outer rcu_read_lock(), could a concurrent call to
drm_ras_node_unregister() free the node and result in a use-after-free
when accessing node fields?

[Severity: High]
This isn't a bug introduced by this patch, but returning an uninitialized
stack variable could cause issues in the netlink dumpit operations.

In drm_ras_nl_list_nodes_dumpit() and drm_ras_nl_get_error_counter_dumpit(),
the local variable ret is declared without initialization:

	int ret;

	xa_for_each_start(&drm_ras_xa, id, node, ctx->restart) {
	...
	}

	if (ret == -EMSGSIZE)
		ctx->restart = id;

	return ret;

If the xarray is empty or ctx->restart is beyond the last element (such as
on the final dump iteration), the loop is entirely skipped.

Could this return an uninitialized ret to userspace, potentially leaking
stack memory or causing netlink_dump() to loop indefinitely if ret happens
to be a positive value?

[ ... ]

> @@ -222,6 +255,61 @@ static int doit_reply_value(struct genl_info *info, u32 node_id,
>  	return genlmsg_reply(msg, info);
>  }
>  
> +/**
> + * drm_ras_nl_error_event() - Notify listeners of an error event
> + * @node: Node structure
> + * @error_id: ID of the error
> + * @error_name: Name of the error
> + * @value: Value associated with the error
> + * @flags: GFP flags for memory allocation
> + *
> + * Sends a notification to all listeners about an error event on a specific
> + * RAS node.
> + *
> + * Return: 0 on success, or negative errno on failure.
> + */
> +int drm_ras_nl_error_event(struct drm_ras_node *node, u32 error_id, const char *error_name,
> +			   u32 value, gfp_t flags)
> +{
> +	struct genl_info info;
> +	struct sk_buff *msg;
> +	struct nlattr *hdr;
> +	int ret;
> +
> +	if (!error_name)
> +		return -EINVAL;
> +
> +	if (!genl_has_listeners(&drm_ras_nl_family, &init_net, DRM_RAS_NLGRP_ERROR_NOTIFY))
> +		return 0;

[Severity: Medium]
Since the drm_ras_nl_family is configured with .netnsok = true, does this
check inadvertently restrict multicast notifications to only the initial
network namespace?

By checking &init_net and subsequently using genlmsg_multicast(), it
appears that subscribers in other network namespaces might not receive
these error events.

Would it be better to use genlmsg_multicast_allns() here to properly
broadcast to all namespaces?

> +
> +	genl_info_init_ntf(&info, &drm_ras_nl_family, DRM_RAS_CMD_ERROR_EVENT);
> +
> +	msg = genlmsg_new(NLMSG_GOODSIZE, flags);
> +	if (!msg)
> +		return -ENOMEM;
> +
> +	hdr = genlmsg_iput(msg, &info);
> +	if (!hdr) {
> +		ret = -EMSGSIZE;
> +		goto free_msg;
> +	}
> +
> +	ret = msg_put_error_event_attrs(msg, node, error_id, error_name, value);
> +	if (ret)
> +		goto cancel_msg;
> +
> +	genlmsg_end(msg, hdr);
> +	genlmsg_multicast(&drm_ras_nl_family, msg, 0, DRM_RAS_NLGRP_ERROR_NOTIFY, flags);
> +	return 0;
> +
> +cancel_msg:
> +	genlmsg_cancel(msg, hdr);
> +free_msg:
> +	nlmsg_free(msg);
> +	return ret;
> +}
> +EXPORT_SYMBOL(drm_ras_nl_error_event);

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260611052144.784969-4-riana.tauro@intel.com?part=1

  reply	other threads:[~2026-06-12  5:22 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-11  5:21 [PATCH v2 0/2] Add drm_ras netlink error event support Riana Tauro
2026-06-11  5:21 ` [PATCH v2 1/2] drm/drm_ras: Add drm_ras netlink error event Riana Tauro
2026-06-12  5:22   ` sashiko-bot [this message]
2026-06-18 10:06   ` Raag Jadav
2026-06-11  5:21 ` [PATCH v2 2/2] drm/xe/xe_drm_ras: Add error-event support in XE drm_ras Riana Tauro
2026-06-12  5:22   ` sashiko-bot
2026-06-18 10:07   ` Raag Jadav
2026-06-11  5:29 ` ✓ CI.KUnit: success for Add drm_ras netlink error event support (rev2) Patchwork
2026-06-11  6:20 ` ✓ Xe.CI.BAT: " Patchwork
2026-06-11 13:45 ` ✓ Xe.CI.FULL: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260612052224.1D6A31F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=riana.tauro@intel.com \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.