From: Riana Tauro <riana.tauro@intel.com>
To: Zack McKevitt <zachary.mckevitt@oss.qualcomm.com>,
Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jakub Kicinski <kuba@kernel.org>,
<intel-xe@lists.freedesktop.org>,
<dri-devel@lists.freedesktop.org>,
<aravind.iddamsetty@linux.intel.com>, <anshuman.gupta@intel.com>,
<joonas.lahtinen@linux.intel.com>, <lukas@wunner.de>,
<simona.vetter@ffwll.ch>, <airlied@gmail.com>,
<pratik.bari@intel.com>, <joshua.santosh.ranjan@intel.com>,
<ashwin.kumar.kulkarni@intel.com>, <shubham.kumar@intel.com>,
Lijo Lazar <lijo.lazar@amd.com>,
Hawking Zhang <Hawking.Zhang@amd.com>,
"David S. Miller" <davem@davemloft.net>,
Paolo Abeni <pabeni@redhat.com>,
Eric Dumazet <edumazet@google.com>, <netdev@vger.kernel.org>,
Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Subject: Re: [PATCH v3 1/4] drm/ras: Introduce the DRM RAS infrastructure over generic netlink
Date: Fri, 16 Jan 2026 11:26:36 +0530 [thread overview]
Message-ID: <3297a59b-a788-43aa-945d-e89592c9ba8d@intel.com> (raw)
In-Reply-To: <919c0b7e-83d7-45e8-ae96-d9fb7a10995c@oss.qualcomm.com>
On 1/16/2026 5:09 AM, Zack McKevitt wrote:
>
>
> On 1/13/2026 1:20 AM, Riana Tauro wrote:
>>>>>> diff --git a/Documentation/netlink/specs/drm_ras.yaml b/
>>>>>> Documentation/netlink/specs/drm_ras.yaml
>>>>>> new file mode 100644
>>>>>> index 000000000000..be0e379c5bc9
>>>>>> --- /dev/null
>>>>>> +++ b/Documentation/netlink/specs/drm_ras.yaml
>>>>>> @@ -0,0 +1,130 @@
>>>>>> +# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR
>>>>>> BSD-3-Clause)
>>>>>> +---
>>>>>> +name: drm-ras
>>>>>> +protocol: genetlink
>>>>>> +uapi-header: drm/drm_ras.h
>>>>>> +
>>>>>> +doc: >-
>>>>>> + DRM RAS (Reliability, Availability, Serviceability) over
>>>>>> Generic Netlink.
>>>>>> + Provides a standardized mechanism for DRM drivers to register
>>>>>> "nodes"
>>>>>> + representing hardware/software components capable of reporting
>>>>>> error counters.
>>>>>> + Userspace tools can query the list of nodes or individual error
>>>>>> counters
>>>>>> + via the Generic Netlink interface.
>>>>>> +
>>>>>> +definitions:
>>>>>> + -
>>>>>> + type: enum
>>>>>> + name: node-type
>>>>>> + value-start: 1
>>>>>> + entries: [error-counter]
>>>>>> + doc: >-
>>>>>> + Type of the node. Currently, only error-counter nodes are
>>>>>> + supported, which expose reliability counters for a
>>>>>> hardware/software
>>>>>> + component.
>>>>>> +
>>>>>> +attribute-sets:
>>>>>> + -
>>>>>> + name: node-attrs
>>>>>> + attributes:
>>>>>> + -
>>>>>> + name: node-id
>>>>>> + type: u32
>>>>>> + doc: >-
>>>>>> + Unique identifier for the node.
>>>>>> + Assigned dynamically by the DRM RAS core upon
>>>>>> registration.
>>>>>> + -
>>>>>> + name: device-name
>>>>>> + type: string
>>>>>> + doc: >-
>>>>>> + Device name chosen by the driver at registration.
>>>>>> + Can be a PCI BDF, UUID, or module name if unique.
>>>>>> + -
>>>>>> + name: node-name
>>>>>> + type: string
>>>>>> + doc: >-
>>>>>> + Node name chosen by the driver at registration.
>>>>>> + Can be an IP block name, or any name that identifies
>>>>>> the
>>>>>> + RAS node inside the device.
>>>>>> + -
>>>>>> + name: node-type
>>>>>> + type: u32
>>>>>> + doc: Type of this node, identifying its function.
>>>>>> + enum: node-type
>>>>>> + -
>>>>>> + name: error-counter-attrs
>>>>>> + attributes:
>>>>>> + -
>>>>>> + name: node-id
>>>>>> + type: u32
>>>>>> + doc: Node ID targeted by this error counter operation.
>>>>>> + -
>>>>>> + name: error-id
>>>>>> + type: u32
>>>>>> + doc: Unique identifier for a specific error counter
>>>>>> within an node.
>>>>>> + -
>>>>>> + name: error-name
>>>>>> + type: string
>>>>>> + doc: Name of the error.
>>>>>> + -
>>>>>> + name: error-value
>>>>>> + type: u32
>>>>>> + doc: Current value of the requested error counter.
>>>>>> +
>>>>>> +operations:
>>>>>> + list:
>>>>>> + -
>>>>>> + name: list-nodes
>>>>>> + doc: >-
>>>>>> + Retrieve the full list of currently registered DRM RAS
>>>>>> nodes.
>>>>>> + Each node includes its dynamically assigned ID, name,
>>>>>> and type.
>>>>>> + **Important:** User space must call this operation
>>>>>> first to obtain
>>>>>> + the node IDs. These IDs are required for all subsequent
>>>>>> + operations on nodes, such as querying error counters.
>>>>
>>>> I am curious about security implications of this design.
>>>
>>> hmm... very good point you are raising here.
>>>
>>> This current design relies entirely in the CAP_NET_ADMIN.
>>> No driver would have the flexibility to choose anything differently.
>>> Please notice that the flag admin-perm is hardcoded in this yaml file.
>>>
>>>> If the complete
>>>> list of RAS nodes is visible for any process on the system (and one
>>>> wants to
>>>> avoid requiring CAP_NET_ADMIN), there should be some way to enforce
>>>> permission checks when performing these operations if desired.
>>>
>>> Right now, there's no way that the driver would choose not avoid
>>> requiring
>>> CAP_NET_ADMIN...
>>>
>>> Only way would be the admin to give the cap_net_admin to the tool with:
>>>
>>> $ sudo setcap cap_net_admin+ep /bin/drm_ras_tool
>>>
>>> but not ideal and not granular anyway...
>>>
>>>>
>>>> For example, this might be implemented in the driver's definition of
>>>> callback functions like query_error_counter; some drivers may want
>>>> to ensure
>>>> that the process can in fact open the file descriptor corresponding
>>>> to the
>>>> queried device before serving a netlink request. Is it enough for a
>>>> driver
>>>> to simply return -EPERM in this case? Any driver that doesnt wish to
>>>> protect
>>>> its RAS nodes need not implement checks in their callbacks.
>>>
>>> Fair enough. If we want to give the option to the drivers, then we need:
>>>
>>> 1. to first remove all the admin-perm flags below and leave the
>>> driver to
>>> pick up their policy on when to return something or -EPERM.
>>> 2. Document this security responsibility and list a few possibilities.
>>> 3. In our Xe case here I believe the easiest option is to use
>>> something like:
>>>
>>> struct scm_creds *creds = NETLINK_CREDS(cb->skb);
>>> if (!gid_eq(creds->gid, GLOBAL_ROOT_GID))
>>> return -EPERM
>>
>> The driver currently does not have access to the callback or the
>> skbuffer. Sending these details as param to driver won't be right as
>> drm_ras needs to handle all the netlink buffers.
>>
>> How about using pre_doit & start calls? If driver has a pre callback ,
>> it's the responsibility of the driver to check permissions/any-pre
>> conditions, else the CAP_NET_ADMIN permission will be checked.
>>
>> @Zack / @Rodrigo thoughts?
>> @Zack Will this work for your usecase?
>>
>> yaml
>> + dump:
>> + pre: drm-ras-nl-pre-list-nodes
>>
>>
>> drm_ras.c :
>>
>> + if (node->pre_list_nodes)
>> + return node->pre_list_nodes(node);
>> +
>> + return check_permissions(cb->skb); //Checks creds
>>
>> Thanks
>> Riana
>>
>
> I agree that a pre_doit is probably the best solution for this.
>
> Not entirely sure what a driver specific implementation would look like
> yet, but I think that as long as the driver callback has a way to access
> the 'current' task_struct pointer corresponding to the user process then
> this approach seems like the best option from the netlink side.
>
> Since this is mostly a concern for our specific use case, perhaps we can
> incorporate this functionality in our change down the road when we
> expand the interface for telemetry?
Yeah using pre_doit we can allow driver to make decisions based on
the private data or driver specific permissions. However we will need to
check the CAP_NET_ADMIN when no driver callback is implemented in the
netlink layer as a default .
Thank you, you can incorporate the changes when you add telemetry nodes.
For now, I will retain the admin-perm in flags.
I will address the rest of the review comments and send out a new
revision shortly.
Thank you
Riana
>
> Let me know what you think.
>
> Zack
>
>>>
>>> or something like that?!
>>>
>>> perhaps drivers could implement some form of cookie or pre-
>>> authorization with
>>> ioctls or sysfs, and then store in the priv?
>>>
>>> Thoughts?
>>> Other options?
>>>
>>>>
>>>> I dont see any such permissions checks in your driver implementation
>>>> which
>>>> is understandable given that it may not be necessary for your use
>>>> cases.
>>>> However, this would be a concern for our driver if we were to adopt
>>>> this
>>>> interface.
>>>
>>> yeap, this case was entirely with admin-perm, so not needed at all...
>>> But I see your point and this is really not giving any flexibility to
>>> other drivers.
>>>
>
>>>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Zack
>>>>
>>
>
next prev parent reply other threads:[~2026-01-16 5:56 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20251205083934.3602030-6-riana.tauro@intel.com>
2025-12-05 8:39 ` [PATCH v3 1/4] drm/ras: Introduce the DRM RAS infrastructure over generic netlink Riana Tauro
2025-12-09 21:35 ` Rodrigo Vivi
2026-01-08 22:36 ` Zack McKevitt
2026-01-09 20:57 ` Rodrigo Vivi
2026-01-13 8:20 ` Riana Tauro
2026-01-15 23:39 ` Zack McKevitt
2026-01-16 5:56 ` Riana Tauro [this message]
2026-01-16 20:26 ` Rodrigo Vivi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3297a59b-a788-43aa-945d-e89592c9ba8d@intel.com \
--to=riana.tauro@intel.com \
--cc=Hawking.Zhang@amd.com \
--cc=airlied@gmail.com \
--cc=anshuman.gupta@intel.com \
--cc=aravind.iddamsetty@linux.intel.com \
--cc=ashwin.kumar.kulkarni@intel.com \
--cc=davem@davemloft.net \
--cc=dri-devel@lists.freedesktop.org \
--cc=edumazet@google.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jeff.hugo@oss.qualcomm.com \
--cc=joonas.lahtinen@linux.intel.com \
--cc=joshua.santosh.ranjan@intel.com \
--cc=kuba@kernel.org \
--cc=lijo.lazar@amd.com \
--cc=lukas@wunner.de \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pratik.bari@intel.com \
--cc=rodrigo.vivi@intel.com \
--cc=shubham.kumar@intel.com \
--cc=simona.vetter@ffwll.ch \
--cc=zachary.mckevitt@oss.qualcomm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox