public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: "Tauro, Riana" <riana.tauro@intel.com>
To: Raag Jadav <raag.jadav@intel.com>,
	<aravind.iddamsetty@linux.intel.com>,
	Jakub Kicinski <kuba@kernel.org>, <rodrigo.vivi@intel.com>
Cc: <dri-devel@lists.freedesktop.org>, <netdev@vger.kernel.org>,
	<anshuman.gupta@intel.com>, <joonas.lahtinen@linux.intel.com>,
	<simona.vetter@ffwll.ch>, <airlied@gmail.com>,
	<pratik.bari@intel.com>, <joshua.santosh.ranjan@intel.com>,
	<ashwin.kumar.kulkarni@intel.com>, <shubham.kumar@intel.com>,
	<ravi.kishore.koppuravuri@intel.com>, <anvesh.bakwad@intel.com>,
	<maarten.lankhorst@linux.intel.com>,
	Zack McKevitt <zachary.mckevitt@oss.qualcomm.com>,
	Lijo Lazar <lijo.lazar@amd.com>,
	"Hawking Zhang" <Hawking.Zhang@amd.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Paolo Abeni" <pabeni@redhat.com>,
	Eric Dumazet <edumazet@google.com>,
	<intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH 3/4] drm/drm_ras: Add DRM RAS netlink error event notification
Date: Thu, 23 Apr 2026 11:22:45 +0530	[thread overview]
Message-ID: <65c6839f-a92d-4122-b347-a1250dde961d@intel.com> (raw)
In-Reply-To: <69c061b4-eb40-4e2d-aeaa-5dfa47b2db8c@intel.com>

Hi Jakub

We had some questions regarding events in netlink

1) According to netlink spec, "Events are considered less idiomatic for 
netlink and notifications should be preferred."
Since we currently don't have a response that matches a get operation to 
use notify. Would using an event be acceptable?

2) Is there a way to check if there are subscribers to a group before 
creating the event message?
    Currently the subscriber check happens in 
netlink_broadcast_filtered, but a reviewer suggested we could
    optimize by skipping message creation if there are no subscribers.

Thanks
Riana

On 4/10/2026 11:50 AM, Tauro, Riana wrote:
>
> On 4/9/2026 11:05 AM, Raag Jadav wrote:
>> On Wed, Apr 08, 2026 at 07:59:33PM +0530, Tauro, Riana wrote:
>>> On 3/25/2026 7:01 PM, Raag Jadav wrote:
>>>> On Wed, Mar 11, 2026 at 03:59:17PM +0530, Riana Tauro wrote:
>> ...
>>
>>>>> +Example: Listen to error events
>>>>> +
>>>>> +.. code-block:: bash
>>>>> +
>>>>> +    sudo ynl --family drm_ras --subscribe error-notify
>>>>> +    {'msg': {'error-id': 1, 'node-id': 1}, 'name': 'error-event'}
>>>> Can we also have error-name and node-name? I'd be pulling my hair off
>>>> if I need to remember all the ids.
>>> Yeah makes sense. We can add the node_name, error_name.
>>> Adding device_name would also be more useful in the event.
>>>
>>> @Rodrigo/@aravind thoughts?
>>>
> I tried adding all parameters, but the event response seems overloaded.
> I think node-name, error-name is not necessary since this will be 
> mostly used
>
> by tools and scripts that get the nodes and error-ids prior to 
> subscribing.
>
> Let me know your thoughts
>
> $ sudo ./tools/net/ynl/pyynl/cli.py --family drm_ras --subscribe 
> error-notify
>
> {'msg': {'device-name': '0000:03:00.0', 'error-id': 1, 'error-name': 
> 'core-compute', 'node-id': 3, 'node-name': 'uncorrectable-errors'},
>
> 'name': 'error-event'}
>
> {'msg': {'device-name': '0000:04:00.0', 'error-id': 1, 'error-name': 
> 'core-compute', 'node-id': 1, 'node-name': 'uncorrectable-errors'},
>
> 'name': 'error-event'}
>
> Thanks
> Riana
>
>>>
>>>> On that note, I think it'll be good to have them as part of request
>>>> attributes as an alternative to ids (also for existing commands) but
>>>> that can done as a follow up.
>>>>
>>> We cannot use names as alternative because it won't work for 
>>> multiple cards.
>>> example in xe: Suppose there are 2 cards and each has 2 nodes. We 
>>> cannot
>>> query using node_name+error_name.
>>> Also most of the netlink implementations use id's as unique 
>>> identifiers.
>>>
>>> $ sudo ./cli.py --family drm_ras  --dump list-nodes
>>> [{'device-name': 'bdf_1', 'node-id': 0, 'node-name': 
>>> 'correctable-errors',
>>> 'node-type': 'error-counter'},
>>>   {'device-name': 'bdf_1, 'node-id': 1, 'node-name': 
>>> 'uncorrectable-errors',
>>> 'node-type': 'error-counter'},
>>>   {'device-name': 'bdf_2', 'node-id': 2, 'node-name': 
>>> 'correctable-errors',
>>> 'node-type': 'error-counter'},
>>>   {'device-name': 'bdf_2', 'node-id': 3, 'node-name': 
>>> 'uncorrectable-errors',
>>> 'node-type': 'error-counter'}]
>> This means they don't persist the user needs to figures out all the 
>> ids before
>> anything can happen. In device node world we have 
>> /dev/dri/by-path/<bdf> which
>> makes it much easier.
>>
>> Also, I'm not much informed about the history and it's still unclear 
>> to me what
>> problem did netlink solve here that cannot be solved by anything 
>> else? But we're
>> too late for that discussion, and again, not my call.
>>
>>>> Also, what if I have multiple devices with multiple nodes. Do they 
>>>> need
>>>> separate subscription?
>>>>
>>> No, we subscribe only to the group not the nodes. In this case the 
>>> group is
>>> 'error-notify'
>>>
>>> $ sudo ./cli.py --family drm_ras --subscribe error-notify
>>> {'msg': {'error-id': 1, 'node-id': 1}, 'name': 'error-event'}
>>> {'msg': {'error-id': 1, 'node-id': 3}, 'name': 'error-event'}
>> Hm, perhaps I need to spend some time wrapping my head around the new 
>> concept.
>> Let's catch up sometime this week.
>>
>> Raag

  reply	other threads:[~2026-04-23  5:53 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11 10:29 [PATCH 0/4] Add support for clear counter and error event in DRM RAS Riana Tauro
2026-03-11 10:29 ` [PATCH 1/4] drm/drm_ras: Add clear-error-counter netlink command to drm_ras Riana Tauro
2026-03-12  0:29   ` Jakub Kicinski
2026-03-25 12:40   ` Raag Jadav
2026-03-11 10:29 ` [PATCH 2/4] drm/xe/xe_drm_ras: Add support for clear-error-counter in XE DRM RAS Riana Tauro
2026-03-12 10:17   ` Raag Jadav
2026-03-11 10:29 ` [PATCH 3/4] drm/drm_ras: Add DRM RAS netlink error event notification Riana Tauro
2026-03-25 13:31   ` Raag Jadav
2026-04-08 14:29     ` Tauro, Riana
2026-04-09  5:35       ` Raag Jadav
2026-04-10  6:20         ` Tauro, Riana
2026-04-23  5:52           ` Tauro, Riana [this message]
2026-03-11 10:29 ` [PATCH 4/4] drm/xe/xe_drm_ras: Add error-event support in XE DRM RAS Riana Tauro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=65c6839f-a92d-4122-b347-a1250dde961d@intel.com \
    --to=riana.tauro@intel.com \
    --cc=Hawking.Zhang@amd.com \
    --cc=airlied@gmail.com \
    --cc=anshuman.gupta@intel.com \
    --cc=anvesh.bakwad@intel.com \
    --cc=aravind.iddamsetty@linux.intel.com \
    --cc=ashwin.kumar.kulkarni@intel.com \
    --cc=davem@davemloft.net \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=edumazet@google.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=joshua.santosh.ranjan@intel.com \
    --cc=kuba@kernel.org \
    --cc=lijo.lazar@amd.com \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pratik.bari@intel.com \
    --cc=raag.jadav@intel.com \
    --cc=ravi.kishore.koppuravuri@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=shubham.kumar@intel.com \
    --cc=simona.vetter@ffwll.ch \
    --cc=zachary.mckevitt@oss.qualcomm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox