Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Riana Tauro <riana.tauro@intel.com>
To: Raag Jadav <raag.jadav@intel.com>, Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>,
	<aravind.iddamsetty@linux.intel.com>, <anshuman.gupta@intel.com>,
	<joonas.lahtinen@linux.intel.com>, <simona.vetter@ffwll.ch>,
	<airlied@gmail.com>, <pratik.bari@intel.com>,
	<joshua.santosh.ranjan@intel.com>,
	<ashwin.kumar.kulkarni@intel.com>, <shubham.kumar@intel.com>
Subject: Re: [PATCH v3 2/4] drm/xe/xe_drm_ras: Add support for drm ras
Date: Mon, 12 Jan 2026 11:43:16 +0530	[thread overview]
Message-ID: <fb4fed2d-27e6-4533-a37e-8a8d24f20aae@intel.com> (raw)
In-Reply-To: <aWEljhdVF10_70Cz@black.igk.intel.com>



On 1/9/2026 9:28 PM, Raag Jadav wrote:
> On Fri, Jan 09, 2026 at 09:13:31AM -0500, Rodrigo Vivi wrote:
>> On Fri, Jan 09, 2026 at 01:38:44PM +0530, Riana Tauro wrote:
>>> Hi Raag
>>>
>>> Thank you for the review
>>>
>>> On 12/9/2025 1:52 PM, Raag Jadav wrote:
>>>> On Fri, Dec 05, 2025 at 02:09:34PM +0530, Riana Tauro wrote:
>>>>> Allocate correctable, nonfatal and fatal nodes per xe device.
>>>>> Each node contains error classes, counters and respective
>>>>> query counter functions.
>>>>>
>>>>> Add basic functionality to create and register drm nodes.
>>>>> Below operations can be performed using Generic netlink DRM RAS interface
> 
> ...
> 
>>>>> Query Error counter:
>>>>>
>>>>> $ sudo ynl --family drm_ras --do query-error-counter  --json '{"node-id":1, "error-id":1}'
>>>>> {'error-id': 1, 'error-name': 'Core Compute Error', 'error-value': 0}
>>>>
>>>> One more (sorry): So this means graphics will be a different id? Or do they
>>>> overlap? How does it work?
>>>>
>>>
>>> Did not get this question.
> 
> This give the impression that it's specific to compute engine, so I was
> hoping for something more generic like "execution unit" or simply "core"
> but I couldn't come up with anything better than this, so upto you.

Perhaps just GT. Let me check

> 
>>>> Also,
>>>>
>>>> [*] I'm not much informed about the history here but the 'error' term
>>>> seems slapped onto almost everything. We already know it's RAS so perhaps
>>>> we add it only where make sense and try to simplify some of the naming?
> 
> ...
> 
>>>>> +/**
>>>>> + * enum drm_xe_ras_error_class - Supported drm ras error classes.
>>>>> + */
>>>>> +enum drm_xe_ras_error_class {
>>>>> +	/** @DRM_XE_RAS_ERROR_CORE_COMPUTE: GT and Media Error */
>>>>> +	DRM_XE_RAS_ERROR_CORE_COMPUTE = 1,
>>>>> +	/** @DRM_XE_RAS_ERROR_SOC_INTERNAL: SOC Error */
>>>>> +	DRM_XE_RAS_ERROR_SOC_INTERNAL,
>>>>> +	/** @DRM_XE_RAS_ERROR_CLASS_MAX: Max Error */
>>>>> +	DRM_XE_RAS_ERROR_CLASS_MAX,	/* non-ABI */
>>>>> +};
>>>>
>>>> Also, all of the enums share the same DRM_XE_RAS_ERROR_* prefix, so let's try
>>>> to have distinguishable naming. Perhaps [*] would be useful here as well ;)
>>>
>>> DRM_XE_RAS_ERROR_SEVERITY_* will cause longer names. Any suggestions?
> 
> Already mentioned above[*], the key is to not overuse 'error' ;)
> 
> DRM_XE_RAS_SEVERITY_*
> DRM_XE_RAS_COMPONENT_*

There's been an interest expressed to add telemetry nodes as well.

https://patchwork.freedesktop.org/patch/666138/?series=118435&rev=5

I have kept the prefix (DRM_XE_RAS_ERROR) consistent with the first 
patch (type - ERROR_COUNTER) for alignment.

 From my perspective retaining the prefix ERROR would be beneficial to 
differentiate if there are different types.

Can you please have a look at the link and let me know if you still 
think the same

For differentiation, i will add SEVERITY and CLASS/COMPONENT.

Thanks
Riana

> 
> and so on ...
> 
>> Try this full version first and see how the outcome looks like...
>> if we are still respecting the line limits without ugly cuts, then let's go with it.
>> otherwise try something shorter ERR_SEV_ ... or something like that...
> 
> ... which can be futher shortened with this idea.
> 
> Side note: I'm already using these on my local branch.
> 
> Raag


  reply	other threads:[~2026-01-12  6:13 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-05  8:39 [PATCH v3 0/4] Introduce DRM_RAS using generic netlink for RAS Riana Tauro
2025-12-05  8:39 ` [PATCH v3 1/4] drm/ras: Introduce the DRM RAS infrastructure over generic netlink Riana Tauro
2025-12-09 21:35   ` Rodrigo Vivi
2026-01-08 22:36     ` Zack McKevitt
2026-01-09 20:57       ` Rodrigo Vivi
2026-01-13  8:20         ` Riana Tauro
2026-01-15 23:39           ` Zack McKevitt
2026-01-16  5:56             ` Riana Tauro
2026-01-16 20:26               ` Rodrigo Vivi
2025-12-05  8:39 ` [PATCH v3 2/4] drm/xe/xe_drm_ras: Add support for drm ras Riana Tauro
2025-12-09  8:22   ` Raag Jadav
2026-01-09  8:08     ` Riana Tauro
2026-01-09 14:13       ` Rodrigo Vivi
2026-01-09 15:58         ` Raag Jadav
2026-01-12  6:13           ` Riana Tauro [this message]
2026-01-12 10:27             ` Raag Jadav
2025-12-09 21:57   ` Rodrigo Vivi
2026-01-07  9:48     ` Aravind Iddamsetty
2025-12-05  8:39 ` [PATCH v3 3/4] drm/xe/xe_hw_error: Add support for GT hardware errors Riana Tauro
2025-12-10 18:18   ` Raag Jadav
2026-01-12  3:41     ` Riana Tauro
2026-01-12 10:02       ` Raag Jadav
2025-12-05  8:39 ` [PATCH v3 4/4] drm/xe/xe_hw_error: Add support for PVC SOC errors Riana Tauro
2025-12-15 10:52   ` Raag Jadav
2026-01-12  4:45     ` Riana Tauro
2026-01-12 10:06       ` Raag Jadav
2025-12-05  9:40 ` ✗ CI.checkpatch: warning for Introduce DRM_RAS using generic netlink for RAS (rev3) Patchwork
2025-12-05  9:41 ` ✓ CI.KUnit: success " Patchwork
2025-12-05  9:56 ` ✗ CI.checksparse: warning " Patchwork
2025-12-05 11:27 ` ✗ Xe.CI.Full: failure " Patchwork
2025-12-09 21:56 ` [PATCH v3 0/4] Introduce DRM_RAS using generic netlink for RAS Alex Deucher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb4fed2d-27e6-4533-a37e-8a8d24f20aae@intel.com \
    --to=riana.tauro@intel.com \
    --cc=airlied@gmail.com \
    --cc=anshuman.gupta@intel.com \
    --cc=aravind.iddamsetty@linux.intel.com \
    --cc=ashwin.kumar.kulkarni@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=joshua.santosh.ranjan@intel.com \
    --cc=pratik.bari@intel.com \
    --cc=raag.jadav@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=shubham.kumar@intel.com \
    --cc=simona.vetter@ffwll.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox