public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Add support for clear counter and error event in DRM RAS
@ 2026-03-11 10:29 Riana Tauro
  2026-03-11 10:29 ` [PATCH 1/4] drm/drm_ras: Add clear-error-counter netlink command to drm_ras Riana Tauro
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Riana Tauro @ 2026-03-11 10:29 UTC (permalink / raw)
  To: intel-xe, dri-devel, netdev
  Cc: aravind.iddamsetty, anshuman.gupta, rodrigo.vivi, joonas.lahtinen,
	simona.vetter, airlied, pratik.bari, joshua.santosh.ranjan,
	ashwin.kumar.kulkarni, shubham.kumar, ravi.kishore.koppuravuri,
	raag.jadav, anvesh.bakwad, maarten.lankhorst, Riana Tauro

Clear Error Counter : Add clear-error-counter command to DRM RAS to clear
a specific error counter of a node. Implement the callback in XE driver
to demonstrate usage.

Usage with both get-error-counter and clear-error-counter:

$ sudo ynl --family drm_ras  --dump get-error-counter --json '{"node-id":1}'
[{'error-id': 1, 'error-name': 'core-compute', 'error-value': 0},
 {'error-id': 2, 'error-name': 'soc-internal', 'error-value': 3}]

$ sudo ynl --family drm_ras  --do clear-error-counter --json \
'{"node-id":1, "error-id":2}'
None

$ sudo ynl --family drm_ras  --dump get-error-counter --json '{"node-id":1}'
[{'error-id': 1, 'error-name': 'core-compute', 'error-value': 0},
 {'error-id': 2, 'error-name': 'soc-internal', 'error-value': 0}]

Error Event Support:  Introduce `error-event` support in DRM RAS to notify
userspace whenever an error occurs.

Each notification includes the node-id and error-id to identify
the source and type of the error. To receive notifications,
userspace must subscribe to the 'error-notify' multicast group.

Userspace can receive the event by subscribing to multicast group.

$ sudo ynl --family drm_ras --subscribe error-notify
{'msg': {'error-id': 2, 'node-id': 1}, 'name': 'error-event'}

Riana Tauro (4):
  drm/drm_ras: Add clear-error-counter netlink command to drm_ras
  drm/xe/xe_drm_ras: Add support for clear-error-counter in XE DRM RAS
  drm/drm_ras: Add DRM RAS netlink error event notification
  drm/xe/xe_drm_ras: Add error-event support in XE DRM RAS

 Documentation/gpu/drm-ras.rst            | 17 +++++
 Documentation/netlink/specs/drm_ras.yaml | 27 ++++++-
 drivers/gpu/drm/drm_ras.c                | 91 +++++++++++++++++++++++-
 drivers/gpu/drm/drm_ras_nl.c             | 19 +++++
 drivers/gpu/drm/drm_ras_nl.h             |  6 ++
 drivers/gpu/drm/xe/xe_drm_ras.c          | 52 +++++++++++++-
 drivers/gpu/drm/xe/xe_drm_ras.h          |  7 ++
 drivers/gpu/drm/xe/xe_hw_error.c         |  5 ++
 include/drm/drm_ras.h                    | 13 ++++
 include/uapi/drm/drm_ras.h               |  4 ++
 10 files changed, 237 insertions(+), 4 deletions(-)

-- 
2.47.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-03-25 13:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-11 10:29 [PATCH 0/4] Add support for clear counter and error event in DRM RAS Riana Tauro
2026-03-11 10:29 ` [PATCH 1/4] drm/drm_ras: Add clear-error-counter netlink command to drm_ras Riana Tauro
2026-03-12  0:29   ` Jakub Kicinski
2026-03-25 12:40   ` Raag Jadav
2026-03-11 10:29 ` [PATCH 2/4] drm/xe/xe_drm_ras: Add support for clear-error-counter in XE DRM RAS Riana Tauro
2026-03-12 10:17   ` Raag Jadav
2026-03-11 10:29 ` [PATCH 3/4] drm/drm_ras: Add DRM RAS netlink error event notification Riana Tauro
2026-03-25 13:31   ` Raag Jadav
2026-03-11 10:29 ` [PATCH 4/4] drm/xe/xe_drm_ras: Add error-event support in XE DRM RAS Riana Tauro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox