All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/1] drm/xe: Add support for GPU health indicator
@ 2026-06-10  9:33 Soham Purkait
  2026-06-10  9:33 ` [PATCH v4 1/1] drm/xe/xe_ras: Add RAS " Soham Purkait
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Soham Purkait @ 2026-06-10  9:33 UTC (permalink / raw)
  To: intel-xe, riana.tauro, anshuman.gupta, aravind.iddamsetty,
	badal.nilawar, raag.jadav, ravi.kishore.koppuravuri,
	mallesh.koujalagi, andi.shyti, rodrigo.vivi
  Cc: soham.purkait, anoop.c.vijay

GPUs commonly rely on various reactive health monitoring approaches.
The Xe GPU health indicator is intended to fit into such reactive
monitoring flows, where it could be used by management tools to fetch
and update GPU health status.

This patch adds Xe GPU health indicator support as a RAS feature.
It introduces the health command IDs and request/response structures
used by the System Controller mailbox, and integrates the feature
into Xe through the gpu_health sysfs interface.

The sysfs file, gpu_health, is created at the device level and
provides a simple interface for observing and updating the reported
GPU health state. It is exposed as read-only for non admin users while
write access is only provided to the admin users.

The sysfs file (gpu_health) is placed at the device level and behaves
as follows:

  $ cat /sys/.../device/gpu_health
  ok

  $ echo critical > /sys/.../device/gpu_health

  $ cat /sys/.../device/gpu_health
  critical

Soham Purkait (1):
  drm/xe/xe_ras: Add RAS GPU health indicator

 .../ABI/testing/sysfs-driver-intel-xe-ras     |  30 +++
 drivers/gpu/drm/xe/xe_ras.c                   | 177 ++++++++++++++++++
 drivers/gpu/drm/xe/xe_ras.h                   |   1 +
 drivers/gpu/drm/xe/xe_ras_types.h             |  60 ++++++
 drivers/gpu/drm/xe/xe_sysctrl_mailbox.c       |  28 +++
 drivers/gpu/drm/xe/xe_sysctrl_mailbox.h       |   3 +
 drivers/gpu/drm/xe/xe_sysctrl_mailbox_types.h |   4 +
 7 files changed, 303 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-driver-intel-xe-ras

-- 
2.43.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-06-17 18:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-10  9:33 [PATCH v4 0/1] drm/xe: Add support for GPU health indicator Soham Purkait
2026-06-10  9:33 ` [PATCH v4 1/1] drm/xe/xe_ras: Add RAS " Soham Purkait
2026-06-10 10:11   ` Gupta, Anshuman
2026-06-15 12:25   ` Andi Shyti
2026-06-17  8:35   ` Nilawar, Badal
2026-06-17 18:48     ` Purkait, Soham
2026-06-10  9:39 ` ✗ CI.checkpatch: warning for drm/xe: Add support for " Patchwork
2026-06-10  9:41 ` ✓ CI.KUnit: success " Patchwork
2026-06-10 10:25 ` ✓ Xe.CI.BAT: " Patchwork
2026-06-10 16:14 ` ✓ Xe.CI.FULL: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.