public inbox for intel-xe@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] drm/xe: Add support for GPU health indicator
@ 2026-04-23 17:39 Soham Purkait
  2026-04-23 17:39 ` [PATCH v2 1/2] drm/xe/xe_ras: Add types and commands for RAS " Soham Purkait
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Soham Purkait @ 2026-04-23 17:39 UTC (permalink / raw)
  To: intel-xe, riana.tauro, anshuman.gupta, aravind.iddamsetty,
	badal.nilawar, raag.jadav, ravi.kishore.koppuravuri,
	mallesh.koujalagi, andi.shyti, rodrigo.vivi
  Cc: soham.purkait, anoop.c.vijay

GPUs commonly rely on various reactive health monitoring
approaches. The Xe GPU health indicator is intended to fit into
such reactive monitoring flows, where it could be used by
management tools to fetch and update GPU health status.

This series adds Xe GPU health indicator support as a RAS feature.
It introduces the health command IDs and request/response structures
used by the System Controller mailbox, and integrates the feature
into Xe through the gpu_health sysfs interface.

The sysfs file, gpu_health, is created at the device level and
provides a simple interface for observing and updating the reported
GPU health state. It is exposed as read-write on PF/native functions
and read-only on VFs.

The sysfs file (gpu_health) is placed at the device level and behaves
as follows:

$ cat /sys/.../device/gpu_health
ok

$ echo critical > /sys/.../device/gpu_health

$ cat /sys/.../device/gpu_health
critical

Soham Purkait (2):
  drm/xe/xe_ras: Add types and commands for RAS GPU health indicator
  drm/xe/xe_ras: Add RAS support for GPU health indicator

 .../ABI/testing/sysfs-driver-intel-xe-ras     |  33 +++
 drivers/gpu/drm/xe/Makefile                   |   1 +
 drivers/gpu/drm/xe/xe_device.c                |   3 +
 drivers/gpu/drm/xe/xe_ras.c                   | 202 ++++++++++++++++++
 drivers/gpu/drm/xe/xe_ras.h                   |  13 ++
 drivers/gpu/drm/xe/xe_ras_types.h             |  83 +++++++
 drivers/gpu/drm/xe/xe_sysctrl_mailbox_types.h |  15 ++
 7 files changed, 350 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-driver-intel-xe-ras
 create mode 100644 drivers/gpu/drm/xe/xe_ras.c
 create mode 100644 drivers/gpu/drm/xe/xe_ras.h
 create mode 100644 drivers/gpu/drm/xe/xe_ras_types.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-04-27 22:16 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-23 17:39 [PATCH v2 0/2] drm/xe: Add support for GPU health indicator Soham Purkait
2026-04-23 17:39 ` [PATCH v2 1/2] drm/xe/xe_ras: Add types and commands for RAS " Soham Purkait
2026-04-23 17:39 ` [PATCH v2 2/2] drm/xe/xe_ras: Add RAS support for " Soham Purkait
2026-04-27 22:16   ` Rodrigo Vivi
2026-04-23 17:52 ` ✗ CI.checkpatch: warning for drm/xe: Add " Patchwork
2026-04-23 17:54 ` ✓ CI.KUnit: success " Patchwork
2026-04-23 19:02 ` ✓ Xe.CI.BAT: " Patchwork
2026-04-24  2:52 ` ✓ Xe.CI.FULL: " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox