public inbox for intel-xe@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Andi Shyti <andi.shyti@kernel.org>
To: Soham Purkait <soham.purkait@intel.com>
Cc: intel-xe@lists.freedesktop.org, riana.tauro@intel.com,
	 anshuman.gupta@intel.com, aravind.iddamsetty@linux.intel.com,
	badal.nilawar@intel.com,  raag.jadav@intel.com,
	ravi.kishore.koppuravuri@intel.com, mallesh.koujalagi@intel.com,
	 andi.shyti@intel.com, rodrigo.vivi@intel.com,
	anoop.c.vijay@intel.com
Subject: Re: [PATCH v2 2/2] drm/xe/xe_ras: Add RAS support for GPU health indicator
Date: Tue, 28 Apr 2026 15:47:37 +0200	[thread overview]
Message-ID: <afC1NX-PvO6qqBU6@zenone.zhora.eu> (raw)
In-Reply-To: <20260423173925.699486-3-soham.purkait@intel.com>

Hi Soham,

...

> diff --git a/Documentation/ABI/testing/sysfs-driver-intel-xe-ras b/Documentation/ABI/testing/sysfs-driver-intel-xe-ras
> new file mode 100644
> index 000000000000..085cb79a6e00
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-driver-intel-xe-ras

Thanks for adding the documentation!

> @@ -0,0 +1,33 @@
> +What:		/sys/bus/pci/drivers/.../gpu_health
> +Date:		April 2026
> +KernelVersion:	7.0
> +Contact:	intel-xe@lists.freedesktop.org
> +Description:
> +		This file exposes the current GPU health state and, for Physical
> +		Functions (PFs), allows GPU health state to be updated.
> +
> +		This sysfs file is only accessible to administrative users and is
> +		present only on Intel Xe platforms that support the GPU health
> +		indicator interface for RAS.
> +
> +		For Physical Functions (PFs), the file is read-write, while for
> +		Virtual Functions (VFs), it is read-only and does not support GPU
> +		health state updates.
> +
> +		Read return a single line containing one of the valid values for

/Read/Reads/ or /Read return/A read returns/

> +		the current device health state. Only for PFs, writing one of the
> +		valid values updates the current device health state.

...

> +static const char * const gpu_health_states[] = {
> +	[XE_RAS_HEALTH_STATUS_OK]		= "ok",
> +	[XE_RAS_HEALTH_STATUS_WARNING]		= "warning",
> +	[XE_RAS_HEALTH_STATUS_CRITICAL]		= "critical"
> +};

Thanks for making it one word, it makes much more sense to me.

...

> +static ssize_t gpu_health_show(struct device *dev, struct device_attribute *attr, char *buf)
> +{
> +	struct xe_device *xe = kdev_to_xe_device(dev);
> +	struct xe_sysctrl_mailbox_command command = {0};
> +	struct xe_ras_health_get_response response = {0};
> +	struct xe_ras_health_get_input request = {0};
> +	enum xe_sysctrl_mailbox_command_id cmd = XE_SYSCTRL_CMD_GET_HEALTH;

do we need 'cmd' here?

> +	enum xe_ras_health_status health;
> +	int ret;
> +	size_t rlen = 0;
> +
> +	prepare_sysctrl_command(&command, cmd, &request,
> +				sizeof(request), &response, sizeof(response));
> +	guard(xe_pm_runtime)(xe);
> +	ret = xe_sysctrl_send_command(&xe->sc, &command, &rlen);
> +	if (ret)
> +		return ret;
> +
> +	if (rlen != sizeof(response)) {
> +		xe_err(xe,
> +		       "[RAS][GET_HEALTH]: invalid Sysctrl response length %zu (expected %zu)\n",
> +		       rlen, sizeof(response));
> +		return -EPROTO;
> +	}
> +	if (response.current_health > XE_RAS_HEALTH_STATUS_CRITICAL) {
> +		xe_err(xe, "[RAS][GET_HEALTH]: invalid health state %u from Sysctrl\n",
> +		       response.current_health);
> +		return -EPROTO;
> +	}
> +
> +	health = (enum xe_ras_health_status)response.current_health;
> +
> +	xe_dbg(xe, "[RAS][GET_HEALTH]: current GPU health state = %d (%s)\n",
> +	       health, gpu_health_states[health]);
> +
> +	return sysfs_emit(buf, "%s\n", gpu_health_states[health]);
> +}
> +
> +static ssize_t gpu_health_store(struct device *dev, struct device_attribute *attr,
> +				const char *buf, size_t count)
> +{
> +	struct xe_device *xe = kdev_to_xe_device(dev);
> +	struct xe_sysctrl_mailbox_command command = {0};
> +	struct xe_ras_health_set_input request = {0};
> +	struct xe_ras_health_set_response response = {0};
> +	enum xe_sysctrl_mailbox_command_id cmd = XE_SYSCTRL_CMD_SET_HEALTH;

do we need 'cmd' here?

Andi

> +	enum xe_ras_health_status health;
> +	int ret;
> +	size_t rlen = 0;
> +	int state;
> +	int ras_status;
> +
> +	state = sysfs_match_string(gpu_health_states,
> +				   buf);
> +	if (state < 0)
> +		return -EINVAL;
> +
> +	request.new_health = (u8)state;
> +
> +	prepare_sysctrl_command(&command, cmd, &request,
> +				sizeof(request), &response, sizeof(response));

  parent reply	other threads:[~2026-04-28 13:47 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23 17:39 [PATCH v2 0/2] drm/xe: Add support for GPU health indicator Soham Purkait
2026-04-23 17:39 ` [PATCH v2 1/2] drm/xe/xe_ras: Add types and commands for RAS " Soham Purkait
2026-04-28  8:56   ` Tauro, Riana
2026-04-29  5:24     ` Purkait, Soham
2026-04-29  5:34       ` Raag Jadav
2026-04-28 13:19   ` Andi Shyti
2026-04-23 17:39 ` [PATCH v2 2/2] drm/xe/xe_ras: Add RAS support for " Soham Purkait
2026-04-27 22:16   ` Rodrigo Vivi
2026-04-28  8:24   ` Tauro, Riana
2026-04-28 12:57     ` Andi Shyti
2026-04-29  6:07     ` Purkait, Soham
2026-04-28 13:47   ` Andi Shyti [this message]
2026-04-29  5:39   ` Raag Jadav
2026-04-23 17:52 ` ✗ CI.checkpatch: warning for drm/xe: Add " Patchwork
2026-04-23 17:54 ` ✓ CI.KUnit: success " Patchwork
2026-04-23 19:02 ` ✓ Xe.CI.BAT: " Patchwork
2026-04-24  2:52 ` ✓ Xe.CI.FULL: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afC1NX-PvO6qqBU6@zenone.zhora.eu \
    --to=andi.shyti@kernel.org \
    --cc=andi.shyti@intel.com \
    --cc=anoop.c.vijay@intel.com \
    --cc=anshuman.gupta@intel.com \
    --cc=aravind.iddamsetty@linux.intel.com \
    --cc=badal.nilawar@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=mallesh.koujalagi@intel.com \
    --cc=raag.jadav@intel.com \
    --cc=ravi.kishore.koppuravuri@intel.com \
    --cc=riana.tauro@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=soham.purkait@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox