All of lore.kernel.org
 help / color / mirror / Atom feed
From: Raag Jadav <raag.jadav@intel.com>
To: "Purkait, Soham" <soham.purkait@intel.com>
Cc: "Tauro, Riana" <riana.tauro@intel.com>,
	intel-xe@lists.freedesktop.org, anshuman.gupta@intel.com,
	aravind.iddamsetty@linux.intel.com, badal.nilawar@intel.com,
	ravi.kishore.koppuravuri@intel.com, mallesh.koujalagi@intel.com,
	andi.shyti@intel.com, rodrigo.vivi@intel.com,
	anoop.c.vijay@intel.com
Subject: Re: [PATCH v2 1/2] drm/xe/xe_ras: Add types and commands for RAS GPU health indicator
Date: Wed, 29 Apr 2026 07:34:03 +0200	[thread overview]
Message-ID: <afGYS7oBCw4-QQoG@black.igk.intel.com> (raw)
In-Reply-To: <3bbb3db0-28be-49ac-a828-1167a1f2ab96@intel.com>

On Wed, Apr 29, 2026 at 10:54:35AM +0530, Purkait, Soham wrote:
> Hi Riana,
> 
> On 28-04-2026 14:26, Tauro, Riana wrote:
> > 
> > On 4/23/2026 11:09 PM, Soham Purkait wrote:
> > > Add System Controller GPU health status values, RAS response codes
> > > and mailbox payload types for querying and updating GPU health.
> > > 
> > > GPU health states are encoded as numeric values defined by
> > > enum xe_ras_health_status for use in System Controller commands.
> > > The GET_HEALTH command fetches the current health state, while
> > > the SET_HEALTH command updates it through the System Controller
> > > mailbox.
> > > 
> > > struct xe_ras_health_get_input and struct xe_ras_health_get_response
> > > describe the GET_HEALTH request and response payloads, respectively,
> > > while struct xe_ras_health_set_input and
> > > struct xe_ras_health_set_response describe the SET_HEALTH request
> > > and response payloads, including the operation status and current
> > > health state in the response.
> > 
> > This is extremely detailed commit message. Avoid using structs in commit
> > message
> > 
> > My suggestion is to split patches into get/set health and add structures
> > in the patch they
> > are being used. But upto you.
> 
> In my opinion its best to keep this intact in Instead of splitting in
> multiple patches as functionality wise it represent a single feature.

Documentation/process/submitting-patches.rst +81

"Solve only one problem per patch.  If your description starts to get
long, that's a sign that you probably need to split up your patch."

Raag

> > > v2:
> > >   - Add enum for health status instead of xe_ras_health_status_t.
> > > (Andi), (Rodrigo)
> > > 
> > > Signed-off-by: Soham Purkait <soham.purkait@intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/xe_ras_types.h             | 83 +++++++++++++++++++
> > >   drivers/gpu/drm/xe/xe_sysctrl_mailbox_types.h | 15 ++++
> > >   2 files changed, 98 insertions(+)
> > >   create mode 100644 drivers/gpu/drm/xe/xe_ras_types.h
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_ras_types.h
> > > b/drivers/gpu/drm/xe/xe_ras_types.h
> > > new file mode 100644
> > > index 000000000000..5f884d6e24de
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/xe/xe_ras_types.h
> > > @@ -0,0 +1,83 @@
> > > +/* SPDX-License-Identifier: MIT */
> > > +/*
> > > + * Copyright © 2026 Intel Corporation
> > > + */
> > > +
> > > +#ifndef _XE_RAS_TYPES_H_
> > > +#define _XE_RAS_TYPES_H_
> > > +
> > > +#include <linux/types.h>
> > > +
> > > +/* RAS response status codes */
> > > +enum xe_ras_response_status {
> > > +    XE_RAS_STATUS_SUCCESS = 0,
> > > +    XE_RAS_STATUS_INVALID_PARAM,
> > > +    XE_RAS_STATUS_OP_NOT_SUPPORTED,
> > > +    XE_RAS_STATUS_TIMEOUT,
> > > +    XE_RAS_STATUS_HARDWARE_FAILURE,
> > > +    XE_RAS_STATUS_INSUFFICIENT_RESOURCES,
> > > +    XE_RAS_STATUS_UNKNOWN_ERROR,
> > > +};
> > > +
> > > +/**
> > > + * enum xe_ras_health_status - Device health status values
> > > + *
> > > + * Health indicator status denoted by numeric values to be used in
> > > system
> > > + * controller mailbox commands.
> > > + *
> > > + * @XE_RAS_HEALTH_STATUS_OK: The device is healthy and operating
> > > within normal
> > > + * parameters.
> > > + * @XE_RAS_HEALTH_STATUS_WARNING: The device is experiencing minor
> > > issues but is
> > > + * still operational.
> > > + * @XE_RAS_HEALTH_STATUS_CRITICAL: The device is in a critical
> > > state and may not
> > > + * be operational.
> > > + */
> > > +enum xe_ras_health_status {
> > > +    XE_RAS_HEALTH_STATUS_OK = 0,
> > > +    XE_RAS_HEALTH_STATUS_WARNING,
> > > +    XE_RAS_HEALTH_STATUS_CRITICAL,
> > > +};
> > > +
> > > +/**
> > > + * struct xe_ras_health_get_input - Input for XE_SYSCTRL_CMD_GET_HEALTH
> > > + */
> > > +struct xe_ras_health_get_input {
> > > +    /** @reserved: Reserved for future use, must be 0 */
> > > +    u32 reserved[2];
> > > +} __packed;
> > > +
> > > +/**
> > > + * struct xe_ras_health_get_response - Response for
> > > XE_SYSCTRL_CMD_GET_HEALTH
> > > + */
> > > +struct xe_ras_health_get_response {
> > > +    /** @current_health: Current health status (OK/WARNING/CRITICAL) */
> > > +    u8 current_health;
> > > +    /** @reserved: Reserved for alignment */
> > Future use in case any other fields are added
> > > +    u8 reserved[3];
> > > +} __packed;
> > > +
> > > +/**
> > > + * struct xe_ras_health_set_input - Input for XE_SYSCTRL_CMD_SET_HEALTH
> > > + */
> > > +struct xe_ras_health_set_input {
> > > +    /** @new_health: New health status to set */
> > > +    u8 new_health;
> > > +    /** @reserved: Reserved for alignment */
> > > +    u8 reserved[3];
> > > +} __packed;
> > > +
> > > +/**
> > > + * struct xe_ras_health_set_response - Response for
> > > XE_SYSCTRL_CMD_SET_HEALTH
> > > + */
> > > +struct xe_ras_health_set_response {
> > > +    /** @operation_status: Status of set operation (RAS_STATUS) */
> > use enum or XE_RAS_STATUS
> > > +    u32 operation_status;
> > > +    /** @current_health: Health status after this change */
> > > +    u8 current_health;
> > > +    /** @reserved: Reserved for alignment */
> > > +    u8 reserved[3];
> > > +    /** @reserved_2: Reserved for future expansion */
> > > +    u32 reserved_2[2];
> > 
> > Nit: reserved1
> > 
> > > +} __packed;
> > > +
> > > +#endif /* _XE_RAS_TYPES_H_ */
> > > diff --git a/drivers/gpu/drm/xe/xe_sysctrl_mailbox_types.h
> > > b/drivers/gpu/drm/xe/xe_sysctrl_mailbox_types.h
> > > index 89456aec6097..fc73e02c3202 100644
> > > --- a/drivers/gpu/drm/xe/xe_sysctrl_mailbox_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_sysctrl_mailbox_types.h
> > > @@ -10,6 +10,21 @@
> > >     #include "abi/xe_sysctrl_abi.h"
> > >   +/**
> > > + * enum xe_sysctrl_mailbox_command_id - RAS Command ID's for GFSP group
> > > + *
> > > + * @XE_SYSCTRL_CMD_GET_HEALTH: Get current health status
> > > + * @XE_SYSCTRL_CMD_SET_HEALTH: Set new health status
> > > + */
> > > +enum xe_sysctrl_mailbox_command_id {
> > > +    XE_SYSCTRL_CMD_GET_HEALTH = 0x0B,
> > > +    XE_SYSCTRL_CMD_SET_HEALTH = 0x0C
> > 
> > Add comma so that following patches do not have to add.
> > 
> > > +};
> > > +
> > > +enum xe_sysctrl_group {
> > > +    XE_SYSCTRL_GROUP_GFSP = 1
> > 
> > Use 0x01. Also make this consistent with the correctable series by Raag.
> > 
> > Thanks
> > Riana
> > 
> > > +};
> > > +
> > >   /**
> > >    * struct xe_sysctrl_mailbox_command - System Controller mailbox
> > > command
> > >    */

  reply	other threads:[~2026-04-29  5:34 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23 17:39 [PATCH v2 0/2] drm/xe: Add support for GPU health indicator Soham Purkait
2026-04-23 17:39 ` [PATCH v2 1/2] drm/xe/xe_ras: Add types and commands for RAS " Soham Purkait
2026-04-28  8:56   ` Tauro, Riana
2026-04-29  5:24     ` Purkait, Soham
2026-04-29  5:34       ` Raag Jadav [this message]
2026-04-28 13:19   ` Andi Shyti
2026-04-23 17:39 ` [PATCH v2 2/2] drm/xe/xe_ras: Add RAS support for " Soham Purkait
2026-04-27 22:16   ` Rodrigo Vivi
2026-04-28  8:24   ` Tauro, Riana
2026-04-28 12:57     ` Andi Shyti
2026-04-29  6:07     ` Purkait, Soham
2026-04-28 13:47   ` Andi Shyti
2026-04-29  5:39   ` Raag Jadav
2026-04-23 17:52 ` ✗ CI.checkpatch: warning for drm/xe: Add " Patchwork
2026-04-23 17:54 ` ✓ CI.KUnit: success " Patchwork
2026-04-23 19:02 ` ✓ Xe.CI.BAT: " Patchwork
2026-04-24  2:52 ` ✓ Xe.CI.FULL: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afGYS7oBCw4-QQoG@black.igk.intel.com \
    --to=raag.jadav@intel.com \
    --cc=andi.shyti@intel.com \
    --cc=anoop.c.vijay@intel.com \
    --cc=anshuman.gupta@intel.com \
    --cc=aravind.iddamsetty@linux.intel.com \
    --cc=badal.nilawar@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=mallesh.koujalagi@intel.com \
    --cc=ravi.kishore.koppuravuri@intel.com \
    --cc=riana.tauro@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=soham.purkait@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.