Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Riana Tauro <riana.tauro@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <anshuman.gupta@intel.com>,
	<lucas.demarchi@intel.com>
Subject: Re: [PATCH] drm/xe/xe_survivability: Add support for survivability mode v2
Date: Thu, 16 Oct 2025 15:17:54 -0400	[thread overview]
Message-ID: <aPFE4hIJKUKh1pjz@intel.com> (raw)
In-Reply-To: <20251014053257.3417575-2-riana.tauro@intel.com>

On Tue, Oct 14, 2025 at 11:02:58AM +0530, Riana Tauro wrote:
> v2 survivability breadcrumbs introduces a new mode called
> SPI Flash Descriptor Override mode (FDO). This is enabled by
> PCODE when MEI itself fails and firmware cannot be updated via
> MEI using igsc. This mode provides the ability to update
> the firmware directly via SPI driver.
> 
> Xe KMD initializes the nvm aux driver if FDO mode is enabled.
> 
> Userspace should check FDO mode entry in survivability sysfs before
> using the SPI driver to update firmware.
> 
> v2 also supports survivability mode for critical boot errors.
> 
> 	cat /sys/bus/pci/devices/0000\:03\:00.0/survivability_mode
> 
>                Capability Info: 0x138320 - 0x2001ae06
>                Postcode Info: 0x138324 - 0x0
>                Overflow Info: 0x138328 - 0x0
>                Auxiliary Info 0: 0x13832c - 0x0

I am truly sorry here, but although I was the one that designed this,
looking it now, I realized that this is breaking the sysfs rules
of one value per file and no fancy format. This is only allowed in
the debugfs.

We need to change this asap, and with help from any tool that
might be already consuming this.

>                FDO Mode: enabled

After we fix that we can come and add this.

About our options: I don't believe that debugfs is an option
without the drm card right?

Perhaps what we need is to transform survivability_mode in
the directory. Each entry becomes a file in this directory.

Sorry,
Rodrigo.

> 
> Signed-off-by: Riana Tauro <riana.tauro@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_pcode_api.h             |  2 ++
>  drivers/gpu/drm/xe/xe_survivability_mode.c    | 32 +++++++++++++++++--
>  .../gpu/drm/xe/xe_survivability_mode_types.h  |  6 ++++
>  3 files changed, 38 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_pcode_api.h b/drivers/gpu/drm/xe/xe_pcode_api.h
> index 92bfcba51e19..d41f07f9194d 100644
> --- a/drivers/gpu/drm/xe/xe_pcode_api.h
> +++ b/drivers/gpu/drm/xe/xe_pcode_api.h
> @@ -77,11 +77,13 @@
>  
>  #define PCODE_SCRATCH(x)		XE_REG(0x138320 + ((x) * 4))
>  /* PCODE_SCRATCH0 */
> +#define   BREADCRUMB_VERSION		REG_GENMASK(31, 29)
>  #define   AUXINFO_REG_OFFSET		REG_GENMASK(17, 15)
>  #define   OVERFLOW_REG_OFFSET		REG_GENMASK(14, 12)
>  #define   HISTORY_TRACKING		REG_BIT(11)
>  #define   OVERFLOW_SUPPORT		REG_BIT(10)
>  #define   AUXINFO_SUPPORT		REG_BIT(9)
> +#define   FDO_MODE			REG_BIT(4)
>  #define   BOOT_STATUS			REG_GENMASK(3, 1)
>  #define      CRITICAL_FAILURE		4
>  #define      NON_CRITICAL_FAILURE	7
> diff --git a/drivers/gpu/drm/xe/xe_survivability_mode.c b/drivers/gpu/drm/xe/xe_survivability_mode.c
> index 1662bfddd4bc..1c9421651548 100644
> --- a/drivers/gpu/drm/xe/xe_survivability_mode.c
> +++ b/drivers/gpu/drm/xe/xe_survivability_mode.c
> @@ -16,6 +16,7 @@
>  #include "xe_heci_gsc.h"
>  #include "xe_i2c.h"
>  #include "xe_mmio.h"
> +#include "xe_nvm.h"
>  #include "xe_pcode_api.h"
>  #include "xe_vsec.h"
>  
> @@ -61,6 +62,12 @@
>   *	Provides history of previous failures
>   * Auxiliary Information
>   *	Certain failures may have information in addition to postcode information
> + * FDO Mode
> + *	To allow recovery in scenarios where MEI itself fails, a new SPI Flash Descriptor
> + *	Override (FDO) mode is added in v2 survivability breadcrumbs. This mode is enabled
> + *	by PCODE and provides the ability to directly update the firmware via SPI Driver without
> + *	any dependency on MEI.
> + *	Xe KMD initializes the nvm aux driver if FDO mode is enabled.
>   *
>   * Runtime Survivability
>   * =====================
> @@ -105,6 +112,11 @@ static void populate_survivability_info(struct xe_device *xe)
>  	set_survivability_info(mmio, info, id, "Capability Info");
>  	reg_value = info[id].value;
>  
> +	survivability->version = REG_FIELD_GET(BREADCRUMB_VERSION, reg_value);
> +	/* FDO mode is exposed only from version 2 */
> +	if (survivability->version >= 2)
> +		survivability->fdo_mode = REG_FIELD_GET(FDO_MODE, reg_value);
> +
>  	if (reg_value & HISTORY_TRACKING) {
>  		id++;
>  		set_survivability_info(mmio, info, id, "Postcode Info");
> @@ -171,6 +183,9 @@ static ssize_t survivability_mode_show(struct device *dev,
>  					       info[index].reg, info[index].value);
>  	}
>  
> +	if (survivability->version >= 2)
> +		count += sysfs_emit_at(buff, count, "FDO Mode: %s\n",
> +				       str_enabled_disabled(survivability->fdo_mode));
>  	return count;
>  }
>  
> @@ -179,9 +194,13 @@ static DEVICE_ATTR_ADMIN_RO(survivability_mode);
>  static void xe_survivability_mode_fini(void *arg)
>  {
>  	struct xe_device *xe = arg;
> +	struct xe_survivability *survivability = &xe->survivability;
>  	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
>  	struct device *dev = &pdev->dev;
>  
> +	if (survivability->fdo_mode)
> +		xe_nvm_fini(xe);
> +
>  	sysfs_remove_file(&dev->kobj, &dev_attr_survivability_mode.attr);
>  }
>  
> @@ -230,11 +249,18 @@ static int enable_boot_survivability_mode(struct pci_dev *pdev)
>  	if (ret)
>  		goto err;
>  
> +	if (survivability->fdo_mode) {
> +		ret = xe_nvm_init(xe);
> +		if (ret)
> +			goto err;
> +	}
> +
>  	dev_err(dev, "In Survivability Mode\n");
>  
>  	return 0;
>  
>  err:
> +	dev_err(dev, "Failed to enable Survivability Mode\n");
>  	survivability->mode = false;
>  	return ret;
>  }
> @@ -365,8 +391,10 @@ int xe_survivability_mode_boot_enable(struct xe_device *xe)
>  	if (ret)
>  		return ret;
>  
> -	/* Log breadcrumbs but do not enter survivability mode for Critical boot errors */
> -	if (survivability->boot_status == CRITICAL_FAILURE) {
> +	/*
> +	 * v2 supports survivability mode for critical errors
> +	 */
> +	if (survivability->version < 2  && survivability->boot_status == CRITICAL_FAILURE) {
>  		log_survivability_info(pdev);
>  		return -ENXIO;
>  	}
> diff --git a/drivers/gpu/drm/xe/xe_survivability_mode_types.h b/drivers/gpu/drm/xe/xe_survivability_mode_types.h
> index cd65a5d167c9..379d90759c28 100644
> --- a/drivers/gpu/drm/xe/xe_survivability_mode_types.h
> +++ b/drivers/gpu/drm/xe/xe_survivability_mode_types.h
> @@ -38,6 +38,12 @@ struct xe_survivability {
>  
>  	/** @type: survivability type */
>  	enum xe_survivability_type type;
> +
> +	/** @fdo_mode: indicates if FDO mode is enabled */
> +	bool fdo_mode;
> +
> +	/** @version: breadcrumb version of survivability mode  */
> +	u8 version;
>  };
>  
>  #endif /* _XE_SURVIVABILITY_MODE_TYPES_H_ */
> -- 
> 2.47.1
> 

  parent reply	other threads:[~2025-10-16 19:18 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-14  5:32 [PATCH] drm/xe/xe_survivability: Add support for survivability mode v2 Riana Tauro
2025-10-14  6:10 ` ✓ CI.KUnit: success for " Patchwork
2025-10-14  6:46 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-14 14:09 ` ✗ Xe.CI.Full: failure " Patchwork
2025-10-16 19:17 ` Rodrigo Vivi [this message]
2025-10-19 15:55   ` [PATCH] " Raag Jadav
2025-10-26 18:59     ` Raag Jadav
2025-10-22 12:38   ` Riana Tauro
2025-11-03  8:05     ` Riana Tauro
2025-11-04 18:16       ` Rodrigo Vivi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPFE4hIJKUKh1pjz@intel.com \
    --to=rodrigo.vivi@intel.com \
    --cc=anshuman.gupta@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=riana.tauro@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox