From: Raag Jadav <raag.jadav@intel.com>
To: Mallesh Koujalagi <mallesh.koujalagi@intel.com>
Cc: intel-xe@lists.freedesktop.org, rodrigo.vivi@intel.com,
matthew.brost@intel.com, anshuman.gupta@intel.com,
badal.nilawar@intel.com, riana.tauro@intel.com,
karthik.poosa@intel.com, sk.anirban@intel.com
Subject: Re: [PATCH v2] drm/xe/xe_survivability: Fix runtime survivability error handling
Date: Thu, 16 Apr 2026 17:48:03 +0200 [thread overview]
Message-ID: <aeEEs4oi6zK-W3SZ@black.igk.intel.com> (raw)
In-Reply-To: <20260414172641.570653-2-mallesh.koujalagi@intel.com>
On Tue, Apr 14, 2026 at 10:56:42PM +0530, Mallesh Koujalagi wrote:
> xe_survivability_mode_runtime_enable() returns an int, but its caller
> csc_hw_error_work() cannot take any meaningful recovery action on
> failure. The function already handles all internal errors via dev_err()
> and proceeds to enable survivability mode regardless of sysfs creation
> failure.
>
> Change the return type to void and drop unnecessary error handling
> in csc_hw_error_work().
>
> v2:
> - Return is not require after the sysfs creation fail. (Rodrigo/Riana)
> - Change int to void return type. (Rodrigo)
> - Remove extra message from csc_hw_error_work().
>
> Fixes: a2ca0633a0fe ("drm/xe/xe_survivability: Add support for Runtime survivability mode")
> Signed-off-by: Mallesh Koujalagi <mallesh.koujalagi@intel.com>
> ---
> drivers/gpu/drm/xe/xe_hw_error.c | 5 +----
> drivers/gpu/drm/xe/xe_survivability_mode.c | 8 ++------
> drivers/gpu/drm/xe/xe_survivability_mode.h | 2 +-
> 3 files changed, 4 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_hw_error.c b/drivers/gpu/drm/xe/xe_hw_error.c
> index 2a31b430570e..64d2260e761b 100644
> --- a/drivers/gpu/drm/xe/xe_hw_error.c
> +++ b/drivers/gpu/drm/xe/xe_hw_error.c
> @@ -169,11 +169,8 @@ static void csc_hw_error_work(struct work_struct *work)
> {
> struct xe_tile *tile = container_of(work, typeof(*tile), csc_hw_error_work);
> struct xe_device *xe = tile_to_xe(tile);
> - int ret;
>
> - ret = xe_survivability_mode_runtime_enable(xe);
> - if (ret)
> - drm_err(&xe->drm, "Failed to enable runtime survivability mode\n");
> + xe_survivability_mode_runtime_enable(xe);
> }
>
> static void csc_hw_error_handler(struct xe_tile *tile, const enum hardware_error hw_err)
> diff --git a/drivers/gpu/drm/xe/xe_survivability_mode.c b/drivers/gpu/drm/xe/xe_survivability_mode.c
> index db64cac39c94..c68b789d93c7 100644
> --- a/drivers/gpu/drm/xe/xe_survivability_mode.c
> +++ b/drivers/gpu/drm/xe/xe_survivability_mode.c
> @@ -396,10 +396,8 @@ bool xe_survivability_mode_is_requested(struct xe_device *xe)
> * Runtime survivability mode is enabled when certain errors cause the device to be
> * in non-recoverable state. The device is declared wedged with the appropriate
> * recovery method and survivability mode sysfs exposed to userspace
> - *
> - * Return: 0 if runtime survivability mode is enabled, negative error code otherwise.
> */
> -int xe_survivability_mode_runtime_enable(struct xe_device *xe)
> +void xe_survivability_mode_runtime_enable(struct xe_device *xe)
> {
> struct xe_survivability *survivability = &xe->survivability;
> struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
Drop ret variable here as well, so we don't mislead anyone that it needs
to be returned.
Raag
> @@ -407,7 +405,7 @@ int xe_survivability_mode_runtime_enable(struct xe_device *xe)
>
> if (!IS_DGFX(xe) || IS_SRIOV_VF(xe) || xe->info.platform < XE_BATTLEMAGE) {
> dev_err(&pdev->dev, "Runtime Survivability Mode not supported\n");
> - return -EINVAL;
> + return;
> }
>
> populate_survivability_info(xe);
> @@ -422,8 +420,6 @@ int xe_survivability_mode_runtime_enable(struct xe_device *xe)
> xe_device_set_wedged_method(xe, DRM_WEDGE_RECOVERY_VENDOR);
> xe_device_declare_wedged(xe);
> dev_err(&pdev->dev, "Firmware flash required, Please refer to the userspace documentation for more details!\n");
> -
> - return 0;
> }
>
> /**
> diff --git a/drivers/gpu/drm/xe/xe_survivability_mode.h b/drivers/gpu/drm/xe/xe_survivability_mode.h
> index 1cc94226aa82..cd040e4d18bb 100644
> --- a/drivers/gpu/drm/xe/xe_survivability_mode.h
> +++ b/drivers/gpu/drm/xe/xe_survivability_mode.h
> @@ -11,7 +11,7 @@
> struct xe_device;
>
> int xe_survivability_mode_boot_enable(struct xe_device *xe);
> -int xe_survivability_mode_runtime_enable(struct xe_device *xe);
> +void xe_survivability_mode_runtime_enable(struct xe_device *xe);
> bool xe_survivability_mode_is_boot_enabled(struct xe_device *xe);
> bool xe_survivability_mode_is_requested(struct xe_device *xe);
>
> --
> 2.34.1
>
prev parent reply other threads:[~2026-04-16 15:48 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-14 17:26 [PATCH v2] drm/xe/xe_survivability: Fix runtime survivability error handling Mallesh Koujalagi
2026-04-14 17:36 ` ✓ CI.KUnit: success for drm/xe/xe_survivability: Fix runtime survivability error handling (rev2) Patchwork
2026-04-14 19:04 ` ✓ Xe.CI.BAT: " Patchwork
2026-04-14 20:38 ` ✓ Xe.CI.FULL: " Patchwork
2026-04-16 15:48 ` Raag Jadav [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aeEEs4oi6zK-W3SZ@black.igk.intel.com \
--to=raag.jadav@intel.com \
--cc=anshuman.gupta@intel.com \
--cc=badal.nilawar@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=karthik.poosa@intel.com \
--cc=mallesh.koujalagi@intel.com \
--cc=matthew.brost@intel.com \
--cc=riana.tauro@intel.com \
--cc=rodrigo.vivi@intel.com \
--cc=sk.anirban@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox