public inbox for intel-xe@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH v4] drm/xe/xe_survivability: Fix runtime survivability error handling
@ 2026-04-20  2:00 Mallesh Koujalagi
  2026-04-20 21:45 ` ✓ CI.KUnit: success for drm/xe/xe_survivability: Fix runtime survivability error handling (rev4) Patchwork
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Mallesh Koujalagi @ 2026-04-20  2:00 UTC (permalink / raw)
  To: intel-xe, rodrigo.vivi, matthew.brost
  Cc: anshuman.gupta, badal.nilawar, riana.tauro, karthik.poosa,
	sk.anirban, raag.jadav, Mallesh Koujalagi

xe_survivability_mode_runtime_enable() returns an int, but its caller
csc_hw_error_work() cannot take any meaningful recovery action on
failure. The function already handles all internal errors via dev_err()
and proceeds to enable survivability mode regardless of sysfs creation
failure.

Change the return type to void and drop unnecessary error handling
in csc_hw_error_work().

v2:
- Return is not require after the sysfs creation fail. (Rodrigo/Riana)
- Change int to void return type. (Rodrigo)
- Remove extra message from csc_hw_error_work().

v3:
- Remove ret variable. (Raag)

v4:
- Drop ret variable from other part of code.

Fixes: a2ca0633a0fe ("drm/xe/xe_survivability: Add support for Runtime survivability mode")
Signed-off-by: Mallesh Koujalagi <mallesh.koujalagi@intel.com>
---
 drivers/gpu/drm/xe/xe_hw_error.c           |  5 +----
 drivers/gpu/drm/xe/xe_survivability_mode.c | 14 ++++----------
 drivers/gpu/drm/xe/xe_survivability_mode.h |  2 +-
 3 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_hw_error.c b/drivers/gpu/drm/xe/xe_hw_error.c
index 2a31b430570e..64d2260e761b 100644
--- a/drivers/gpu/drm/xe/xe_hw_error.c
+++ b/drivers/gpu/drm/xe/xe_hw_error.c
@@ -169,11 +169,8 @@ static void csc_hw_error_work(struct work_struct *work)
 {
 	struct xe_tile *tile = container_of(work, typeof(*tile), csc_hw_error_work);
 	struct xe_device *xe = tile_to_xe(tile);
-	int ret;
 
-	ret = xe_survivability_mode_runtime_enable(xe);
-	if (ret)
-		drm_err(&xe->drm, "Failed to enable runtime survivability mode\n");
+	xe_survivability_mode_runtime_enable(xe);
 }
 
 static void csc_hw_error_handler(struct xe_tile *tile, const enum hardware_error hw_err)
diff --git a/drivers/gpu/drm/xe/xe_survivability_mode.c b/drivers/gpu/drm/xe/xe_survivability_mode.c
index db64cac39c94..427afd144f3a 100644
--- a/drivers/gpu/drm/xe/xe_survivability_mode.c
+++ b/drivers/gpu/drm/xe/xe_survivability_mode.c
@@ -396,25 +396,21 @@ bool xe_survivability_mode_is_requested(struct xe_device *xe)
  * Runtime survivability mode is enabled when certain errors cause the device to be
  * in non-recoverable state. The device is declared wedged with the appropriate
  * recovery method and survivability mode sysfs exposed to userspace
- *
- * Return: 0 if runtime survivability mode is enabled, negative error code otherwise.
  */
-int xe_survivability_mode_runtime_enable(struct xe_device *xe)
+void xe_survivability_mode_runtime_enable(struct xe_device *xe)
 {
 	struct xe_survivability *survivability = &xe->survivability;
 	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
-	int ret;
 
 	if (!IS_DGFX(xe) || IS_SRIOV_VF(xe) || xe->info.platform < XE_BATTLEMAGE) {
 		dev_err(&pdev->dev, "Runtime Survivability Mode not supported\n");
-		return -EINVAL;
+		return;
 	}
 
 	populate_survivability_info(xe);
 
-	ret = create_survivability_sysfs(pdev);
-	if (ret)
-		dev_err(&pdev->dev, "Failed to create survivability mode sysfs\n");
+	if (create_survivability_sysfs(pdev))
+		dev_err(&pdev->dev, "Failed to create survivability sysfs\n");
 
 	survivability->type = XE_SURVIVABILITY_TYPE_RUNTIME;
 	dev_err(&pdev->dev, "Runtime Survivability mode enabled\n");
@@ -422,8 +418,6 @@ int xe_survivability_mode_runtime_enable(struct xe_device *xe)
 	xe_device_set_wedged_method(xe, DRM_WEDGE_RECOVERY_VENDOR);
 	xe_device_declare_wedged(xe);
 	dev_err(&pdev->dev, "Firmware flash required, Please refer to the userspace documentation for more details!\n");
-
-	return 0;
 }
 
 /**
diff --git a/drivers/gpu/drm/xe/xe_survivability_mode.h b/drivers/gpu/drm/xe/xe_survivability_mode.h
index 1cc94226aa82..cd040e4d18bb 100644
--- a/drivers/gpu/drm/xe/xe_survivability_mode.h
+++ b/drivers/gpu/drm/xe/xe_survivability_mode.h
@@ -11,7 +11,7 @@
 struct xe_device;
 
 int xe_survivability_mode_boot_enable(struct xe_device *xe);
-int xe_survivability_mode_runtime_enable(struct xe_device *xe);
+void xe_survivability_mode_runtime_enable(struct xe_device *xe);
 bool xe_survivability_mode_is_boot_enabled(struct xe_device *xe);
 bool xe_survivability_mode_is_requested(struct xe_device *xe);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-04-21  7:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-20  2:00 [PATCH v4] drm/xe/xe_survivability: Fix runtime survivability error handling Mallesh Koujalagi
2026-04-20 21:45 ` ✓ CI.KUnit: success for drm/xe/xe_survivability: Fix runtime survivability error handling (rev4) Patchwork
2026-04-20 22:32 ` ✓ Xe.CI.BAT: " Patchwork
2026-04-21  1:29 ` ✗ Xe.CI.FULL: failure " Patchwork
2026-04-21  7:57 ` [PATCH v4] drm/xe/xe_survivability: Fix runtime survivability error handling Raag Jadav

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox