From: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
To: Riana Tauro <riana.tauro@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <anshuman.gupta@intel.com>,
<rodrigo.vivi@intel.com>, <lucas.demarchi@intel.com>,
<aravind.iddamsetty@linux.intel.com>, <raag.jadav@intel.com>,
<frank.scarbrough@intel.com>, <sk.anirban@intel.com>
Subject: Re: [PATCH v4 9/9] drm/xe/xe_hw_error: Add fault injection to trigger csc error handler
Date: Fri, 11 Jul 2025 10:41:40 -0700 [thread overview]
Message-ID: <aHFM1MSvgrn4U1iF@unerlige-desk.amr.corp.intel.com> (raw)
In-Reply-To: <20250709112024.1053710-10-riana.tauro@intel.com>
On Wed, Jul 09, 2025 at 04:50:21PM +0530, Riana Tauro wrote:
>Add a debugfs fault handler to trigger csc error handler that
>wedges the device and sends drm uevent
>
>Signed-off-by: Riana Tauro <riana.tauro@intel.com>
>---
> drivers/gpu/drm/xe/xe_debugfs.c | 2 ++
> drivers/gpu/drm/xe/xe_hw_error.c | 11 +++++++++++
> 2 files changed, 13 insertions(+)
>
>diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
>index d83cd6ed3fa8..134610437aea 100644
>--- a/drivers/gpu/drm/xe/xe_debugfs.c
>+++ b/drivers/gpu/drm/xe/xe_debugfs.c
>@@ -29,6 +29,7 @@
> #endif
>
> DECLARE_FAULT_ATTR(gt_reset_failure);
>+DECLARE_FAULT_ATTR(inject_csc_hw_error);
>
> static struct xe_device *node_to_xe(struct drm_info_node *node)
> {
>@@ -273,4 +274,5 @@ void xe_debugfs_register(struct xe_device *xe)
> xe_pxp_debugfs_register(xe->pxp);
>
> fault_create_debugfs_attr("fail_gt_reset", root, >_reset_failure);
>+ fault_create_debugfs_attr("inject_csc_hw_error", root, &inject_csc_hw_error);
Maybe create this attribute only for BMG since it will bail out anyways
with an error when the worker runs? OR are you expecting to see that log
message which says "runtime survivability not supported".
The absence of this attribute in debugfs can also be sufficient to
indicate that it's not supported.
Thanks,
Umesh
> }
>diff --git a/drivers/gpu/drm/xe/xe_hw_error.c b/drivers/gpu/drm/xe/xe_hw_error.c
>index 7cc9b8a7fa1a..2d56a93b3a71 100644
>--- a/drivers/gpu/drm/xe/xe_hw_error.c
>+++ b/drivers/gpu/drm/xe/xe_hw_error.c
>@@ -3,6 +3,8 @@
> * Copyright © 2025 Intel Corporation
> */
>
>+#include <linux/fault-inject.h>
>+
> #include "regs/xe_gsc_regs.h"
> #include "regs/xe_hw_error_regs.h"
> #include "regs/xe_irq_regs.h"
>@@ -13,6 +15,7 @@
> #include "xe_survivability_mode.h"
>
> #define HEC_UNCORR_FW_ERR_BITS 4
>+extern struct fault_attr inject_csc_hw_error;
>
> /* Error categories reported by hardware */
> enum hardware_error {
>@@ -43,6 +46,11 @@ static const char *hw_error_to_str(const enum hardware_error hw_err)
> }
> }
>
>+static bool fault_inject_csc_hw_error(void)
>+{
>+ return should_fail(&inject_csc_hw_error, 1);
>+}
>+
> static void csc_hw_error_work(struct work_struct *work)
> {
> struct xe_tile *tile = container_of(work, typeof(*tile), csc_hw_error_work);
>@@ -134,6 +142,9 @@ void xe_hw_error_irq_handler(struct xe_tile *tile, const u32 master_ctl)
> {
> enum hardware_error hw_err;
>
>+ if (fault_inject_csc_hw_error())
>+ schedule_work(&tile->csc_hw_error_work);
>+
> for (hw_err = 0; hw_err < HARDWARE_ERROR_MAX; hw_err++)
> if (master_ctl & ERROR_IRQ(hw_err))
> hw_error_source_handler(tile, hw_err);
>--
>2.47.1
>
next prev parent reply other threads:[~2025-07-11 17:42 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-09 11:20 [PATCH v4 0/9] Handle Firmware reported Hardware Errors Riana Tauro
2025-07-09 11:20 ` [PATCH v4 1/9] drm: Add a vendor-specific recovery method to device wedged uevent Riana Tauro
2025-07-09 13:41 ` Simona Vetter
2025-07-09 14:09 ` Christian König
2025-07-09 14:18 ` Raag Jadav
2025-07-09 16:52 ` Rodrigo Vivi
2025-07-10 9:01 ` Simona Vetter
2025-07-10 9:37 ` Christian König
2025-07-10 10:24 ` Raag Jadav
2025-07-10 19:00 ` Rodrigo Vivi
2025-07-10 21:46 ` Raag Jadav
2025-07-11 5:17 ` Riana Tauro
2025-07-11 6:08 ` Raag Jadav
2025-07-11 8:56 ` Simona Vetter
2025-07-11 8:59 ` Simona Vetter
2025-07-14 5:27 ` Riana Tauro
2025-07-14 12:33 ` Simona Vetter
2025-07-09 14:46 ` Riana Tauro
2025-07-09 11:20 ` [PATCH v4 2/9] drm/xe: Set GT as wedged before sending " Riana Tauro
2025-07-09 17:26 ` Matthew Brost
2025-07-09 11:20 ` [PATCH v4 3/9] drm/xe: Add a helper function to set recovery method Riana Tauro
2025-07-09 11:20 ` [PATCH v4 4/9] drm/xe/xe_survivability: Refactor survivability mode Riana Tauro
2025-07-09 11:20 ` [PATCH v4 5/9] drm/xe/xe_survivability: Add support for Runtime " Riana Tauro
2025-07-09 23:44 ` Umesh Nerlige Ramappa
2025-07-10 5:59 ` Riana Tauro
2025-07-10 17:12 ` Umesh Nerlige Ramappa
2025-07-11 5:23 ` Riana Tauro
2025-07-09 11:20 ` [PATCH v4 6/9] drm/xe/doc: Document device wedged and runtime survivability Riana Tauro
2025-07-11 5:39 ` Raag Jadav
2025-07-11 6:09 ` Riana Tauro
2025-07-12 5:45 ` Raag Jadav
2025-07-14 9:04 ` Riana Tauro
2025-07-09 11:20 ` [PATCH v4 7/9] drm/xe: Add support to handle hardware errors Riana Tauro
2025-07-10 21:09 ` Umesh Nerlige Ramappa
2025-07-11 5:35 ` Riana Tauro
2025-07-11 17:34 ` Umesh Nerlige Ramappa
2025-07-09 11:20 ` [PATCH v4 8/9] drm/xe/xe_hw_error: Handle CSC Firmware reported Hardware errors Riana Tauro
2025-07-11 0:36 ` Umesh Nerlige Ramappa
2025-07-11 5:46 ` Riana Tauro
2025-07-11 17:38 ` Umesh Nerlige Ramappa
2025-07-09 11:20 ` [PATCH v4 9/9] drm/xe/xe_hw_error: Add fault injection to trigger csc error handler Riana Tauro
2025-07-11 17:41 ` Umesh Nerlige Ramappa [this message]
2025-07-14 7:07 ` Riana Tauro
2025-07-09 12:28 ` ✗ CI.checkpatch: warning for Handle Firmware reported Hardware Errors (rev4) Patchwork
2025-07-09 12:30 ` ✓ CI.KUnit: success " Patchwork
2025-07-09 12:44 ` ✗ CI.checksparse: warning " Patchwork
2025-07-09 13:06 ` ✓ Xe.CI.BAT: success " Patchwork
2025-07-09 15:02 ` ✗ Xe.CI.Full: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aHFM1MSvgrn4U1iF@unerlige-desk.amr.corp.intel.com \
--to=umesh.nerlige.ramappa@intel.com \
--cc=anshuman.gupta@intel.com \
--cc=aravind.iddamsetty@linux.intel.com \
--cc=frank.scarbrough@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=raag.jadav@intel.com \
--cc=riana.tauro@intel.com \
--cc=rodrigo.vivi@intel.com \
--cc=sk.anirban@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox