From: "Tauro, Riana" <riana.tauro@intel.com>
To: Raag Jadav <raag.jadav@intel.com>, Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Mallesh Koujalagi <mallesh.koujalagi@intel.com>,
<intel-xe@lists.freedesktop.org>, <matthew.brost@intel.com>,
<anshuman.gupta@intel.com>, <badal.nilawar@intel.com>,
<karthik.poosa@intel.com>, <sk.anirban@intel.com>
Subject: Re: [PATCH] drm/xe/xe_survivability: Fix runtime survivability error handling
Date: Tue, 14 Apr 2026 20:05:17 +0530 [thread overview]
Message-ID: <c3bfef08-37b4-47c2-bfad-6084d16c54c1@intel.com> (raw)
In-Reply-To: <ad5K96woeG-_19SY@black.igk.intel.com>
On 4/14/2026 7:41 PM, Raag Jadav wrote:
> On Tue, Apr 14, 2026 at 09:58:18AM -0400, Rodrigo Vivi wrote:
>> On Tue, Apr 14, 2026 at 06:14:27PM +0530, Mallesh Koujalagi wrote:
>>> When enabling survivability mode at runtime, the code tries to create
>>> a sysfs entry. If that step fails, the error is only logged, but the
>>> function still reports success. This makes it look like survivability
>>> mode was enabled even though part of it failed.
>>>
>>> Fixes: a2ca0633a0fe ("drm/xe/xe_survivability: Add support for Runtime survivability mode")
>>> Signed-off-by: Mallesh Koujalagi <mallesh.koujalagi@intel.com>
>>> ---
>>> drivers/gpu/drm/xe/xe_survivability_mode.c | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_survivability_mode.c b/drivers/gpu/drm/xe/xe_survivability_mode.c
>>> index db64cac39c94..c2dfc7ea7b83 100644
>>> --- a/drivers/gpu/drm/xe/xe_survivability_mode.c
>>> +++ b/drivers/gpu/drm/xe/xe_survivability_mode.c
>>> @@ -413,8 +413,10 @@ int xe_survivability_mode_runtime_enable(struct xe_device *xe)
>>> populate_survivability_info(xe);
>>>
>>> ret = create_survivability_sysfs(pdev);
>>> - if (ret)
>>> + if (ret) {
>>> dev_err(&pdev->dev, "Failed to create survivability mode sysfs\n");
>>> + return ret;
>> Perhaps this is intentional?
>> But if so, this function needs to be changed to void and
>> the extra msg removed from csc_hw_error_work()
>>
>> Riana, Raag?
Yeah, this was intentional. Even though the sysfs creation fails. The
device needs to be wedged because
there is a critical CSC error and firmware needs to updated and device
shouldn't be used
Currently runtime survivability mode is indicated with a combination of
uevent+sysfs and dmesg
Returning here will remove all indications on how to recover the card.
Yeah the error message in csc worker can be removed. Will RB [1].
Thanks
Riana
> I've done[1] it already, but perhaps a separate fix is also harmless.
>
> [1] https://lore.kernel.org/intel-xe/20260402174229.1062874-4-raag.jadav@intel.com/
>
> Raag
>
>>> survivability->type = XE_SURVIVABILITY_TYPE_RUNTIME;
>>> dev_err(&pdev->dev, "Runtime Survivability mode enabled\n");
>>> --
>>> 2.34.1
>>>
next prev parent reply other threads:[~2026-04-14 14:35 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-14 12:44 [PATCH] drm/xe/xe_survivability: Fix runtime survivability error handling Mallesh Koujalagi
2026-04-14 12:54 ` ✓ CI.KUnit: success for " Patchwork
2026-04-14 13:58 ` [PATCH] " Rodrigo Vivi
2026-04-14 14:11 ` Raag Jadav
2026-04-14 14:35 ` Tauro, Riana [this message]
2026-04-14 14:00 ` ✗ Xe.CI.BAT: failure for " Patchwork
2026-04-14 14:55 ` ✗ Xe.CI.FULL: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c3bfef08-37b4-47c2-bfad-6084d16c54c1@intel.com \
--to=riana.tauro@intel.com \
--cc=anshuman.gupta@intel.com \
--cc=badal.nilawar@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=karthik.poosa@intel.com \
--cc=mallesh.koujalagi@intel.com \
--cc=matthew.brost@intel.com \
--cc=raag.jadav@intel.com \
--cc=rodrigo.vivi@intel.com \
--cc=sk.anirban@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox