From: Raag Jadav <raag.jadav@intel.com>
To: "Vivi, Rodrigo" <rodrigo.vivi@intel.com>
Cc: "Nilawar, Badal" <badal.nilawar@intel.com>,
"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
"Nikula, Jani" <jani.nikula@intel.com>,
"ville.syrjala@linux.intel.com" <ville.syrjala@linux.intel.com>,
"Roper, Matthew D" <matthew.d.roper@intel.com>,
"Brost, Matthew" <matthew.brost@intel.com>,
"dev@lankhorst.se" <dev@lankhorst.se>,
"Shankar, Uma" <uma.shankar@intel.com>,
"Poosa, Karthik" <karthik.poosa@intel.com>,
"Wajdeczko, Michal" <Michal.Wajdeczko@intel.com>
Subject: Re: [PATCH v2] drm/xe/pm: Handle GT resume failure
Date: Sat, 10 Jan 2026 08:35:29 +0100 [thread overview]
Message-ID: <aWIBQTLkpx2D-2n5@black.igk.intel.com> (raw)
In-Reply-To: <a8d6db36025d026a07eb7c1de07dfff376b69cc2.camel@intel.com>
On Fri, Jan 09, 2026 at 08:20:12PM +0530, Vivi, Rodrigo wrote:
> On Fri, 2026-01-09 at 14:43 +0100, Raag Jadav wrote:
> > On Fri, Jan 09, 2026 at 04:37:39PM +0530, Nilawar, Badal wrote:
> > > On 09-01-2026 11:28, Raag Jadav wrote:
> > > > On Fri, Jan 09, 2026 at 08:14:05AM +0530, Nilawar, Badal wrote:
> > > > > On 20-12-2025 13:06, Raag Jadav wrote:
> > > > > > We've been historically ignoring GT resume failure. Since the
> > > > > > function
> > > > > > can return error, handle it properly.
> > > > > >
> > > > > > v2: Bring up display before bailing (Matt Roper, Rodrigo)
> > > > > >
> > > > > > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> > > > > > ---
> > > > > > drivers/gpu/drm/xe/xe_pm.c | 26 ++++++++++++++++++++++----
> > > > > > 1 file changed, 22 insertions(+), 4 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/xe/xe_pm.c
> > > > > > b/drivers/gpu/drm/xe/xe_pm.c
> > > > > > index 4390ba69610d..559cf5490ac0 100644
> > > > > > --- a/drivers/gpu/drm/xe/xe_pm.c
> > > > > > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > > > > > @@ -260,10 +260,19 @@ int xe_pm_resume(struct xe_device *xe)
> > > > > > xe_irq_resume(xe);
> > > > > > - for_each_gt(gt, xe, id)
> > > > > > - xe_gt_resume(gt);
> > > > > > + for_each_gt(gt, xe, id) {
> > > > > > + err = xe_gt_resume(gt);
> > > > > > + if (err)
> > > > > > + break;
> > > > > GT and SAMedia are different entities (even if both are treated
> > > > > as GTs in
> > > > > software), should we not continue attempting to resume the
> > > > > remaining GT even
> > > > > if resuming first one fails.
> > > > My limited understanding is that GUI needs render engine, which
> > > > the user
> > > > won't be getting back either way.
> > >
> > > Ok, but how about multi-tile platform like PVC which doesn't have
> > > Render
> > > engine and Media?
> >
> > I'm assuming we don't need display to be able to debug them?
> >
> > Regardless, a semi-working hardware would cause even more problems
> > than
> > solve IMHO but I'll leave to the experts to decide.
>
> What I liked in this Raag's version is that this is the most generic
> best-effort attempt we can get to still bring some display for some
> debug or information instead of only a blank screen or only keep
> moving as if nothing had happened.
>
> A more complete version would be checks per platform and per type of
> GT etc. We can still attempt to try this as a follow up.
>
> For server platforms such as PVC it is a bogus discussion. System
> Suspend is not a valid case there. Period.
>
> For some discrete like BMG where render is there in the main GT,
> but display is fused off, we can simply return the error.
Display helpers already handle fused off cases.
if (!xe->info.probe_display)
return;
> For the case where main GT failed, we still can try to bring display
> up and get some output like Raag's attempt.
>
> For the case where media GT failed we could proceed with the resume,
> but kind of a partial wedge of the media gt...
We can try to introduce some kind of per-gt wedging but that's pretty
much a driver wide redesign.
Raag
> > > > > > + }
> > > > > > + /*
> > > > > > + * Try to bring up display before bailing from GT
> > > > > > resume failure,
> > > > > > + * so we don't leave the user clueless with a blank
> > > > > > screen.
> > > > > > + */
> > > > > > xe_display_pm_resume(xe);
> > > > > > + if (err)
> > > > > > + goto err;
> > > > > > err = xe_bo_restore_late(xe);
> > > > > > if (err)
> > > > > > @@ -656,10 +665,19 @@ int xe_pm_runtime_resume(struct
> > > > > > xe_device *xe)
> > > > > > xe_irq_resume(xe);
> > > > > > - for_each_gt(gt, xe, id)
> > > > > > - xe->d3cold.allowed ? xe_gt_resume(gt) :
> > > > > > xe_gt_runtime_resume(gt);
> > > > > > + for_each_gt(gt, xe, id) {
> > > > > > + err = xe->d3cold.allowed ? xe_gt_resume(gt)
> > > > > > : xe_gt_runtime_resume(gt);
> > > > > > + if (err)
> > > > > > + break;
> > > > > > + }
> > > > > > + /*
> > > > > > + * Try to bring up display before bailing from GT
> > > > > > resume failure,
> > > > > > + * so we don't leave the user clueless with a blank
> > > > > > screen.
> > > > > > + */
> > > > > > xe_display_pm_runtime_resume(xe);
> > > > > > + if (err)
> > > > > > + goto out;
> > > > > > if (xe->d3cold.allowed) {
> > > > > > err = xe_bo_restore_late(xe);
prev parent reply other threads:[~2026-01-10 7:35 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-20 7:36 [PATCH v2] drm/xe/pm: Handle GT resume failure Raag Jadav
2025-12-21 3:08 ` Vivi, Rodrigo
2026-01-08 11:54 ` Raag Jadav
2026-01-08 20:34 ` Rodrigo Vivi
2025-12-22 10:30 ` ✓ CI.KUnit: success for drm/xe/pm: Handle GT resume failure (rev2) Patchwork
2025-12-22 11:05 ` ✓ Xe.CI.BAT: " Patchwork
2025-12-22 11:49 ` ✓ CI.KUnit: " Patchwork
2025-12-22 12:28 ` ✓ Xe.CI.Full: " Patchwork
2026-01-09 2:44 ` [PATCH v2] drm/xe/pm: Handle GT resume failure Nilawar, Badal
2026-01-09 5:58 ` Raag Jadav
2026-01-09 11:07 ` Nilawar, Badal
2026-01-09 13:43 ` Raag Jadav
2026-01-09 14:50 ` Vivi, Rodrigo
2026-01-10 7:35 ` Raag Jadav [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aWIBQTLkpx2D-2n5@black.igk.intel.com \
--to=raag.jadav@intel.com \
--cc=Michal.Wajdeczko@intel.com \
--cc=badal.nilawar@intel.com \
--cc=dev@lankhorst.se \
--cc=intel-xe@lists.freedesktop.org \
--cc=jani.nikula@intel.com \
--cc=karthik.poosa@intel.com \
--cc=matthew.brost@intel.com \
--cc=matthew.d.roper@intel.com \
--cc=rodrigo.vivi@intel.com \
--cc=uma.shankar@intel.com \
--cc=ville.syrjala@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox