Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Raag Jadav <raag.jadav@intel.com>
To: "Vivi, Rodrigo" <rodrigo.vivi@intel.com>
Cc: "Nilawar, Badal" <badal.nilawar@intel.com>,
	"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
	"Nikula, Jani" <jani.nikula@intel.com>,
	"ville.syrjala@linux.intel.com" <ville.syrjala@linux.intel.com>,
	"Roper, Matthew D" <matthew.d.roper@intel.com>,
	"Brost, Matthew" <matthew.brost@intel.com>,
	"dev@lankhorst.se" <dev@lankhorst.se>,
	"Shankar, Uma" <uma.shankar@intel.com>,
	"Poosa, Karthik" <karthik.poosa@intel.com>,
	"Wajdeczko, Michal" <Michal.Wajdeczko@intel.com>
Subject: Re: [PATCH v2] drm/xe/pm: Handle GT resume failure
Date: Sat, 10 Jan 2026 08:35:29 +0100	[thread overview]
Message-ID: <aWIBQTLkpx2D-2n5@black.igk.intel.com> (raw)
In-Reply-To: <a8d6db36025d026a07eb7c1de07dfff376b69cc2.camel@intel.com>

On Fri, Jan 09, 2026 at 08:20:12PM +0530, Vivi, Rodrigo wrote:
> On Fri, 2026-01-09 at 14:43 +0100, Raag Jadav wrote:
> > On Fri, Jan 09, 2026 at 04:37:39PM +0530, Nilawar, Badal wrote:
> > > On 09-01-2026 11:28, Raag Jadav wrote:
> > > > On Fri, Jan 09, 2026 at 08:14:05AM +0530, Nilawar, Badal wrote:
> > > > > On 20-12-2025 13:06, Raag Jadav wrote:
> > > > > > We've been historically ignoring GT resume failure. Since the
> > > > > > function
> > > > > > can return error, handle it properly.
> > > > > > 
> > > > > > v2: Bring up display before bailing (Matt Roper, Rodrigo)
> > > > > > 
> > > > > > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> > > > > > ---
> > > > > >    drivers/gpu/drm/xe/xe_pm.c | 26 ++++++++++++++++++++++----
> > > > > >    1 file changed, 22 insertions(+), 4 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/xe/xe_pm.c
> > > > > > b/drivers/gpu/drm/xe/xe_pm.c
> > > > > > index 4390ba69610d..559cf5490ac0 100644
> > > > > > --- a/drivers/gpu/drm/xe/xe_pm.c
> > > > > > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > > > > > @@ -260,10 +260,19 @@ int xe_pm_resume(struct xe_device *xe)
> > > > > >    	xe_irq_resume(xe);
> > > > > > -	for_each_gt(gt, xe, id)
> > > > > > -		xe_gt_resume(gt);
> > > > > > +	for_each_gt(gt, xe, id) {
> > > > > > +		err = xe_gt_resume(gt);
> > > > > > +		if (err)
> > > > > > +			break;
> > > > > GT and SAMedia are different entities (even if both are treated
> > > > > as GTs in
> > > > > software), should we not continue attempting to resume the
> > > > > remaining GT even
> > > > > if resuming first one fails.
> > > > My limited understanding is that GUI needs render engine, which
> > > > the user
> > > > won't be getting back either way.
> > > 
> > > Ok, but how about multi-tile platform like PVC which doesn't have
> > > Render
> > > engine and Media?
> > 
> > I'm assuming we don't need display to be able to debug them?
> > 
> > Regardless, a semi-working hardware would cause even more problems
> > than
> > solve IMHO but I'll leave to the experts to decide.
> 
> What I liked in this Raag's version is that this is the most generic
> best-effort attempt we can get to still bring some display for some
> debug or information instead of only a blank screen or only keep
> moving as if nothing had happened.
> 
> A more complete version would be checks per platform and per type of
> GT etc. We can still attempt to try this as a follow up.
> 
> For server platforms such as PVC it is a bogus discussion. System
> Suspend is not a valid case there. Period.
> 
> For some discrete like BMG where render is there in the main GT,
> but display is fused off, we can simply return the error.

Display helpers already handle fused off cases.

	if (!xe->info.probe_display)
		return;

> For the case where main GT failed, we still can try to bring display
> up and get some output like Raag's attempt.
> 
> For the case where media GT failed we could proceed with the resume,
> but kind of a partial wedge of the media gt...

We can try to introduce some kind of per-gt wedging but that's pretty
much a driver wide redesign.

Raag

> > > > > > +	}
> > > > > > +	/*
> > > > > > +	 * Try to bring up display before bailing from GT
> > > > > > resume failure,
> > > > > > +	 * so we don't leave the user clueless with a blank
> > > > > > screen.
> > > > > > +	 */
> > > > > >    	xe_display_pm_resume(xe);
> > > > > > +	if (err)
> > > > > > +		goto err;
> > > > > >    	err = xe_bo_restore_late(xe);
> > > > > >    	if (err)
> > > > > > @@ -656,10 +665,19 @@ int xe_pm_runtime_resume(struct
> > > > > > xe_device *xe)
> > > > > >    	xe_irq_resume(xe);
> > > > > > -	for_each_gt(gt, xe, id)
> > > > > > -		xe->d3cold.allowed ? xe_gt_resume(gt) :
> > > > > > xe_gt_runtime_resume(gt);
> > > > > > +	for_each_gt(gt, xe, id) {
> > > > > > +		err = xe->d3cold.allowed ? xe_gt_resume(gt)
> > > > > > : xe_gt_runtime_resume(gt);
> > > > > > +		if (err)
> > > > > > +			break;
> > > > > > +	}
> > > > > > +	/*
> > > > > > +	 * Try to bring up display before bailing from GT
> > > > > > resume failure,
> > > > > > +	 * so we don't leave the user clueless with a blank
> > > > > > screen.
> > > > > > +	 */
> > > > > >    	xe_display_pm_runtime_resume(xe);
> > > > > > +	if (err)
> > > > > > +		goto out;
> > > > > >    	if (xe->d3cold.allowed) {
> > > > > >    		err = xe_bo_restore_late(xe);

      reply	other threads:[~2026-01-10  7:35 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-20  7:36 [PATCH v2] drm/xe/pm: Handle GT resume failure Raag Jadav
2025-12-21  3:08 ` Vivi, Rodrigo
2026-01-08 11:54   ` Raag Jadav
2026-01-08 20:34     ` Rodrigo Vivi
2025-12-22 10:30 ` ✓ CI.KUnit: success for drm/xe/pm: Handle GT resume failure (rev2) Patchwork
2025-12-22 11:05 ` ✓ Xe.CI.BAT: " Patchwork
2025-12-22 11:49 ` ✓ CI.KUnit: " Patchwork
2025-12-22 12:28 ` ✓ Xe.CI.Full: " Patchwork
2026-01-09  2:44 ` [PATCH v2] drm/xe/pm: Handle GT resume failure Nilawar, Badal
2026-01-09  5:58   ` Raag Jadav
2026-01-09 11:07     ` Nilawar, Badal
2026-01-09 13:43       ` Raag Jadav
2026-01-09 14:50         ` Vivi, Rodrigo
2026-01-10  7:35           ` Raag Jadav [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWIBQTLkpx2D-2n5@black.igk.intel.com \
    --to=raag.jadav@intel.com \
    --cc=Michal.Wajdeczko@intel.com \
    --cc=badal.nilawar@intel.com \
    --cc=dev@lankhorst.se \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@intel.com \
    --cc=karthik.poosa@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=uma.shankar@intel.com \
    --cc=ville.syrjala@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox