Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Vivi, Rodrigo" <rodrigo.vivi@intel.com>
To: "Nilawar, Badal" <badal.nilawar@intel.com>,
	"Jadav, Raag" <raag.jadav@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
	"Nikula, Jani" <jani.nikula@intel.com>,
	"ville.syrjala@linux.intel.com" <ville.syrjala@linux.intel.com>,
	"Roper, Matthew D" <matthew.d.roper@intel.com>,
	"Brost, Matthew" <matthew.brost@intel.com>,
	"dev@lankhorst.se" <dev@lankhorst.se>,
	"Shankar, Uma" <uma.shankar@intel.com>,
	"Poosa, Karthik" <karthik.poosa@intel.com>,
	"Wajdeczko, Michal" <Michal.Wajdeczko@intel.com>
Subject: Re: [PATCH v2] drm/xe/pm: Handle GT resume failure
Date: Fri, 9 Jan 2026 14:50:12 +0000	[thread overview]
Message-ID: <a8d6db36025d026a07eb7c1de07dfff376b69cc2.camel@intel.com> (raw)
In-Reply-To: <aWEF-MRhF5PiGTkP@black.igk.intel.com>

On Fri, 2026-01-09 at 14:43 +0100, Raag Jadav wrote:
> On Fri, Jan 09, 2026 at 04:37:39PM +0530, Nilawar, Badal wrote:
> > On 09-01-2026 11:28, Raag Jadav wrote:
> > > On Fri, Jan 09, 2026 at 08:14:05AM +0530, Nilawar, Badal wrote:
> > > > On 20-12-2025 13:06, Raag Jadav wrote:
> > > > > We've been historically ignoring GT resume failure. Since the
> > > > > function
> > > > > can return error, handle it properly.
> > > > > 
> > > > > v2: Bring up display before bailing (Matt Roper, Rodrigo)
> > > > > 
> > > > > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> > > > > ---
> > > > >    drivers/gpu/drm/xe/xe_pm.c | 26 ++++++++++++++++++++++----
> > > > >    1 file changed, 22 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/xe/xe_pm.c
> > > > > b/drivers/gpu/drm/xe/xe_pm.c
> > > > > index 4390ba69610d..559cf5490ac0 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_pm.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > > > > @@ -260,10 +260,19 @@ int xe_pm_resume(struct xe_device *xe)
> > > > >    	xe_irq_resume(xe);
> > > > > -	for_each_gt(gt, xe, id)
> > > > > -		xe_gt_resume(gt);
> > > > > +	for_each_gt(gt, xe, id) {
> > > > > +		err = xe_gt_resume(gt);
> > > > > +		if (err)
> > > > > +			break;
> > > > GT and SAMedia are different entities (even if both are treated
> > > > as GTs in
> > > > software), should we not continue attempting to resume the
> > > > remaining GT even
> > > > if resuming first one fails.
> > > My limited understanding is that GUI needs render engine, which
> > > the user
> > > won't be getting back either way.
> > 
> > Ok, but how about multi-tile platform like PVC which doesn't have
> > Render
> > engine and Media?
> 
> I'm assuming we don't need display to be able to debug them?
> 
> Regardless, a semi-working hardware would cause even more problems
> than
> solve IMHO but I'll leave to the experts to decide.

What I liked in this Raag's version is that this is the most generic
best-effort attempt we can get to still bring some display for some
debug or information instead of only a blank screen or only keep
moving as if nothing had happened.

A more complete version would be checks per platform and per type of
GT etc. We can still attempt to try this as a follow up.

For server platforms such as PVC it is a bogus discussion. System
Suspend is not a valid case there. Period.

For some discrete like BMG where render is there in the main GT,
but display is fused off, we can simply return the error.

For the case where main GT failed, we still can try to bring display
up and get some output like Raag's attempt.

For the case where media GT failed we could proceed with the resume,
but kind of a partial wedge of the media gt...

> 
> Raag
> 
> > > > > +	}
> > > > > +	/*
> > > > > +	 * Try to bring up display before bailing from GT
> > > > > resume failure,
> > > > > +	 * so we don't leave the user clueless with a blank
> > > > > screen.
> > > > > +	 */
> > > > >    	xe_display_pm_resume(xe);
> > > > > +	if (err)
> > > > > +		goto err;
> > > > >    	err = xe_bo_restore_late(xe);
> > > > >    	if (err)
> > > > > @@ -656,10 +665,19 @@ int xe_pm_runtime_resume(struct
> > > > > xe_device *xe)
> > > > >    	xe_irq_resume(xe);
> > > > > -	for_each_gt(gt, xe, id)
> > > > > -		xe->d3cold.allowed ? xe_gt_resume(gt) :
> > > > > xe_gt_runtime_resume(gt);
> > > > > +	for_each_gt(gt, xe, id) {
> > > > > +		err = xe->d3cold.allowed ? xe_gt_resume(gt)
> > > > > : xe_gt_runtime_resume(gt);
> > > > > +		if (err)
> > > > > +			break;
> > > > > +	}
> > > > > +	/*
> > > > > +	 * Try to bring up display before bailing from GT
> > > > > resume failure,
> > > > > +	 * so we don't leave the user clueless with a blank
> > > > > screen.
> > > > > +	 */
> > > > >    	xe_display_pm_runtime_resume(xe);
> > > > > +	if (err)
> > > > > +		goto out;
> > > > >    	if (xe->d3cold.allowed) {
> > > > >    		err = xe_bo_restore_late(xe);

  reply	other threads:[~2026-01-09 14:50 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-20  7:36 [PATCH v2] drm/xe/pm: Handle GT resume failure Raag Jadav
2025-12-21  3:08 ` Vivi, Rodrigo
2026-01-08 11:54   ` Raag Jadav
2026-01-08 20:34     ` Rodrigo Vivi
2025-12-22 10:30 ` ✓ CI.KUnit: success for drm/xe/pm: Handle GT resume failure (rev2) Patchwork
2025-12-22 11:05 ` ✓ Xe.CI.BAT: " Patchwork
2025-12-22 11:49 ` ✓ CI.KUnit: " Patchwork
2025-12-22 12:28 ` ✓ Xe.CI.Full: " Patchwork
2026-01-09  2:44 ` [PATCH v2] drm/xe/pm: Handle GT resume failure Nilawar, Badal
2026-01-09  5:58   ` Raag Jadav
2026-01-09 11:07     ` Nilawar, Badal
2026-01-09 13:43       ` Raag Jadav
2026-01-09 14:50         ` Vivi, Rodrigo [this message]
2026-01-10  7:35           ` Raag Jadav

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a8d6db36025d026a07eb7c1de07dfff376b69cc2.camel@intel.com \
    --to=rodrigo.vivi@intel.com \
    --cc=Michal.Wajdeczko@intel.com \
    --cc=badal.nilawar@intel.com \
    --cc=dev@lankhorst.se \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@intel.com \
    --cc=karthik.poosa@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=raag.jadav@intel.com \
    --cc=uma.shankar@intel.com \
    --cc=ville.syrjala@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox