Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: "Cavitt, Jonathan" <jonathan.cavitt@intel.com>
Cc: "Maslak, Jan" <jan.maslak@intel.com>,
	"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
	"Patelczyk, Maciej" <maciej.patelczyk@intel.com>,
	"joonas.lahtinen@linux.intel.com"
	<joonas.lahtinen@linux.intel.com>
Subject: Re: [PATCH 1/1] drm/xe: Restore engine registers before restarting schedulers after GT reset
Date: Wed, 10 Dec 2025 10:47:20 -0800	[thread overview]
Message-ID: <aTnAONCgMwwXlIrb@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <CH0PR11MB54448758C46AF56B5C81C089E5A0A@CH0PR11MB5444.namprd11.prod.outlook.com>

On Wed, Dec 10, 2025 at 11:25:43AM -0700, Cavitt, Jonathan wrote:
> -----Original Message-----
> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of Jan Maslak
> Sent: Wednesday, December 10, 2025 6:56 AM
> To: intel-xe@lists.freedesktop.org
> Cc: Patelczyk, Maciej <maciej.patelczyk@intel.com>; joonas.lahtinen@linux.intel.com; Brost, Matthew <matthew.brost@intel.com>; Maslak, Jan <jan.maslak@intel.com>
> Subject: [PATCH 1/1] drm/xe: Restore engine registers before restarting schedulers after GT reset
> > 
> > During GT reset recovery in do_gt_restart(), xe_uc_start() was called
> > before xe_reg_sr_apply_mmio() restored engine-specific registers. This
> > created a race window where the scheduler could run jobs before hardware
> > state was fully restored.
> > 
> > This caused failures in eudebug tests (xe_exec_sip_eudebug@breakpoint-
> > waitsip-*) where TD_CTL register (containing TD_CTL_GLOBAL_DEBUG_ENABLE)
> > wasn't restored before jobs started executing. Breakpoints would fail to
> > trigger SIP entry because the debug enable bit wasn't set yet.
> > 
> > Fix by moving xe_uc_start() after all MMIO register restoration,
> > including engine registers and CCS mode configuration, ensuring all
> > hardware state is fully restored before any jobs can be scheduled.
> > 
> > Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
> > Signed-off-by: Jan Maslak <jan.maslak@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_gt.c | 7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> > index 7caf781ba9e8..313ce83ab0e5 100644
> > --- a/drivers/gpu/drm/xe/xe_gt.c
> > +++ b/drivers/gpu/drm/xe/xe_gt.c
> > @@ -771,9 +771,6 @@ static int do_gt_restart(struct xe_gt *gt)
> >  		xe_gt_sriov_pf_init_hw(gt);
> >  
> >  	xe_mocs_init(gt);
> > -	err = xe_uc_start(&gt->uc);
> > -	if (err)
> > -		return err;
> >  
> >  	for_each_hw_engine(hwe, gt, id)
> >  		xe_reg_sr_apply_mmio(&hwe->reg_sr, gt);
> > @@ -781,6 +778,10 @@ static int do_gt_restart(struct xe_gt *gt)
> >  	/* Get CCS mode in sync between sw/hw */
> >  	xe_gt_apply_ccs_mode(gt);
> >  
> > +	err = xe_uc_start(&gt->uc);
> > +	if (err)
> > +		return err;
> > +
> 
> This probably doesn't need to go after xe_gt_sanitize_freq (or, in other
> words, this is probably the correct place to move this to), so:
> 

Yes, xe_gt_sanitize_freq relies on the GuC/CTs being live, so this is
the correct place.

With that:
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> 
> -Jonathan Cavitt
> 
> >  	/* Restore GT freq to expected values */
> >  	xe_gt_sanitize_freq(gt);
> >  
> > -- 
> > 2.34.1
> > 
> > 

  reply	other threads:[~2025-12-10 18:47 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-10 14:56 [PATCH 0/1] drm/xe: Restore engine registers before restarting schedulers after GT reset Jan Maslak
2025-12-10 14:56 ` [PATCH 1/1] " Jan Maslak
2025-12-10 18:25   ` Cavitt, Jonathan
2025-12-10 18:47     ` Matthew Brost [this message]
2025-12-11 10:30 ` ✓ CI.KUnit: success for drm/xe: Restore engine registers before restarting schedulers after GT reset (rev2) Patchwork
2025-12-11 11:46 ` ✓ Xe.CI.BAT: " Patchwork
2025-12-11 21:19 ` ✗ Xe.CI.Full: failure " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2025-12-05  4:55 [PATCH 0/1] drm/xe: Restore engine registers before restarting schedulers after GT reset Jan Maslak
2025-12-05  4:55 ` [PATCH 1/1] " Jan Maslak
2025-12-05 20:36   ` Matthew Brost
2025-12-05 20:37     ` Matthew Brost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aTnAONCgMwwXlIrb@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jan.maslak@intel.com \
    --cc=jonathan.cavitt@intel.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=maciej.patelczyk@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox