Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Vivekanandan, Balasubramani" <balasubramani.vivekanandan@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: <intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH] drm/xe/device: Discard check for lmem_init
Date: Fri, 19 Dec 2025 19:57:40 +0530	[thread overview]
Message-ID: <aUVg3EtDPVJ9hnm7@bvivekan-mobl1> (raw)
In-Reply-To: <20251217231229.GC1180203@mdroper-desk1.amr.corp.intel.com>

On 17.12.2025 15:12, Matt Roper wrote:
> On Wed, Dec 17, 2025 at 06:21:43PM +0530, Balasubramani Vivekanandan wrote:
> > Prior to lmem init check, driver is waiting for the pcode uncore_init
> > status. uncore_init status will be asserted after the complete boot and
> > initialization of the SoC by the pcode. uncore_init confirms that lmem
> > init and mmio unblock has been already completed.
> > It makes no sense to check for lmem init after the pcode uncore_init
> > check. So it can be removed.
> 
> While I think this should be fine on our current platforms, one thing
> that worries me is that we'll bypass xe_pcode_ready() if we ever have a
> device that sets skip_pcode in xe_pci.c.  No such device exists today,
> but if one shows up in the future it may not be obvious when enabling
> the platform that we'd need to add back the GU_CNTL check (or something
> equivalent).

I agree.

> 
> A couple thoughts:
> 
>  - Maybe we should have an initial patch that drops 'skip_pcode' from
>    xe_device_desc since it's not being used today.  If it becomes
>    necessary in the future, then we can easily re-add it, and the
>    process of doing so may help remind us that we also need to do other
>    checks to make sure the device/lmem is fully initialized and ready to
>    use.

We may not be able to drop skip_pcode because it is used for SRIOV and
going to be used for a future platform. So I am not dropping skip_pcode
in my upcoming revision.

> 
>  - Maybe we should replace wait_for_lmem_ready() with an
>    "assert_lmem_ready()" function that will just do a quick sanity check
>    on debug builds.
> 
>         static void assert_lmem_ready(struct xe_device *xe) {
>                 if (!IS_DGFX(xe) || IS_SRIOV_VF(xe))
>                         return;
> 
>                 xe_assert(xe, xe_mmio_read32(xe_root_tile_mmio(xe), GU_CNTL) & LMEM_INIT);
>         }

Thanks, this looks good. I will update.

Regards,
Bala

> 
>    That eliminates all the looping/polling logic, but still helps make
>    sure we don't miss anything if we ever need to skip the pcode step on
>    a future platform (or if the init flows change and our ordering
>    assumptions are no longer true).  And since it's an xe_assert() it's
>    only active on debug/CI builds and will be compiled out on release
>    builds.
> 
> 
> Matt
> 
> > 
> > Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_device.c | 67 +++-------------------------------
> >  1 file changed, 5 insertions(+), 62 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > index 1197f914ef77..3818d0cccb0e 100644
> > --- a/drivers/gpu/drm/xe/xe_device.c
> > +++ b/drivers/gpu/drm/xe/xe_device.c
> > @@ -8,7 +8,6 @@
> >  #include <linux/aperture.h>
> >  #include <linux/delay.h>
> >  #include <linux/fault-inject.h>
> > -#include <linux/iopoll.h>
> >  #include <linux/units.h>
> >  
> >  #include <drm/drm_atomic_helper.h>
> > @@ -630,63 +629,6 @@ static int xe_set_dma_info(struct xe_device *xe)
> >  	return err;
> >  }
> >  
> > -static int lmem_initializing(struct xe_device *xe)
> > -{
> > -	if (xe_mmio_read32(xe_root_tile_mmio(xe), GU_CNTL) & LMEM_INIT)
> > -		return 0;
> > -
> > -	if (signal_pending(current))
> > -		return -EINTR;
> > -
> > -	return 1;
> > -}
> > -
> > -static int wait_for_lmem_ready(struct xe_device *xe)
> > -{
> > -	const unsigned long TIMEOUT_SEC = 60;
> > -	unsigned long prev_jiffies;
> > -	int initializing;
> > -
> > -	if (!IS_DGFX(xe))
> > -		return 0;
> > -
> > -	if (IS_SRIOV_VF(xe))
> > -		return 0;
> > -
> > -	if (!lmem_initializing(xe))
> > -		return 0;
> > -
> > -	drm_dbg(&xe->drm, "Waiting for lmem initialization\n");
> > -	prev_jiffies = jiffies;
> > -
> > -	/*
> > -	 * The boot firmware initializes local memory and
> > -	 * assesses its health. If memory training fails,
> > -	 * the punit will have been instructed to keep the GT powered
> > -	 * down.we won't be able to communicate with it
> > -	 *
> > -	 * If the status check is done before punit updates the register,
> > -	 * it can lead to the system being unusable.
> > -	 * use a timeout and defer the probe to prevent this.
> > -	 */
> > -	poll_timeout_us(initializing = lmem_initializing(xe),
> > -			initializing <= 0,
> > -			20 * USEC_PER_MSEC, TIMEOUT_SEC * USEC_PER_SEC, true);
> > -	if (initializing < 0)
> > -		return initializing;
> > -
> > -	if (initializing) {
> > -		drm_dbg(&xe->drm, "lmem not initialized by firmware\n");
> > -		return -EPROBE_DEFER;
> > -	}
> > -
> > -	drm_dbg(&xe->drm, "lmem ready after %ums",
> > -		jiffies_to_msecs(jiffies - prev_jiffies));
> > -
> > -	return 0;
> > -}
> > -ALLOW_ERROR_INJECTION(wait_for_lmem_ready, ERRNO); /* See xe_pci_probe() */
> > -
> >  static void vf_update_device_info(struct xe_device *xe)
> >  {
> >  	xe_assert(xe, IS_SRIOV_VF(xe));
> > @@ -740,6 +682,11 @@ int xe_device_probe_early(struct xe_device *xe)
> >  	if (IS_SRIOV_VF(xe))
> >  		vf_update_device_info(xe);
> >  
> > +	/*
> > +	 * Check for pcode uncore_init status to confirm if the SoC
> > +	 * initialization is complete. Until done, any MMIO or lmem access from
> > +	 * the driver will be blocked
> > +	 */
> >  	err = xe_pcode_probe_early(xe);
> >  	if (err || xe_survivability_mode_is_requested(xe)) {
> >  		int save_err = err;
> > @@ -756,10 +703,6 @@ int xe_device_probe_early(struct xe_device *xe)
> >  		return save_err;
> >  	}
> >  
> > -	err = wait_for_lmem_ready(xe);
> > -	if (err)
> > -		return err;
> > -
> >  	xe->wedged.mode = xe_modparam.wedged_mode;
> >  
> >  	err = xe_device_vram_alloc(xe);
> > -- 
> > 2.43.0
> > 
> 
> -- 
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation

  parent reply	other threads:[~2025-12-19 14:27 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-17 12:51 [PATCH] drm/xe/device: Discard check for lmem_init Balasubramani Vivekanandan
2025-12-17 13:54 ` ✓ CI.KUnit: success for " Patchwork
2025-12-17 14:53 ` ✓ Xe.CI.BAT: " Patchwork
2025-12-17 23:12 ` [PATCH] " Matt Roper
2025-12-18 15:57   ` Ville Syrjälä
2025-12-19 14:38     ` Vivekanandan, Balasubramani
2025-12-19 14:27   ` Vivekanandan, Balasubramani [this message]
2025-12-19 16:42     ` Matt Roper
2025-12-18 11:42 ` ✗ Xe.CI.Full: failure for " Patchwork
2025-12-18 20:57 ` [PATCH] " Summers, Stuart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aUVg3EtDPVJ9hnm7@bvivekan-mobl1 \
    --to=balasubramani.vivekanandan@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.d.roper@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox