Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Riana Tauro <riana.tauro@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <anshuman.gupta@intel.com>,
	<aravind.iddamsetty@intel.com>, <badal.nilawar@intel.com>,
	<matthew.d.roper@intel.com>
Subject: Re: [PATCH 2/2] RFC drm/xe: re-order lmem init check and wait for initialization to complete
Date: Fri, 8 Mar 2024 09:42:30 -0500	[thread overview]
Message-ID: <Zesj1vgtwk877OhR@intel.com> (raw)
In-Reply-To: <20240308085517.2030484-3-riana.tauro@intel.com>

On Fri, Mar 08, 2024 at 02:25:17PM +0530, Riana Tauro wrote:
> Lmem init check should be done only after pcode initialization
> status is complete. Move lmem init check after pcode status
> check. Also wait for a short while after pcode status check
> to allow completion of the task.
> 
> Failing to do so, can lead to aborting the module load
> leaving the system unusable. Wait until the lmem initialization
> is complete within a timeout (60s) or till the user aborts.
> 
> Signed-off-by: Riana Tauro <riana.tauro@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_device.c | 53 +++++++++++++++++++++++++++++++++-
>  drivers/gpu/drm/xe/xe_mmio.c   | 29 -------------------
>  2 files changed, 52 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 83dd60f68566..4806e7806be5 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -413,12 +413,59 @@ static int xe_set_dma_info(struct xe_device *xe)
>  	return err;
>  }
>  
> +static int verify_lmem_ready(struct xe_gt *gt)

maybe bool?

> +{
> +	return xe_mmio_read32(gt, GU_CNTL) & LMEM_INIT;
> +}
> +
> +static int wait_for_lmem_ready(struct xe_device *xe)
> +{
> +	struct xe_gt *gt = xe_root_mmio_gt(xe);
> +	unsigned long timeout, start;
> +
> +	if (!IS_DGFX(xe))
> +		return 0;
> +
> +	if (IS_SRIOV_VF(xe))
> +		return 0;
> +	/*
> +	 * The boot firmware initializes local memory and assesses its health.
> +	 * If memory training fails, the punit will have been instructed to
> +	 * keep the GT powered down; we won't be able to communicate with it
> +	 * and we should not continue with driver initialization.
> +	 */

the comment is negative in a positive outcome.
I mean, one reading this comment above might conclude that we are returning below
because we won't be able to communicate with the GT.

but the code is right and we need this change. thanks for taking care of it.

Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

> +	if (verify_lmem_ready(gt))
> +		return 0;
> +
> +	drm_dbg(&xe->drm, "Waiting for lmem initialisation\n");
> +
> +	start = jiffies;
> +	timeout = start + msecs_to_jiffies(60 * 1000); /* 60 sec! */
> +
> +	do {
> +		if (signal_pending(current))
> +			return -EINTR;
> +
> +		if (time_after(jiffies, timeout))
> +			return -EPROBE_DEFER;
> +
> +		msleep(20);
> +
> +	} while (!verify_lmem_ready(gt));
> +
> +	drm_dbg(&xe->drm, "lmem ready after %ums",
> +		jiffies_to_msecs(jiffies - start));
> +
> +	return 0;
> +}
> +
>  /**
>   * xe_device_probe_early: Device early probe
>   * @xe: xe device instance
>   *
>   * Initialize MMIO resources that don't require any
> - * knowledge about tile count. Also initialize pcode
> + * knowledge about tile count. Also initialize pcode and
> + * check vram initialization on root tile.
>   *
>   * Return: 0 on success, error code on failure
>   */
> @@ -438,6 +485,10 @@ int xe_device_probe_early(struct xe_device *xe)
>  	if (err)
>  		return err;
>  
> +	err = wait_for_lmem_ready(xe);
> +	if (err)
> +		return err;
> +
>  	return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
> index 7ba2477452d7..7fc0c5453b21 100644
> --- a/drivers/gpu/drm/xe/xe_mmio.c
> +++ b/drivers/gpu/drm/xe/xe_mmio.c
> @@ -360,30 +360,6 @@ static void mmio_fini(struct drm_device *drm, void *arg)
>  		iounmap(xe->mem.vram.mapping);
>  }
>  
> -static int xe_verify_lmem_ready(struct xe_device *xe)
> -{
> -	struct xe_gt *gt = xe_root_mmio_gt(xe);
> -
> -	if (!IS_DGFX(xe))
> -		return 0;
> -
> -	if (IS_SRIOV_VF(xe))
> -		return 0;
> -
> -	/*
> -	 * The boot firmware initializes local memory and assesses its health.
> -	 * If memory training fails, the punit will have been instructed to
> -	 * keep the GT powered down; we won't be able to communicate with it
> -	 * and we should not continue with driver initialization.
> -	 */
> -	if (!(xe_mmio_read32(gt, GU_CNTL) & LMEM_INIT)) {
> -		drm_err(&xe->drm, "VRAM not initialized by firmware\n");
> -		return -ENODEV;
> -	}
> -
> -	return 0;
> -}
> -
>  int xe_mmio_init(struct xe_device *xe)
>  {
>  	struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> @@ -407,16 +383,11 @@ int xe_mmio_init(struct xe_device *xe)
>  int xe_mmio_root_tile_init(struct xe_device *xe)
>  {
>  	struct xe_tile *root_tile = xe_device_get_root_tile(xe);
> -	int err;
>  
>  	/* Setup first tile; other tiles (if present) will be setup later. */
>  	root_tile->mmio.size = SZ_16M;
>  	root_tile->mmio.regs = xe->mmio.regs;
>  
> -	err = xe_verify_lmem_ready(xe);
> -	if (err)
> -		return err;
> -
>  	return 0;
>  }
>  
> -- 
> 2.40.0
> 

  reply	other threads:[~2024-03-08 14:42 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-08  8:55 [PATCH 0/2] Pcode init status and lmem check Riana Tauro
2024-03-08  8:49 ` ✓ CI.Patch_applied: success for " Patchwork
2024-03-08  8:49 ` ✓ CI.checkpatch: " Patchwork
2024-03-08  8:50 ` ✓ CI.KUnit: " Patchwork
2024-03-08  8:55 ` [PATCH 1/2] RFC drm/xe: check pcode init status only on root gt of root tile Riana Tauro
2024-03-08 14:37   ` Rodrigo Vivi
2024-03-11  5:11     ` Riana Tauro
2024-03-11 10:18   ` Ghimiray, Himal Prasad
2024-03-11 10:58     ` Ghimiray, Himal Prasad
2024-03-08  8:55 ` [PATCH 2/2] RFC drm/xe: re-order lmem init check and wait for initialization to complete Riana Tauro
2024-03-08 14:42   ` Rodrigo Vivi [this message]
2024-03-11  5:23     ` Riana Tauro
2024-03-11 14:40     ` Lucas De Marchi
2024-03-11 16:41       ` Rodrigo Vivi
2024-03-11 11:08   ` Ghimiray, Himal Prasad
2024-03-08  9:06 ` ✓ CI.Build: success for Pcode init status and lmem check Patchwork
2024-03-08  9:06 ` ✓ CI.Hooks: " Patchwork
2024-03-08  9:08 ` ✓ CI.checksparse: " Patchwork
2024-03-08  9:37 ` ✗ CI.BAT: failure " Patchwork
2024-03-08 12:59 ` ✓ CI.Patch_applied: success for Pcode init status and lmem check (rev2) Patchwork
2024-03-08 13:00 ` ✓ CI.checkpatch: " Patchwork
2024-03-08 13:01 ` ✓ CI.KUnit: " Patchwork
2024-03-08 13:11 ` ✓ CI.Build: " Patchwork
2024-03-08 13:12 ` ✓ CI.Hooks: " Patchwork
2024-03-08 13:13 ` ✓ CI.checksparse: " Patchwork
2024-03-08 13:45 ` ✓ CI.BAT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zesj1vgtwk877OhR@intel.com \
    --to=rodrigo.vivi@intel.com \
    --cc=anshuman.gupta@intel.com \
    --cc=aravind.iddamsetty@intel.com \
    --cc=badal.nilawar@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.d.roper@intel.com \
    --cc=riana.tauro@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox