Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Teres Alexis, Alan Previn" <alan.previn.teres.alexis@intel.com>
To: "Harrison, John C" <john.c.harrison@intel.com>,
	"Intel-GFX@Lists.FreeDesktop.Org"
	<Intel-GFX@Lists.FreeDesktop.Org>
Cc: "DRI-Devel@Lists.FreeDesktop.Org" <DRI-Devel@Lists.FreeDesktop.Org>
Subject: Re: [Intel-gfx] [PATCH 2/7] drm/i915/guc: Fix capture size warning and bump the size
Date: Tue, 2 Aug 2022 17:46:09 +0000	[thread overview]
Message-ID: <b3c7738db6403d951527d9a065ef2ba8c1d4c9f3.camel@intel.com> (raw)
In-Reply-To: <20220728022028.2190627-3-John.C.Harrison@Intel.com>


Straight forward change - LGTM.

Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>


On Wed, 2022-07-27 at 19:20 -0700, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> There was a size check to warn if the GuC error state capture buffer
> allocation would be too small to fit a reasonable amount of capture
> data for the current platform. Unfortunately, the test was done too
> early in the boot sequence and was actually testing 'if(-ENODEV >
> size)'.
> 
> Move the check to be later. The check is only used to print a warning
> message, so it doesn't really matter how early or late it is done.
> Note that it is not possible to dynamically size the buffer because
> the allocation needs to be done before the engine information is
> available (at least, it would be in the intended two-phase GuC init
> process).
> 
> Now that the check works, it is reporting size too small for newer
> platforms. The check includes a 3x oversample multiplier to allow for
> multiple error captures to be bufferd by GuC before i915 has a chance
> to read them out. This is less important than simply being big enough
> to fit the first capture.
> 
> So a) bump the default size to be large enough for one capture minimum
> and b) make the warning only if one capture won't fit, instead use a
> notice for the 3x size.
> 
> Note that the size estimate is a worst case scenario. Actual captures
> will likely be smaller.
> 
> Lastly, use drm_warn istead of DRM_WARN as the former provides more
> infmration and the latter is deprecated.
> 
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>  .../gpu/drm/i915/gt/uc/intel_guc_capture.c    | 40 ++++++++++++++-----
>  .../gpu/drm/i915/gt/uc/intel_guc_capture.h    |  1 -
>  drivers/gpu/drm/i915/gt/uc/intel_guc_log.c    |  4 --
>  drivers/gpu/drm/i915/gt/uc/intel_guc_log.h    |  4 +-
>  4 files changed, 31 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> index 75257bd20ff01..b54b7883320b1 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> @@ -600,10 +600,8 @@ intel_guc_capture_getnullheader(struct intel_guc *guc,
>  	return 0;
>  }
>  
> -#define GUC_CAPTURE_OVERBUFFER_MULTIPLIER 3
> -
> -int
> -intel_guc_capture_output_min_size_est(struct intel_guc *guc)
> +static int
> +guc_capture_output_min_size_est(struct intel_guc *guc)
>  {
>  	struct intel_gt *gt = guc_to_gt(guc);
>  	struct intel_engine_cs *engine;
> @@ -623,13 +621,8 @@ intel_guc_capture_output_min_size_est(struct intel_guc *guc)
>  	 * For each engine instance, there would be 1 x guc_state_capture_group_t output
>  	 * followed by 3 x guc_state_capture_t lists. The latter is how the register
>  	 * dumps are split across different register types (where the '3' are global vs class
> -	 * vs instance). Finally, let's multiply the whole thing by 3x (just so we are
> -	 * not limited to just 1 round of data in a worst case full register dump log)
> -	 *
> -	 * NOTE: intel_guc_log that allocates the log buffer would round this size up to
> -	 * a power of two.
> +	 * vs instance).
>  	 */
> -
>  	for_each_engine(engine, gt, id) {
>  		worst_min_size += sizeof(struct guc_state_capture_group_header_t) +
>  					 (3 * sizeof(struct guc_state_capture_header_t));
> @@ -649,7 +642,30 @@ intel_guc_capture_output_min_size_est(struct intel_guc *guc)
>  
>  	worst_min_size += (num_regs * sizeof(struct guc_mmio_reg));
>  
> -	return (worst_min_size * GUC_CAPTURE_OVERBUFFER_MULTIPLIER);
> +	return worst_min_size;
> +}
> +
> +/*
> + * Add on a 3x multiplier to allow for multiple back-to-back captures occurring
> + * before the i915 can read the data out and process it
> + */
> +#define GUC_CAPTURE_OVERBUFFER_MULTIPLIER 3
> +
> +static void check_guc_capture_size(struct intel_guc *guc)
> +{
> +	struct drm_i915_private *i915 = guc_to_gt(guc)->i915;
> +	int min_size = guc_capture_output_min_size_est(guc);
> +	int spare_size = min_size * GUC_CAPTURE_OVERBUFFER_MULTIPLIER;
> +
> +	if (min_size < 0)
> +		drm_warn(&i915->drm, "Failed to calculate GuC error state capture buffer minimum size: %d!\n",
> +			 min_size);
> +	else if (min_size > CAPTURE_BUFFER_SIZE)
> +		drm_warn(&i915->drm, "GuC error state capture buffer is too small: %d < %d\n",
> +			 CAPTURE_BUFFER_SIZE, min_size);
> +	else if (spare_size > CAPTURE_BUFFER_SIZE)
> +		drm_notice(&i915->drm, "GuC error state capture buffer maybe too small: %d < %d (min = %d)\n",
> +			   CAPTURE_BUFFER_SIZE, spare_size, min_size);
>  }
>  
>  /*
> @@ -1580,5 +1596,7 @@ int intel_guc_capture_init(struct intel_guc *guc)
>  	INIT_LIST_HEAD(&guc->capture->outlist);
>  	INIT_LIST_HEAD(&guc->capture->cachelist);
>  
> +	check_guc_capture_size(guc);
> +
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.h
> index d3d7bd0b6db64..fbd3713c7832d 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.h
> @@ -21,7 +21,6 @@ int intel_guc_capture_print_engine_node(struct drm_i915_error_state_buf *m,
>  void intel_guc_capture_get_matching_node(struct intel_gt *gt, struct intel_engine_coredump *ee,
>  					 struct intel_context *ce);
>  void intel_guc_capture_process(struct intel_guc *guc);
> -int intel_guc_capture_output_min_size_est(struct intel_guc *guc);
>  int intel_guc_capture_getlist(struct intel_guc *guc, u32 owner, u32 type, u32 classid,
>  			      void **outptr);
>  int intel_guc_capture_getlistsize(struct intel_guc *guc, u32 owner, u32 type, u32 classid,
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
> index 492bbf419d4df..991d4a02248dc 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
> @@ -487,10 +487,6 @@ int intel_guc_log_create(struct intel_guc_log *log)
>  
>  	GEM_BUG_ON(log->vma);
>  
> -	if (intel_guc_capture_output_min_size_est(guc) > CAPTURE_BUFFER_SIZE)
> -		DRM_WARN("GuC log buffer for state_capture maybe too small. %d < %d\n",
> -			 CAPTURE_BUFFER_SIZE, intel_guc_capture_output_min_size_est(guc));
> -
>  	guc_log_size = intel_guc_log_size(log);
>  
>  	vma = intel_guc_allocate_vma(guc, guc_log_size);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h
> index 18007e639be99..dc9715411d626 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h
> @@ -22,11 +22,11 @@ struct intel_guc;
>  #elif defined(CONFIG_DRM_I915_DEBUG_GEM)
>  #define CRASH_BUFFER_SIZE	SZ_1M
>  #define DEBUG_BUFFER_SIZE	SZ_2M
> -#define CAPTURE_BUFFER_SIZE	SZ_1M
> +#define CAPTURE_BUFFER_SIZE	SZ_4M
>  #else
>  #define CRASH_BUFFER_SIZE	SZ_8K
>  #define DEBUG_BUFFER_SIZE	SZ_64K
> -#define CAPTURE_BUFFER_SIZE	SZ_16K
> +#define CAPTURE_BUFFER_SIZE	SZ_2M
>  #endif
>  
>  /*
> -- 
> 2.37.1
> 


  reply	other threads:[~2022-08-02 17:46 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-28  2:20 [Intel-gfx] [PATCH 0/7] Fixes and improvements to GuC logging and error capture John.C.Harrison
2022-07-28  2:20 ` [Intel-gfx] [PATCH 1/7] drm/i915/guc: Add a helper for log buffer size John.C.Harrison
2022-08-02 17:37   ` Teres Alexis, Alan Previn
2022-08-03  0:29     ` John Harrison
2022-07-28  2:20 ` [Intel-gfx] [PATCH 2/7] drm/i915/guc: Fix capture size warning and bump the size John.C.Harrison
2022-08-02 17:46   ` Teres Alexis, Alan Previn [this message]
2022-07-28  2:20 ` [Intel-gfx] [PATCH 3/7] drm/i915/guc: Add GuC <-> kernel time stamp translation information John.C.Harrison
2022-08-05  0:40   ` Teres Alexis, Alan Previn
2022-08-08 18:43     ` John Harrison
2022-08-15  4:55       ` Teres Alexis, Alan Previn
2022-08-19 10:45   ` Jani Nikula
2022-08-19 21:02     ` John Harrison
2022-08-23 10:09       ` Jani Nikula
2022-07-28  2:20 ` [Intel-gfx] [PATCH 4/7] drm/i915/guc: Record CTB info in error logs John.C.Harrison
2022-08-02 18:27   ` Teres Alexis, Alan Previn
2022-08-03  0:20     ` John Harrison
2022-07-28  2:20 ` [Intel-gfx] [PATCH 5/7] drm/i915/guc: Use streaming loads to speed up dumping the guc log John.C.Harrison
2022-08-02 18:48   ` Teres Alexis, Alan Previn
2022-08-03  0:14     ` John Harrison
2022-07-28  2:20 ` [Intel-gfx] [PATCH 6/7] drm/i915/guc: Make GuC log sizes runtime configurable John.C.Harrison
2022-08-15  5:43   ` Teres Alexis, Alan Previn
2022-08-24  9:01   ` Joonas Lahtinen
     [not found]     ` <4bd7b51a-caf0-d987-c7df-6cfb24f36597@intel.com>
2022-08-25  7:15       ` Joonas Lahtinen
2022-08-25 16:31         ` John Harrison
2022-08-26  6:23           ` Joonas Lahtinen
2022-09-12  7:12             ` Joonas Lahtinen
2022-09-12 23:46               ` John Harrison
2022-07-28  2:20 ` [Intel-gfx] [PATCH 7/7] drm/i915/guc: Reduce spam from error capture John.C.Harrison
2022-08-02 18:54   ` Teres Alexis, Alan Previn
2022-07-28  2:37 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Fixes and improvements to GuC logging and " Patchwork
2022-07-28  2:37 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2022-07-28  2:57 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-07-28  9:31 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2022-08-16  0:53   ` John Harrison

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3c7738db6403d951527d9a065ef2ba8c1d4c9f3.camel@intel.com \
    --to=alan.previn.teres.alexis@intel.com \
    --cc=DRI-Devel@Lists.FreeDesktop.Org \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=john.c.harrison@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox