public inbox for dri-devel@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
To: Lucas De Marchi <lucas.demarchi@intel.com>,
	<intel-gfx@lists.freedesktop.org>
Cc: "Matthew Brost" <matthew.brost@intel.com>,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	dri-devel@lists.freedesktop.org,
	"John Harrison" <John.C.Harrison@Intel.com>
Subject: Re: [PATCH 16/19] drm/i915/guc: Use a single pass to calculate regset
Date: Tue, 1 Feb 2022 14:42:20 -0800	[thread overview]
Message-ID: <6a5c4a0f-2e71-7dd4-7f15-4e076cbdbe62@intel.com> (raw)
In-Reply-To: <20220126203702.1784589-17-lucas.demarchi@intel.com>



On 1/26/2022 12:36 PM, Lucas De Marchi wrote:
> The ADS initialitazion was using 2 passes to calculate the regset sent
> to GuC to initialize each engine: the first pass to just have the final
> object size and the second to set each register in place in the final
> gem object.
>
> However in order to maintain an ordered set of registers to pass to guc,
> each register needs to be added and moved in the final array. The second
> phase may actually happen in IO memory rather than system memory and
> accessing IO memory by simply dereferencing the pointer doesn't work on
> all architectures. Other places of the ADS initializaition were
> converted to use the dma_buf_map API, but here there may be a lot more
> accesses to IO memory. So, instead of following that same approach,
> convert the regset initialization to calculate the final array in 1
> pass and in the second pass that array is just copied to its final
> location, updating the pointers for each engine written to the ADS blob.
>
> One important thing is that struct temp_regset now have
> different semantics: `registers` continues to track the registers of a
> single engine, however the other fields are updated together, according
> to the newly added `storage`, which tracks the memory allocated for
> all the registers. So rename some of these fields and add a
> __mmio_reg_add(): this function (possibly) allocates memory and operates
> on the storage pointer while guc_mmio_reg_add() continues to manage the
> registers pointer.
>
> On a Tiger Lake system using enable_guc=3, the following log message is
> now seen:
>
> 	[  187.334310] i915 0000:00:02.0: [drm:intel_guc_ads_create [i915]] Used 4 KB for temporary ADS regset
>
> This change has also been tested on an ARM64 host with DG2 and other
> discrete graphics cards.
>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: John Harrison <John.C.Harrison@Intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc.h     |   7 ++
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 117 +++++++++++++--------
>   2 files changed, 79 insertions(+), 45 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> index e2e0df1c3d91..4c852eee3ad8 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> @@ -152,6 +152,13 @@ struct intel_guc {
>   	struct dma_buf_map ads_map;
>   	/** @ads_regset_size: size of the save/restore regsets in the ADS */
>   	u32 ads_regset_size;
> +	/**
> +	 * @ads_regset_count: number of save/restore registers in the ADS for
> +	 * each engine
> +	 */
> +	u32 ads_regset_count[I915_NUM_ENGINES];
> +	/** @ads_regset: save/restore regsets in the ADS */
> +	struct guc_mmio_reg *ads_regset;
>   	/** @ads_golden_ctxt_size: size of the golden contexts in the ADS */
>   	u32 ads_golden_ctxt_size;
>   	/** @ads_engine_usage_size: size of engine usage in the ADS */
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> index 73ca34de44f7..390101ee3661 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> @@ -226,14 +226,13 @@ static void guc_mapping_table_init(struct intel_gt *gt,
>   
>   /*
>    * The save/restore register list must be pre-calculated to a temporary
> - * buffer of driver defined size before it can be generated in place
> - * inside the ADS.
> + * buffer before it can be copied inside the ADS.
>    */
> -#define MAX_MMIO_REGS	128	/* Arbitrary size, increase as needed */
>   struct temp_regset {
>   	struct guc_mmio_reg *registers;
> -	u32 used;
> -	u32 size;
> +	struct guc_mmio_reg *storage;

I think this could use a comment to distinguish between registers and 
storage. Something like.:

/* ptr to the base of the allocated storage for all engines */
struct guc_mmio_reg *storage;

/* ptr to the section of the storage for the engine currently being 
worked on */
struct guc_mmio_reg *registers;


> +	u32 storage_used;
> +	u32 storage_max;
>   };
>   
>   static int guc_mmio_reg_cmp(const void *a, const void *b)
> @@ -244,18 +243,44 @@ static int guc_mmio_reg_cmp(const void *a, const void *b)
>   	return (int)ra->offset - (int)rb->offset;
>   }
>   
> +static struct guc_mmio_reg * __must_check
> +__mmio_reg_add(struct temp_regset *regset, struct guc_mmio_reg *reg)
> +{
> +	u32 pos = regset->storage_used;
> +	struct guc_mmio_reg *slot;
> +
> +	if (pos >= regset->storage_max) {
> +		size_t size = ALIGN((pos + 1) * sizeof(*slot), PAGE_SIZE);
> +		struct guc_mmio_reg *r = krealloc(regset->storage,
> +						  size, GFP_KERNEL);
> +		if (!r) {
> +			WARN_ONCE(1, "Incomplete regset list: can't add register (%d)\n",
> +				  -ENOMEM);
> +			return ERR_PTR(-ENOMEM);
> +		}
> +
> +		regset->registers = r + (regset->registers - regset->storage);
> +		regset->storage = r;
> +		regset->storage_max = size / sizeof(*slot);
> +	}
> +
> +	slot = &regset->storage[pos];
> +	regset->storage_used++;
> +	*slot = *reg;
> +
> +	return slot;
> +}
> +
>   static long __must_check guc_mmio_reg_add(struct temp_regset *regset,
>   					  u32 offset, u32 flags)
>   {
> -	u32 count = regset->used;
> +	u32 count = regset->storage_used - (regset->registers - regset->storage);
>   	struct guc_mmio_reg reg = {
>   		.offset = offset,
>   		.flags = flags,
>   	};
>   	struct guc_mmio_reg *slot;
>   
> -	GEM_BUG_ON(count >= regset->size);
> -
>   	/*
>   	 * The mmio list is built using separate lists within the driver.
>   	 * It's possible that at some point we may attempt to add the same
> @@ -266,9 +291,9 @@ static long __must_check guc_mmio_reg_add(struct temp_regset *regset,
>   		    sizeof(reg), guc_mmio_reg_cmp))
>   		return 0;
>   
> -	slot = &regset->registers[count];
> -	regset->used++;
> -	*slot = reg;
> +	slot = __mmio_reg_add(regset, &reg);
> +	if (IS_ERR(slot))
> +		return PTR_ERR(slot);
>   
>   	while (slot-- > regset->registers) {
>   		GEM_BUG_ON(slot[0].offset == slot[1].offset);
> @@ -295,7 +320,11 @@ static int guc_mmio_regset_init(struct temp_regset *regset,
>   	unsigned int i;
>   	int ret = 0;
>   
> -	regset->used = 0;
> +	/*
> +	 * Each engine's registers point to a new start relative to
> +	 * storage
> +	 */
> +	regset->registers = regset->storage + regset->storage_used;
>   
>   	ret |= GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true);
>   	ret |= GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false);
> @@ -317,32 +346,28 @@ static int guc_mmio_regset_init(struct temp_regset *regset,
>   	return ret ? -1 : 0;
>   }
>   
> -static int guc_mmio_reg_state_query(struct intel_guc *guc)
> +static long guc_mmio_reg_state_create(struct intel_guc *guc)
>   {
>   	struct intel_gt *gt = guc_to_gt(guc);
>   	struct intel_engine_cs *engine;
>   	enum intel_engine_id id;
> -	struct temp_regset temp_set;
> -	u32 total;
> +	struct temp_regset temp_set = {};
> +	long total = 0;
>   
> -	/*
> -	 * Need to actually build the list in order to filter out
> -	 * duplicates and other such data dependent constructions.
> -	 */
> -	temp_set.size = MAX_MMIO_REGS;
> -	temp_set.registers = kmalloc_array(temp_set.size,
> -					   sizeof(*temp_set.registers),
> -					   GFP_KERNEL);
> -	if (!temp_set.registers)
> -		return -ENOMEM;
> -
> -	total = 0;
>   	for_each_engine(engine, gt, id) {
> -		guc_mmio_regset_init(&temp_set, engine);
> -		total += temp_set.used;
> +		u32 used = temp_set.storage_used;
> +
> +		if (guc_mmio_regset_init(&temp_set, engine) < 0)
> +			return -1;

If you fail here you're leaking temp_set.storage.
Also, any reason not to just return the return code from 
guc_mmio_regset_init?

Apart from these minor comments, the change LGTM. IMO we could also 
merge this patch on its own ahead of the rest of the dma_buf code, 
because not having to recreate the regset on every reset/resume is still 
helpful.

Daniele

> +
> +		guc->ads_regset_count[id] = temp_set.storage_used - used;
> +		total += guc->ads_regset_count[id];
>   	}
>   
> -	kfree(temp_set.registers);
> +	guc->ads_regset = temp_set.storage;
> +
> +	drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS regset\n",
> +		(temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 10);
>   
>   	return total * sizeof(struct guc_mmio_reg);
>   }
> @@ -352,40 +377,38 @@ static void guc_mmio_reg_state_init(struct intel_guc *guc,
>   {
>   	struct intel_gt *gt = guc_to_gt(guc);
>   	struct intel_engine_cs *engine;
> +	struct guc_mmio_reg *ads_registers;
>   	enum intel_engine_id id;
> -	struct temp_regset temp_set;
> -	struct guc_mmio_reg_set *ads_reg_set;
>   	u32 addr_ggtt, offset;
> -	u8 guc_class;
>   
>   	offset = guc_ads_regset_offset(guc);
>   	addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
> -	temp_set.registers = (struct guc_mmio_reg *)(((u8 *)blob) + offset);
> -	temp_set.size = guc->ads_regset_size / sizeof(temp_set.registers[0]);
> +	ads_registers = (struct guc_mmio_reg *)(((u8 *)blob) + offset);
> +
> +	memcpy(ads_registers, guc->ads_regset, guc->ads_regset_size);
>   
>   	for_each_engine(engine, gt, id) {
> +		u32 count = guc->ads_regset_count[id];
> +		struct guc_mmio_reg_set *ads_reg_set;
> +		u8 guc_class;
> +
>   		/* Class index is checked in class converter */
>   		GEM_BUG_ON(engine->instance >= GUC_MAX_INSTANCES_PER_CLASS);
>   
>   		guc_class = engine_class_to_guc_class(engine->class);
>   		ads_reg_set = &blob->ads.reg_state_list[guc_class][engine->instance];
>   
> -		guc_mmio_regset_init(&temp_set, engine);
> -		if (!temp_set.used) {
> +		if (!count) {
>   			ads_reg_set->address = 0;
>   			ads_reg_set->count = 0;
>   			continue;
>   		}
>   
>   		ads_reg_set->address = addr_ggtt;
> -		ads_reg_set->count = temp_set.used;
> +		ads_reg_set->count = count;
>   
> -		temp_set.size -= temp_set.used;
> -		temp_set.registers += temp_set.used;
> -		addr_ggtt += temp_set.used * sizeof(struct guc_mmio_reg);
> +		addr_ggtt += count * sizeof(struct guc_mmio_reg);
>   	}
> -
> -	GEM_BUG_ON(temp_set.size);
>   }
>   
>   static void fill_engine_enable_masks(struct intel_gt *gt,
> @@ -634,8 +657,11 @@ int intel_guc_ads_create(struct intel_guc *guc)
>   
>   	GEM_BUG_ON(guc->ads_vma);
>   
> -	/* Need to calculate the reg state size dynamically: */
> -	ret = guc_mmio_reg_state_query(guc);
> +	/*
> +	 * Create reg state size dynamically on system memory to be copied to
> +	 * the final ads blob on gt init/reset
> +	 */
> +	ret = guc_mmio_reg_state_create(guc);
>   	if (ret < 0)
>   		return ret;
>   	guc->ads_regset_size = ret;
> @@ -681,6 +707,7 @@ void intel_guc_ads_destroy(struct intel_guc *guc)
>   	i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP);
>   	guc->ads_blob = NULL;
>   	dma_buf_map_clear(&guc->ads_map);
> +	kfree(guc->ads_regset);
>   }
>   
>   static void guc_ads_private_data_reset(struct intel_guc *guc)


  parent reply	other threads:[~2022-02-01 22:42 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-26 20:36 [PATCH 00/19] drm/i915/guc: Refactor ADS access to use dma_buf_map Lucas De Marchi
2022-01-26 20:36 ` [PATCH 01/19] dma-buf-map: Add read/write helpers Lucas De Marchi
2022-01-27  7:24   ` Christian König
2022-01-27  7:36     ` Matthew Brost
2022-01-27  7:59       ` Christian König
2022-01-27  9:02         ` [Intel-gfx] " Daniel Vetter
2022-01-27 14:26   ` Thomas Zimmermann
2022-01-27 16:34     ` Lucas De Marchi
2022-01-28  8:32       ` Thomas Zimmermann
2022-01-26 20:36 ` [PATCH 02/19] dma-buf-map: Add helper to initialize second map Lucas De Marchi
2022-01-27  7:27   ` Christian König
2022-01-27  7:57     ` Lucas De Marchi
2022-01-27  8:02       ` Christian König
2022-01-27  8:18         ` [Intel-gfx] " Lucas De Marchi
2022-01-27  8:55           ` Christian König
2022-01-27  9:12             ` Lucas De Marchi
2022-01-27  9:21               ` Christian König
2022-01-27  8:57         ` Daniel Vetter
2022-01-27  9:33           ` [Intel-gfx] " Lucas De Marchi
2022-01-27 10:00             ` Daniel Vetter
2022-01-27 10:21               ` Christian König
2022-01-27 11:16                 ` Daniel Vetter
2022-01-27 11:44                   ` [Linaro-mm-sig] " Christian König
2022-01-27 11:56                     ` Daniel Vetter
2022-01-27 16:13                     ` Lucas De Marchi
2022-01-27 14:52                 ` Thomas Zimmermann
2022-01-27 16:12                 ` Lucas De Marchi
2022-01-27 14:33   ` Thomas Zimmermann
2022-01-27 15:59     ` [Intel-gfx] " Lucas De Marchi
2022-01-28  8:15       ` Thomas Zimmermann
2022-01-28  8:34         ` Thomas Zimmermann
2022-01-26 20:36 ` [PATCH 03/19] drm/i915/gt: Add helper for shmem copy to dma_buf_map Lucas De Marchi
2022-01-26 20:36 ` [PATCH 04/19] drm/i915/guc: Keep dma_buf_map of ads_blob around Lucas De Marchi
2022-01-26 20:36 ` [PATCH 05/19] drm/i915/guc: Add read/write helpers for ADS blob Lucas De Marchi
2022-01-26 20:36 ` [PATCH 06/19] drm/i915/guc: Convert golden context init to dma_buf_map Lucas De Marchi
2022-01-26 20:36 ` [PATCH 07/19] drm/i915/guc: Convert policies update " Lucas De Marchi
2022-01-26 20:36 ` [PATCH 08/19] drm/i915/guc: Convert engine record " Lucas De Marchi
2022-01-26 20:36 ` [PATCH 09/19] dma-buf-map: Add wrapper over memset Lucas De Marchi
2022-01-27  7:28   ` Christian König
2022-01-27 14:54   ` Thomas Zimmermann
2022-01-27 15:38     ` [Intel-gfx] " Lucas De Marchi
2022-01-27 15:47       ` Thomas Zimmermann
2022-01-26 20:36 ` [PATCH 10/19] drm/i915/guc: Convert guc_ads_private_data_reset to dma_buf_map Lucas De Marchi
2022-01-26 20:36 ` [PATCH 11/19] drm/i915/guc: Convert golden context prep " Lucas De Marchi
2022-01-26 20:36 ` [PATCH 12/19] drm/i915/guc: Replace check for golden context size Lucas De Marchi
2022-01-26 20:36 ` [PATCH 13/19] drm/i915/guc: Convert mapping table to dma_buf_map Lucas De Marchi
2022-01-26 20:36 ` [PATCH 14/19] drm/i915/guc: Convert capture list " Lucas De Marchi
2022-01-26 20:36 ` [PATCH 15/19] drm/i915/guc: Prepare for error propagation Lucas De Marchi
2022-01-26 20:36 ` [PATCH 16/19] drm/i915/guc: Use a single pass to calculate regset Lucas De Marchi
2022-01-27  0:29   ` kernel test robot
2022-01-27  2:02   ` kernel test robot
2022-01-27  4:37   ` kernel test robot
2022-02-01 22:42   ` Daniele Ceraolo Spurio [this message]
2022-02-03 23:44     ` Lucas De Marchi
2022-01-26 20:37 ` [PATCH 17/19] drm/i915/guc: Convert guc_mmio_reg_state_init to dma_buf_map Lucas De Marchi
2022-01-26 20:37 ` [PATCH 18/19] drm/i915/guc: Convert __guc_ads_init " Lucas De Marchi
2022-01-26 20:37 ` [PATCH 19/19] drm/i915/guc: Remove plain ads_blob pointer Lucas De Marchi
  -- strict thread matches above, loose matches on Subject: below --
2022-02-04 17:44 [PATCH 00/19] drm/i915/guc: Refactor ADS access to use iosys_map Lucas De Marchi
2022-02-04 17:44 ` [PATCH 16/19] drm/i915/guc: Use a single pass to calculate regset Lucas De Marchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6a5c4a0f-2e71-7dd4-7f15-4e076cbdbe62@intel.com \
    --to=daniele.ceraolospurio@intel.com \
    --cc=John.C.Harrison@Intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox