public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/3] Disable automatic load CCS load balancing
@ 2024-03-13 20:19 Andi Shyti
  2024-03-13 20:19 ` [PATCH v6 1/3] drm/i915/gt: Disable HW load balancing for CCS Andi Shyti
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Andi Shyti @ 2024-03-13 20:19 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: Chris Wilson, Joonas Lahtinen, Matt Roper, John Harrison, stable,
	Andi Shyti, Andi Shyti, Tvrtko Ursulin

Hi,

this series does basically two things:

1. Disables automatic load balancing as adviced by the hardware
   workaround.

2. Assigns all the CCS slices to one single user engine. The user
   will then be able to query only one CCS engine

From v5 I have created a new file, gt/intel_gt_ccs_mode.c where
I added the intel_gt_apply_ccs_mode(). In the upcoming patches,
this file will contain the implementation for dynamic CCS mode
setting.

Thanks Tvrtko, Matt, John and Joonas for your reviews!

Andi

Changelog
=========
v5 -> v6 (thanks Matt for the suggestions in v6)
 - Remove the refactoring and the for_each_available_engine()
   macro and instead do not create the intel_engine_cs structure
   at all.
 - In patch 1 just a trivial reordering of the bit definitions.

v4 -> v5
 - Use the workaround framework to do all the CCS balancing
   settings in order to always apply the modes also when the
   engine resets. Put everything in its own specific function to
   be executed for the first CCS engine encountered. (Thanks
   Matt)
 - Calculate the CCS ID for the CCS mode as the first available
   CCS among all the engines (Thanks Matt)
 - create the intel_gt_ccs_mode.c function to host the CCS
   configuration. We will have it ready for the next series.
 - Fix a selftest that was failing because could not set CCS2.
 - Add the for_each_available_engine() macro to exclude CCS1+ and
   start using it in the hangcheck selftest.

v3 -> v4
 - Reword correctly the comment in the workaround
 - Fix a buffer overflow (Thanks Joonas)
 - Handle properly the fused engines when setting the CCS mode.

v2 -> v3
 - Simplified the algorithm for creating the list of the exported
   uabi engines. (Patch 1) (Thanks, Tvrtko)
 - Consider the fused engines when creating the uabi engine list
   (Patch 2) (Thanks, Matt)
 - Patch 4 now uses a the refactoring from patch 1, in a cleaner
   outcome.

v1 -> v2
 - In Patch 1 use the correct workaround number (thanks Matt).
 - In Patch 2 do not add the extra CCS engines to the exposed
   UABI engine list and adapt the engine counting accordingly
   (thanks Tvrtko).
 - Reword the commit of Patch 2 (thanks John).

Andi Shyti (3):
  drm/i915/gt: Disable HW load balancing for CCS
  drm/i915/gt: Do not generate the command streamer for all the CCS
  drm/i915/gt: Enable only one CCS for compute workload

 drivers/gpu/drm/i915/Makefile               |  1 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 20 ++++++++---
 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c | 39 +++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h | 13 +++++++
 drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  6 ++++
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 30 ++++++++++++++--
 6 files changed, 103 insertions(+), 6 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h

-- 
2.43.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6 1/3] drm/i915/gt: Disable HW load balancing for CCS
  2024-03-13 20:19 [PATCH v6 0/3] Disable automatic load CCS load balancing Andi Shyti
@ 2024-03-13 20:19 ` Andi Shyti
  2024-03-13 20:19 ` [PATCH v6 2/3] drm/i915/gt: Do not generate the command streamer for all the CCS Andi Shyti
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Andi Shyti @ 2024-03-13 20:19 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: Chris Wilson, Joonas Lahtinen, Matt Roper, John Harrison, stable,
	Andi Shyti, Andi Shyti, Tvrtko Ursulin

The hardware should not dynamically balance the load between CCS
engines. Wa_14019159160 recommends disabling it across all
platforms.

Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  1 +
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 23 +++++++++++++++++++--
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 50962cfd1353..31b102604e3d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -1477,6 +1477,7 @@
 #define   ECOBITS_PPGTT_CACHE4B			(0 << 8)
 
 #define GEN12_RCU_MODE				_MMIO(0x14800)
+#define   XEHP_RCU_MODE_FIXED_SLICE_CCS_MODE	REG_BIT(1)
 #define   GEN12_RCU_MODE_CCS_ENABLE		REG_BIT(0)
 
 #define CHV_FUSE_GT				_MMIO(VLV_GUNIT_BASE + 0x2168)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index b079cbbc1897..9963e5725ae5 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -51,7 +51,8 @@
  *   registers belonging to BCS, VCS or VECS should be implemented in
  *   xcs_engine_wa_init(). Workarounds for registers not belonging to a specific
  *   engine's MMIO range but that are part of of the common RCS/CCS reset domain
- *   should be implemented in general_render_compute_wa_init().
+ *   should be implemented in general_render_compute_wa_init(). The settings
+ *   about the CCS load balancing should be added in ccs_engine_wa_mode().
  *
  * - GT workarounds: the list of these WAs is applied whenever these registers
  *   revert to their default values: on GPU reset, suspend/resume [1]_, etc.
@@ -2854,6 +2855,22 @@ add_render_compute_tuning_settings(struct intel_gt *gt,
 		wa_write_clr(wal, GEN8_GARBCNTL, GEN12_BUS_HASH_CTL_BIT_EXC);
 }
 
+static void ccs_engine_wa_mode(struct intel_engine_cs *engine, struct i915_wa_list *wal)
+{
+	struct intel_gt *gt = engine->gt;
+
+	if (!IS_DG2(gt->i915))
+		return;
+
+	/*
+	 * Wa_14019159160: This workaround, along with others, leads to
+	 * significant challenges in utilizing load balancing among the
+	 * CCS slices. Consequently, an architectural decision has been
+	 * made to completely disable automatic CCS load balancing.
+	 */
+	wa_masked_en(wal, GEN12_RCU_MODE, XEHP_RCU_MODE_FIXED_SLICE_CCS_MODE);
+}
+
 /*
  * The workarounds in this function apply to shared registers in
  * the general render reset domain that aren't tied to a
@@ -3000,8 +3017,10 @@ engine_init_workarounds(struct intel_engine_cs *engine, struct i915_wa_list *wal
 	 * to a single RCS/CCS engine's workaround list since
 	 * they're reset as part of the general render domain reset.
 	 */
-	if (engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE)
+	if (engine->flags & I915_ENGINE_FIRST_RENDER_COMPUTE) {
 		general_render_compute_wa_init(engine, wal);
+		ccs_engine_wa_mode(engine, wal);
+	}
 
 	if (engine->class == COMPUTE_CLASS)
 		ccs_engine_wa_init(engine, wal);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 2/3] drm/i915/gt: Do not generate the command streamer for all the CCS
  2024-03-13 20:19 [PATCH v6 0/3] Disable automatic load CCS load balancing Andi Shyti
  2024-03-13 20:19 ` [PATCH v6 1/3] drm/i915/gt: Disable HW load balancing for CCS Andi Shyti
@ 2024-03-13 20:19 ` Andi Shyti
  2024-03-26 16:03   ` Matt Roper
  2024-03-13 20:19 ` [PATCH v6 3/3] drm/i915/gt: Enable only one CCS for compute workload Andi Shyti
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Andi Shyti @ 2024-03-13 20:19 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: Chris Wilson, Joonas Lahtinen, Matt Roper, John Harrison, stable,
	Andi Shyti, Andi Shyti, Tvrtko Ursulin

We want a fixed load CCS balancing consisting in all slices
sharing one single user engine. For this reason do not create the
intel_engine_cs structure with its dedicated command streamer for
CCS slices beyond the first.

Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index f553cf4e6449..c4fb31bb6e72 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -966,6 +966,7 @@ int intel_engines_init_mmio(struct intel_gt *gt)
 	const unsigned int engine_mask = init_engine_mask(gt);
 	unsigned int mask = 0;
 	unsigned int i, class;
+	u8 ccs_instance = 0;
 	u8 logical_ids[MAX_ENGINE_INSTANCE + 1];
 	int err;
 
@@ -986,6 +987,19 @@ int intel_engines_init_mmio(struct intel_gt *gt)
 			    !HAS_ENGINE(gt, i))
 				continue;
 
+			/*
+			 * Do not create the command streamer for CCS slices
+			 * beyond the first. All the workload submitted to the
+			 * first engine will be shared among all the slices.
+			 *
+			 * Once the user will be allowed to customize the CCS
+			 * mode, then this check needs to be removed.
+			 */
+			if (IS_DG2(i915) &&
+			    class == COMPUTE_CLASS &&
+			    ccs_instance++)
+				continue;
+
 			err = intel_engine_setup(gt, i,
 						 logical_ids[instance]);
 			if (err)
@@ -996,11 +1010,9 @@ int intel_engines_init_mmio(struct intel_gt *gt)
 	}
 
 	/*
-	 * Catch failures to update intel_engines table when the new engines
-	 * are added to the driver by a warning and disabling the forgotten
-	 * engines.
+	 * Update the intel_engines table.
 	 */
-	if (drm_WARN_ON(&i915->drm, mask != engine_mask))
+	if (mask != engine_mask)
 		gt->info.engine_mask = mask;
 
 	gt->info.num_engines = hweight32(mask);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 3/3] drm/i915/gt: Enable only one CCS for compute workload
  2024-03-13 20:19 [PATCH v6 0/3] Disable automatic load CCS load balancing Andi Shyti
  2024-03-13 20:19 ` [PATCH v6 1/3] drm/i915/gt: Disable HW load balancing for CCS Andi Shyti
  2024-03-13 20:19 ` [PATCH v6 2/3] drm/i915/gt: Do not generate the command streamer for all the CCS Andi Shyti
@ 2024-03-13 20:19 ` Andi Shyti
  2024-03-26 16:06   ` Matt Roper
  2024-03-20 15:06 ` [PATCH v6 0/3] Disable automatic load CCS load balancing Andi Shyti
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Andi Shyti @ 2024-03-13 20:19 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: Chris Wilson, Joonas Lahtinen, Matt Roper, John Harrison, stable,
	Andi Shyti, Andi Shyti, Tvrtko Ursulin

Enable only one CCS engine by default with all the compute sices
allocated to it.

While generating the list of UABI engines to be exposed to the
user, exclude any additional CCS engines beyond the first
instance.

This change can be tested with igt i915_query.

Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
---
 drivers/gpu/drm/i915/Makefile               |  1 +
 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c | 39 +++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h | 13 +++++++
 drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  5 +++
 drivers/gpu/drm/i915/gt/intel_workarounds.c |  7 ++++
 5 files changed, 65 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 3ef6ed41e62b..a6885a1d41a1 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -118,6 +118,7 @@ gt-y += \
 	gt/intel_ggtt_fencing.o \
 	gt/intel_gt.o \
 	gt/intel_gt_buffer_pool.o \
+	gt/intel_gt_ccs_mode.o \
 	gt/intel_gt_clock_utils.o \
 	gt/intel_gt_debugfs.o \
 	gt/intel_gt_engines_debugfs.o \
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
new file mode 100644
index 000000000000..044219c5960a
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
@@ -0,0 +1,39 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2024 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "intel_gt.h"
+#include "intel_gt_ccs_mode.h"
+#include "intel_gt_regs.h"
+
+void intel_gt_apply_ccs_mode(struct intel_gt *gt)
+{
+	int cslice;
+	u32 mode = 0;
+	int first_ccs = __ffs(CCS_MASK(gt));
+
+	if (!IS_DG2(gt->i915))
+		return;
+
+	/* Build the value for the fixed CCS load balancing */
+	for (cslice = 0; cslice < I915_MAX_CCS; cslice++) {
+		if (CCS_MASK(gt) & BIT(cslice))
+			/*
+			 * If available, assign the cslice
+			 * to the first available engine...
+			 */
+			mode |= XEHP_CCS_MODE_CSLICE(cslice, first_ccs);
+
+		else
+			/*
+			 * ... otherwise, mark the cslice as
+			 * unavailable if no CCS dispatches here
+			 */
+			mode |= XEHP_CCS_MODE_CSLICE(cslice,
+						     XEHP_CCS_MODE_CSLICE_MASK);
+	}
+
+	intel_uncore_write(gt->uncore, XEHP_CCS_MODE, mode);
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
new file mode 100644
index 000000000000..9e5549caeb26
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2024 Intel Corporation
+ */
+
+#ifndef __INTEL_GT_CCS_MODE_H__
+#define __INTEL_GT_CCS_MODE_H__
+
+struct intel_gt;
+
+void intel_gt_apply_ccs_mode(struct intel_gt *gt);
+
+#endif /* __INTEL_GT_CCS_MODE_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 31b102604e3d..743fe3566722 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -1480,6 +1480,11 @@
 #define   XEHP_RCU_MODE_FIXED_SLICE_CCS_MODE	REG_BIT(1)
 #define   GEN12_RCU_MODE_CCS_ENABLE		REG_BIT(0)
 
+#define XEHP_CCS_MODE				_MMIO(0x14804)
+#define   XEHP_CCS_MODE_CSLICE_MASK		REG_GENMASK(2, 0) /* CCS0-3 + rsvd */
+#define   XEHP_CCS_MODE_CSLICE_WIDTH		ilog2(XEHP_CCS_MODE_CSLICE_MASK + 1)
+#define   XEHP_CCS_MODE_CSLICE(cslice, ccs)	(ccs << (cslice * XEHP_CCS_MODE_CSLICE_WIDTH))
+
 #define CHV_FUSE_GT				_MMIO(VLV_GUNIT_BASE + 0x2168)
 #define   CHV_FGT_DISABLE_SS0			(1 << 10)
 #define   CHV_FGT_DISABLE_SS1			(1 << 11)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 9963e5725ae5..8188c9f0b5ce 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -10,6 +10,7 @@
 #include "intel_engine_regs.h"
 #include "intel_gpu_commands.h"
 #include "intel_gt.h"
+#include "intel_gt_ccs_mode.h"
 #include "intel_gt_mcr.h"
 #include "intel_gt_print.h"
 #include "intel_gt_regs.h"
@@ -2869,6 +2870,12 @@ static void ccs_engine_wa_mode(struct intel_engine_cs *engine, struct i915_wa_li
 	 * made to completely disable automatic CCS load balancing.
 	 */
 	wa_masked_en(wal, GEN12_RCU_MODE, XEHP_RCU_MODE_FIXED_SLICE_CCS_MODE);
+
+	/*
+	 * After having disabled automatic load balancing we need to
+	 * assign all slices to a single CCS. We will call it CCS mode 1
+	 */
+	intel_gt_apply_ccs_mode(gt);
 }
 
 /*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/3] Disable automatic load CCS load balancing
  2024-03-13 20:19 [PATCH v6 0/3] Disable automatic load CCS load balancing Andi Shyti
                   ` (2 preceding siblings ...)
  2024-03-13 20:19 ` [PATCH v6 3/3] drm/i915/gt: Enable only one CCS for compute workload Andi Shyti
@ 2024-03-20 15:06 ` Andi Shyti
  2024-03-20 15:40   ` Tvrtko Ursulin
  2024-03-26 16:21 ` Andi Shyti
  2024-03-26 16:24 ` Andi Shyti
  5 siblings, 1 reply; 15+ messages in thread
From: Andi Shyti @ 2024-03-20 15:06 UTC (permalink / raw)
  To: Andi Shyti
  Cc: intel-gfx, dri-devel, Chris Wilson, Joonas Lahtinen, Matt Roper,
	John Harrison, stable, Andi Shyti, Tvrtko Ursulin

Ping! Any thoughts here?

Andi

On Wed, Mar 13, 2024 at 09:19:48PM +0100, Andi Shyti wrote:
> Hi,
> 
> this series does basically two things:
> 
> 1. Disables automatic load balancing as adviced by the hardware
>    workaround.
> 
> 2. Assigns all the CCS slices to one single user engine. The user
>    will then be able to query only one CCS engine
> 
> >From v5 I have created a new file, gt/intel_gt_ccs_mode.c where
> I added the intel_gt_apply_ccs_mode(). In the upcoming patches,
> this file will contain the implementation for dynamic CCS mode
> setting.
> 
> Thanks Tvrtko, Matt, John and Joonas for your reviews!
> 
> Andi
> 
> Changelog
> =========
> v5 -> v6 (thanks Matt for the suggestions in v6)
>  - Remove the refactoring and the for_each_available_engine()
>    macro and instead do not create the intel_engine_cs structure
>    at all.
>  - In patch 1 just a trivial reordering of the bit definitions.
> 
> v4 -> v5
>  - Use the workaround framework to do all the CCS balancing
>    settings in order to always apply the modes also when the
>    engine resets. Put everything in its own specific function to
>    be executed for the first CCS engine encountered. (Thanks
>    Matt)
>  - Calculate the CCS ID for the CCS mode as the first available
>    CCS among all the engines (Thanks Matt)
>  - create the intel_gt_ccs_mode.c function to host the CCS
>    configuration. We will have it ready for the next series.
>  - Fix a selftest that was failing because could not set CCS2.
>  - Add the for_each_available_engine() macro to exclude CCS1+ and
>    start using it in the hangcheck selftest.
> 
> v3 -> v4
>  - Reword correctly the comment in the workaround
>  - Fix a buffer overflow (Thanks Joonas)
>  - Handle properly the fused engines when setting the CCS mode.
> 
> v2 -> v3
>  - Simplified the algorithm for creating the list of the exported
>    uabi engines. (Patch 1) (Thanks, Tvrtko)
>  - Consider the fused engines when creating the uabi engine list
>    (Patch 2) (Thanks, Matt)
>  - Patch 4 now uses a the refactoring from patch 1, in a cleaner
>    outcome.
> 
> v1 -> v2
>  - In Patch 1 use the correct workaround number (thanks Matt).
>  - In Patch 2 do not add the extra CCS engines to the exposed
>    UABI engine list and adapt the engine counting accordingly
>    (thanks Tvrtko).
>  - Reword the commit of Patch 2 (thanks John).
> 
> Andi Shyti (3):
>   drm/i915/gt: Disable HW load balancing for CCS
>   drm/i915/gt: Do not generate the command streamer for all the CCS
>   drm/i915/gt: Enable only one CCS for compute workload
> 
>  drivers/gpu/drm/i915/Makefile               |  1 +
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 20 ++++++++---
>  drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c | 39 +++++++++++++++++++++
>  drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h | 13 +++++++
>  drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  6 ++++
>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 30 ++++++++++++++--
>  6 files changed, 103 insertions(+), 6 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
> 
> -- 
> 2.43.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/3] Disable automatic load CCS load balancing
  2024-03-20 15:06 ` [PATCH v6 0/3] Disable automatic load CCS load balancing Andi Shyti
@ 2024-03-20 15:40   ` Tvrtko Ursulin
  2024-03-20 18:02     ` Andi Shyti
  0 siblings, 1 reply; 15+ messages in thread
From: Tvrtko Ursulin @ 2024-03-20 15:40 UTC (permalink / raw)
  To: Andi Shyti
  Cc: intel-gfx, dri-devel, Chris Wilson, Joonas Lahtinen, Matt Roper,
	John Harrison, stable, Andi Shyti


On 20/03/2024 15:06, Andi Shyti wrote:
> Ping! Any thoughts here?

I only casually observed the discussion after I saw Matt suggested 
further simplifications. As I understood it, you will bring back the 
uabi engine games when adding the dynamic behaviour and that is fine by me.

Regards,

Tvrtko

> On Wed, Mar 13, 2024 at 09:19:48PM +0100, Andi Shyti wrote:
>> Hi,
>>
>> this series does basically two things:
>>
>> 1. Disables automatic load balancing as adviced by the hardware
>>     workaround.
>>
>> 2. Assigns all the CCS slices to one single user engine. The user
>>     will then be able to query only one CCS engine
>>
>> >From v5 I have created a new file, gt/intel_gt_ccs_mode.c where
>> I added the intel_gt_apply_ccs_mode(). In the upcoming patches,
>> this file will contain the implementation for dynamic CCS mode
>> setting.
>>
>> Thanks Tvrtko, Matt, John and Joonas for your reviews!
>>
>> Andi
>>
>> Changelog
>> =========
>> v5 -> v6 (thanks Matt for the suggestions in v6)
>>   - Remove the refactoring and the for_each_available_engine()
>>     macro and instead do not create the intel_engine_cs structure
>>     at all.
>>   - In patch 1 just a trivial reordering of the bit definitions.
>>
>> v4 -> v5
>>   - Use the workaround framework to do all the CCS balancing
>>     settings in order to always apply the modes also when the
>>     engine resets. Put everything in its own specific function to
>>     be executed for the first CCS engine encountered. (Thanks
>>     Matt)
>>   - Calculate the CCS ID for the CCS mode as the first available
>>     CCS among all the engines (Thanks Matt)
>>   - create the intel_gt_ccs_mode.c function to host the CCS
>>     configuration. We will have it ready for the next series.
>>   - Fix a selftest that was failing because could not set CCS2.
>>   - Add the for_each_available_engine() macro to exclude CCS1+ and
>>     start using it in the hangcheck selftest.
>>
>> v3 -> v4
>>   - Reword correctly the comment in the workaround
>>   - Fix a buffer overflow (Thanks Joonas)
>>   - Handle properly the fused engines when setting the CCS mode.
>>
>> v2 -> v3
>>   - Simplified the algorithm for creating the list of the exported
>>     uabi engines. (Patch 1) (Thanks, Tvrtko)
>>   - Consider the fused engines when creating the uabi engine list
>>     (Patch 2) (Thanks, Matt)
>>   - Patch 4 now uses a the refactoring from patch 1, in a cleaner
>>     outcome.
>>
>> v1 -> v2
>>   - In Patch 1 use the correct workaround number (thanks Matt).
>>   - In Patch 2 do not add the extra CCS engines to the exposed
>>     UABI engine list and adapt the engine counting accordingly
>>     (thanks Tvrtko).
>>   - Reword the commit of Patch 2 (thanks John).
>>
>> Andi Shyti (3):
>>    drm/i915/gt: Disable HW load balancing for CCS
>>    drm/i915/gt: Do not generate the command streamer for all the CCS
>>    drm/i915/gt: Enable only one CCS for compute workload
>>
>>   drivers/gpu/drm/i915/Makefile               |  1 +
>>   drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 20 ++++++++---
>>   drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c | 39 +++++++++++++++++++++
>>   drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h | 13 +++++++
>>   drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  6 ++++
>>   drivers/gpu/drm/i915/gt/intel_workarounds.c | 30 ++++++++++++++--
>>   6 files changed, 103 insertions(+), 6 deletions(-)
>>   create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
>>   create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
>>
>> -- 
>> 2.43.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/3] Disable automatic load CCS load balancing
  2024-03-20 15:40   ` Tvrtko Ursulin
@ 2024-03-20 18:02     ` Andi Shyti
  0 siblings, 0 replies; 15+ messages in thread
From: Andi Shyti @ 2024-03-20 18:02 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Andi Shyti, intel-gfx, dri-devel, Chris Wilson, Joonas Lahtinen,
	Matt Roper, John Harrison, stable, Andi Shyti

Hi Tvrtko,

On Wed, Mar 20, 2024 at 03:40:18PM +0000, Tvrtko Ursulin wrote:
> On 20/03/2024 15:06, Andi Shyti wrote:
> > Ping! Any thoughts here?
> 
> I only casually observed the discussion after I saw Matt suggested further
> simplifications. As I understood it, you will bring back the uabi engine
> games when adding the dynamic behaviour and that is fine by me.

yes, the refactoring suggested by you will come later.

Thanks,
Andi

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 2/3] drm/i915/gt: Do not generate the command streamer for all the CCS
  2024-03-13 20:19 ` [PATCH v6 2/3] drm/i915/gt: Do not generate the command streamer for all the CCS Andi Shyti
@ 2024-03-26 16:03   ` Matt Roper
  2024-03-26 18:42     ` Andi Shyti
  0 siblings, 1 reply; 15+ messages in thread
From: Matt Roper @ 2024-03-26 16:03 UTC (permalink / raw)
  To: Andi Shyti
  Cc: intel-gfx, dri-devel, Chris Wilson, Joonas Lahtinen,
	John Harrison, stable, Andi Shyti, Tvrtko Ursulin

On Wed, Mar 13, 2024 at 09:19:50PM +0100, Andi Shyti wrote:
> We want a fixed load CCS balancing consisting in all slices
> sharing one single user engine. For this reason do not create the
> intel_engine_cs structure with its dedicated command streamer for
> CCS slices beyond the first.
> 
> Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: <stable@vger.kernel.org> # v6.2+
> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c | 20 ++++++++++++++++----
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index f553cf4e6449..c4fb31bb6e72 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -966,6 +966,7 @@ int intel_engines_init_mmio(struct intel_gt *gt)
>  	const unsigned int engine_mask = init_engine_mask(gt);
>  	unsigned int mask = 0;
>  	unsigned int i, class;
> +	u8 ccs_instance = 0;
>  	u8 logical_ids[MAX_ENGINE_INSTANCE + 1];
>  	int err;
>  
> @@ -986,6 +987,19 @@ int intel_engines_init_mmio(struct intel_gt *gt)
>  			    !HAS_ENGINE(gt, i))
>  				continue;
>  
> +			/*
> +			 * Do not create the command streamer for CCS slices
> +			 * beyond the first. All the workload submitted to the
> +			 * first engine will be shared among all the slices.
> +			 *
> +			 * Once the user will be allowed to customize the CCS
> +			 * mode, then this check needs to be removed.
> +			 */
> +			if (IS_DG2(i915) &&
> +			    class == COMPUTE_CLASS &&
> +			    ccs_instance++)
> +				continue;

Wouldn't it be more intuitive to drop the non-lowest CCS engines in
init_engine_mask() since that's the function that's dedicated to
building the list of engines we'll use?  Then we don't need to kill the
assertion farther down either.


Matt

> +
>  			err = intel_engine_setup(gt, i,
>  						 logical_ids[instance]);
>  			if (err)
> @@ -996,11 +1010,9 @@ int intel_engines_init_mmio(struct intel_gt *gt)
>  	}
>  
>  	/*
> -	 * Catch failures to update intel_engines table when the new engines
> -	 * are added to the driver by a warning and disabling the forgotten
> -	 * engines.
> +	 * Update the intel_engines table.
>  	 */
> -	if (drm_WARN_ON(&i915->drm, mask != engine_mask))
> +	if (mask != engine_mask)
>  		gt->info.engine_mask = mask;
>  
>  	gt->info.num_engines = hweight32(mask);
> -- 
> 2.43.0
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 3/3] drm/i915/gt: Enable only one CCS for compute workload
  2024-03-13 20:19 ` [PATCH v6 3/3] drm/i915/gt: Enable only one CCS for compute workload Andi Shyti
@ 2024-03-26 16:06   ` Matt Roper
  0 siblings, 0 replies; 15+ messages in thread
From: Matt Roper @ 2024-03-26 16:06 UTC (permalink / raw)
  To: Andi Shyti
  Cc: intel-gfx, dri-devel, Chris Wilson, Joonas Lahtinen,
	John Harrison, stable, Andi Shyti, Tvrtko Ursulin

On Wed, Mar 13, 2024 at 09:19:51PM +0100, Andi Shyti wrote:
> Enable only one CCS engine by default with all the compute sices
> allocated to it.
> 
> While generating the list of UABI engines to be exposed to the
> user, exclude any additional CCS engines beyond the first
> instance.
> 
> This change can be tested with igt i915_query.
> 
> Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: <stable@vger.kernel.org> # v6.2+

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> ---
>  drivers/gpu/drm/i915/Makefile               |  1 +
>  drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c | 39 +++++++++++++++++++++
>  drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h | 13 +++++++
>  drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  5 +++
>  drivers/gpu/drm/i915/gt/intel_workarounds.c |  7 ++++
>  5 files changed, 65 insertions(+)
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 3ef6ed41e62b..a6885a1d41a1 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -118,6 +118,7 @@ gt-y += \
>  	gt/intel_ggtt_fencing.o \
>  	gt/intel_gt.o \
>  	gt/intel_gt_buffer_pool.o \
> +	gt/intel_gt_ccs_mode.o \
>  	gt/intel_gt_clock_utils.o \
>  	gt/intel_gt_debugfs.o \
>  	gt/intel_gt_engines_debugfs.o \
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
> new file mode 100644
> index 000000000000..044219c5960a
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
> @@ -0,0 +1,39 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2024 Intel Corporation
> + */
> +
> +#include "i915_drv.h"
> +#include "intel_gt.h"
> +#include "intel_gt_ccs_mode.h"
> +#include "intel_gt_regs.h"
> +
> +void intel_gt_apply_ccs_mode(struct intel_gt *gt)
> +{
> +	int cslice;
> +	u32 mode = 0;
> +	int first_ccs = __ffs(CCS_MASK(gt));
> +
> +	if (!IS_DG2(gt->i915))
> +		return;
> +
> +	/* Build the value for the fixed CCS load balancing */
> +	for (cslice = 0; cslice < I915_MAX_CCS; cslice++) {
> +		if (CCS_MASK(gt) & BIT(cslice))
> +			/*
> +			 * If available, assign the cslice
> +			 * to the first available engine...
> +			 */
> +			mode |= XEHP_CCS_MODE_CSLICE(cslice, first_ccs);
> +
> +		else
> +			/*
> +			 * ... otherwise, mark the cslice as
> +			 * unavailable if no CCS dispatches here
> +			 */
> +			mode |= XEHP_CCS_MODE_CSLICE(cslice,
> +						     XEHP_CCS_MODE_CSLICE_MASK);
> +	}
> +
> +	intel_uncore_write(gt->uncore, XEHP_CCS_MODE, mode);
> +}
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
> new file mode 100644
> index 000000000000..9e5549caeb26
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
> @@ -0,0 +1,13 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2024 Intel Corporation
> + */
> +
> +#ifndef __INTEL_GT_CCS_MODE_H__
> +#define __INTEL_GT_CCS_MODE_H__
> +
> +struct intel_gt;
> +
> +void intel_gt_apply_ccs_mode(struct intel_gt *gt);
> +
> +#endif /* __INTEL_GT_CCS_MODE_H__ */
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> index 31b102604e3d..743fe3566722 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> @@ -1480,6 +1480,11 @@
>  #define   XEHP_RCU_MODE_FIXED_SLICE_CCS_MODE	REG_BIT(1)
>  #define   GEN12_RCU_MODE_CCS_ENABLE		REG_BIT(0)
>  
> +#define XEHP_CCS_MODE				_MMIO(0x14804)
> +#define   XEHP_CCS_MODE_CSLICE_MASK		REG_GENMASK(2, 0) /* CCS0-3 + rsvd */
> +#define   XEHP_CCS_MODE_CSLICE_WIDTH		ilog2(XEHP_CCS_MODE_CSLICE_MASK + 1)
> +#define   XEHP_CCS_MODE_CSLICE(cslice, ccs)	(ccs << (cslice * XEHP_CCS_MODE_CSLICE_WIDTH))
> +
>  #define CHV_FUSE_GT				_MMIO(VLV_GUNIT_BASE + 0x2168)
>  #define   CHV_FGT_DISABLE_SS0			(1 << 10)
>  #define   CHV_FGT_DISABLE_SS1			(1 << 11)
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 9963e5725ae5..8188c9f0b5ce 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -10,6 +10,7 @@
>  #include "intel_engine_regs.h"
>  #include "intel_gpu_commands.h"
>  #include "intel_gt.h"
> +#include "intel_gt_ccs_mode.h"
>  #include "intel_gt_mcr.h"
>  #include "intel_gt_print.h"
>  #include "intel_gt_regs.h"
> @@ -2869,6 +2870,12 @@ static void ccs_engine_wa_mode(struct intel_engine_cs *engine, struct i915_wa_li
>  	 * made to completely disable automatic CCS load balancing.
>  	 */
>  	wa_masked_en(wal, GEN12_RCU_MODE, XEHP_RCU_MODE_FIXED_SLICE_CCS_MODE);
> +
> +	/*
> +	 * After having disabled automatic load balancing we need to
> +	 * assign all slices to a single CCS. We will call it CCS mode 1
> +	 */
> +	intel_gt_apply_ccs_mode(gt);
>  }
>  
>  /*
> -- 
> 2.43.0
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/3] Disable automatic load CCS load balancing
  2024-03-13 20:19 [PATCH v6 0/3] Disable automatic load CCS load balancing Andi Shyti
                   ` (3 preceding siblings ...)
  2024-03-20 15:06 ` [PATCH v6 0/3] Disable automatic load CCS load balancing Andi Shyti
@ 2024-03-26 16:21 ` Andi Shyti
  2024-03-26 16:24 ` Andi Shyti
  5 siblings, 0 replies; 15+ messages in thread
From: Andi Shyti @ 2024-03-26 16:21 UTC (permalink / raw)
  To: Andi Shyti
  Cc: intel-gfx, dri-devel, Chris Wilson, Joonas Lahtinen, Matt Roper,
	John Harrison, stable, Andi Shyti, Tvrtko Ursulin

Joonas,

> 1. Disables automatic load balancing as adviced by the hardware
>    workaround.

do we need a documentation update here?

Andi

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/3] Disable automatic load CCS load balancing
  2024-03-13 20:19 [PATCH v6 0/3] Disable automatic load CCS load balancing Andi Shyti
                   ` (4 preceding siblings ...)
  2024-03-26 16:21 ` Andi Shyti
@ 2024-03-26 16:24 ` Andi Shyti
  2024-03-26 16:25   ` Mrozek, Michal
  5 siblings, 1 reply; 15+ messages in thread
From: Andi Shyti @ 2024-03-26 16:24 UTC (permalink / raw)
  To: Andi Shyti
  Cc: intel-gfx, dri-devel, Chris Wilson, Joonas Lahtinen, Matt Roper,
	John Harrison, stable, Andi Shyti, Tvrtko Ursulin, michal.mrozek,
	mark.j.buxton

Hi Michal, Mark,

can you please ack from your side this first batch of changes?

Thanks,
Andi

On Wed, Mar 13, 2024 at 09:19:48PM +0100, Andi Shyti wrote:
> Hi,
> 
> this series does basically two things:
> 
> 1. Disables automatic load balancing as adviced by the hardware
>    workaround.
> 
> 2. Assigns all the CCS slices to one single user engine. The user
>    will then be able to query only one CCS engine
> 
> >From v5 I have created a new file, gt/intel_gt_ccs_mode.c where
> I added the intel_gt_apply_ccs_mode(). In the upcoming patches,
> this file will contain the implementation for dynamic CCS mode
> setting.
> 
> Thanks Tvrtko, Matt, John and Joonas for your reviews!
> 
> Andi
> 
> Changelog
> =========
> v5 -> v6 (thanks Matt for the suggestions in v6)
>  - Remove the refactoring and the for_each_available_engine()
>    macro and instead do not create the intel_engine_cs structure
>    at all.
>  - In patch 1 just a trivial reordering of the bit definitions.
> 
> v4 -> v5
>  - Use the workaround framework to do all the CCS balancing
>    settings in order to always apply the modes also when the
>    engine resets. Put everything in its own specific function to
>    be executed for the first CCS engine encountered. (Thanks
>    Matt)
>  - Calculate the CCS ID for the CCS mode as the first available
>    CCS among all the engines (Thanks Matt)
>  - create the intel_gt_ccs_mode.c function to host the CCS
>    configuration. We will have it ready for the next series.
>  - Fix a selftest that was failing because could not set CCS2.
>  - Add the for_each_available_engine() macro to exclude CCS1+ and
>    start using it in the hangcheck selftest.
> 
> v3 -> v4
>  - Reword correctly the comment in the workaround
>  - Fix a buffer overflow (Thanks Joonas)
>  - Handle properly the fused engines when setting the CCS mode.
> 
> v2 -> v3
>  - Simplified the algorithm for creating the list of the exported
>    uabi engines. (Patch 1) (Thanks, Tvrtko)
>  - Consider the fused engines when creating the uabi engine list
>    (Patch 2) (Thanks, Matt)
>  - Patch 4 now uses a the refactoring from patch 1, in a cleaner
>    outcome.
> 
> v1 -> v2
>  - In Patch 1 use the correct workaround number (thanks Matt).
>  - In Patch 2 do not add the extra CCS engines to the exposed
>    UABI engine list and adapt the engine counting accordingly
>    (thanks Tvrtko).
>  - Reword the commit of Patch 2 (thanks John).
> 
> Andi Shyti (3):
>   drm/i915/gt: Disable HW load balancing for CCS
>   drm/i915/gt: Do not generate the command streamer for all the CCS
>   drm/i915/gt: Enable only one CCS for compute workload
> 
>  drivers/gpu/drm/i915/Makefile               |  1 +
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 20 ++++++++---
>  drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c | 39 +++++++++++++++++++++
>  drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h | 13 +++++++
>  drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  6 ++++
>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 30 ++++++++++++++--
>  6 files changed, 103 insertions(+), 6 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
> 
> -- 
> 2.43.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH v6 0/3] Disable automatic load CCS load balancing
  2024-03-26 16:24 ` Andi Shyti
@ 2024-03-26 16:25   ` Mrozek, Michal
  0 siblings, 0 replies; 15+ messages in thread
From: Mrozek, Michal @ 2024-03-26 16:25 UTC (permalink / raw)
  To: Andi Shyti
  Cc: intel-gfx, dri-devel, Chris Wilson, Joonas Lahtinen,
	Roper, Matthew D, Harrison, John C, stable@vger.kernel.org,
	Andi Shyti, Tvrtko Ursulin, Buxton, Mark J

On Wed, Mar 13, 2024 at 09:19:48PM +0100, Andi Shyti wrote:
> Hi,
> 
> this series does basically two things:
> 
> 1. Disables automatic load balancing as adviced by the hardware
>    workaround.
> 
> 2. Assigns all the CCS slices to one single user engine. The user
>    will then be able to query only one CCS engine
> 
> >From v5 I have created a new file, gt/intel_gt_ccs_mode.c where
> I added the intel_gt_apply_ccs_mode(). In the upcoming patches, this 
> file will contain the implementation for dynamic CCS mode setting.
> 
> Thanks Tvrtko, Matt, John and Joonas for your reviews!
> 
> Andi
> 
> Changelog
> =========
> v5 -> v6 (thanks Matt for the suggestions in v6)
>  - Remove the refactoring and the for_each_available_engine()
>    macro and instead do not create the intel_engine_cs structure
>    at all.
>  - In patch 1 just a trivial reordering of the bit definitions.
> 
> v4 -> v5
>  - Use the workaround framework to do all the CCS balancing
>    settings in order to always apply the modes also when the
>    engine resets. Put everything in its own specific function to
>    be executed for the first CCS engine encountered. (Thanks
>    Matt)
>  - Calculate the CCS ID for the CCS mode as the first available
>    CCS among all the engines (Thanks Matt)
>  - create the intel_gt_ccs_mode.c function to host the CCS
>    configuration. We will have it ready for the next series.
>  - Fix a selftest that was failing because could not set CCS2.
>  - Add the for_each_available_engine() macro to exclude CCS1+ and
>    start using it in the hangcheck selftest.
> 
> v3 -> v4
>  - Reword correctly the comment in the workaround
>  - Fix a buffer overflow (Thanks Joonas)
>  - Handle properly the fused engines when setting the CCS mode.
> 
> v2 -> v3
>  - Simplified the algorithm for creating the list of the exported
>    uabi engines. (Patch 1) (Thanks, Tvrtko)
>  - Consider the fused engines when creating the uabi engine list
>    (Patch 2) (Thanks, Matt)
>  - Patch 4 now uses a the refactoring from patch 1, in a cleaner
>    outcome.
> 
> v1 -> v2
>  - In Patch 1 use the correct workaround number (thanks Matt).
>  - In Patch 2 do not add the extra CCS engines to the exposed
>    UABI engine list and adapt the engine counting accordingly
>    (thanks Tvrtko).
>  - Reword the commit of Patch 2 (thanks John).
> 
> Andi Shyti (3):
>   drm/i915/gt: Disable HW load balancing for CCS
>   drm/i915/gt: Do not generate the command streamer for all the CCS
>   drm/i915/gt: Enable only one CCS for compute workload
> 
>  drivers/gpu/drm/i915/Makefile               |  1 +
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 20 ++++++++---
>  drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c | 39 
> +++++++++++++++++++++  drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h | 13 +++++++
>  drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  6 ++++
>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 30 ++++++++++++++--
>  6 files changed, 103 insertions(+), 6 deletions(-)  create mode 
> 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
> 
> --
> 2.43.0

Acked-by: Michal Mrozek <michal.mrozek@intel.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 2/3] drm/i915/gt: Do not generate the command streamer for all the CCS
  2024-03-26 16:03   ` Matt Roper
@ 2024-03-26 18:42     ` Andi Shyti
  2024-03-26 21:30       ` Matt Roper
  0 siblings, 1 reply; 15+ messages in thread
From: Andi Shyti @ 2024-03-26 18:42 UTC (permalink / raw)
  To: Matt Roper
  Cc: Andi Shyti, intel-gfx, dri-devel, Chris Wilson, Joonas Lahtinen,
	John Harrison, stable, Andi Shyti, Tvrtko Ursulin

Hi Matt,

On Tue, Mar 26, 2024 at 09:03:10AM -0700, Matt Roper wrote:
> On Wed, Mar 13, 2024 at 09:19:50PM +0100, Andi Shyti wrote:
> > +			/*
> > +			 * Do not create the command streamer for CCS slices
> > +			 * beyond the first. All the workload submitted to the
> > +			 * first engine will be shared among all the slices.
> > +			 *
> > +			 * Once the user will be allowed to customize the CCS
> > +			 * mode, then this check needs to be removed.
> > +			 */
> > +			if (IS_DG2(i915) &&
> > +			    class == COMPUTE_CLASS &&
> > +			    ccs_instance++)
> > +				continue;
> 
> Wouldn't it be more intuitive to drop the non-lowest CCS engines in
> init_engine_mask() since that's the function that's dedicated to
> building the list of engines we'll use?  Then we don't need to kill the
> assertion farther down either.

Because we don't check the result of init_engine_mask() while
creating the engine's structure. We check it only after and
indeed I removed the drm_WARN_ON() check.

I think the whole process of creating the engine's structure in
the intel_engines_init_mmio() can be simplified, but this goes
beyong the scope of the series.

Or am I missing something?

Thanks,
Andi

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 2/3] drm/i915/gt: Do not generate the command streamer for all the CCS
  2024-03-26 18:42     ` Andi Shyti
@ 2024-03-26 21:30       ` Matt Roper
  2024-03-26 23:16         ` Andi Shyti
  0 siblings, 1 reply; 15+ messages in thread
From: Matt Roper @ 2024-03-26 21:30 UTC (permalink / raw)
  To: Andi Shyti
  Cc: intel-gfx, dri-devel, Chris Wilson, Joonas Lahtinen,
	John Harrison, stable, Andi Shyti, Tvrtko Ursulin

On Tue, Mar 26, 2024 at 07:42:34PM +0100, Andi Shyti wrote:
> Hi Matt,
> 
> On Tue, Mar 26, 2024 at 09:03:10AM -0700, Matt Roper wrote:
> > On Wed, Mar 13, 2024 at 09:19:50PM +0100, Andi Shyti wrote:
> > > +			/*
> > > +			 * Do not create the command streamer for CCS slices
> > > +			 * beyond the first. All the workload submitted to the
> > > +			 * first engine will be shared among all the slices.
> > > +			 *
> > > +			 * Once the user will be allowed to customize the CCS
> > > +			 * mode, then this check needs to be removed.
> > > +			 */
> > > +			if (IS_DG2(i915) &&
> > > +			    class == COMPUTE_CLASS &&
> > > +			    ccs_instance++)
> > > +				continue;
> > 
> > Wouldn't it be more intuitive to drop the non-lowest CCS engines in
> > init_engine_mask() since that's the function that's dedicated to
> > building the list of engines we'll use?  Then we don't need to kill the
> > assertion farther down either.
> 
> Because we don't check the result of init_engine_mask() while
> creating the engine's structure. We check it only after and
> indeed I removed the drm_WARN_ON() check.
> 
> I think the whole process of creating the engine's structure in
> the intel_engines_init_mmio() can be simplified, but this goes
> beyong the scope of the series.
> 
> Or am I missing something?

The important part of init_engine_mask isn't the return value, but
rather that it's what sets up gt->info.engine_mask.  The HAS_ENGINE()
check that intel_engines_init_mmio() uses is based on the value stored
there, so updating that function will also ensure that we skip the
engines we don't want in the loop.


Matt

> 
> Thanks,
> Andi

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 2/3] drm/i915/gt: Do not generate the command streamer for all the CCS
  2024-03-26 21:30       ` Matt Roper
@ 2024-03-26 23:16         ` Andi Shyti
  0 siblings, 0 replies; 15+ messages in thread
From: Andi Shyti @ 2024-03-26 23:16 UTC (permalink / raw)
  To: Matt Roper
  Cc: Andi Shyti, intel-gfx, dri-devel, Chris Wilson, Joonas Lahtinen,
	John Harrison, stable, Andi Shyti, Tvrtko Ursulin

Hi Matt,

On Tue, Mar 26, 2024 at 02:30:33PM -0700, Matt Roper wrote:
> On Tue, Mar 26, 2024 at 07:42:34PM +0100, Andi Shyti wrote:
> > On Tue, Mar 26, 2024 at 09:03:10AM -0700, Matt Roper wrote:
> > > On Wed, Mar 13, 2024 at 09:19:50PM +0100, Andi Shyti wrote:
> > > > +			/*
> > > > +			 * Do not create the command streamer for CCS slices
> > > > +			 * beyond the first. All the workload submitted to the
> > > > +			 * first engine will be shared among all the slices.
> > > > +			 *
> > > > +			 * Once the user will be allowed to customize the CCS
> > > > +			 * mode, then this check needs to be removed.
> > > > +			 */
> > > > +			if (IS_DG2(i915) &&
> > > > +			    class == COMPUTE_CLASS &&
> > > > +			    ccs_instance++)
> > > > +				continue;
> > > 
> > > Wouldn't it be more intuitive to drop the non-lowest CCS engines in
> > > init_engine_mask() since that's the function that's dedicated to
> > > building the list of engines we'll use?  Then we don't need to kill the
> > > assertion farther down either.
> > 
> > Because we don't check the result of init_engine_mask() while
> > creating the engine's structure. We check it only after and
> > indeed I removed the drm_WARN_ON() check.
> > 
> > I think the whole process of creating the engine's structure in
> > the intel_engines_init_mmio() can be simplified, but this goes
> > beyong the scope of the series.
> > 
> > Or am I missing something?
> 
> The important part of init_engine_mask isn't the return value, but
> rather that it's what sets up gt->info.engine_mask.  The HAS_ENGINE()
> check that intel_engines_init_mmio() uses is based on the value stored
> there, so updating that function will also ensure that we skip the
> engines we don't want in the loop.

Yes, can do like this, as well. After all this is done I'm going
to do some cleanup here, as well.

Thanks,
Andi

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-03-26 23:16 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-13 20:19 [PATCH v6 0/3] Disable automatic load CCS load balancing Andi Shyti
2024-03-13 20:19 ` [PATCH v6 1/3] drm/i915/gt: Disable HW load balancing for CCS Andi Shyti
2024-03-13 20:19 ` [PATCH v6 2/3] drm/i915/gt: Do not generate the command streamer for all the CCS Andi Shyti
2024-03-26 16:03   ` Matt Roper
2024-03-26 18:42     ` Andi Shyti
2024-03-26 21:30       ` Matt Roper
2024-03-26 23:16         ` Andi Shyti
2024-03-13 20:19 ` [PATCH v6 3/3] drm/i915/gt: Enable only one CCS for compute workload Andi Shyti
2024-03-26 16:06   ` Matt Roper
2024-03-20 15:06 ` [PATCH v6 0/3] Disable automatic load CCS load balancing Andi Shyti
2024-03-20 15:40   ` Tvrtko Ursulin
2024-03-20 18:02     ` Andi Shyti
2024-03-26 16:21 ` Andi Shyti
2024-03-26 16:24 ` Andi Shyti
2024-03-26 16:25   ` Mrozek, Michal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox