public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] drm/i915: Tune the GPU L3 SQ credits on CHV
@ 2016-05-03 12:54 Imre Deak
  2016-05-03 12:54 ` [PATCH v3 1/3] drm/i915/bdw: Add missing delay during L3 SQC credit programming Imre Deak
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Imre Deak @ 2016-05-03 12:54 UTC (permalink / raw)
  To: intel-gfx

This is v3 of patchset [1]. It addresses comments from Ville and drops
the BXT change, since that was added separately in [2]. The place where
the WA is programmed in [2] is not ideal, since it's done now only when
an RCS context is submitted and isn't done for the other engines. My
solution was to program the WA during init_clock_gating(), but that has
the problem of losing the setting across a GPU reset. The way it's done
in [2] is probably still the better solution in practice.

[1]
https://lists.freedesktop.org/archives/intel-gfx/2016-April/093923.html
[2]
https://lists.freedesktop.org/archives/intel-gfx/2016-April/093480.html

Imre Deak (3):
  drm/i915/bdw: Add missing delay during L3 SQC credit programming
  drm/i915: Clean up L3 SQC register field definitions
  drm/i915/chv: Tune L3 SQC credits based on actual latencies

 drivers/gpu/drm/i915/i915_reg.h         | 10 ++++++--
 drivers/gpu/drm/i915/intel_pm.c         | 41 +++++++++++++++++++++++++--------
 drivers/gpu/drm/i915/intel_ringbuffer.c |  3 ++-
 3 files changed, 42 insertions(+), 12 deletions(-)

-- 
2.5.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 1/3] drm/i915/bdw: Add missing delay during L3 SQC credit programming
  2016-05-03 12:54 [PATCH v3 0/3] drm/i915: Tune the GPU L3 SQ credits on CHV Imre Deak
@ 2016-05-03 12:54 ` Imre Deak
  2016-05-03 12:54 ` [PATCH v3 2/3] drm/i915: Clean up L3 SQC register field definitions Imre Deak
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Imre Deak @ 2016-05-03 12:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: stable

BSpec requires us to wait ~100 clocks before re-enabling clock gating,
so make sure we do this.

CC: stable@vger.kernel.org
CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 2422ac3..227cd2d 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6738,6 +6738,12 @@ static void broadwell_init_clock_gating(struct drm_device *dev)
 	misccpctl = I915_READ(GEN7_MISCCPCTL);
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
 	I915_WRITE(GEN8_L3SQCREG1, BDW_WA_L3SQCREG1_DEFAULT);
+	/*
+	 * Wait at least 100 clocks before re-enabling clock gating. See
+	 * the definition of L3SQCREG1 in BSpec.
+	 */
+	POSTING_READ(GEN8_L3SQCREG1);
+	udelay(1);
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
 
 	/*
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v3 2/3] drm/i915: Clean up L3 SQC register field definitions
  2016-05-03 12:54 [PATCH v3 0/3] drm/i915: Tune the GPU L3 SQ credits on CHV Imre Deak
  2016-05-03 12:54 ` [PATCH v3 1/3] drm/i915/bdw: Add missing delay during L3 SQC credit programming Imre Deak
@ 2016-05-03 12:54 ` Imre Deak
  2016-05-03 12:54 ` [PATCH v3 3/3] drm/i915/chv: Tune L3 SQC credits based on actual latencies Imre Deak
  2016-05-03 13:18 ` ✗ Fi.CI.BAT: failure for drm/i915: Tune the GPU L3 SQ credits on CHV Patchwork
  3 siblings, 0 replies; 6+ messages in thread
From: Imre Deak @ 2016-05-03 12:54 UTC (permalink / raw)
  To: intel-gfx

No need for hard-coding the register value, the corresponding fields are
defined properly in BSpec.

No functional change.

v2:
- Rebased on BXT L3 SQC tuning patch merged meanwhile.

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
---
 drivers/gpu/drm/i915/i915_reg.h         | 4 ++--
 drivers/gpu/drm/i915/intel_pm.c         | 3 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 3 ++-
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index fd19f57..543f440 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -6091,8 +6091,8 @@ enum skl_disp_power_wells {
 #define  VLV_B0_WA_L3SQCREG1_VALUE		0x00D30000
 
 #define GEN8_L3SQCREG1				_MMIO(0xB100)
-#define  BDW_WA_L3SQCREG1_DEFAULT		0x784000
-#define  BXT_WA_L3SQCREG1_DEFAULT		0xF84000
+#define  L3_GENERAL_PRIO_CREDITS(x)		(((x) >> 1) << 19)
+#define  L3_HIGH_PRIO_CREDITS(x)		(((x) >> 1) << 14)
 
 #define GEN7_L3CNTLREG1				_MMIO(0xB01C)
 #define  GEN7_WA_FOR_GEN7_L3_CONTROL			0x3C47FF8C
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 227cd2d..6a48f40 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6737,7 +6737,8 @@ static void broadwell_init_clock_gating(struct drm_device *dev)
 	 */
 	misccpctl = I915_READ(GEN7_MISCCPCTL);
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
-	I915_WRITE(GEN8_L3SQCREG1, BDW_WA_L3SQCREG1_DEFAULT);
+	I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(30) |
+				   L3_HIGH_PRIO_CREDITS(2));
 	/*
 	 * Wait at least 100 clocks before re-enabling clock gating. See
 	 * the definition of L3SQCREG1 in BSpec.
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 70738a5..8f3eb30 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1182,7 +1182,8 @@ static int bxt_init_workarounds(struct intel_engine_cs *engine)
 
 	/* WaProgramL3SqcReg1DefaultForPerf:bxt */
 	if (IS_BXT_REVID(dev, BXT_REVID_B0, REVID_FOREVER))
-		I915_WRITE(GEN8_L3SQCREG1, BXT_WA_L3SQCREG1_DEFAULT);
+		I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(62) |
+					   L3_HIGH_PRIO_CREDITS(2));
 
 	return 0;
 }
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v3 3/3] drm/i915/chv: Tune L3 SQC credits based on actual latencies
  2016-05-03 12:54 [PATCH v3 0/3] drm/i915: Tune the GPU L3 SQ credits on CHV Imre Deak
  2016-05-03 12:54 ` [PATCH v3 1/3] drm/i915/bdw: Add missing delay during L3 SQC credit programming Imre Deak
  2016-05-03 12:54 ` [PATCH v3 2/3] drm/i915: Clean up L3 SQC register field definitions Imre Deak
@ 2016-05-03 12:54 ` Imre Deak
  2016-05-03 13:18 ` ✗ Fi.CI.BAT: failure for drm/i915: Tune the GPU L3 SQ credits on CHV Patchwork
  3 siblings, 0 replies; 6+ messages in thread
From: Imre Deak @ 2016-05-03 12:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

While browsing BSpec I bumped into a note saying we need to tune these
values based on actual measurements done after initial enabling. I've
checked that it indeed improves things on BXT. I haven't checked this on
CHV, but here it is if someone wants to give it a go.

v2:
- Add note about the discrepancy wrt. to the spec in the formula
  calculating the credit encodings. (Mika, Ville)
- Move the WA comment to the new function. (Ville)
v3:
- Keep the comment about the SQC WA in the caller. (Ville)

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h |  6 ++++++
 drivers/gpu/drm/i915/intel_pm.c | 48 +++++++++++++++++++++++++++--------------
 2 files changed, 38 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 543f440..54ce0b1 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -6091,6 +6091,12 @@ enum skl_disp_power_wells {
 #define  VLV_B0_WA_L3SQCREG1_VALUE		0x00D30000
 
 #define GEN8_L3SQCREG1				_MMIO(0xB100)
+/*
+ * Note that on CHV the following has an off-by-one error wrt. to BSpec.
+ * Using the formula in BSpec leads to a hang, while the formula here works
+ * fine and matches the formulas for all other platforms. A BSpec change
+ * request has been filed to clarify this.
+ */
 #define  L3_GENERAL_PRIO_CREDITS(x)		(((x) >> 1) << 19)
 #define  L3_HIGH_PRIO_CREDITS(x)		(((x) >> 1) << 14)
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 6a48f40..017c431 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6696,11 +6696,33 @@ static void lpt_suspend_hw(struct drm_device *dev)
 	}
 }
 
+static void gen8_set_l3sqc_credits(struct drm_i915_private *dev_priv,
+				   int general_prio_credits,
+				   int high_prio_credits)
+{
+	u32 misccpctl;
+
+	/* WaTempDisableDOPClkGating:bdw */
+	misccpctl = I915_READ(GEN7_MISCCPCTL);
+	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
+
+	I915_WRITE(GEN8_L3SQCREG1,
+		   L3_GENERAL_PRIO_CREDITS(general_prio_credits) |
+		   L3_HIGH_PRIO_CREDITS(high_prio_credits));
+
+	/*
+	 * Wait at least 100 clocks before re-enabling clock gating.
+	 * See the definition of L3SQCREG1 in BSpec.
+	 */
+	POSTING_READ(GEN8_L3SQCREG1);
+	udelay(1);
+	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
+}
+
 static void broadwell_init_clock_gating(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	enum pipe pipe;
-	uint32_t misccpctl;
 
 	ilk_init_lp_watermarks(dev);
 
@@ -6731,21 +6753,8 @@ static void broadwell_init_clock_gating(struct drm_device *dev)
 	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
 		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
 
-	/*
-	 * WaProgramL3SqcReg1Default:bdw
-	 * WaTempDisableDOPClkGating:bdw
-	 */
-	misccpctl = I915_READ(GEN7_MISCCPCTL);
-	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
-	I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(30) |
-				   L3_HIGH_PRIO_CREDITS(2));
-	/*
-	 * Wait at least 100 clocks before re-enabling clock gating. See
-	 * the definition of L3SQCREG1 in BSpec.
-	 */
-	POSTING_READ(GEN8_L3SQCREG1);
-	udelay(1);
-	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
+	/* WaProgramL3SqcReg1Default:bdw */
+	gen8_set_l3sqc_credits(dev_priv, 30, 2);
 
 	/*
 	 * WaGttCachingOffByDefault:bdw
@@ -7016,6 +7025,13 @@ static void cherryview_init_clock_gating(struct drm_device *dev)
 		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
 
 	/*
+	 * WaProgramL3SqcReg1Default:chv
+	 * See gfxspecs/Related Documents/Performance Guide/
+	 * LSQC Setting Recommendations.
+	 */
+	gen8_set_l3sqc_credits(dev_priv, 38, 2);
+
+	/*
 	 * GTT cache may not work with big pages, so if those
 	 * are ever enabled GTT cache may need to be disabled.
 	 */
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Tune the GPU L3 SQ credits on CHV
  2016-05-03 12:54 [PATCH v3 0/3] drm/i915: Tune the GPU L3 SQ credits on CHV Imre Deak
                   ` (2 preceding siblings ...)
  2016-05-03 12:54 ` [PATCH v3 3/3] drm/i915/chv: Tune L3 SQC credits based on actual latencies Imre Deak
@ 2016-05-03 13:18 ` Patchwork
  2016-05-03 13:55   ` Imre Deak
  3 siblings, 1 reply; 6+ messages in thread
From: Patchwork @ 2016-05-03 13:18 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Tune the GPU L3 SQ credits on CHV
URL   : https://patchwork.freedesktop.org/series/6663/
State : failure

== Summary ==

Series 6663v1 drm/i915: Tune the GPU L3 SQ credits on CHV
http://patchwork.freedesktop.org/api/1.0/series/6663/revisions/1/mbox/

Test gem_exec_flush:
        Subgroup basic-uc-pro-default-interruptible:
                pass       -> FAIL       (byt-nuc)
Test kms_flip:
        Subgroup basic-flip-vs-modeset:
                pass       -> DMESG-WARN (skl-i7k-2)
Test kms_force_connector_basic:
        Subgroup force-load-detect:
                skip       -> PASS       (snb-x220t)
        Subgroup prune-stale-modes:
                skip       -> PASS       (snb-x220t)
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-c:
                incomplete -> PASS       (hsw-gt2)

bdw-nuci7-2      total:221  pass:209  dwarn:0   dfail:0   fail:0   skip:12 
bdw-ultra        total:221  pass:196  dwarn:0   dfail:0   fail:0   skip:25 
byt-nuc          total:220  pass:175  dwarn:0   dfail:0   fail:4   skip:41 
hsw-brixbox      total:221  pass:195  dwarn:0   dfail:0   fail:0   skip:26 
hsw-gt2          total:221  pass:199  dwarn:0   dfail:0   fail:1   skip:21 
ilk-hp8440p      total:221  pass:158  dwarn:0   dfail:0   fail:2   skip:61 
ivb-t430s        total:221  pass:190  dwarn:0   dfail:0   fail:0   skip:31 
skl-i7k-2        total:221  pass:193  dwarn:1   dfail:0   fail:0   skip:27 
skl-nuci5        total:221  pass:210  dwarn:0   dfail:0   fail:0   skip:11 
snb-dellxps      total:221  pass:179  dwarn:0   dfail:0   fail:0   skip:42 
snb-x220t        total:221  pass:179  dwarn:0   dfail:0   fail:1   skip:41 

Results at /archive/results/CI_IGT_test/Patchwork_2129/

4c6b0d9cea0a81653fc290fe64d5c43e7d5c5762 drm-intel-nightly: 2016y-05m-03d-08h-18m-32s UTC integration manifest
4e31095 drm/i915/chv: Tune L3 SQC credits based on actual latencies
e0e32f8 drm/i915: Clean up L3 SQC register field definitions
a11eee8 drm/i915/bdw: Add missing delay during L3 SQC credit programming

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ✗ Fi.CI.BAT: failure for drm/i915: Tune the GPU L3 SQ credits on CHV
  2016-05-03 13:18 ` ✗ Fi.CI.BAT: failure for drm/i915: Tune the GPU L3 SQ credits on CHV Patchwork
@ 2016-05-03 13:55   ` Imre Deak
  0 siblings, 0 replies; 6+ messages in thread
From: Imre Deak @ 2016-05-03 13:55 UTC (permalink / raw)
  To: intel-gfx, Ville Syrjälä, Mika Kuoppala

On ti, 2016-05-03 at 13:18 +0000, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Tune the GPU L3 SQ credits on CHV
> URL   : https://patchwork.freedesktop.org/series/6663/
> State : failure
> 
> == Summary ==
>
> Series 6663v1 drm/i915: Tune the GPU L3 SQ credits on CHV
> http://patchwork.freedesktop.org/api/1.0/series/6663/revisions/1/mbox
> /
> 
> Test gem_exec_flush:
>         Subgroup basic-uc-pro-default-interruptible:
>                 pass       -> FAIL       (byt-nuc)

Pre-existing issue, assumed to be fixed by Chris' recent clflush fix.

> Test kms_flip:
>         Subgroup basic-flip-vs-modeset:
>                 pass       -> DMESG-WARN (skl-i7k-2)

Unrelated platform, pre-existing issue. stuck pageflip, vblank timeout
during atomic commit:
https://bugs.freedesktop.org/show_bug.cgi?id=94572
https://bugs.freedesktop.org/show_bug.cgi?id=94993

Thanks for the review I pushed the patches to -dinq.

> Test kms_force_connector_basic:
>         Subgroup force-load-detect:
>                 skip       -> PASS       (snb-x220t)
>         Subgroup prune-stale-modes:
>                 skip       -> PASS       (snb-x220t)
> Test kms_pipe_crc_basic:
>         Subgroup suspend-read-crc-pipe-c:
>                 incomplete -> PASS       (hsw-gt2)
> 
> bdw-nuci7-
> 2      total:221  pass:209  dwarn:0   dfail:0   fail:0   skip:12 
> bdw-
> ultra        total:221  pass:196  dwarn:0   dfail:0   fail:0   skip:2
> 5 
> byt-
> nuc          total:220  pass:175  dwarn:0   dfail:0   fail:4   skip:4
> 1 
> hsw-
> brixbox      total:221  pass:195  dwarn:0   dfail:0   fail:0   skip:2
> 6 
> hsw-
> gt2          total:221  pass:199  dwarn:0   dfail:0   fail:1   skip:2
> 1 
> ilk-
> hp8440p      total:221  pass:158  dwarn:0   dfail:0   fail:2   skip:6
> 1 
> ivb-
> t430s        total:221  pass:190  dwarn:0   dfail:0   fail:0   skip:3
> 1 
> skl-i7k-
> 2        total:221  pass:193  dwarn:1   dfail:0   fail:0   skip:27 
> skl-
> nuci5        total:221  pass:210  dwarn:0   dfail:0   fail:0   skip:1
> 1 
> snb-
> dellxps      total:221  pass:179  dwarn:0   dfail:0   fail:0   skip:4
> 2 
> snb-
> x220t        total:221  pass:179  dwarn:0   dfail:0   fail:1   skip:4
> 1 
> 
> Results at /archive/results/CI_IGT_test/Patchwork_2129/
> 
> 4c6b0d9cea0a81653fc290fe64d5c43e7d5c5762 drm-intel-nightly: 2016y-
> 05m-03d-08h-18m-32s UTC integration manifest
> 4e31095 drm/i915/chv: Tune L3 SQC credits based on actual latencies
> e0e32f8 drm/i915: Clean up L3 SQC register field definitions
> a11eee8 drm/i915/bdw: Add missing delay during L3 SQC credit
> programming
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-05-03 13:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-03 12:54 [PATCH v3 0/3] drm/i915: Tune the GPU L3 SQ credits on CHV Imre Deak
2016-05-03 12:54 ` [PATCH v3 1/3] drm/i915/bdw: Add missing delay during L3 SQC credit programming Imre Deak
2016-05-03 12:54 ` [PATCH v3 2/3] drm/i915: Clean up L3 SQC register field definitions Imre Deak
2016-05-03 12:54 ` [PATCH v3 3/3] drm/i915/chv: Tune L3 SQC credits based on actual latencies Imre Deak
2016-05-03 13:18 ` ✗ Fi.CI.BAT: failure for drm/i915: Tune the GPU L3 SQ credits on CHV Patchwork
2016-05-03 13:55   ` Imre Deak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox