public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list.
@ 2015-10-07 11:44 Francisco Jerez
  2015-10-07 11:44 ` [PATCH 2/6] drm/i915: Don't warn if the workaround list is empty Francisco Jerez
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Francisco Jerez @ 2015-10-07 11:44 UTC (permalink / raw)
  To: intel-gfx

This programs the L3 configuration based on the sizes given for each
partition as arguments.  The relevant register writes are added to the
workaround list so that they are re-applied to each context while it's
initialized, preventing state leaks from other userspace processes
which may have modified the L3 partitioning from its boot-up state,
since all relevant registers are part of the software and hardware
command checker whitelists.

Some userspace clients (DDX and current versions of Mesa not patched
with my L3 partitioning series [1]) assume that the L3 configuration,
in particular the URB size, comes up in certain state when a context
is created, but nothing in the kernel guarantees this assumption, the
registers that control the partitioning of the L3 cache were being
left untouched.

Note that the VLV_L3SQCREG1_SQGHPCI_DEFAULT macro defined here has the
same value as the previously defined VLV_B0_WA_L3SQCREG1_VALUE, but
the latter will be removed in a future commit.

[1] http://lists.freedesktop.org/archives/mesa-dev/2015-September/093550.html

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
---
 drivers/gpu/drm/i915/i915_reg.h         | 13 ++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 80 +++++++++++++++++++++++++++++++++
 2 files changed, 93 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index a7c9e8c..663bc8f 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -5920,11 +5920,21 @@ enum skl_disp_power_wells {
 # define CHV_HZ_8X8_MODE_IN_1X				(1<<15)
 # define BDW_HIZ_POWER_COMPILER_CLOCK_GATING_DISABLE	(1<<3)
 
+#define GEN8_L3CNTLREG				0x7034
+#define  GEN8_L3CNTLREG_URB_ALLOC(n)		((n) << 1)
+#define  GEN8_L3CNTLREG_RO_ALLOC(n)		((n) << 11)
+#define  GEN8_L3CNTLREG_DC_ALLOC(n)		((n) << 18)
+#define  GEN8_L3CNTLREG_ALL_ALLOC(n)		((n) << 25)
+
 #define GEN9_SLICE_COMMON_ECO_CHICKEN0		0x7308
 #define  DISABLE_PIXEL_MASK_CAMMING		(1<<14)
 
 #define GEN7_L3SQCREG1				0xB010
 #define  VLV_B0_WA_L3SQCREG1_VALUE		0x00D30000
+#define  IVB_L3SQCREG1_SQGHPCI_DEFAULT		0x00730000
+#define  VLV_L3SQCREG1_SQGHPCI_DEFAULT		0x00D30000
+#define  HSW_L3SQCREG1_SQGHPCI_DEFAULT		0x00610000
+#define  GEN7_L3SQCREG1_CONV_DC_UC		(1 << 24)
 
 #define GEN8_L3SQCREG1				0xB100
 #define  BDW_WA_L3SQCREG1_DEFAULT		0x784000
@@ -5933,6 +5943,9 @@ enum skl_disp_power_wells {
 #define  GEN7_WA_FOR_GEN7_L3_CONTROL			0x3C47FF8C
 #define  GEN7_L3AGDIS				(1<<19)
 #define GEN7_L3CNTLREG2				0xB020
+#define  GEN7_L3CNTLREG2_URB_ALLOC(n)		((n) << 1)
+#define  GEN7_L3CNTLREG2_RO_ALLOC(n)		((n) << 14)
+#define  GEN7_L3CNTLREG2_DC_ALLOC(n)		((n) << 21)
 #define GEN7_L3CNTLREG3				0xB024
 
 #define GEN7_L3_CHICKEN_MODE_REGISTER		0xB030
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 9035f8c..54ca344 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -800,6 +800,86 @@ static int wa_add(struct drm_i915_private *dev_priv,
 
 #define WA_WRITE(addr, val) WA_REG(addr, 0xffffffff, val)
 
+/**
+ * init_l3_partitioning_workarounds - Add L3 partitioning set-up to the WA list.
+ *
+ * @ring - Ring to program the L3 partitioning for.
+ * @n_urb - Number of ways to allocate for the URB.
+ * @n_ro - Number of ways to allocate for read-only L3 clients.
+ * @n_dc - Number of ways to allocate for the DC read-write L3 client.
+ * @n_all - Number of ways to allocate for the common pool shared
+ *          among all L3 clients.
+ *
+ * Note that for this to work correctly the L3 cache must be
+ * completely flushed whenever the workaround list is applied to a
+ * context.
+ */
+static int init_l3_partitioning_workarounds(struct intel_engine_cs *ring,
+					    unsigned int n_urb,
+					    unsigned int n_ro,
+					    unsigned int n_dc,
+					    unsigned int n_all)
+{
+	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (INTEL_INFO(dev)->gen >= 8) {
+		/*
+		 * The ALL partition may not be used simultaneously
+		 * with RO and DC.
+		 */
+		BUG_ON(n_all && (n_ro || n_dc));
+
+		/* Just need to set up the L3 partitioning. */
+		WA_WRITE(GEN8_L3CNTLREG,
+			 GEN8_L3CNTLREG_URB_ALLOC(n_urb) |
+			 GEN8_L3CNTLREG_RO_ALLOC(n_ro) |
+			 GEN8_L3CNTLREG_DC_ALLOC(n_dc) |
+			 GEN8_L3CNTLREG_ALL_ALLOC(n_all));
+
+	} else if (INTEL_INFO(dev)->gen >= 7) {
+		/*
+		 * Offset applied by the hardware to the number of
+		 * ways allocated to the URB, which is also the
+		 * minimum legal URB allocation.
+		 */
+		const unsigned int n0_urb = (IS_VALLEYVIEW(dev) ? 32 : 0);
+		BUG_ON(n_urb < n0_urb);
+
+		/* The ALL partition is not supported on Gen7. */
+		BUG_ON(n_all);
+
+		/*
+		 * Init the L3SQ General and high priority credit
+		 * initialization value to the hardware defaults
+		 * (except for VLV B0 which supposedly defaults to a
+		 * value different to the one we set here), and demote
+		 * the DC to LLC if it has no ways assigned.
+		 *
+		 * WaIncreaseL3CreditsForVLVB0:vlv
+		 */
+		WA_WRITE(GEN7_L3SQCREG1,
+			 (IS_HASWELL(dev) ? HSW_L3SQCREG1_SQGHPCI_DEFAULT :
+			  IS_VALLEYVIEW(dev) ? VLV_L3SQCREG1_SQGHPCI_DEFAULT :
+			  IVB_L3SQCREG1_SQGHPCI_DEFAULT) |
+			 (n_dc ? 0 : GEN7_L3SQCREG1_CONV_DC_UC));
+
+		/* Set up the L3 partitioning. */
+		WA_WRITE(GEN7_L3CNTLREG2,
+			 GEN7_L3CNTLREG2_URB_ALLOC(n_urb - n0_urb) |
+			 GEN7_L3CNTLREG2_RO_ALLOC(n_ro) |
+			 GEN7_L3CNTLREG2_DC_ALLOC(n_dc));
+
+		WA_WRITE(GEN7_L3CNTLREG3, 0);
+
+	} else {
+		/* No L3 on pre-Gen7 hardware. */
+		BUG();
+	}
+
+	return 0;
+}
+
 static int gen8_init_workarounds(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
-- 
2.5.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/6] drm/i915: Don't warn if the workaround list is empty.
  2015-10-07 11:44 [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Francisco Jerez
@ 2015-10-07 11:44 ` Francisco Jerez
  2015-10-07 14:37   ` Daniel Vetter
  2015-10-07 11:44 ` [PATCH 3/6] drm/i915: Hook up ring workaround writes at context creation time on Gen6-7 Francisco Jerez
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Francisco Jerez @ 2015-10-07 11:44 UTC (permalink / raw)
  To: intel-gfx

It's not an error for the workaround list to be empty if no
workarounds are needed.  This will avoid spamming the logs
unnecessarily on Gen6 after the workaround list is hooked up on
pre-Gen8 hardware by the following commits.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 54ca344..3373516 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -717,7 +717,7 @@ static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_workarounds *w = &dev_priv->workarounds;
 
-	if (WARN_ON_ONCE(w->count == 0))
+	if (w->count == 0)
 		return 0;
 
 	ring->gpu_caches_dirty = true;
-- 
2.5.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/6] drm/i915: Hook up ring workaround writes at context creation time on Gen6-7.
  2015-10-07 11:44 [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Francisco Jerez
  2015-10-07 11:44 ` [PATCH 2/6] drm/i915: Don't warn if the workaround list is empty Francisco Jerez
@ 2015-10-07 11:44 ` Francisco Jerez
  2015-10-07 14:38   ` Daniel Vetter
  2015-10-07 11:44 ` [PATCH 4/6] drm/i915: Program the L3 configuration to hardware defaults on context init Francisco Jerez
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Francisco Jerez @ 2015-10-07 11:44 UTC (permalink / raw)
  To: intel-gfx

intel_rcs_ctx_init() emits all workaround register writes on the list
to the ring, in addition to calling i915_gem_render_state_init().  The
workaround list is currently empty on Gen6-7 so this shouldn't cause
any functional changes.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
---
This is based on top of Daniel's "drm/i915: Resurrect golden context
on gen6/7". [1]

[1] https://patchwork.freedesktop.org/patch/60870/

 drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3373516..e2e40d0 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2703,7 +2703,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 			GEN8_RING_SEMAPHORE_INIT;
 		}
 	} else if (INTEL_INFO(dev)->gen >= 6) {
-		ring->init_context = i915_gem_render_state_init;
+		ring->init_context = intel_rcs_ctx_init;
 		ring->add_request = gen6_add_request;
 		ring->flush = gen7_render_ring_flush;
 		if (INTEL_INFO(dev)->gen == 6)
-- 
2.5.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/6] drm/i915: Program the L3 configuration to hardware defaults on context init.
  2015-10-07 11:44 [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Francisco Jerez
  2015-10-07 11:44 ` [PATCH 2/6] drm/i915: Don't warn if the workaround list is empty Francisco Jerez
  2015-10-07 11:44 ` [PATCH 3/6] drm/i915: Hook up ring workaround writes at context creation time on Gen6-7 Francisco Jerez
@ 2015-10-07 11:44 ` Francisco Jerez
  2015-10-07 11:44 ` [PATCH 5/6] drm/i915/hsw: Move L3 atomics workaround to the workaround list Francisco Jerez
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Francisco Jerez @ 2015-10-07 11:44 UTC (permalink / raw)
  To: intel-gfx

Use init_l3_partitioning_workarounds() to set up the L3 partitioning
on context creation according to the hardware boot-up defaults for
each device.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 53 +++++++++++++++++++++++++++++++--
 1 file changed, 51 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e2e40d0..c4c39c4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -880,6 +880,36 @@ static int init_l3_partitioning_workarounds(struct intel_engine_cs *ring,
 	return 0;
 }
 
+static int ivb_init_workarounds(struct intel_engine_cs *ring)
+{
+	/*
+	 * Set up the default L3 partitioning of the hardware of 32
+	 * ways (i.e. 256 KB on GT2 parts) for the URB and another 32
+	 * ways for the RO partition.
+	 */
+	return init_l3_partitioning_workarounds(ring, 32, 32, 0, 0);
+}
+
+static int vlv_init_workarounds(struct intel_engine_cs *ring)
+{
+	/*
+	 * Set up the default L3 partitioning of the hardware of 64
+	 * ways (i.e. 128 KB) for the URB, 16 ways (i.e. 32 KB) for
+	 * the RO partition and another 16 ways for the DC.
+	 */
+	return init_l3_partitioning_workarounds(ring, 64, 16, 16, 0);
+}
+
+static int hsw_init_workarounds(struct intel_engine_cs *ring)
+{
+	/*
+	 * Set up the default L3 partitioning of the hardware of 32
+	 * ways (i.e. 256 KB on GT2 parts) for the URB and another 32
+	 * ways for the RO partition.
+	 */
+	return init_l3_partitioning_workarounds(ring, 32, 32, 0, 0);
+}
+
 static int gen8_init_workarounds(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
@@ -929,7 +959,12 @@ static int gen8_init_workarounds(struct intel_engine_cs *ring)
 			    GEN6_WIZ_HASHING_MASK,
 			    GEN6_WIZ_HASHING_16x4);
 
-	return 0;
+	/*
+	 * Set up the default L3 partitioning of the hardware of 48
+	 * ways (i.e. 384 KB on GT2 parts, 192 KB on GT1 parts and
+	 * CHV) for the URB and another 48 ways for the RO partition.
+	 */
+	return init_l3_partitioning_workarounds(ring, 48, 0, 0, 48);
 }
 
 static int bdw_init_workarounds(struct intel_engine_cs *ring)
@@ -1053,7 +1088,12 @@ static int gen9_init_workarounds(struct intel_engine_cs *ring)
 	/* WaDisableSTUnitPowerOptimization:skl,bxt */
 	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
 
-	return 0;
+	/*
+	 * Set up the default L3 partitioning of the hardware of 48
+	 * ways (i.e. 384 KB on GT2 parts, 192 KB on GT1 parts and
+	 * BXT) for the URB and another 48 ways for the RO partition.
+	 */
+	return init_l3_partitioning_workarounds(ring, 48, 0, 0, 48);
 }
 
 static int skl_tune_iz_hashing(struct intel_engine_cs *ring)
@@ -1175,6 +1215,15 @@ int init_workarounds_ring(struct intel_engine_cs *ring)
 
 	dev_priv->workarounds.count = 0;
 
+	if (IS_IVYBRIDGE(dev))
+		return ivb_init_workarounds(ring);
+
+	if (IS_VALLEYVIEW(dev))
+		return vlv_init_workarounds(ring);
+
+	if (IS_HASWELL(dev))
+		return hsw_init_workarounds(ring);
+
 	if (IS_BROADWELL(dev))
 		return bdw_init_workarounds(ring);
 
-- 
2.5.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/6] drm/i915/hsw: Move L3 atomics workaround to the workaround list.
  2015-10-07 11:44 [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Francisco Jerez
                   ` (2 preceding siblings ...)
  2015-10-07 11:44 ` [PATCH 4/6] drm/i915: Program the L3 configuration to hardware defaults on context init Francisco Jerez
@ 2015-10-07 11:44 ` Francisco Jerez
  2015-10-07 11:44 ` [PATCH 6/6] drm/i915/vlv: Remove WaIncreaseL3CreditsForVLVB0 from init_clock_gating Francisco Jerez
  2015-10-07 12:42 ` [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Chris Wilson
  5 siblings, 0 replies; 11+ messages in thread
From: Francisco Jerez @ 2015-10-07 11:44 UTC (permalink / raw)
  To: intel-gfx

This makes sure that the workaround is not accidentally undone by some
process (which is possible because the HSW_SCRATCH1 and
HSW_ROW_CHICKEN3 registers are partially whitelisted), what could
cause the next context to be created to hang if it wasn't expecting L3
atomics to be enabled.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
---
 drivers/gpu/drm/i915/intel_pm.c         |  5 -----
 drivers/gpu/drm/i915/intel_ringbuffer.c | 12 ++++++++++++
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 60d120c..d495043 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6585,11 +6585,6 @@ static void haswell_init_clock_gating(struct drm_device *dev)
 
 	ilk_init_lp_watermarks(dev);
 
-	/* L3 caching of data atomics doesn't work -- disable it. */
-	I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
-	I915_WRITE(HSW_ROW_CHICKEN3,
-		   _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
-
 	/* This is required by WaCatErrorRejectionIssue:hsw */
 	I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
 			I915_READ(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG) |
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index c4c39c4..4cb9f5c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -902,6 +902,18 @@ static int vlv_init_workarounds(struct intel_engine_cs *ring)
 
 static int hsw_init_workarounds(struct intel_engine_cs *ring)
 {
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+
+	/*
+	 * L3 caching of data atomics can easily cause a full-system
+	 * hang if not used with extreme care -- Disable it on all
+	 * contexts by default, userspace is free to re-enable them if
+	 * it knows what it's doing.
+	 */
+	WA_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
+	WA_WRITE(HSW_ROW_CHICKEN3,
+		 _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
+
 	/*
 	 * Set up the default L3 partitioning of the hardware of 32
 	 * ways (i.e. 256 KB on GT2 parts) for the URB and another 32
-- 
2.5.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/6] drm/i915/vlv: Remove WaIncreaseL3CreditsForVLVB0 from init_clock_gating.
  2015-10-07 11:44 [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Francisco Jerez
                   ` (3 preceding siblings ...)
  2015-10-07 11:44 ` [PATCH 5/6] drm/i915/hsw: Move L3 atomics workaround to the workaround list Francisco Jerez
@ 2015-10-07 11:44 ` Francisco Jerez
  2015-10-07 12:42 ` [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Chris Wilson
  5 siblings, 0 replies; 11+ messages in thread
From: Francisco Jerez @ 2015-10-07 11:44 UTC (permalink / raw)
  To: intel-gfx

The same work-arond is applied per-context in
init_l3_partitioning_workarounds(), so the direct MMIO write of
GEN7_L3SQCREG1 should be redundant now.  Applying the work-around at
context creation time also makes sure that the MMIO writes are not
accidentally undone by userspace.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
---
 drivers/gpu/drm/i915/i915_reg.h | 1 -
 drivers/gpu/drm/i915/intel_pm.c | 6 ------
 2 files changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 663bc8f..b445c93 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -5930,7 +5930,6 @@ enum skl_disp_power_wells {
 #define  DISABLE_PIXEL_MASK_CAMMING		(1<<14)
 
 #define GEN7_L3SQCREG1				0xB010
-#define  VLV_B0_WA_L3SQCREG1_VALUE		0x00D30000
 #define  IVB_L3SQCREG1_SQGHPCI_DEFAULT		0x00730000
 #define  VLV_L3SQCREG1_SQGHPCI_DEFAULT		0x00D30000
 #define  HSW_L3SQCREG1_SQGHPCI_DEFAULT		0x00610000
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d495043..ccb91d3 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6810,12 +6810,6 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
 		   _MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4));
 
 	/*
-	 * WaIncreaseL3CreditsForVLVB0:vlv
-	 * This is the hardware default actually.
-	 */
-	I915_WRITE(GEN7_L3SQCREG1, VLV_B0_WA_L3SQCREG1_VALUE);
-
-	/*
 	 * WaDisableVLVClockGating_VBIIssue:vlv
 	 * Disable clock gating on th GCFG unit to prevent a delay
 	 * in the reporting of vblank events.
-- 
2.5.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list.
  2015-10-07 11:44 [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Francisco Jerez
                   ` (4 preceding siblings ...)
  2015-10-07 11:44 ` [PATCH 6/6] drm/i915/vlv: Remove WaIncreaseL3CreditsForVLVB0 from init_clock_gating Francisco Jerez
@ 2015-10-07 12:42 ` Chris Wilson
  2015-10-07 13:30   ` Francisco Jerez
  5 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2015-10-07 12:42 UTC (permalink / raw)
  To: Francisco Jerez; +Cc: intel-gfx

On Wed, Oct 07, 2015 at 02:44:00PM +0300, Francisco Jerez wrote:
> This programs the L3 configuration based on the sizes given for each
> partition as arguments.  The relevant register writes are added to the
> workaround list so that they are re-applied to each context while it's
> initialized, preventing state leaks from other userspace processes
> which may have modified the L3 partitioning from its boot-up state,
> since all relevant registers are part of the software and hardware
> command checker whitelists.
> 
> Some userspace clients (DDX and current versions of Mesa not patched
> with my L3 partitioning series [1]) assume that the L3 configuration,
> in particular the URB size, comes up in certain state when a context
> is created, but nothing in the kernel guarantees this assumption, the
> registers that control the partitioning of the L3 cache were being
> left untouched.
> 
> Note that the VLV_L3SQCREG1_SQGHPCI_DEFAULT macro defined here has the
> same value as the previously defined VLV_B0_WA_L3SQCREG1_VALUE, but
> the latter will be removed in a future commit.
> 
> [1] http://lists.freedesktop.org/archives/mesa-dev/2015-September/093550.html

Merge this with 1 + 5 (or was it 4?)- not only does it introduce a
temporary unused function, but we have to jump between patches to see
whether the function is safe (especially given those BUGs), and you add
all w/a in the same patch so bisection is not improved.

Looking at the function itself, I am not convinced that it actually adds
anything over calling setting up the WA from the vfuncs - at least the
bdw/gen7 split is redundant (we split at the vfunc then call one
function where we replit again, but with nasty constraints on the
interface of the function for different gen, it's not a function for the
faint of heart).

> +static int init_l3_partitioning_workarounds(struct intel_engine_cs *ring,
> +					    unsigned int n_urb,
> +					    unsigned int n_ro,
> +					    unsigned int n_dc,
> +					    unsigned int n_all)
> +{
> +	struct drm_device *dev = ring->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +
> +	if (INTEL_INFO(dev)->gen >= 8) {

Why use dev here? You already have dev_priv, so why chase the pointer
again?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list.
  2015-10-07 12:42 ` [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Chris Wilson
@ 2015-10-07 13:30   ` Francisco Jerez
  2015-10-08  9:17     ` Chris Wilson
  0 siblings, 1 reply; 11+ messages in thread
From: Francisco Jerez @ 2015-10-07 13:30 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx


[-- Attachment #1.1.1: Type: text/plain, Size: 3919 bytes --]

Chris Wilson <chris@chris-wilson.co.uk> writes:

> On Wed, Oct 07, 2015 at 02:44:00PM +0300, Francisco Jerez wrote:
>> This programs the L3 configuration based on the sizes given for each
>> partition as arguments.  The relevant register writes are added to the
>> workaround list so that they are re-applied to each context while it's
>> initialized, preventing state leaks from other userspace processes
>> which may have modified the L3 partitioning from its boot-up state,
>> since all relevant registers are part of the software and hardware
>> command checker whitelists.
>> 
>> Some userspace clients (DDX and current versions of Mesa not patched
>> with my L3 partitioning series [1]) assume that the L3 configuration,
>> in particular the URB size, comes up in certain state when a context
>> is created, but nothing in the kernel guarantees this assumption, the
>> registers that control the partitioning of the L3 cache were being
>> left untouched.
>> 
>> Note that the VLV_L3SQCREG1_SQGHPCI_DEFAULT macro defined here has the
>> same value as the previously defined VLV_B0_WA_L3SQCREG1_VALUE, but
>> the latter will be removed in a future commit.
>> 
>> [1] http://lists.freedesktop.org/archives/mesa-dev/2015-September/093550.html
>
> Merge this with 1 + 5 (or was it 4?)- not only does it introduce a
> temporary unused function, but we have to jump between patches to see
> whether the function is safe (especially given those BUGs), and you add
> all w/a in the same patch so bisection is not improved.
>
What do you want me to merge it with?  This *is* PATCH 1, and PATCH 5 is
largely unrelated.  I wouldn't have any objection to squashing PATCH 4
into this commit though.

> Looking at the function itself, I am not convinced that it actually adds
> anything over calling setting up the WA from the vfuncs - at least the
> bdw/gen7 split is redundant (we split at the vfunc then call one
> function where we replit again, but with nasty constraints on the
> interface of the function for different gen, it's not a function for the
> faint of heart).
>
The constraints are just the hardware's constraints on the L3
partitioning.  The function is meant to implement the details of
programming the L3 configuration, which is different for different gens
even though the overall structure of the L3 partitioning is the same.
Of course the constraints set by specific hardware on the partition
sizes cannot be abstracted out.

I guess that splitting this out into two functions (one for gen7 and
another for gen8) wouldn't hurt much, but open-coding the function in
all its uses (5) would involve duplicating quite some code.

Assuming I split the function into gen7 and gen8 variants, would you
like them to implement a common consistent interface (i.e. the same
prototype that my init_l3_partitioning_workarounds() implements), or
would you like the variant for each gen to implement an ad-hoc interface
according to the partition configs actually needed on each gen?  I
suspect the former would be cleaner.

>> +static int init_l3_partitioning_workarounds(struct intel_engine_cs *ring,
>> +					    unsigned int n_urb,
>> +					    unsigned int n_ro,
>> +					    unsigned int n_dc,
>> +					    unsigned int n_all)
>> +{
>> +	struct drm_device *dev = ring->dev;
>> +	struct drm_i915_private *dev_priv = dev->dev_private;
>> +
>> +	if (INTEL_INFO(dev)->gen >= 8) {
>
> Why use dev here? You already have dev_priv, so why chase the pointer
> again?

Sorry?  Does INTEL_INFO() take a drm_device or a drm_i915_private
pointer as argument?  The two types don't seem to be related by an
inheritance relationship or anything similar AFAICT, and most other uses
in this file seem to pass a drm_device as argument even where a
drm_i915_private is available.

> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 212 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/6] drm/i915: Don't warn if the workaround list is empty.
  2015-10-07 11:44 ` [PATCH 2/6] drm/i915: Don't warn if the workaround list is empty Francisco Jerez
@ 2015-10-07 14:37   ` Daniel Vetter
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Vetter @ 2015-10-07 14:37 UTC (permalink / raw)
  To: Francisco Jerez; +Cc: intel-gfx

On Wed, Oct 07, 2015 at 02:44:01PM +0300, Francisco Jerez wrote:
> It's not an error for the workaround list to be empty if no
> workarounds are needed.  This will avoid spamming the logs
> unnecessarily on Gen6 after the workaround list is hooked up on
> pre-Gen8 hardware by the following commits.
> 
> Signed-off-by: Francisco Jerez <currojerez@riseup.net>

I think the idea of this was to not forget this when bringing up new
platforms. I think we wrestle workarounds often enough to have a low risk
for that ;-)

Queued for -next, thanks for the patch.
-Daniel

> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 54ca344..3373516 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -717,7 +717,7 @@ static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct i915_workarounds *w = &dev_priv->workarounds;
>  
> -	if (WARN_ON_ONCE(w->count == 0))
> +	if (w->count == 0)
>  		return 0;
>  
>  	ring->gpu_caches_dirty = true;
> -- 
> 2.5.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/6] drm/i915: Hook up ring workaround writes at context creation time on Gen6-7.
  2015-10-07 11:44 ` [PATCH 3/6] drm/i915: Hook up ring workaround writes at context creation time on Gen6-7 Francisco Jerez
@ 2015-10-07 14:38   ` Daniel Vetter
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Vetter @ 2015-10-07 14:38 UTC (permalink / raw)
  To: Francisco Jerez; +Cc: intel-gfx

On Wed, Oct 07, 2015 at 02:44:02PM +0300, Francisco Jerez wrote:
> intel_rcs_ctx_init() emits all workaround register writes on the list
> to the ring, in addition to calling i915_gem_render_state_init().  The
> workaround list is currently empty on Gen6-7 so this shouldn't cause
> any functional changes.
> 
> Signed-off-by: Francisco Jerez <currojerez@riseup.net>

Queued for -next, thanks for the patch.
-Daniel

> ---
> This is based on top of Daniel's "drm/i915: Resurrect golden context
> on gen6/7". [1]
> 
> [1] https://patchwork.freedesktop.org/patch/60870/
> 
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 3373516..e2e40d0 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2703,7 +2703,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>  			GEN8_RING_SEMAPHORE_INIT;
>  		}
>  	} else if (INTEL_INFO(dev)->gen >= 6) {
> -		ring->init_context = i915_gem_render_state_init;
> +		ring->init_context = intel_rcs_ctx_init;
>  		ring->add_request = gen6_add_request;
>  		ring->flush = gen7_render_ring_flush;
>  		if (INTEL_INFO(dev)->gen == 6)
> -- 
> 2.5.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list.
  2015-10-07 13:30   ` Francisco Jerez
@ 2015-10-08  9:17     ` Chris Wilson
  0 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2015-10-08  9:17 UTC (permalink / raw)
  To: Francisco Jerez; +Cc: intel-gfx

On Wed, Oct 07, 2015 at 04:30:35PM +0300, Francisco Jerez wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > On Wed, Oct 07, 2015 at 02:44:00PM +0300, Francisco Jerez wrote:
> >> This programs the L3 configuration based on the sizes given for each
> >> partition as arguments.  The relevant register writes are added to the
> >> workaround list so that they are re-applied to each context while it's
> >> initialized, preventing state leaks from other userspace processes
> >> which may have modified the L3 partitioning from its boot-up state,
> >> since all relevant registers are part of the software and hardware
> >> command checker whitelists.
> >> 
> >> Some userspace clients (DDX and current versions of Mesa not patched
> >> with my L3 partitioning series [1]) assume that the L3 configuration,
> >> in particular the URB size, comes up in certain state when a context
> >> is created, but nothing in the kernel guarantees this assumption, the
> >> registers that control the partitioning of the L3 cache were being
> >> left untouched.
> >> 
> >> Note that the VLV_L3SQCREG1_SQGHPCI_DEFAULT macro defined here has the
> >> same value as the previously defined VLV_B0_WA_L3SQCREG1_VALUE, but
> >> the latter will be removed in a future commit.
> >> 
> >> [1] http://lists.freedesktop.org/archives/mesa-dev/2015-September/093550.html
> >
> > Merge this with 1 + 5 (or was it 4?)- not only does it introduce a
> > temporary unused function, but we have to jump between patches to see
> > whether the function is safe (especially given those BUGs), and you add
> > all w/a in the same patch so bisection is not improved.
> >
> What do you want me to merge it with?  This *is* PATCH 1, and PATCH 5 is
> largely unrelated.  I wouldn't have any objection to squashing PATCH 4
> into this commit though.

1+4. That just highlights the issue of having two very tightly coupled
patches arbitrary split - the split here doesn't even improve bisection.
 
> > Looking at the function itself, I am not convinced that it actually adds
> > anything over calling setting up the WA from the vfuncs - at least the
> > bdw/gen7 split is redundant (we split at the vfunc then call one
> > function where we replit again, but with nasty constraints on the
> > interface of the function for different gen, it's not a function for the
> > faint of heart).
> >
> The constraints are just the hardware's constraints on the L3
> partitioning.  The function is meant to implement the details of
> programming the L3 configuration, which is different for different gens
> even though the overall structure of the L3 partitioning is the same.
> Of course the constraints set by specific hardware on the partition
> sizes cannot be abstracted out.
> 
> I guess that splitting this out into two functions (one for gen7 and
> another for gen8) wouldn't hurt much, but open-coding the function in
> all its uses (5) would involve duplicating quite some code.

Having slept on it, and I think a gen7/gen8 split is best.

My issue with the current combined function is that the interface is
different based on generation, which says to me that it is multiple
functions. It scores very low on the hard-to-misuse ranking

http://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html

And instead of the multiline multiple ternary operators, just have
an if-else-chain for hsw/vlv/ivb (for setting the cmd value).
 
> Assuming I split the function into gen7 and gen8 variants, would you
> like them to implement a common consistent interface (i.e. the same
> prototype that my init_l3_partitioning_workarounds() implements), or
> would you like the variant for each gen to implement an ad-hoc interface
> according to the partition configs actually needed on each gen?  I
> suspect the former would be cleaner.

No, the BUGs tell me that the registers do not have a consistent
interface.
 
> >> +static int init_l3_partitioning_workarounds(struct intel_engine_cs *ring,
> >> +					    unsigned int n_urb,
> >> +					    unsigned int n_ro,
> >> +					    unsigned int n_dc,
> >> +					    unsigned int n_all)
> >> +{
> >> +	struct drm_device *dev = ring->dev;
> >> +	struct drm_i915_private *dev_priv = dev->dev_private;
> >> +
> >> +	if (INTEL_INFO(dev)->gen >= 8) {
> >
> > Why use dev here? You already have dev_priv, so why chase the pointer
> > again?
> 
> Sorry?  Does INTEL_INFO() take a drm_device or a drm_i915_private
> pointer as argument?  The two types don't seem to be related by an
> inheritance relationship or anything similar AFAICT, and most other uses
> in this file seem to pass a drm_device as argument even where a
> drm_i915_private is available.

drm_i915_private is our primary internal pointer. We can shave over 8KiB
of object code by removing the pointer dance we have from still passing
around drm_device and jumping to drm_i915_private. New code is supposed
to be using drm_i915_private consistently for internal interfaces (we
can then clearly see the external entry points).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-10-08  9:17 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-07 11:44 [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Francisco Jerez
2015-10-07 11:44 ` [PATCH 2/6] drm/i915: Don't warn if the workaround list is empty Francisco Jerez
2015-10-07 14:37   ` Daniel Vetter
2015-10-07 11:44 ` [PATCH 3/6] drm/i915: Hook up ring workaround writes at context creation time on Gen6-7 Francisco Jerez
2015-10-07 14:38   ` Daniel Vetter
2015-10-07 11:44 ` [PATCH 4/6] drm/i915: Program the L3 configuration to hardware defaults on context init Francisco Jerez
2015-10-07 11:44 ` [PATCH 5/6] drm/i915/hsw: Move L3 atomics workaround to the workaround list Francisco Jerez
2015-10-07 11:44 ` [PATCH 6/6] drm/i915/vlv: Remove WaIncreaseL3CreditsForVLVB0 from init_clock_gating Francisco Jerez
2015-10-07 12:42 ` [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Chris Wilson
2015-10-07 13:30   ` Francisco Jerez
2015-10-08  9:17     ` Chris Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox