public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH 0/6] Rendering Specific HW Workarounds for VLV
@ 2014-03-24 17:30 sourab.gupta
  2014-03-24 17:30 ` [PATCH v4 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore' sourab.gupta
                   ` (5 more replies)
  0 siblings, 6 replies; 37+ messages in thread
From: sourab.gupta @ 2014-03-24 17:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch series adds rendering specific HW workarounds for VLV platform.
These patches leads to stable behavior on VLV, especially
when playing 3D Apps, benchmarks.

This patch series consolidates the earlier patch set in a clean thread 
and adds the in-patch changelogs which we had missed out earlier.
The comments received on earlier patches are addressed.

Akash Goel (6):
  drm/i915/vlv: Added a rendering specific Hw WA
    'WaTlbInvalidateStoreDataBefore'
  drm/i915/vlv: Added a rendering specific Hw WA     
    'WaSendDummy3dPrimitveAfterSetContext'
  drm/i915: Enabling the TLB invalidate bit in GFX Mode register
  drm/i915/vlv: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE
    reg
  drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush'
  drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv

 drivers/gpu/drm/i915/i915_gem_context.c | 54 +++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_reg.h         |  4 +++
 drivers/gpu/drm/i915/intel_pm.c         | 10 ++++--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 39 ++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 5 files changed, 102 insertions(+), 6 deletions(-)

-- 
1.8.5.1

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v4 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore'
  2014-03-24 17:30 [PATCH 0/6] Rendering Specific HW Workarounds for VLV sourab.gupta
@ 2014-03-24 17:30 ` sourab.gupta
  2014-03-24 17:30 ` [PATCH v4 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext' sourab.gupta
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 37+ messages in thread
From: sourab.gupta @ 2014-03-24 17:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
Store data commands.

v2: Modified the WA comment (Ville)

v3: Added the vlv identifier with WA name (Damien)

v4: Reworked based on Chris' comments (WA moved to gen7 ring flush func,
sending 6 dwords instead of 8)

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 87d1a2d..75cac4e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -309,6 +309,29 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
 	u32 scratch_addr = ring->scratch.gtt_offset + 128;
 	int ret;
 
+	if (invalidate_domains && IS_VALLEYVIEW(ring->dev)) {
+		/*
+		 * WaTlbInvalidateStoreDataBefore:vlv
+		 * This workaround is applicable in case the flush call has
+		 * arrived in context of invalidate_all_caches function.
+		 * Before pipecontrol with TLB invalidate set, need 2 store
+		 * data commands (such as MI_STORE_DATA_IMM or MI_STORE_DATA_INDEX)
+		 * Without this, hardware cannot guarantee the command after the
+		 * PIPE_CONTROL with TLB inv will not use the old TLB values.
+		 * FIXME: should apply to snb, ivb
+		 */
+		int i;
+		ret = intel_ring_begin(ring, 3 * 2);
+		if (ret)
+			return ret;
+		for (i = 0; i < 2; i++) {
+			intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
+			intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR);
+			intel_ring_emit(ring, 0);
+		}
+		intel_ring_advance(ring);
+	}
+
 	/*
 	 * Ensure that any following seqno writes only happen when the render
 	 * cache is indeed flushed.
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext'
  2014-03-24 17:30 [PATCH 0/6] Rendering Specific HW Workarounds for VLV sourab.gupta
  2014-03-24 17:30 ` [PATCH v4 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore' sourab.gupta
@ 2014-03-24 17:30 ` sourab.gupta
  2014-04-08  4:41   ` Gupta, Sourab
  2014-03-24 17:30 ` [PATCH v2 3/6] drm/i915: Enabling the TLB invalidate bit in GFX Mode register sourab.gupta
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 37+ messages in thread
From: sourab.gupta @ 2014-03-24 17:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

This workaround is needed on VLV for the HW context feature.
It is used after adding the mi_set_context command in ring buffer
for Hw context switch. As per the spec
"The software must send a pipe_control with a CS stall and a post sync
operation and then a dummy DRAW after every MI_SET_CONTEXT and after any
PIPELINE_SELECT that is enabling 3D mode".

v2: Modified the WA comment. (Ville)

v3: Added the vlv identifier with the WA name

v4: Check removed for scratch page initialization. (Chris/Daniel)

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 54 +++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_reg.h         |  1 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |  9 ++++++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 4 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 6043062..83bf89e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -584,6 +584,47 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static inline void
+mi_set_context_dummy3d_prim_wa(struct intel_ring_buffer *ring)
+{
+	u32 scratch_addr;
+	u32 flags = 0;
+
+	/* Actual scratch location is at 128 bytes offset */
+	scratch_addr = intel_get_pipe_control_scratch_addr(ring) + 128;
+
+	/*
+	 * WaSendDummy3dPrimitveAfterSetContext:vlv
+	 * Software must send a pipe_control with a CS stall
+	 * and a post sync operation and then a dummy DRAW after
+	 * every MI_SET_CONTEXT and after any PIPELINE_SELECT that
+	 * is enabling 3D mode. A dummy draw is a 3DPRIMITIVE command
+	 * with Indirect Parameter Enable set to 0, UAV Coherency
+	 * Required set to 0, Predicate Enable set to 0,
+	 * End Offset Enable set to 0, and Vertex Count Per Instance
+	 * set to 0, All other parameters are a don't care.
+	 */
+
+	/*
+	 * Add a pipe control with CS Stall and postsync op
+	 * before dummy 3D_PRIMITIVE
+	 */
+	flags |= PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL;
+	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
+	intel_ring_emit(ring, flags);
+	intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT);
+	intel_ring_emit(ring, 0);
+
+	/* Add a dummy 3D_PRIMITVE */
+	intel_ring_emit(ring, GFX_OP_3DPRIMITIVE);
+	intel_ring_emit(ring, 4); /* PrimTopoType*/
+	intel_ring_emit(ring, 0); /* VertexCountPerInstance */
+	intel_ring_emit(ring, 0); /* StartVertexLocation */
+	intel_ring_emit(ring, 0); /* InstanceCount */
+	intel_ring_emit(ring, 0); /* StartInstanceLocation */
+	intel_ring_emit(ring, 0); /* BaseVertexLocation  */
+}
+
 static inline int
 mi_set_context(struct intel_ring_buffer *ring,
 	       struct i915_hw_context *new_context,
@@ -602,7 +643,10 @@ mi_set_context(struct intel_ring_buffer *ring,
 			return ret;
 	}
 
-	ret = intel_ring_begin(ring, 6);
+	if (IS_VALLEYVIEW(ring->dev))
+		ret = intel_ring_begin(ring, 6+4+8);
+	else
+		ret = intel_ring_begin(ring, 6);
 	if (ret)
 		return ret;
 
@@ -626,7 +670,13 @@ mi_set_context(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, MI_NOOP);
 
 	if (IS_GEN7(ring->dev))
-		intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
+		if (IS_VALLEYVIEW(ring->dev)) {
+			/* FIXME, should also apply to ivb */
+			mi_set_context_dummy3d_prim_wa(ring);
+			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
+			intel_ring_emit(ring, MI_NOOP);
+		} else
+			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
 	else
 		intel_ring_emit(ring, MI_NOOP);
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index adcb9c7..b922e38 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -348,6 +348,7 @@
 #define   PIPE_CONTROL_DEPTH_CACHE_FLUSH		(1<<0)
 #define   PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */
 
+#define GFX_OP_3DPRIMITIVE ((0x3<<29)|(0x3<<27)|(0x3<<24)|(7-2))
 
 /*
  * Reset registers
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 75cac4e..bace089 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -583,6 +583,15 @@ err:
 	return ret;
 }
 
+u32
+intel_get_pipe_control_scratch_addr(struct intel_ring_buffer *ring)
+{
+	if (ring->scratch.obj == NULL)
+		return 0;
+
+	return ring->scratch.gtt_offset;
+}
+
 static int init_render_ring(struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = ring->dev;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f11ceb2..e38ca82 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -294,6 +294,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev);
 
 u32 intel_ring_get_active_head(struct intel_ring_buffer *ring);
 void intel_ring_setup_status_page(struct intel_ring_buffer *ring);
+u32 intel_get_pipe_control_scratch_addr(struct intel_ring_buffer *ring);
 
 static inline u32 intel_ring_get_tail(struct intel_ring_buffer *ring)
 {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 3/6] drm/i915: Enabling the TLB invalidate bit in GFX Mode register
  2014-03-24 17:30 [PATCH 0/6] Rendering Specific HW Workarounds for VLV sourab.gupta
  2014-03-24 17:30 ` [PATCH v4 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore' sourab.gupta
  2014-03-24 17:30 ` [PATCH v4 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext' sourab.gupta
@ 2014-03-24 17:30 ` sourab.gupta
  2014-04-01  5:01   ` Gupta, Sourab
  2014-04-02 11:34   ` Ville Syrjälä
  2014-03-24 17:30 ` [PATCH 4/6] drm/i915/vlv: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg sourab.gupta
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 37+ messages in thread
From: sourab.gupta @ 2014-03-24 17:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

This patch Enables the bit for TLB invalidate in GFX Mode register
for Gen7.

According to bspec,  When enabled this bit limits the invalidation
of the TLB only to batch buffer boundaries, to pipe_control
commands which have the TLB invalidation bit set and sync flushes.
If disabled, the TLB caches are flushed for every full flush of
the pipeline.

Tested only on vlv platform. Chris has tested on ivb and hsw
platforms.

v2: Adding the explicit enabling of this bit for all Gen7 platforms
instead of only vlv (Chris)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk> #ivb, hsw -Chris
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index bace089..eb4811a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -617,7 +617,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 
 	if (IS_GEN7(dev))
 		I915_WRITE(GFX_MODE_GEN7,
-			   _MASKED_BIT_DISABLE(GFX_TLB_INVALIDATE_EXPLICIT) |
+			   _MASKED_BIT_ENABLE(GFX_TLB_INVALIDATE_EXPLICIT) |
 			   _MASKED_BIT_ENABLE(GFX_REPLAY_MODE));
 
 	if (INTEL_INFO(dev)->gen >= 5) {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 4/6] drm/i915/vlv: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg
  2014-03-24 17:30 [PATCH 0/6] Rendering Specific HW Workarounds for VLV sourab.gupta
                   ` (2 preceding siblings ...)
  2014-03-24 17:30 ` [PATCH v2 3/6] drm/i915: Enabling the TLB invalidate bit in GFX Mode register sourab.gupta
@ 2014-03-24 17:30 ` sourab.gupta
  2014-03-24 17:47   ` Chris Wilson
  2014-03-24 17:30 ` [PATCH v2 5/6] drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush' sourab.gupta
  2014-03-24 17:30 ` [PATCH v2 6/6] drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv sourab.gupta
  5 siblings, 1 reply; 37+ messages in thread
From: sourab.gupta @ 2014-03-24 17:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Akash Goel

From: Akash Goel <akash.goel@intel.com>

Removing the VS_TIMER_DISPATCH bit enable for MI MODE reg for
VLV platform as it is not required.

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index eb4811a..1512a71 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -599,7 +599,10 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 	int ret = init_ring_common(ring);
 
 	if (INTEL_INFO(dev)->gen > 3)
-		I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
+		/* FIXME, should also apply to ivb */
+		if (!IS_VALLEYVIEW(dev))
+			I915_WRITE(MI_MODE,
+					_MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
 
 	/* We need to disable the AsyncFlip performance optimisations in order
 	 * to use MI_WAIT_FOR_EVENT within the CS. It should already be
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 5/6] drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush'
  2014-03-24 17:30 [PATCH 0/6] Rendering Specific HW Workarounds for VLV sourab.gupta
                   ` (3 preceding siblings ...)
  2014-03-24 17:30 ` [PATCH 4/6] drm/i915/vlv: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg sourab.gupta
@ 2014-03-24 17:30 ` sourab.gupta
  2014-04-01 10:51   ` Ville Syrjälä
  2014-03-24 17:30 ` [PATCH v2 6/6] drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv sourab.gupta
  5 siblings, 1 reply; 37+ messages in thread
From: sourab.gupta @ 2014-03-24 17:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

In Valleyview, Operational flush cannot be enabled on
BWG A0 [Errata BWT006]

v2: Corrected the code regarding the wrong usage of
MASKED_BIT_DISABLE (Chris)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h | 3 +++
 drivers/gpu/drm/i915/intel_pm.c | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index b922e38..266bfa1 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -972,6 +972,9 @@ enum punit_power_well {
 #define   ECO_GATING_CX_ONLY	(1<<3)
 #define   ECO_FLIP_DONE		(1<<0)
 
+#define GEN7_CACHE_MODE_0	0x07000 /* IVB+ only */
+#define GEN7_RC_OP_FLUSH_ENABLE (1<<0)
+
 #define CACHE_MODE_0_GEN7	0x7000 /* IVB+ */
 #define   HIZ_RAW_STALL_OPT_DISABLE (1<<2)
 #define CACHE_MODE_1		0x7004 /* IVB+ */
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index fd68f93..c3a8554 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5068,6 +5068,9 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
 		   _MASKED_BIT_ENABLE(GEN7_MAX_PS_THREAD_DEP |
 				      GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
 
+	/* WaDisable_RenderCache_OperationalFlush:vlv */
+	I915_WRITE(GEN7_CACHE_MODE_0, _MASKED_BIT_DISABLE(GEN7_RC_OP_FLUSH_ENABLE));
+
 	/* WaForceL3Serialization:vlv */
 	I915_WRITE(GEN7_L3SQCREG4, I915_READ(GEN7_L3SQCREG4) &
 		   ~L3SQ_URB_READ_CAM_MATCH_DISABLE);
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 6/6] drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv
  2014-03-24 17:30 [PATCH 0/6] Rendering Specific HW Workarounds for VLV sourab.gupta
                   ` (4 preceding siblings ...)
  2014-03-24 17:30 ` [PATCH v2 5/6] drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush' sourab.gupta
@ 2014-03-24 17:30 ` sourab.gupta
  2014-03-24 17:56   ` Damien Lespiau
  2014-05-27 14:27   ` Damien Lespiau
  5 siblings, 2 replies; 37+ messages in thread
From: sourab.gupta @ 2014-03-24 17:30 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

For disabling L3 clock gating we need to set bit 25 of MMIO
register 940c. Earlier this was being done by just writing 1
into bit 25 and resetting all other bits.
This patch modifies the routine to read-modify-write of the
register, so that the values of other bits are not destroyed.

v2: Modifying the comments and the patch commit message (Chris)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index c3a8554..af4bb8e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5093,8 +5093,11 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
 	I915_WRITE(GEN6_UCGCTL2,
 		   GEN6_RCZUNIT_CLOCK_GATE_DISABLE);
 
-	/* WaDisableL3Bank2xClockGate:vlv */
-	I915_WRITE(GEN7_UCGCTL4, GEN7_L3BANK2X_CLOCK_GATE_DISABLE);
+	/* WaDisableL3Bank2xClockGate:vlv
+	 * Disabling L3 clock gating- MMIO 940c[25] = 1
+	 * Set bit 25, to disable L3_BANK_2x_CLK_GATING */
+	I915_WRITE(GEN7_UCGCTL4,
+			I915_READ(GEN7_UCGCTL4) | GEN7_L3BANK2X_CLOCK_GATE_DISABLE);
 
 	I915_WRITE(MI_ARB_VLV, MI_ARB_DISPLAY_TRICKLE_FEED_DISABLE);
 
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 4/6] drm/i915/vlv: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg
  2014-03-24 17:30 ` [PATCH 4/6] drm/i915/vlv: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg sourab.gupta
@ 2014-03-24 17:47   ` Chris Wilson
  2014-03-24 17:55     ` Gupta, Sourab
  0 siblings, 1 reply; 37+ messages in thread
From: Chris Wilson @ 2014-03-24 17:47 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Akash Goel, Daniel Vetter, intel-gfx

On Mon, Mar 24, 2014 at 11:00:05PM +0530, sourab.gupta@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Removing the VS_TIMER_DISPATCH bit enable for MI MODE reg for
> VLV platform as it is not required.
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>

I've been running with this on ivb and hsw, and have not yet seen a
difference. So a tentative,
Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # ivb, hsw
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 4/6] drm/i915/vlv: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg
  2014-03-24 17:47   ` Chris Wilson
@ 2014-03-24 17:55     ` Gupta, Sourab
  2014-03-24 18:01       ` Chris Wilson
  0 siblings, 1 reply; 37+ messages in thread
From: Gupta, Sourab @ 2014-03-24 17:55 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Goel, Akash, Daniel Vetter, intel-gfx@lists.freedesktop.org

On Mon, 2014-03-24 at 17:47 +0000, Chris Wilson wrote:
> On Mon, Mar 24, 2014 at 11:00:05PM +0530, sourab.gupta@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > Removing the VS_TIMER_DISPATCH bit enable for MI MODE reg for
> > VLV platform as it is not required.
> > 
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> 
> I've been running with this on ivb and hsw, and have not yet seen a
> difference. So a tentative,
> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # ivb, hsw
> -Chris
> 
Hi Chris,
Right now this patch applies to vlv. So, in that case, should this be
applicable to Gen7 per se?
I'll float a new version, if reqd.

Regards,
Sourab

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 6/6] drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv
  2014-03-24 17:30 ` [PATCH v2 6/6] drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv sourab.gupta
@ 2014-03-24 17:56   ` Damien Lespiau
  2014-03-25  6:52     ` Gupta, Sourab
  2014-05-27 14:27   ` Damien Lespiau
  1 sibling, 1 reply; 37+ messages in thread
From: Damien Lespiau @ 2014-03-24 17:56 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Daniel Vetter, intel-gfx, Akash Goel

On Mon, Mar 24, 2014 at 11:00:07PM +0530, sourab.gupta@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> For disabling L3 clock gating we need to set bit 25 of MMIO
> register 940c. Earlier this was being done by just writing 1
> into bit 25 and resetting all other bits.
> This patch modifies the routine to read-modify-write of the
> register, so that the values of other bits are not destroyed.
> 
> v2: Modifying the comments and the patch commit message (Chris)

This patch commit message lacks the most important information: which
bit are we setting back to 0 and we shouldn't, and why is that
important? We do direct writes to other registers in that function (for
instance (MI_ARB_VLV just below).

-- 
Damien

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 4/6] drm/i915/vlv: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg
  2014-03-24 17:55     ` Gupta, Sourab
@ 2014-03-24 18:01       ` Chris Wilson
  2014-03-24 18:28         ` [PATCH v2 " sourab.gupta
  0 siblings, 1 reply; 37+ messages in thread
From: Chris Wilson @ 2014-03-24 18:01 UTC (permalink / raw)
  To: Gupta, Sourab; +Cc: Goel, Akash, Daniel Vetter, intel-gfx@lists.freedesktop.org

On Mon, Mar 24, 2014 at 05:55:22PM +0000, Gupta, Sourab wrote:
> On Mon, 2014-03-24 at 17:47 +0000, Chris Wilson wrote:
> > On Mon, Mar 24, 2014 at 11:00:05PM +0530, sourab.gupta@intel.com wrote:
> > > From: Akash Goel <akash.goel@intel.com>
> > > 
> > > Removing the VS_TIMER_DISPATCH bit enable for MI MODE reg for
> > > VLV platform as it is not required.
> > > 
> > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > 
> > I've been running with this on ivb and hsw, and have not yet seen a
> > difference. So a tentative,
> > Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # ivb, hsw
> > -Chris
> > 
> Hi Chris,
> Right now this patch applies to vlv. So, in that case, should this be
> applicable to Gen7 per se?

Yes, meant to say, I was testing this applied to ivb and hsw (instead of
vlv).

> I'll float a new version, if reqd.

Thanks,
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 4/6] drm/i915/vlv: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg
  2014-03-24 18:01       ` Chris Wilson
@ 2014-03-24 18:28         ` sourab.gupta
  2014-03-25 11:33           ` Ville Syrjälä
  0 siblings, 1 reply; 37+ messages in thread
From: sourab.gupta @ 2014-03-24 18:28 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

Removing the VS_TIMER_DISPATCH bit enable for MI MODE reg for
Gen7 platform as it is not required.

v2: Enhancing the scope of the patch to full Gen7 (Chris)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # ivb, hsw -Chris
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index eb4811a..9983802 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -599,7 +599,9 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 	int ret = init_ring_common(ring);
 
 	if (INTEL_INFO(dev)->gen > 3)
-		I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
+		if (!IS_GEN7(dev))
+			I915_WRITE(MI_MODE,
+					_MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
 
 	/* We need to disable the AsyncFlip performance optimisations in order
 	 * to use MI_WAIT_FOR_EVENT within the CS. It should already be
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 6/6] drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv
  2014-03-24 17:56   ` Damien Lespiau
@ 2014-03-25  6:52     ` Gupta, Sourab
  2014-04-01  5:22       ` Gupta, Sourab
  0 siblings, 1 reply; 37+ messages in thread
From: Gupta, Sourab @ 2014-03-25  6:52 UTC (permalink / raw)
  To: Lespiau, Damien
  Cc: Daniel Vetter, intel-gfx@lists.freedesktop.org, Goel, Akash

On Mon, 2014-03-24 at 17:56 +0000, Lespiau, Damien wrote:
> On Mon, Mar 24, 2014 at 11:00:07PM +0530, sourab.gupta@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > For disabling L3 clock gating we need to set bit 25 of MMIO
> > register 940c. Earlier this was being done by just writing 1
> > into bit 25 and resetting all other bits.
> > This patch modifies the routine to read-modify-write of the
> > register, so that the values of other bits are not destroyed.
> > 
> > v2: Modifying the comments and the patch commit message (Chris)
> 
> This patch commit message lacks the most important information: which
> bit are we setting back to 0 and we shouldn't, and why is that
> important? We do direct writes to other registers in that function (for
> instance (MI_ARB_VLV just below).
> 
Hi Damien,
The reset value of the register is 0x00F80003. Therefore, if we directly
set only bit 25 to 1, without caring about other bits, the following reg
bits will be affected (bits 1:0, bits 23:19).
This doesn't seem to be the case with other regs where we are writing
directly (MI_ARB_VLV ) whose default value is 0.
So, by this commit we're just trying to set only the bit which we really
want to change.

Regards,
Sourab

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 4/6] drm/i915/vlv: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg
  2014-03-24 18:28         ` [PATCH v2 " sourab.gupta
@ 2014-03-25 11:33           ` Ville Syrjälä
  2014-03-25 12:31             ` [PATCH v3 4/6] drm/i915: " sourab.gupta
  0 siblings, 1 reply; 37+ messages in thread
From: Ville Syrjälä @ 2014-03-25 11:33 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Akash Goel, Daniel Vetter, intel-gfx

On Mon, Mar 24, 2014 at 11:58:22PM +0530, sourab.gupta@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Removing the VS_TIMER_DISPATCH bit enable for MI MODE reg for
> Gen7 platform as it is not required.
> 
> v2: Enhancing the scope of the patch to full Gen7 (Chris)
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # ivb, hsw -Chris
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index eb4811a..9983802 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -599,7 +599,9 @@ static int init_render_ring(struct intel_ring_buffer *ring)
>  	int ret = init_ring_common(ring);
>  
>  	if (INTEL_INFO(dev)->gen > 3)
> -		I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
> +		if (!IS_GEN7(dev))

We shouldn't enable this on gen8 either, and while doing that you could
avoid the extra indentation by rewriting it as something like this:
if (INTEL_INFO(dev)->gen >= 4 && INTEL_INFO(dev)->gen < 7)

Also you could add the appropriate w/a note while you're touching the
code:
WaTimedSingleVertexDispatch:cl,bw,ctg,elk,ilk,snb

> +			I915_WRITE(MI_MODE,
> +					_MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
>  
>  	/* We need to disable the AsyncFlip performance optimisations in order
>  	 * to use MI_WAIT_FOR_EVENT within the CS. It should already be
> -- 
> 1.8.5.1

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v3 4/6] drm/i915: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg
  2014-03-25 11:33           ` Ville Syrjälä
@ 2014-03-25 12:31             ` sourab.gupta
  2014-03-25 13:11               ` Ville Syrjälä
  0 siblings, 1 reply; 37+ messages in thread
From: sourab.gupta @ 2014-03-25 12:31 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

This patch Removes the VS_TIMER_DISPATCH bit enable in MI MODE reg for
platforms > Gen6.
VS_TIMER_DISPATCH bit enable was earlier required as a part of
WA 'WaTimedSingleVertexDispatch', which is now applicable only to
platforms < Gen7.

v2: Enhancing the scope of the patch to full Gen7 (Chris)

v3: Modifying the WA condition to the cover the applicable platforms,
and adding the WA name in comments. (Ville)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # ivb, hsw -Chris
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 816137f..2ad5fe7 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -605,7 +605,8 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret = init_ring_common(ring);
 
-	if (INTEL_INFO(dev)->gen > 3)
+	/* WaTimedSingleVertexDispatch:cl,bw,ctg,elk,ilk,snb */
+	if (INTEL_INFO(dev)->gen >= 4 && INTEL_INFO(dev)->gen < 7)
 		I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
 
 	/* We need to disable the AsyncFlip performance optimisations in order
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 4/6] drm/i915: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg
  2014-03-25 12:31             ` [PATCH v3 4/6] drm/i915: " sourab.gupta
@ 2014-03-25 13:11               ` Ville Syrjälä
  2014-03-25 15:41                 ` Daniel Vetter
  0 siblings, 1 reply; 37+ messages in thread
From: Ville Syrjälä @ 2014-03-25 13:11 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Daniel Vetter, intel-gfx, Akash Goel

On Tue, Mar 25, 2014 at 06:01:50PM +0530, sourab.gupta@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> This patch Removes the VS_TIMER_DISPATCH bit enable in MI MODE reg for
> platforms > Gen6.
> VS_TIMER_DISPATCH bit enable was earlier required as a part of
> WA 'WaTimedSingleVertexDispatch', which is now applicable only to
> platforms < Gen7.
> 
> v2: Enhancing the scope of the patch to full Gen7 (Chris)
> 
> v3: Modifying the WA condition to the cover the applicable platforms,
> and adding the WA name in comments. (Ville)
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # ivb, hsw -Chris

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 816137f..2ad5fe7 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -605,7 +605,8 @@ static int init_render_ring(struct intel_ring_buffer *ring)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	int ret = init_ring_common(ring);
>  
> -	if (INTEL_INFO(dev)->gen > 3)
> +	/* WaTimedSingleVertexDispatch:cl,bw,ctg,elk,ilk,snb */
> +	if (INTEL_INFO(dev)->gen >= 4 && INTEL_INFO(dev)->gen < 7)
>  		I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
>  
>  	/* We need to disable the AsyncFlip performance optimisations in order
> -- 
> 1.8.5.1

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 4/6] drm/i915: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg
  2014-03-25 13:11               ` Ville Syrjälä
@ 2014-03-25 15:41                 ` Daniel Vetter
  0 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2014-03-25 15:41 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Akash Goel, Daniel Vetter, sourab.gupta, intel-gfx

On Tue, Mar 25, 2014 at 03:11:10PM +0200, Ville Syrjälä wrote:
> On Tue, Mar 25, 2014 at 06:01:50PM +0530, sourab.gupta@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > This patch Removes the VS_TIMER_DISPATCH bit enable in MI MODE reg for
> > platforms > Gen6.
> > VS_TIMER_DISPATCH bit enable was earlier required as a part of
> > WA 'WaTimedSingleVertexDispatch', which is now applicable only to
> > platforms < Gen7.
> > 
> > v2: Enhancing the scope of the patch to full Gen7 (Chris)
> > 
> > v3: Modifying the WA condition to the cover the applicable platforms,
> > and adding the WA name in comments. (Ville)
> > 
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # ivb, hsw -Chris
> 
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Queued for -next, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 3/6] drm/i915: Enabling the TLB invalidate bit in GFX Mode register
  2014-03-24 17:30 ` [PATCH v2 3/6] drm/i915: Enabling the TLB invalidate bit in GFX Mode register sourab.gupta
@ 2014-04-01  5:01   ` Gupta, Sourab
  2014-04-02 11:34   ` Ville Syrjälä
  1 sibling, 0 replies; 37+ messages in thread
From: Gupta, Sourab @ 2014-04-01  5:01 UTC (permalink / raw)
  To: intel-gfx@lists.freedesktop.org; +Cc: Daniel Vetter, Goel, Akash

On Mon, 2014-03-24 at 17:30 +0000, Gupta, Sourab wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> This patch Enables the bit for TLB invalidate in GFX Mode register
> for Gen7.
> 
> According to bspec,  When enabled this bit limits the invalidation
> of the TLB only to batch buffer boundaries, to pipe_control
> commands which have the TLB invalidation bit set and sync flushes.
> If disabled, the TLB caches are flushed for every full flush of
> the pipeline.
> 
> Tested only on vlv platform. Chris has tested on ivb and hsw
> platforms.
> 
> v2: Adding the explicit enabling of this bit for all Gen7 platforms
> instead of only vlv (Chris)
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> #ivb, hsw -Chris
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index bace089..eb4811a 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -617,7 +617,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
>  
>  	if (IS_GEN7(dev))
>  		I915_WRITE(GFX_MODE_GEN7,
> -			   _MASKED_BIT_DISABLE(GFX_TLB_INVALIDATE_EXPLICIT) |
> +			   _MASKED_BIT_ENABLE(GFX_TLB_INVALIDATE_EXPLICIT) |
>  			   _MASKED_BIT_ENABLE(GFX_REPLAY_MODE));
>  
>  	if (INTEL_INFO(dev)->gen >= 5) {

Hi Chris,

Can you please review this patch.
Thanks,
Sourab

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 6/6] drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv
  2014-03-25  6:52     ` Gupta, Sourab
@ 2014-04-01  5:22       ` Gupta, Sourab
  2014-04-14 10:22         ` Gupta, Sourab
  0 siblings, 1 reply; 37+ messages in thread
From: Gupta, Sourab @ 2014-04-01  5:22 UTC (permalink / raw)
  To: Lespiau, Damien
  Cc: Daniel Vetter, intel-gfx@lists.freedesktop.org, Goel, Akash

On Tue, 2014-03-25 at 12:23 +0530, sourab gupta wrote:
> On Mon, 2014-03-24 at 17:56 +0000, Lespiau, Damien wrote:
> > On Mon, Mar 24, 2014 at 11:00:07PM +0530, sourab.gupta@intel.com wrote:
> > > From: Akash Goel <akash.goel@intel.com>
> > > 
> > > For disabling L3 clock gating we need to set bit 25 of MMIO
> > > register 940c. Earlier this was being done by just writing 1
> > > into bit 25 and resetting all other bits.
> > > This patch modifies the routine to read-modify-write of the
> > > register, so that the values of other bits are not destroyed.
> > > 
> > > v2: Modifying the comments and the patch commit message (Chris)
> > 
> > This patch commit message lacks the most important information: which
> > bit are we setting back to 0 and we shouldn't, and why is that
> > important? We do direct writes to other registers in that function (for
> > instance (MI_ARB_VLV just below).
> > 
> Hi Damien,
> The reset value of the register is 0x00F80003. Therefore, if we directly
> set only bit 25 to 1, without caring about other bits, the following reg
> bits will be affected (bits 1:0, bits 23:19).
> This doesn't seem to be the case with other regs where we are writing
> directly (MI_ARB_VLV ) whose default value is 0.
> So, by this commit we're just trying to set only the bit which we really
> want to change.
> 
> Regards,
> Sourab
> 
> 
Hi Damien,
Please provide your comments on the above explanation. I'll add more
information to the commit message regarding the same, if it is okay.

Thanks,
Sourab

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 5/6] drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush'
  2014-03-24 17:30 ` [PATCH v2 5/6] drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush' sourab.gupta
@ 2014-04-01 10:51   ` Ville Syrjälä
  2014-04-03  4:42     ` [PATCH v3 " sourab.gupta
  0 siblings, 1 reply; 37+ messages in thread
From: Ville Syrjälä @ 2014-04-01 10:51 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Akash Goel, Daniel Vetter, intel-gfx

On Mon, Mar 24, 2014 at 11:00:06PM +0530, sourab.gupta@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> In Valleyview, Operational flush cannot be enabled on
> BWG A0 [Errata BWT006]

Apparently this has been busted ever since gen4. So I think either we
should disable it for all gen4+ platforms or we should trust that the
default value is 0. The default does seem to be disabled for everything,
but since there's a w/a name for it I'm not sure if we can trust that.

BDW is a bit weird since the bit has been repurposed for some GT3 thing.
Not sure what the deal is with !GT3.

> 
> v2: Corrected the code regarding the wrong usage of
> MASKED_BIT_DISABLE (Chris)
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h | 3 +++
>  drivers/gpu/drm/i915/intel_pm.c | 3 +++
>  2 files changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index b922e38..266bfa1 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -972,6 +972,9 @@ enum punit_power_well {
>  #define   ECO_GATING_CX_ONLY	(1<<3)
>  #define   ECO_FLIP_DONE		(1<<0)
>  
> +#define GEN7_CACHE_MODE_0	0x07000 /* IVB+ only */
> +#define GEN7_RC_OP_FLUSH_ENABLE (1<<0)
> +
>  #define CACHE_MODE_0_GEN7	0x7000 /* IVB+ */
>  #define   HIZ_RAW_STALL_OPT_DISABLE (1<<2)
>  #define CACHE_MODE_1		0x7004 /* IVB+ */
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index fd68f93..c3a8554 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -5068,6 +5068,9 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
>  		   _MASKED_BIT_ENABLE(GEN7_MAX_PS_THREAD_DEP |
>  				      GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
>  
> +	/* WaDisable_RenderCache_OperationalFlush:vlv */
> +	I915_WRITE(GEN7_CACHE_MODE_0, _MASKED_BIT_DISABLE(GEN7_RC_OP_FLUSH_ENABLE));
> +
>  	/* WaForceL3Serialization:vlv */
>  	I915_WRITE(GEN7_L3SQCREG4, I915_READ(GEN7_L3SQCREG4) &
>  		   ~L3SQ_URB_READ_CAM_MATCH_DISABLE);
> -- 
> 1.8.5.1

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 3/6] drm/i915: Enabling the TLB invalidate bit in GFX Mode register
  2014-03-24 17:30 ` [PATCH v2 3/6] drm/i915: Enabling the TLB invalidate bit in GFX Mode register sourab.gupta
  2014-04-01  5:01   ` Gupta, Sourab
@ 2014-04-02 11:34   ` Ville Syrjälä
  2014-04-02 11:55     ` Daniel Vetter
  1 sibling, 1 reply; 37+ messages in thread
From: Ville Syrjälä @ 2014-04-02 11:34 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Akash Goel, Daniel Vetter, intel-gfx

On Mon, Mar 24, 2014 at 11:00:04PM +0530, sourab.gupta@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> This patch Enables the bit for TLB invalidate in GFX Mode register
> for Gen7.
> 
> According to bspec,  When enabled this bit limits the invalidation
> of the TLB only to batch buffer boundaries, to pipe_control
> commands which have the TLB invalidation bit set and sync flushes.
> If disabled, the TLB caches are flushed for every full flush of
> the pipeline.
> 
> Tested only on vlv platform. Chris has tested on ivb and hsw
> platforms.
> 
> v2: Adding the explicit enabling of this bit for all Gen7 platforms
> instead of only vlv (Chris)
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> #ivb, hsw -Chris

Could I trouble you to add the w/a note?
WaBCSVCSTlbInvalidationMode:ivb,vlv,hsw

No idea why it mentions only BCS and VCS, but it does seem to say that
it's essentially a new name for WaEnableFlushTlbInvalidationMode:snb.

With that:
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index bace089..eb4811a 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -617,7 +617,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
>  
>  	if (IS_GEN7(dev))
>  		I915_WRITE(GFX_MODE_GEN7,
> -			   _MASKED_BIT_DISABLE(GFX_TLB_INVALIDATE_EXPLICIT) |
> +			   _MASKED_BIT_ENABLE(GFX_TLB_INVALIDATE_EXPLICIT) |
>  			   _MASKED_BIT_ENABLE(GFX_REPLAY_MODE));
>  
>  	if (INTEL_INFO(dev)->gen >= 5) {
> -- 
> 1.8.5.1

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 3/6] drm/i915: Enabling the TLB invalidate bit in GFX Mode register
  2014-04-02 11:34   ` Ville Syrjälä
@ 2014-04-02 11:55     ` Daniel Vetter
  0 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2014-04-02 11:55 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Daniel Vetter, sourab.gupta, Akash Goel, intel-gfx

On Wed, Apr 02, 2014 at 02:34:59PM +0300, Ville Syrjälä wrote:
> On Mon, Mar 24, 2014 at 11:00:04PM +0530, sourab.gupta@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > This patch Enables the bit for TLB invalidate in GFX Mode register
> > for Gen7.
> > 
> > According to bspec,  When enabled this bit limits the invalidation
> > of the TLB only to batch buffer boundaries, to pipe_control
> > commands which have the TLB invalidation bit set and sync flushes.
> > If disabled, the TLB caches are flushed for every full flush of
> > the pipeline.
> > 
> > Tested only on vlv platform. Chris has tested on ivb and hsw
> > platforms.
> > 
> > v2: Adding the explicit enabling of this bit for all Gen7 platforms
> > instead of only vlv (Chris)
> > 
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > Tested-by: Chris Wilson <chris@chris-wilson.co.uk> #ivb, hsw -Chris
> 
> Could I trouble you to add the w/a note?
> WaBCSVCSTlbInvalidationMode:ivb,vlv,hsw
> 
> No idea why it mentions only BCS and VCS, but it does seem to say that
> it's essentially a new name for WaEnableFlushTlbInvalidationMode:snb.

Done for both the gen6 and gen7 version of this.
> 
> With that:
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> > ---
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index bace089..eb4811a 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -617,7 +617,7 @@ static int init_render_ring(struct intel_ring_buffer *ring)
> >  
> >  	if (IS_GEN7(dev))
> >  		I915_WRITE(GFX_MODE_GEN7,
> > -			   _MASKED_BIT_DISABLE(GFX_TLB_INVALIDATE_EXPLICIT) |
> > +			   _MASKED_BIT_ENABLE(GFX_TLB_INVALIDATE_EXPLICIT) |
> >  			   _MASKED_BIT_ENABLE(GFX_REPLAY_MODE));
> >  
> >  	if (INTEL_INFO(dev)->gen >= 5) {
> > -- 
> > 1.8.5.1
> 
> -- 
> Ville Syrjälä
> Intel OTC

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v3 5/6] drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush'
  2014-04-01 10:51   ` Ville Syrjälä
@ 2014-04-03  4:42     ` sourab.gupta
  2014-04-04 11:17       ` Ville Syrjälä
  0 siblings, 1 reply; 37+ messages in thread
From: sourab.gupta @ 2014-04-03  4:42 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

On Gen4+ platforms (except BDW), Render Cache Operational flush
cannot be enabled.
This WA is apparently required for all Gen4+ platforms,except BDW.
In BDW, the bit has been repurposed otherwise.
This has been tested only on vlv.

v2: Corrected the code regarding the wrong usage of
MASKED_BIT_DISABLE (Chris)

v3: Enhancing the scope of WA to Gen4+ platforms except BDW (Ville)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h |  1 +
 drivers/gpu/drm/i915/intel_pm.c | 15 +++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 393f93e..366c0bf 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1060,6 +1060,7 @@ enum punit_power_well {
 #define   ECO_FLIP_DONE		(1<<0)
 
 #define CACHE_MODE_0_GEN7	0x7000 /* IVB+ */
+#define RC_OP_FLUSH_ENABLE (1<<0)
 #define   HIZ_RAW_STALL_OPT_DISABLE (1<<2)
 #define CACHE_MODE_1		0x7004 /* IVB+ */
 #define   PIXEL_SUBSPAN_COLLECT_OPT_DISABLE	(1<<6)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 1454777..d181735 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4624,6 +4624,9 @@ static void ironlake_init_clock_gating(struct drm_device *dev)
 	I915_WRITE(CACHE_MODE_0,
 		   _MASKED_BIT_ENABLE(CM0_PIPELINED_RENDER_FLUSH_DISABLE));
 
+	/* WaDisable_RenderCache_OperationalFlush:ilk */
+	I915_WRITE(CACHE_MODE_0, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+
 	g4x_disable_trickle_feed(dev);
 
 	ibx_init_clock_gating(dev);
@@ -4699,6 +4702,9 @@ static void gen6_init_clock_gating(struct drm_device *dev)
 		I915_WRITE(GEN6_GT_MODE,
 			   _MASKED_BIT_ENABLE(GEN6_TD_FOUR_ROW_DISPATCH_DISABLE));
 
+	/* WaDisable_RenderCache_OperationalFlush:snb */
+	I915_WRITE(CACHE_MODE_0, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+
 	/*
 	 * BSpec recoomends 8x4 when MSAA is used,
 	 * however in practice 16x4 seems fastest.
@@ -4938,6 +4944,9 @@ static void haswell_init_clock_gating(struct drm_device *dev)
 	I915_WRITE(GEN7_FF_THREAD_MODE,
 		   I915_READ(GEN7_FF_THREAD_MODE) & ~GEN7_FF_VS_REF_CNT_FFME);
 
+	/* WaDisable_RenderCache_OperationalFlush:hsw */
+	I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+
 	/* enable HiZ Raw Stall Optimization */
 	I915_WRITE(CACHE_MODE_0_GEN7,
 		   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
@@ -4990,6 +4999,9 @@ static void ivybridge_init_clock_gating(struct drm_device *dev)
 		I915_WRITE(GEN7_HALF_SLICE_CHICKEN1,
 			   _MASKED_BIT_ENABLE(GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
 
+	/* WaDisable_RenderCache_OperationalFlush:ivb */
+	I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+
 	/* Apply the WaDisableRHWOOptimizationForRenderHang:ivb workaround. */
 	I915_WRITE(GEN7_COMMON_SLICE_CHICKEN1,
 		   GEN7_CSC1_RHWO_OPT_DISABLE_IN_RCC);
@@ -5107,6 +5119,9 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
 		   _MASKED_BIT_ENABLE(GEN7_MAX_PS_THREAD_DEP |
 				      GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
 
+	/* WaDisable_RenderCache_OperationalFlush:vlv */
+	I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+
 	/* WaForceL3Serialization:vlv */
 	I915_WRITE(GEN7_L3SQCREG4, I915_READ(GEN7_L3SQCREG4) &
 		   ~L3SQ_URB_READ_CAM_MATCH_DISABLE);
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 5/6] drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush'
  2014-04-03  4:42     ` [PATCH v3 " sourab.gupta
@ 2014-04-04 11:17       ` Ville Syrjälä
  2014-04-04 11:44         ` [PATCH v4 " sourab.gupta
  0 siblings, 1 reply; 37+ messages in thread
From: Ville Syrjälä @ 2014-04-04 11:17 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Daniel Vetter, intel-gfx, Akash Goel

On Thu, Apr 03, 2014 at 10:12:14AM +0530, sourab.gupta@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> On Gen4+ platforms (except BDW), Render Cache Operational flush
> cannot be enabled.
> This WA is apparently required for all Gen4+ platforms,except BDW.
> In BDW, the bit has been repurposed otherwise.
> This has been tested only on vlv.
> 
> v2: Corrected the code regarding the wrong usage of
> MASKED_BIT_DISABLE (Chris)
> 
> v3: Enhancing the scope of WA to Gen4+ platforms except BDW (Ville)

Actually you you missed g4x,crestline,broadwater. Add it into those as
well, and you can add:
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h |  1 +
>  drivers/gpu/drm/i915/intel_pm.c | 15 +++++++++++++++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 393f93e..366c0bf 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1060,6 +1060,7 @@ enum punit_power_well {
>  #define   ECO_FLIP_DONE		(1<<0)
>  
>  #define CACHE_MODE_0_GEN7	0x7000 /* IVB+ */
> +#define RC_OP_FLUSH_ENABLE (1<<0)
>  #define   HIZ_RAW_STALL_OPT_DISABLE (1<<2)
>  #define CACHE_MODE_1		0x7004 /* IVB+ */
>  #define   PIXEL_SUBSPAN_COLLECT_OPT_DISABLE	(1<<6)
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 1454777..d181735 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4624,6 +4624,9 @@ static void ironlake_init_clock_gating(struct drm_device *dev)
>  	I915_WRITE(CACHE_MODE_0,
>  		   _MASKED_BIT_ENABLE(CM0_PIPELINED_RENDER_FLUSH_DISABLE));
>  
> +	/* WaDisable_RenderCache_OperationalFlush:ilk */
> +	I915_WRITE(CACHE_MODE_0, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
> +
>  	g4x_disable_trickle_feed(dev);
>  
>  	ibx_init_clock_gating(dev);
> @@ -4699,6 +4702,9 @@ static void gen6_init_clock_gating(struct drm_device *dev)
>  		I915_WRITE(GEN6_GT_MODE,
>  			   _MASKED_BIT_ENABLE(GEN6_TD_FOUR_ROW_DISPATCH_DISABLE));
>  
> +	/* WaDisable_RenderCache_OperationalFlush:snb */
> +	I915_WRITE(CACHE_MODE_0, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
> +
>  	/*
>  	 * BSpec recoomends 8x4 when MSAA is used,
>  	 * however in practice 16x4 seems fastest.
> @@ -4938,6 +4944,9 @@ static void haswell_init_clock_gating(struct drm_device *dev)
>  	I915_WRITE(GEN7_FF_THREAD_MODE,
>  		   I915_READ(GEN7_FF_THREAD_MODE) & ~GEN7_FF_VS_REF_CNT_FFME);
>  
> +	/* WaDisable_RenderCache_OperationalFlush:hsw */
> +	I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
> +
>  	/* enable HiZ Raw Stall Optimization */
>  	I915_WRITE(CACHE_MODE_0_GEN7,
>  		   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
> @@ -4990,6 +4999,9 @@ static void ivybridge_init_clock_gating(struct drm_device *dev)
>  		I915_WRITE(GEN7_HALF_SLICE_CHICKEN1,
>  			   _MASKED_BIT_ENABLE(GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
>  
> +	/* WaDisable_RenderCache_OperationalFlush:ivb */
> +	I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
> +
>  	/* Apply the WaDisableRHWOOptimizationForRenderHang:ivb workaround. */
>  	I915_WRITE(GEN7_COMMON_SLICE_CHICKEN1,
>  		   GEN7_CSC1_RHWO_OPT_DISABLE_IN_RCC);
> @@ -5107,6 +5119,9 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
>  		   _MASKED_BIT_ENABLE(GEN7_MAX_PS_THREAD_DEP |
>  				      GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
>  
> +	/* WaDisable_RenderCache_OperationalFlush:vlv */
> +	I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
> +
>  	/* WaForceL3Serialization:vlv */
>  	I915_WRITE(GEN7_L3SQCREG4, I915_READ(GEN7_L3SQCREG4) &
>  		   ~L3SQ_URB_READ_CAM_MATCH_DISABLE);
> -- 
> 1.8.5.1

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v4 5/6] drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush'
  2014-04-04 11:17       ` Ville Syrjälä
@ 2014-04-04 11:44         ` sourab.gupta
  2014-04-04 15:24           ` Chris Wilson
  0 siblings, 1 reply; 37+ messages in thread
From: sourab.gupta @ 2014-04-04 11:44 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

On Gen4+ platforms (except BDW), Render Cache Operational flush
cannot be enabled.
This WA is apparently required for all Gen4+ platforms,except BDW.
In BDW, the bit has been repurposed otherwise.
This has been tested only on vlv.

v2: Corrected the code regarding the wrong usage of
MASKED_BIT_DISABLE (Chris)

v3: Enhancing the scope of WA to Gen4+ platforms except BDW (Ville)

v4: Adding WA for g4x, crestline, broadwater (Ville)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h |  1 +
 drivers/gpu/drm/i915/intel_pm.c | 24 ++++++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 393f93e..366c0bf 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1060,6 +1060,7 @@ enum punit_power_well {
 #define   ECO_FLIP_DONE		(1<<0)
 
 #define CACHE_MODE_0_GEN7	0x7000 /* IVB+ */
+#define RC_OP_FLUSH_ENABLE (1<<0)
 #define   HIZ_RAW_STALL_OPT_DISABLE (1<<2)
 #define CACHE_MODE_1		0x7004 /* IVB+ */
 #define   PIXEL_SUBSPAN_COLLECT_OPT_DISABLE	(1<<6)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 1454777..17ff36e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4624,6 +4624,9 @@ static void ironlake_init_clock_gating(struct drm_device *dev)
 	I915_WRITE(CACHE_MODE_0,
 		   _MASKED_BIT_ENABLE(CM0_PIPELINED_RENDER_FLUSH_DISABLE));
 
+	/* WaDisable_RenderCache_OperationalFlush:ilk */
+	I915_WRITE(CACHE_MODE_0, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+
 	g4x_disable_trickle_feed(dev);
 
 	ibx_init_clock_gating(dev);
@@ -4699,6 +4702,9 @@ static void gen6_init_clock_gating(struct drm_device *dev)
 		I915_WRITE(GEN6_GT_MODE,
 			   _MASKED_BIT_ENABLE(GEN6_TD_FOUR_ROW_DISPATCH_DISABLE));
 
+	/* WaDisable_RenderCache_OperationalFlush:snb */
+	I915_WRITE(CACHE_MODE_0, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+
 	/*
 	 * BSpec recoomends 8x4 when MSAA is used,
 	 * however in practice 16x4 seems fastest.
@@ -4938,6 +4944,9 @@ static void haswell_init_clock_gating(struct drm_device *dev)
 	I915_WRITE(GEN7_FF_THREAD_MODE,
 		   I915_READ(GEN7_FF_THREAD_MODE) & ~GEN7_FF_VS_REF_CNT_FFME);
 
+	/* WaDisable_RenderCache_OperationalFlush:hsw */
+	I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+
 	/* enable HiZ Raw Stall Optimization */
 	I915_WRITE(CACHE_MODE_0_GEN7,
 		   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
@@ -4990,6 +4999,9 @@ static void ivybridge_init_clock_gating(struct drm_device *dev)
 		I915_WRITE(GEN7_HALF_SLICE_CHICKEN1,
 			   _MASKED_BIT_ENABLE(GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
 
+	/* WaDisable_RenderCache_OperationalFlush:ivb */
+	I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+
 	/* Apply the WaDisableRHWOOptimizationForRenderHang:ivb workaround. */
 	I915_WRITE(GEN7_COMMON_SLICE_CHICKEN1,
 		   GEN7_CSC1_RHWO_OPT_DISABLE_IN_RCC);
@@ -5107,6 +5119,9 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
 		   _MASKED_BIT_ENABLE(GEN7_MAX_PS_THREAD_DEP |
 				      GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
 
+	/* WaDisable_RenderCache_OperationalFlush:vlv */
+	I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+
 	/* WaForceL3Serialization:vlv */
 	I915_WRITE(GEN7_L3SQCREG4, I915_READ(GEN7_L3SQCREG4) &
 		   ~L3SQ_URB_READ_CAM_MATCH_DISABLE);
@@ -5176,6 +5191,9 @@ static void g4x_init_clock_gating(struct drm_device *dev)
 	I915_WRITE(CACHE_MODE_0,
 		   _MASKED_BIT_ENABLE(CM0_PIPELINED_RENDER_FLUSH_DISABLE));
 
+	/* WaDisable_RenderCache_OperationalFlush:g4x */
+	I915_WRITE(CACHE_MODE_0, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+
 	g4x_disable_trickle_feed(dev);
 }
 
@@ -5190,6 +5208,9 @@ static void crestline_init_clock_gating(struct drm_device *dev)
 	I915_WRITE16(DEUC, 0);
 	I915_WRITE(MI_ARB_STATE,
 		   _MASKED_BIT_ENABLE(MI_ARB_DISPLAY_TRICKLE_FEED_DISABLE));
+
+	/* WaDisable_RenderCache_OperationalFlush:gen4 */
+	I915_WRITE(CACHE_MODE_0, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
 }
 
 static void broadwater_init_clock_gating(struct drm_device *dev)
@@ -5204,6 +5225,9 @@ static void broadwater_init_clock_gating(struct drm_device *dev)
 	I915_WRITE(RENCLK_GATE_D2, 0);
 	I915_WRITE(MI_ARB_STATE,
 		   _MASKED_BIT_ENABLE(MI_ARB_DISPLAY_TRICKLE_FEED_DISABLE));
+
+	/* WaDisable_RenderCache_OperationalFlush:gen4 */
+	I915_WRITE(CACHE_MODE_0, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
 }
 
 static void gen3_init_clock_gating(struct drm_device *dev)
-- 
1.8.5.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 5/6] drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush'
  2014-04-04 11:44         ` [PATCH v4 " sourab.gupta
@ 2014-04-04 15:24           ` Chris Wilson
  2014-04-04 15:35             ` Ville Syrjälä
  2014-04-04 15:59             ` Daniel Vetter
  0 siblings, 2 replies; 37+ messages in thread
From: Chris Wilson @ 2014-04-04 15:24 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Daniel Vetter, intel-gfx, Akash Goel

On Fri, Apr 04, 2014 at 05:14:38PM +0530, sourab.gupta@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> On Gen4+ platforms (except BDW), Render Cache Operational flush
> cannot be enabled.
> This WA is apparently required for all Gen4+ platforms,except BDW.
> In BDW, the bit has been repurposed otherwise.
> This has been tested only on vlv.
> 
> v2: Corrected the code regarding the wrong usage of
> MASKED_BIT_DISABLE (Chris)
> 
> v3: Enhancing the scope of WA to Gen4+ platforms except BDW (Ville)
> 
> v4: Adding WA for g4x, crestline, broadwater (Ville)
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Note that we now have a redundant CM0_RC_OP_FLUSH_DISABLE (which fails
the name test anyway). I'm also not a fan of
enable(RC_OP_FLUSH_ENABLE)/disable(RC_OP_FLUSH_ENABLE) either, but as
far as the content goes,

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Sadly, it didn't appear to fix any bugs.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 5/6] drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush'
  2014-04-04 15:24           ` Chris Wilson
@ 2014-04-04 15:35             ` Ville Syrjälä
  2014-04-04 15:59               ` Chris Wilson
  2014-04-04 15:59             ` Daniel Vetter
  1 sibling, 1 reply; 37+ messages in thread
From: Ville Syrjälä @ 2014-04-04 15:35 UTC (permalink / raw)
  To: Chris Wilson, sourab.gupta, intel-gfx, Daniel Vetter, Akash Goel

On Fri, Apr 04, 2014 at 04:24:05PM +0100, Chris Wilson wrote:
> On Fri, Apr 04, 2014 at 05:14:38PM +0530, sourab.gupta@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > On Gen4+ platforms (except BDW), Render Cache Operational flush
> > cannot be enabled.
> > This WA is apparently required for all Gen4+ platforms,except BDW.
> > In BDW, the bit has been repurposed otherwise.
> > This has been tested only on vlv.
> > 
> > v2: Corrected the code regarding the wrong usage of
> > MASKED_BIT_DISABLE (Chris)
> > 
> > v3: Enhancing the scope of WA to Gen4+ platforms except BDW (Ville)
> > 
> > v4: Adding WA for g4x, crestline, broadwater (Ville)
> > 
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Note that we now have a redundant CM0_RC_OP_FLUSH_DISABLE (which fails
> the name test anyway).

That's the correct name for the bit on gen3 AFAICS. Might be interesting
to try to flip it on gen3 and see if we get moar fps :P

> I'm also not a fan of
> enable(RC_OP_FLUSH_ENABLE)/disable(RC_OP_FLUSH_ENABLE) either, but as
> far as the content goes,
> 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Sadly, it didn't appear to fix any bugs.
> -Chris
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 5/6] drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush'
  2014-04-04 15:35             ` Ville Syrjälä
@ 2014-04-04 15:59               ` Chris Wilson
  0 siblings, 0 replies; 37+ messages in thread
From: Chris Wilson @ 2014-04-04 15:59 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Akash Goel, Daniel Vetter, sourab.gupta, intel-gfx

On Fri, Apr 04, 2014 at 06:35:21PM +0300, Ville Syrjälä wrote:
> On Fri, Apr 04, 2014 at 04:24:05PM +0100, Chris Wilson wrote:
> > On Fri, Apr 04, 2014 at 05:14:38PM +0530, sourab.gupta@intel.com wrote:
> > > From: Akash Goel <akash.goel@intel.com>
> > > 
> > > On Gen4+ platforms (except BDW), Render Cache Operational flush
> > > cannot be enabled.
> > > This WA is apparently required for all Gen4+ platforms,except BDW.
> > > In BDW, the bit has been repurposed otherwise.
> > > This has been tested only on vlv.
> > > 
> > > v2: Corrected the code regarding the wrong usage of
> > > MASKED_BIT_DISABLE (Chris)
> > > 
> > > v3: Enhancing the scope of WA to Gen4+ platforms except BDW (Ville)
> > > 
> > > v4: Adding WA for g4x, crestline, broadwater (Ville)
> > > 
> > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > Note that we now have a redundant CM0_RC_OP_FLUSH_DISABLE (which fails
> > the name test anyway).
> 
> That's the correct name for the bit on gen3 AFAICS. Might be interesting
> to try to flip it on gen3 and see if we get moar fps :P

Hmm, that's true. Ok, keep the unused name ;-)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 5/6] drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush'
  2014-04-04 15:24           ` Chris Wilson
  2014-04-04 15:35             ` Ville Syrjälä
@ 2014-04-04 15:59             ` Daniel Vetter
  1 sibling, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2014-04-04 15:59 UTC (permalink / raw)
  To: Chris Wilson, sourab.gupta, intel-gfx, Daniel Vetter,
	Ville Syrjala, Akash Goel

On Fri, Apr 04, 2014 at 04:24:05PM +0100, Chris Wilson wrote:
> On Fri, Apr 04, 2014 at 05:14:38PM +0530, sourab.gupta@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > On Gen4+ platforms (except BDW), Render Cache Operational flush
> > cannot be enabled.
> > This WA is apparently required for all Gen4+ platforms,except BDW.
> > In BDW, the bit has been repurposed otherwise.
> > This has been tested only on vlv.
> > 
> > v2: Corrected the code regarding the wrong usage of
> > MASKED_BIT_DISABLE (Chris)
> > 
> > v3: Enhancing the scope of WA to Gen4+ platforms except BDW (Ville)
> > 
> > v4: Adding WA for g4x, crestline, broadwater (Ville)
> > 
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Note that we now have a redundant CM0_RC_OP_FLUSH_DISABLE (which fails
> the name test anyway). I'm also not a fan of
> enable(RC_OP_FLUSH_ENABLE)/disable(RC_OP_FLUSH_ENABLE) either, but as
> far as the content goes,
> 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Sadly, it didn't appear to fix any bugs.

Queued for -next, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext'
  2014-03-24 17:30 ` [PATCH v4 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext' sourab.gupta
@ 2014-04-08  4:41   ` Gupta, Sourab
  2014-04-14  9:45     ` [PATCH v5 " sourab.gupta
  0 siblings, 1 reply; 37+ messages in thread
From: Gupta, Sourab @ 2014-04-08  4:41 UTC (permalink / raw)
  To: intel-gfx@lists.freedesktop.org; +Cc: Daniel Vetter, Goel, Akash

On Mon, 2014-03-24 at 17:30 +0000, Gupta, Sourab wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> This workaround is needed on VLV for the HW context feature.
> It is used after adding the mi_set_context command in ring buffer
> for Hw context switch. As per the spec
> "The software must send a pipe_control with a CS stall and a post sync
> operation and then a dummy DRAW after every MI_SET_CONTEXT and after any
> PIPELINE_SELECT that is enabling 3D mode".
> 
> v2: Modified the WA comment. (Ville)
> 
> v3: Added the vlv identifier with the WA name
> 
> v4: Check removed for scratch page initialization. (Chris/Daniel)
> 
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 54 +++++++++++++++++++++++++++++++--
>  drivers/gpu/drm/i915/i915_reg.h         |  1 +
>  drivers/gpu/drm/i915/intel_ringbuffer.c |  9 ++++++
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
>  4 files changed, 63 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 6043062..83bf89e 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -584,6 +584,47 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
>  	return ctx;
>  }
>  
> +static inline void
> +mi_set_context_dummy3d_prim_wa(struct intel_ring_buffer *ring)
> +{
> +	u32 scratch_addr;
> +	u32 flags = 0;
> +
> +	/* Actual scratch location is at 128 bytes offset */
> +	scratch_addr = intel_get_pipe_control_scratch_addr(ring) + 128;
> +
> +	/*
> +	 * WaSendDummy3dPrimitveAfterSetContext:vlv
> +	 * Software must send a pipe_control with a CS stall
> +	 * and a post sync operation and then a dummy DRAW after
> +	 * every MI_SET_CONTEXT and after any PIPELINE_SELECT that
> +	 * is enabling 3D mode. A dummy draw is a 3DPRIMITIVE command
> +	 * with Indirect Parameter Enable set to 0, UAV Coherency
> +	 * Required set to 0, Predicate Enable set to 0,
> +	 * End Offset Enable set to 0, and Vertex Count Per Instance
> +	 * set to 0, All other parameters are a don't care.
> +	 */
> +
> +	/*
> +	 * Add a pipe control with CS Stall and postsync op
> +	 * before dummy 3D_PRIMITIVE
> +	 */
> +	flags |= PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL;
> +	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
> +	intel_ring_emit(ring, flags);
> +	intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT);
> +	intel_ring_emit(ring, 0);
> +
> +	/* Add a dummy 3D_PRIMITVE */
> +	intel_ring_emit(ring, GFX_OP_3DPRIMITIVE);
> +	intel_ring_emit(ring, 4); /* PrimTopoType*/
> +	intel_ring_emit(ring, 0); /* VertexCountPerInstance */
> +	intel_ring_emit(ring, 0); /* StartVertexLocation */
> +	intel_ring_emit(ring, 0); /* InstanceCount */
> +	intel_ring_emit(ring, 0); /* StartInstanceLocation */
> +	intel_ring_emit(ring, 0); /* BaseVertexLocation  */
> +}
> +
>  static inline int
>  mi_set_context(struct intel_ring_buffer *ring,
>  	       struct i915_hw_context *new_context,
> @@ -602,7 +643,10 @@ mi_set_context(struct intel_ring_buffer *ring,
>  			return ret;
>  	}
>  
> -	ret = intel_ring_begin(ring, 6);
> +	if (IS_VALLEYVIEW(ring->dev))
> +		ret = intel_ring_begin(ring, 6+4+8);
> +	else
> +		ret = intel_ring_begin(ring, 6);
>  	if (ret)
>  		return ret;
>  
> @@ -626,7 +670,13 @@ mi_set_context(struct intel_ring_buffer *ring,
>  	intel_ring_emit(ring, MI_NOOP);
>  
>  	if (IS_GEN7(ring->dev))
> -		intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
> +		if (IS_VALLEYVIEW(ring->dev)) {
> +			/* FIXME, should also apply to ivb */
> +			mi_set_context_dummy3d_prim_wa(ring);
> +			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
> +			intel_ring_emit(ring, MI_NOOP);
> +		} else
> +			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
>  	else
>  		intel_ring_emit(ring, MI_NOOP);
>  
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index adcb9c7..b922e38 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -348,6 +348,7 @@
>  #define   PIPE_CONTROL_DEPTH_CACHE_FLUSH		(1<<0)
>  #define   PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */
>  
> +#define GFX_OP_3DPRIMITIVE ((0x3<<29)|(0x3<<27)|(0x3<<24)|(7-2))
>  
>  /*
>   * Reset registers
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 75cac4e..bace089 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -583,6 +583,15 @@ err:
>  	return ret;
>  }
>  
> +u32
> +intel_get_pipe_control_scratch_addr(struct intel_ring_buffer *ring)
> +{
> +	if (ring->scratch.obj == NULL)
> +		return 0;
> +
> +	return ring->scratch.gtt_offset;
> +}
> +
>  static int init_render_ring(struct intel_ring_buffer *ring)
>  {
>  	struct drm_device *dev = ring->dev;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index f11ceb2..e38ca82 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -294,6 +294,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev);
>  
>  u32 intel_ring_get_active_head(struct intel_ring_buffer *ring);
>  void intel_ring_setup_status_page(struct intel_ring_buffer *ring);
> +u32 intel_get_pipe_control_scratch_addr(struct intel_ring_buffer *ring);
>  
>  static inline u32 intel_ring_get_tail(struct intel_ring_buffer *ring)
>  {

Gentle Reminder to review the patch.

Thanks,
Sourab

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v5 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext'
  2014-04-08  4:41   ` Gupta, Sourab
@ 2014-04-14  9:45     ` sourab.gupta
  2014-05-28  9:57       ` Gupta, Sourab
  0 siblings, 1 reply; 37+ messages in thread
From: sourab.gupta @ 2014-04-14  9:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

This workaround is needed on VLV for the HW context feature.
It is used after adding the mi_set_context command in ring buffer
for Hw context switch. As per the spec
"The software must send a pipe_control with a CS stall and a post sync
operation and then a dummy DRAW after every MI_SET_CONTEXT and after any
PIPELINE_SELECT that is enabling 3D mode".
Tested only for vlv.

v2: Modified the WA comment. (Ville)

v3: Added the vlv identifier with the WA name

v4: Check removed for scratch page initialization. (Chris/Daniel)

v5: Refactored based on latest codebase. Also WA added for full Gen7.

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 55 +++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_reg.h         |  1 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |  9 ++++++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 4 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index f77b4c1..b6d2a67 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -545,6 +545,47 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static inline void
+mi_set_context_dummy3d_prim_wa(struct intel_ring_buffer *ring)
+{
+	u32 scratch_addr;
+	u32 flags = 0;
+
+	/* Actual scratch location is at 128 bytes offset */
+	scratch_addr = intel_get_pipe_control_scratch_addr(ring) + 128;
+
+	/*
+	 * WaSendDummy3dPrimitveAfterSetContext:ivb,vlv
+	 * Software must send a pipe_control with a CS stall
+	 * and a post sync operation and then a dummy DRAW after
+	 * every MI_SET_CONTEXT and after any PIPELINE_SELECT that
+	 * is enabling 3D mode. A dummy draw is a 3DPRIMITIVE command
+	 * with Indirect Parameter Enable set to 0, UAV Coherency
+	 * Required set to 0, Predicate Enable set to 0,
+	 * End Offset Enable set to 0, and Vertex Count Per Instance
+	 * set to 0, All other parameters are a don't care.
+	 */
+
+	/*
+	 * Add a pipe control with CS Stall and postsync op
+	 * before dummy 3D_PRIMITIVE
+	 */
+	flags |= PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL;
+	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
+	intel_ring_emit(ring, flags);
+	intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT);
+	intel_ring_emit(ring, 0);
+
+	/* Add a dummy 3D_PRIMITVE */
+	intel_ring_emit(ring, GFX_OP_3DPRIMITIVE);
+	intel_ring_emit(ring, 4); /* PrimTopoType*/
+	intel_ring_emit(ring, 0); /* VertexCountPerInstance */
+	intel_ring_emit(ring, 0); /* StartVertexLocation */
+	intel_ring_emit(ring, 0); /* InstanceCount */
+	intel_ring_emit(ring, 0); /* StartInstanceLocation */
+	intel_ring_emit(ring, 0); /* BaseVertexLocation  */
+}
+
 static inline int
 mi_set_context(struct intel_ring_buffer *ring,
 	       struct i915_hw_context *new_context,
@@ -563,7 +604,10 @@ mi_set_context(struct intel_ring_buffer *ring,
 			return ret;
 	}
 
-	ret = intel_ring_begin(ring, 6);
+	if (INTEL_INFO(ring->dev)->gen == 7)
+		ret = intel_ring_begin(ring, 6+4+8);
+	else
+		ret = intel_ring_begin(ring, 6);
 	if (ret)
 		return ret;
 
@@ -586,8 +630,15 @@ mi_set_context(struct intel_ring_buffer *ring,
 	 */
 	intel_ring_emit(ring, MI_NOOP);
 
+	/* WaSendDummy3dPrimitveAfterSetContext:ivb,vlv */
 	if (INTEL_INFO(ring->dev)->gen >= 7)
-		intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
+		if (INTEL_INFO(ring->dev)->gen == 7) {
+			mi_set_context_dummy3d_prim_wa(ring);
+			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
+			intel_ring_emit(ring, MI_NOOP);
+		} else
+			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
+
 	else
 		intel_ring_emit(ring, MI_NOOP);
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 8f84555..1128527 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -355,6 +355,7 @@
 #define   PIPE_CONTROL_STALL_AT_SCOREBOARD		(1<<1)
 #define   PIPE_CONTROL_DEPTH_CACHE_FLUSH		(1<<0)
 #define   PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */
+#define GFX_OP_3DPRIMITIVE ((0x3<<29)|(0x3<<27)|(0x3<<24)|(7-2))
 
 /*
  * Commands used only by the command parser
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index eb3dd26..834411b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -585,6 +585,15 @@ err:
 	return ret;
 }
 
+u32
+intel_get_pipe_control_scratch_addr(struct intel_ring_buffer *ring)
+{
+	if (ring->scratch.obj == NULL)
+		return 0;
+
+	return ring->scratch.gtt_offset;
+}
+
 static int init_render_ring(struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = ring->dev;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 413cdc7..ffaed8b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -291,6 +291,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev);
 
 u64 intel_ring_get_active_head(struct intel_ring_buffer *ring);
 void intel_ring_setup_status_page(struct intel_ring_buffer *ring);
+u32 intel_get_pipe_control_scratch_addr(struct intel_ring_buffer *ring);
 
 static inline u32 intel_ring_get_tail(struct intel_ring_buffer *ring)
 {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 6/6] drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv
  2014-04-01  5:22       ` Gupta, Sourab
@ 2014-04-14 10:22         ` Gupta, Sourab
  2014-05-26 10:33           ` Gupta, Sourab
  0 siblings, 1 reply; 37+ messages in thread
From: Gupta, Sourab @ 2014-04-14 10:22 UTC (permalink / raw)
  To: Lespiau, Damien
  Cc: Daniel Vetter, intel-gfx@lists.freedesktop.org, Goel, Akash,
	Gupta, Sourab

On Tue, 2014-04-01 at 10:53 +0530, sourab gupta wrote:
> On Tue, 2014-03-25 at 12:23 +0530, sourab gupta wrote:
> > On Mon, 2014-03-24 at 17:56 +0000, Lespiau, Damien wrote:
> > > On Mon, Mar 24, 2014 at 11:00:07PM +0530, sourab.gupta@intel.com wrote:
> > > > From: Akash Goel <akash.goel@intel.com>
> > > > 
> > > > For disabling L3 clock gating we need to set bit 25 of MMIO
> > > > register 940c. Earlier this was being done by just writing 1
> > > > into bit 25 and resetting all other bits.
> > > > This patch modifies the routine to read-modify-write of the
> > > > register, so that the values of other bits are not destroyed.
> > > > 
> > > > v2: Modifying the comments and the patch commit message (Chris)
> > > 
> > > This patch commit message lacks the most important information: which
> > > bit are we setting back to 0 and we shouldn't, and why is that
> > > important? We do direct writes to other registers in that function (for
> > > instance (MI_ARB_VLV just below).
> > > 
> > Hi Damien,
> > The reset value of the register is 0x00F80003. Therefore, if we directly
> > set only bit 25 to 1, without caring about other bits, the following reg
> > bits will be affected (bits 1:0, bits 23:19).
> > This doesn't seem to be the case with other regs where we are writing
> > directly (MI_ARB_VLV ) whose default value is 0.
> > So, by this commit we're just trying to set only the bit which we really
> > want to change.
> > 
> > Regards,
> > Sourab
> > 
> > 
> Hi Damien,
> Please provide your comments on the above explanation. I'll add more
> information to the commit message regarding the same, if it is okay.
> 
> Thanks,
> Sourab
> 
Hi Damien,

Waiting for feedback on the patch and the explanation. Can you please
let us know if the explained reason is good enough for the patch to be
considered. If so, it can be added to the commit message.

Regards,
Sourab

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 6/6] drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv
  2014-04-14 10:22         ` Gupta, Sourab
@ 2014-05-26 10:33           ` Gupta, Sourab
  0 siblings, 0 replies; 37+ messages in thread
From: Gupta, Sourab @ 2014-05-26 10:33 UTC (permalink / raw)
  To: Lespiau, Damien
  Cc: Daniel Vetter, intel-gfx@lists.freedesktop.org, Goel, Akash

On Mon, 2014-04-14 at 10:22 +0000, Gupta, Sourab wrote:
> On Tue, 2014-04-01 at 10:53 +0530, sourab gupta wrote:
> > On Tue, 2014-03-25 at 12:23 +0530, sourab gupta wrote:
> > > On Mon, 2014-03-24 at 17:56 +0000, Lespiau, Damien wrote:
> > > > On Mon, Mar 24, 2014 at 11:00:07PM +0530, sourab.gupta@intel.com wrote:
> > > > > From: Akash Goel <akash.goel@intel.com>
> > > > > 
> > > > > For disabling L3 clock gating we need to set bit 25 of MMIO
> > > > > register 940c. Earlier this was being done by just writing 1
> > > > > into bit 25 and resetting all other bits.
> > > > > This patch modifies the routine to read-modify-write of the
> > > > > register, so that the values of other bits are not destroyed.
> > > > > 
> > > > > v2: Modifying the comments and the patch commit message (Chris)
> > > > 
> > > > This patch commit message lacks the most important information: which
> > > > bit are we setting back to 0 and we shouldn't, and why is that
> > > > important? We do direct writes to other registers in that function (for
> > > > instance (MI_ARB_VLV just below).
> > > > 
> > > Hi Damien,
> > > The reset value of the register is 0x00F80003. Therefore, if we directly
> > > set only bit 25 to 1, without caring about other bits, the following reg
> > > bits will be affected (bits 1:0, bits 23:19).
> > > This doesn't seem to be the case with other regs where we are writing
> > > directly (MI_ARB_VLV ) whose default value is 0.
> > > So, by this commit we're just trying to set only the bit which we really
> > > want to change.
> > > 
> > > Regards,
> > > Sourab
> > > 
> > > 
> > Hi Damien,
> > Please provide your comments on the above explanation. I'll add more
> > information to the commit message regarding the same, if it is okay.
> > 
> > Thanks,
> > Sourab
> > 
> Hi Damien,
> 
> Waiting for feedback on the patch and the explanation. Can you please
> let us know if the explained reason is good enough for the patch to be
> considered. If so, it can be added to the commit message.
> 
> Regards,
> Sourab
> 
Hi,
Can you please review this patch. Waiting for the feedback.
Thanks,
Sourab

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 6/6] drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv
  2014-03-24 17:30 ` [PATCH v2 6/6] drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv sourab.gupta
  2014-03-24 17:56   ` Damien Lespiau
@ 2014-05-27 14:27   ` Damien Lespiau
  2014-05-27 16:54     ` Daniel Vetter
  1 sibling, 1 reply; 37+ messages in thread
From: Damien Lespiau @ 2014-05-27 14:27 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Daniel Vetter, intel-gfx, Akash Goel

On Mon, Mar 24, 2014 at 11:00:07PM +0530, sourab.gupta@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> For disabling L3 clock gating we need to set bit 25 of MMIO
> register 940c. Earlier this was being done by just writing 1
> into bit 25 and resetting all other bits.
> This patch modifies the routine to read-modify-write of the
> register, so that the values of other bits are not destroyed.
> 
> v2: Modifying the comments and the patch commit message (Chris)
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>

Apart from the multiline comment format and the second line not aligned
with the '(' as we usually do:

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>

-- 
Damien

> ---
>  drivers/gpu/drm/i915/intel_pm.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index c3a8554..af4bb8e 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -5093,8 +5093,11 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
>  	I915_WRITE(GEN6_UCGCTL2,
>  		   GEN6_RCZUNIT_CLOCK_GATE_DISABLE);
>  
> -	/* WaDisableL3Bank2xClockGate:vlv */
> -	I915_WRITE(GEN7_UCGCTL4, GEN7_L3BANK2X_CLOCK_GATE_DISABLE);
> +	/* WaDisableL3Bank2xClockGate:vlv
> +	 * Disabling L3 clock gating- MMIO 940c[25] = 1
> +	 * Set bit 25, to disable L3_BANK_2x_CLK_GATING */
> +	I915_WRITE(GEN7_UCGCTL4,
> +			I915_READ(GEN7_UCGCTL4) | GEN7_L3BANK2X_CLOCK_GATE_DISABLE);
>  
>  	I915_WRITE(MI_ARB_VLV, MI_ARB_DISPLAY_TRICKLE_FEED_DISABLE);
>  
> -- 
> 1.8.5.1
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 6/6] drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv
  2014-05-27 14:27   ` Damien Lespiau
@ 2014-05-27 16:54     ` Daniel Vetter
  0 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2014-05-27 16:54 UTC (permalink / raw)
  To: Damien Lespiau; +Cc: Daniel Vetter, intel-gfx, sourab.gupta, Akash Goel

On Tue, May 27, 2014 at 03:27:23PM +0100, Damien Lespiau wrote:
> On Mon, Mar 24, 2014 at 11:00:07PM +0530, sourab.gupta@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > For disabling L3 clock gating we need to set bit 25 of MMIO
> > register 940c. Earlier this was being done by just writing 1
> > into bit 25 and resetting all other bits.
> > This patch modifies the routine to read-modify-write of the
> > register, so that the values of other bits are not destroyed.
> > 
> > v2: Modifying the comments and the patch commit message (Chris)
> > 
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> 
> Apart from the multiline comment format and the second line not aligned
> with the '(' as we usually do:

Fixed.

> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>

Queued for -next, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v5 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext'
  2014-04-14  9:45     ` [PATCH v5 " sourab.gupta
@ 2014-05-28  9:57       ` Gupta, Sourab
  2014-06-05  5:44         ` Gupta, Sourab
  0 siblings, 1 reply; 37+ messages in thread
From: Gupta, Sourab @ 2014-05-28  9:57 UTC (permalink / raw)
  To: intel-gfx@lists.freedesktop.org; +Cc: S, Deepak, Goel, Akash

On Mon, 2014-04-14 at 09:45 +0000, Gupta, Sourab wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> This workaround is needed on VLV for the HW context feature.
> It is used after adding the mi_set_context command in ring buffer
> for Hw context switch. As per the spec
> "The software must send a pipe_control with a CS stall and a post sync
> operation and then a dummy DRAW after every MI_SET_CONTEXT and after any
> PIPELINE_SELECT that is enabling 3D mode".
> Tested only for vlv.
> 
> v2: Modified the WA comment. (Ville)
> 
> v3: Added the vlv identifier with the WA name
> 
> v4: Check removed for scratch page initialization. (Chris/Daniel)
> 
> v5: Refactored based on latest codebase. Also WA added for full Gen7.
> 
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 55 +++++++++++++++++++++++++++++++--
>  drivers/gpu/drm/i915/i915_reg.h         |  1 +
>  drivers/gpu/drm/i915/intel_ringbuffer.c |  9 ++++++
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
>  4 files changed, 64 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index f77b4c1..b6d2a67 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -545,6 +545,47 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
>  	return ctx;
>  }
>  
> +static inline void
> +mi_set_context_dummy3d_prim_wa(struct intel_ring_buffer *ring)
> +{
> +	u32 scratch_addr;
> +	u32 flags = 0;
> +
> +	/* Actual scratch location is at 128 bytes offset */
> +	scratch_addr = intel_get_pipe_control_scratch_addr(ring) + 128;
> +
> +	/*
> +	 * WaSendDummy3dPrimitveAfterSetContext:ivb,vlv
> +	 * Software must send a pipe_control with a CS stall
> +	 * and a post sync operation and then a dummy DRAW after
> +	 * every MI_SET_CONTEXT and after any PIPELINE_SELECT that
> +	 * is enabling 3D mode. A dummy draw is a 3DPRIMITIVE command
> +	 * with Indirect Parameter Enable set to 0, UAV Coherency
> +	 * Required set to 0, Predicate Enable set to 0,
> +	 * End Offset Enable set to 0, and Vertex Count Per Instance
> +	 * set to 0, All other parameters are a don't care.
> +	 */
> +
> +	/*
> +	 * Add a pipe control with CS Stall and postsync op
> +	 * before dummy 3D_PRIMITIVE
> +	 */
> +	flags |= PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL;
> +	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
> +	intel_ring_emit(ring, flags);
> +	intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT);
> +	intel_ring_emit(ring, 0);
> +
> +	/* Add a dummy 3D_PRIMITVE */
> +	intel_ring_emit(ring, GFX_OP_3DPRIMITIVE);
> +	intel_ring_emit(ring, 4); /* PrimTopoType*/
> +	intel_ring_emit(ring, 0); /* VertexCountPerInstance */
> +	intel_ring_emit(ring, 0); /* StartVertexLocation */
> +	intel_ring_emit(ring, 0); /* InstanceCount */
> +	intel_ring_emit(ring, 0); /* StartInstanceLocation */
> +	intel_ring_emit(ring, 0); /* BaseVertexLocation  */
> +}
> +
>  static inline int
>  mi_set_context(struct intel_ring_buffer *ring,
>  	       struct i915_hw_context *new_context,
> @@ -563,7 +604,10 @@ mi_set_context(struct intel_ring_buffer *ring,
>  			return ret;
>  	}
>  
> -	ret = intel_ring_begin(ring, 6);
> +	if (INTEL_INFO(ring->dev)->gen == 7)
> +		ret = intel_ring_begin(ring, 6+4+8);
> +	else
> +		ret = intel_ring_begin(ring, 6);
>  	if (ret)
>  		return ret;
>  
> @@ -586,8 +630,15 @@ mi_set_context(struct intel_ring_buffer *ring,
>  	 */
>  	intel_ring_emit(ring, MI_NOOP);
>  
> +	/* WaSendDummy3dPrimitveAfterSetContext:ivb,vlv */
>  	if (INTEL_INFO(ring->dev)->gen >= 7)
> -		intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
> +		if (INTEL_INFO(ring->dev)->gen == 7) {
> +			mi_set_context_dummy3d_prim_wa(ring);
> +			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
> +			intel_ring_emit(ring, MI_NOOP);
> +		} else
> +			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
> +
>  	else
>  		intel_ring_emit(ring, MI_NOOP);
>  
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 8f84555..1128527 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -355,6 +355,7 @@
>  #define   PIPE_CONTROL_STALL_AT_SCOREBOARD		(1<<1)
>  #define   PIPE_CONTROL_DEPTH_CACHE_FLUSH		(1<<0)
>  #define   PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */
> +#define GFX_OP_3DPRIMITIVE ((0x3<<29)|(0x3<<27)|(0x3<<24)|(7-2))
>  
>  /*
>   * Commands used only by the command parser
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index eb3dd26..834411b 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -585,6 +585,15 @@ err:
>  	return ret;
>  }
>  
> +u32
> +intel_get_pipe_control_scratch_addr(struct intel_ring_buffer *ring)
> +{
> +	if (ring->scratch.obj == NULL)
> +		return 0;
> +
> +	return ring->scratch.gtt_offset;
> +}
> +
>  static int init_render_ring(struct intel_ring_buffer *ring)
>  {
>  	struct drm_device *dev = ring->dev;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 413cdc7..ffaed8b 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -291,6 +291,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev);
>  
>  u64 intel_ring_get_active_head(struct intel_ring_buffer *ring);
>  void intel_ring_setup_status_page(struct intel_ring_buffer *ring);
> +u32 intel_get_pipe_control_scratch_addr(struct intel_ring_buffer *ring);
>  
>  static inline u32 intel_ring_get_tail(struct intel_ring_buffer *ring)
>  {

Hi,

Can you please review this WA patch.

Thanks,
Sourab

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v5 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext'
  2014-05-28  9:57       ` Gupta, Sourab
@ 2014-06-05  5:44         ` Gupta, Sourab
  0 siblings, 0 replies; 37+ messages in thread
From: Gupta, Sourab @ 2014-06-05  5:44 UTC (permalink / raw)
  To: intel-gfx@lists.freedesktop.org; +Cc: S, Deepak, Goel, Akash

On Wed, 2014-05-28 at 15:27 +0530, sourab gupta wrote:
> On Mon, 2014-04-14 at 09:45 +0000, Gupta, Sourab wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > This workaround is needed on VLV for the HW context feature.
> > It is used after adding the mi_set_context command in ring buffer
> > for Hw context switch. As per the spec
> > "The software must send a pipe_control with a CS stall and a post sync
> > operation and then a dummy DRAW after every MI_SET_CONTEXT and after any
> > PIPELINE_SELECT that is enabling 3D mode".
> > Tested only for vlv.
> > 
> > v2: Modified the WA comment. (Ville)
> > 
> > v3: Added the vlv identifier with the WA name
> > 
> > v4: Check removed for scratch page initialization. (Chris/Daniel)
> > 
> > v5: Refactored based on latest codebase. Also WA added for full Gen7.
> > 
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_context.c | 55 +++++++++++++++++++++++++++++++--
> >  drivers/gpu/drm/i915/i915_reg.h         |  1 +
> >  drivers/gpu/drm/i915/intel_ringbuffer.c |  9 ++++++
> >  drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
> >  4 files changed, 64 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index f77b4c1..b6d2a67 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -545,6 +545,47 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
> >  	return ctx;
> >  }
> >  
> > +static inline void
> > +mi_set_context_dummy3d_prim_wa(struct intel_ring_buffer *ring)
> > +{
> > +	u32 scratch_addr;
> > +	u32 flags = 0;
> > +
> > +	/* Actual scratch location is at 128 bytes offset */
> > +	scratch_addr = intel_get_pipe_control_scratch_addr(ring) + 128;
> > +
> > +	/*
> > +	 * WaSendDummy3dPrimitveAfterSetContext:ivb,vlv
> > +	 * Software must send a pipe_control with a CS stall
> > +	 * and a post sync operation and then a dummy DRAW after
> > +	 * every MI_SET_CONTEXT and after any PIPELINE_SELECT that
> > +	 * is enabling 3D mode. A dummy draw is a 3DPRIMITIVE command
> > +	 * with Indirect Parameter Enable set to 0, UAV Coherency
> > +	 * Required set to 0, Predicate Enable set to 0,
> > +	 * End Offset Enable set to 0, and Vertex Count Per Instance
> > +	 * set to 0, All other parameters are a don't care.
> > +	 */
> > +
> > +	/*
> > +	 * Add a pipe control with CS Stall and postsync op
> > +	 * before dummy 3D_PRIMITIVE
> > +	 */
> > +	flags |= PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL;
> > +	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
> > +	intel_ring_emit(ring, flags);
> > +	intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT);
> > +	intel_ring_emit(ring, 0);
> > +
> > +	/* Add a dummy 3D_PRIMITVE */
> > +	intel_ring_emit(ring, GFX_OP_3DPRIMITIVE);
> > +	intel_ring_emit(ring, 4); /* PrimTopoType*/
> > +	intel_ring_emit(ring, 0); /* VertexCountPerInstance */
> > +	intel_ring_emit(ring, 0); /* StartVertexLocation */
> > +	intel_ring_emit(ring, 0); /* InstanceCount */
> > +	intel_ring_emit(ring, 0); /* StartInstanceLocation */
> > +	intel_ring_emit(ring, 0); /* BaseVertexLocation  */
> > +}
> > +
> >  static inline int
> >  mi_set_context(struct intel_ring_buffer *ring,
> >  	       struct i915_hw_context *new_context,
> > @@ -563,7 +604,10 @@ mi_set_context(struct intel_ring_buffer *ring,
> >  			return ret;
> >  	}
> >  
> > -	ret = intel_ring_begin(ring, 6);
> > +	if (INTEL_INFO(ring->dev)->gen == 7)
> > +		ret = intel_ring_begin(ring, 6+4+8);
> > +	else
> > +		ret = intel_ring_begin(ring, 6);
> >  	if (ret)
> >  		return ret;
> >  
> > @@ -586,8 +630,15 @@ mi_set_context(struct intel_ring_buffer *ring,
> >  	 */
> >  	intel_ring_emit(ring, MI_NOOP);
> >  
> > +	/* WaSendDummy3dPrimitveAfterSetContext:ivb,vlv */
> >  	if (INTEL_INFO(ring->dev)->gen >= 7)
> > -		intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
> > +		if (INTEL_INFO(ring->dev)->gen == 7) {
> > +			mi_set_context_dummy3d_prim_wa(ring);
> > +			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
> > +			intel_ring_emit(ring, MI_NOOP);
> > +		} else
> > +			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
> > +
> >  	else
> >  		intel_ring_emit(ring, MI_NOOP);
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index 8f84555..1128527 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -355,6 +355,7 @@
> >  #define   PIPE_CONTROL_STALL_AT_SCOREBOARD		(1<<1)
> >  #define   PIPE_CONTROL_DEPTH_CACHE_FLUSH		(1<<0)
> >  #define   PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */
> > +#define GFX_OP_3DPRIMITIVE ((0x3<<29)|(0x3<<27)|(0x3<<24)|(7-2))
> >  
> >  /*
> >   * Commands used only by the command parser
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index eb3dd26..834411b 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -585,6 +585,15 @@ err:
> >  	return ret;
> >  }
> >  
> > +u32
> > +intel_get_pipe_control_scratch_addr(struct intel_ring_buffer *ring)
> > +{
> > +	if (ring->scratch.obj == NULL)
> > +		return 0;
> > +
> > +	return ring->scratch.gtt_offset;
> > +}
> > +
> >  static int init_render_ring(struct intel_ring_buffer *ring)
> >  {
> >  	struct drm_device *dev = ring->dev;
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> > index 413cdc7..ffaed8b 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> > @@ -291,6 +291,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev);
> >  
> >  u64 intel_ring_get_active_head(struct intel_ring_buffer *ring);
> >  void intel_ring_setup_status_page(struct intel_ring_buffer *ring);
> > +u32 intel_get_pipe_control_scratch_addr(struct intel_ring_buffer *ring);
> >  
> >  static inline u32 intel_ring_get_tail(struct intel_ring_buffer *ring)
> >  {
> 
> Hi,
> 
> Can you please review this WA patch.
> 
> Thanks,
> Sourab
> 
Hi,
Can you please provide your comments on above WA patch.
Thanks,
Sourab

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2014-06-05  5:44 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-24 17:30 [PATCH 0/6] Rendering Specific HW Workarounds for VLV sourab.gupta
2014-03-24 17:30 ` [PATCH v4 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore' sourab.gupta
2014-03-24 17:30 ` [PATCH v4 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext' sourab.gupta
2014-04-08  4:41   ` Gupta, Sourab
2014-04-14  9:45     ` [PATCH v5 " sourab.gupta
2014-05-28  9:57       ` Gupta, Sourab
2014-06-05  5:44         ` Gupta, Sourab
2014-03-24 17:30 ` [PATCH v2 3/6] drm/i915: Enabling the TLB invalidate bit in GFX Mode register sourab.gupta
2014-04-01  5:01   ` Gupta, Sourab
2014-04-02 11:34   ` Ville Syrjälä
2014-04-02 11:55     ` Daniel Vetter
2014-03-24 17:30 ` [PATCH 4/6] drm/i915/vlv: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg sourab.gupta
2014-03-24 17:47   ` Chris Wilson
2014-03-24 17:55     ` Gupta, Sourab
2014-03-24 18:01       ` Chris Wilson
2014-03-24 18:28         ` [PATCH v2 " sourab.gupta
2014-03-25 11:33           ` Ville Syrjälä
2014-03-25 12:31             ` [PATCH v3 4/6] drm/i915: " sourab.gupta
2014-03-25 13:11               ` Ville Syrjälä
2014-03-25 15:41                 ` Daniel Vetter
2014-03-24 17:30 ` [PATCH v2 5/6] drm/i915/vlv:Implement the WA 'WaDisable_RenderCache_OperationalFlush' sourab.gupta
2014-04-01 10:51   ` Ville Syrjälä
2014-04-03  4:42     ` [PATCH v3 " sourab.gupta
2014-04-04 11:17       ` Ville Syrjälä
2014-04-04 11:44         ` [PATCH v4 " sourab.gupta
2014-04-04 15:24           ` Chris Wilson
2014-04-04 15:35             ` Ville Syrjälä
2014-04-04 15:59               ` Chris Wilson
2014-04-04 15:59             ` Daniel Vetter
2014-03-24 17:30 ` [PATCH v2 6/6] drm/i915/vlv: Modifying WA 'WaDisableL3Bank2xClockGate for vlv sourab.gupta
2014-03-24 17:56   ` Damien Lespiau
2014-03-25  6:52     ` Gupta, Sourab
2014-04-01  5:22       ` Gupta, Sourab
2014-04-14 10:22         ` Gupta, Sourab
2014-05-26 10:33           ` Gupta, Sourab
2014-05-27 14:27   ` Damien Lespiau
2014-05-27 16:54     ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox