public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH 0/6] Rendering specific Hw workarounds for VLV
@ 2014-01-22  3:45 akash.goel
  2014-01-22  3:45 ` [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore' akash.goel
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: akash.goel @ 2014-01-22  3:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

The following patches leads to stable behavior on VLV, especially
when playing 3D Apps, benchmarks.

Akash Goel (6):
  drm/i915/vlv: Added a rendering specific Hw WA
    'WaTlbInvalidateStoreDataBefore'
  drm/i915/vlv: Added a rendering specific Hw WA
    'WaReadAfterWriteHazard'
  drm/i915/vlv: Modified the programming of 2 regs in Ring
    initialisation
  drm/i915/vlv: Added 3 rendering specific Hw Workarounds in clock
    gating fn
  drm/i915/vlv: Removed 3 rendering specific Hw WA from clock gating fn
  drm/i915/vlv: Added a rendering specific Hw WA
    'WaSendDummy3dPrimitveAfterSetContext'

 drivers/gpu/drm/i915/i915_gem_context.c | 64 +++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_reg.h         |  6 +++
 drivers/gpu/drm/i915/intel_pm.c         | 32 +++++++++-----
 drivers/gpu/drm/i915/intel_ringbuffer.c | 75 ++++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 5 files changed, 160 insertions(+), 18 deletions(-)

-- 
1.8.5.2

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore'
  2014-01-22  3:45 [PATCH 0/6] Rendering specific Hw workarounds for VLV akash.goel
@ 2014-01-22  3:45 ` akash.goel
  2014-01-22 10:51   ` Ville Syrjälä
  2014-01-22  3:45 ` [PATCH 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaReadAfterWriteHazard' akash.goel
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: akash.goel @ 2014-01-22  3:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
Store data commands.

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 442c9a6..133d273 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2177,6 +2177,28 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
 	uint32_t flush_domains;
 	int ret;
 
+	if (IS_VALLEYVIEW(ring->dev)) {
+		/*
+		 * WaTlbInvalidateStoreDataBefore
+		 * Before pipecontrol with TLB invalidate set, need 2 store
+		 * data commands (such as MI_STORE_DATA_IMM or MI_STORE_DATA_INDEX)
+		 * Without this, hardware cannot guarantee the command after the
+		 * PIPE_CONTROL with TLB inv will not use the old TLB values.
+		 */
+		int i;
+		ret = intel_ring_begin(ring, 4 * 2);
+		if (ret)
+			return ret;
+		for (i = 0; i < 2; i++) {
+			intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
+			intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_INDEX <<
+						MI_STORE_DWORD_INDEX_SHIFT);
+			intel_ring_emit(ring, 0);
+			intel_ring_emit(ring, MI_NOOP);
+		}
+		intel_ring_advance(ring);
+	}
+
 	flush_domains = 0;
 	if (ring->gpu_caches_dirty)
 		flush_domains = I915_GEM_GPU_DOMAINS;
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaReadAfterWriteHazard'
  2014-01-22  3:45 [PATCH 0/6] Rendering specific Hw workarounds for VLV akash.goel
  2014-01-22  3:45 ` [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore' akash.goel
@ 2014-01-22  3:45 ` akash.goel
  2014-01-22 10:54   ` Ville Syrjälä
  2014-01-22  3:45 ` [PATCH 3/6] drm/i915/vlv: Modified the programming of 2 regs in Ring initialisation akash.goel
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: akash.goel @ 2014-01-22  3:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Added a new rendering specific Workaround 'WaReadAfterWriteHazard'.
In this WA, need to add 12 MI Store Dword commands to ensure proper
flush of h/w pipeline.

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 133d273..e8ec536 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2167,6 +2167,31 @@ intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
 
 	trace_i915_gem_ring_flush(ring, 0, I915_GEM_GPU_DOMAINS);
 
+	if (IS_VALLEYVIEW(ring->dev)) {
+		/*
+		 * WaReadAfterWriteHazard
+		 * Send a number of Store Data commands here to finish
+		 * flushing hardware pipeline.This is needed in the case
+		 * where the next workload tries reading from the same
+		 * surface that this batch writes to. Without these StoreDWs,
+		 * not all of the data will actually be flushd to the surface
+		 * by the time the next batch starts reading it, possibly
+		 * causing a small amount of corruption.
+		 */
+		int i;
+		ret = intel_ring_begin(ring, 4 * 12);
+		if (ret)
+			return ret;
+		for (i = 0; i < 12; i++) {
+			intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
+			intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_INDEX <<
+							MI_STORE_DWORD_INDEX_SHIFT);
+			intel_ring_emit(ring, 0);
+			intel_ring_emit(ring, MI_NOOP);
+		}
+		intel_ring_advance(ring);
+	}
+
 	ring->gpu_caches_dirty = false;
 	return 0;
 }
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/6] drm/i915/vlv: Modified the programming of 2 regs in Ring initialisation
  2014-01-22  3:45 [PATCH 0/6] Rendering specific Hw workarounds for VLV akash.goel
  2014-01-22  3:45 ` [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore' akash.goel
  2014-01-22  3:45 ` [PATCH 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaReadAfterWriteHazard' akash.goel
@ 2014-01-22  3:45 ` akash.goel
  2014-01-22 11:01   ` Ville Syrjälä
  2014-01-22  3:45 ` [PATCH 4/6] drm/i915/vlv: Added 3 rendering specific Hw Workarounds in clock gating fn akash.goel
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: akash.goel @ 2014-01-22  3:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Modified programming of following 2 regs in Render ring initialisation fn.
1. GFX_MODE_GEN7 (Enabling TLB invalidate)
2. MI_MODE (Enabling MI Flush)

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e8ec536..8b99df2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -563,7 +563,9 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 	int ret = init_ring_common(ring);
 
 	if (INTEL_INFO(dev)->gen > 3)
-		I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
+		if (!IS_VALLEYVIEW(dev))
+			I915_WRITE(MI_MODE,
+				_MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
 
 	/* We need to disable the AsyncFlip performance optimisations in order
 	 * to use MI_WAIT_FOR_EVENT within the CS. It should already be
@@ -579,10 +581,17 @@ static int init_render_ring(struct intel_ring_buffer *ring)
 		I915_WRITE(GFX_MODE,
 			   _MASKED_BIT_ENABLE(GFX_TLB_INVALIDATE_ALWAYS));
 
-	if (IS_GEN7(dev))
-		I915_WRITE(GFX_MODE_GEN7,
-			   _MASKED_BIT_DISABLE(GFX_TLB_INVALIDATE_ALWAYS) |
-			   _MASKED_BIT_ENABLE(GFX_REPLAY_MODE));
+	if (IS_GEN7(dev)) {
+		if (IS_VALLEYVIEW(dev)) {
+			I915_WRITE(GFX_MODE_GEN7,
+				_MASKED_BIT_ENABLE(GFX_REPLAY_MODE));
+			I915_WRITE(MI_MODE, I915_READ(MI_MODE) |
+				_MASKED_BIT_ENABLE(MI_FLUSH_ENABLE));
+		} else
+			I915_WRITE(GFX_MODE_GEN7,
+				_MASKED_BIT_DISABLE(GFX_TLB_INVALIDATE_ALWAYS) |
+				_MASKED_BIT_ENABLE(GFX_REPLAY_MODE));
+	}
 
 	if (INTEL_INFO(dev)->gen >= 5) {
 		ret = init_pipe_control(ring);
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/6] drm/i915/vlv: Added 3 rendering specific Hw Workarounds in clock gating fn
  2014-01-22  3:45 [PATCH 0/6] Rendering specific Hw workarounds for VLV akash.goel
                   ` (2 preceding siblings ...)
  2014-01-22  3:45 ` [PATCH 3/6] drm/i915/vlv: Modified the programming of 2 regs in Ring initialisation akash.goel
@ 2014-01-22  3:45 ` akash.goel
  2014-01-22 11:10   ` Ville Syrjälä
  2014-01-22  3:45 ` [PATCH 5/6] drm/i915/vlv: Removed 3 rendering specific Hw WA from clock gating fn akash.goel
  2014-01-22  3:45 ` [PATCH 6/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext' akash.goel
  5 siblings, 1 reply; 25+ messages in thread
From: akash.goel @ 2014-01-22  3:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Added 2 new rendering specific Workarounds
1. WaDisable_RenderCache_OperationalFlush
     Operational flush cannot be enabled on
     BWG A0 [Errata BWT006]
2. WaVSThreadDispatchOverride
     Performance optimization - Hw will decide which
     half slice the thread will dispatch, May not be
     really needed for VLV, as its single slice

Modified the implementation of 1 workaround
1. WaDisableL3Bank2xClockGate
    Disabling L3 clock gating- MMIO 940c[25] = 1

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h |  3 +++
 drivers/gpu/drm/i915/intel_pm.c | 22 +++++++++++++++++++++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index a699efd..d829754 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -934,6 +934,9 @@
 #define   ECO_GATING_CX_ONLY	(1<<3)
 #define   ECO_FLIP_DONE		(1<<0)
 
+#define GEN7_CACHE_MODE_0	0x07000 /* IVB+ only */
+#define GEN7_RC_OP_FLUSH_ENABLE (1<<0)
+
 #define CACHE_MODE_1		0x7004 /* IVB+ */
 #define   PIXEL_SUBSPAN_COLLECT_OPT_DISABLE (1<<6)
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 469170c..4c36ff8 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4955,6 +4955,12 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
 	I915_WRITE(GEN7_L3CNTLREG1, I915_READ(GEN7_L3CNTLREG1) | GEN7_L3AGDIS);
 	I915_WRITE(GEN7_L3_CHICKEN_MODE_REGISTER, GEN7_WA_L3_CHICKEN_MODE);
 
+	/* WaDisable_RenderCache_OperationalFlush
+	 * Clear bit 0, so we do a AND with the mask
+	 * to keep other bits the same */
+	I915_WRITE(GEN7_CACHE_MODE_0,  (I915_READ(GEN7_CACHE_MODE_0) |
+			  _MASKED_BIT_DISABLE(GEN7_RC_OP_FLUSH_ENABLE)));
+
 	/* WaForceL3Serialization:vlv */
 	I915_WRITE(GEN7_L3SQCREG4, I915_READ(GEN7_L3SQCREG4) &
 		   ~L3SQ_URB_READ_CAM_MATCH_DISABLE);
@@ -4991,10 +4997,24 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
 		   GEN6_RCPBUNIT_CLOCK_GATE_DISABLE |
 		   GEN6_RCCUNIT_CLOCK_GATE_DISABLE);
 
-	I915_WRITE(GEN7_UCGCTL4, GEN7_L3BANK2X_CLOCK_GATE_DISABLE);
+	/* WaDisableL3Bank2xClockGate
+	 * Disabling L3 clock gating- MMIO 940c[25] = 1
+	 * Set bit 25, to disable L3_BANK_2x_CLK_GATING */
+	I915_WRITE(GEN7_UCGCTL4,
+		I915_READ(GEN7_UCGCTL4) | GEN7_L3BANK2X_CLOCK_GATE_DISABLE);
 
 	I915_WRITE(MI_ARB_VLV, MI_ARB_DISPLAY_TRICKLE_FEED_DISABLE);
 
+	/* WaVSThreadDispatchOverride
+	 * Hw will decide which half slice the thread will dispatch.
+	 * May not be needed for VLV, as its a single slice */
+	I915_WRITE(GEN7_CACHE_MODE_0,
+		I915_READ(GEN7_FF_THREAD_MODE) &
+		(~GEN7_FF_VS_SCHED_LOAD_BALANCE));
+
+	/* WaDisable4x2SubspanOptimization,
+	 * Disable combining of two 2x2 subspans into a 4x2 subspan
+	 * Set chicken bit to disable subspan optimization */
 	I915_WRITE(CACHE_MODE_1,
 		   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
 
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/6] drm/i915/vlv: Removed 3 rendering specific Hw WA from clock gating fn
  2014-01-22  3:45 [PATCH 0/6] Rendering specific Hw workarounds for VLV akash.goel
                   ` (3 preceding siblings ...)
  2014-01-22  3:45 ` [PATCH 4/6] drm/i915/vlv: Added 3 rendering specific Hw Workarounds in clock gating fn akash.goel
@ 2014-01-22  3:45 ` akash.goel
  2014-01-22 11:11   ` Ville Syrjälä
  2014-01-22  3:45 ` [PATCH 6/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext' akash.goel
  5 siblings, 1 reply; 25+ messages in thread
From: akash.goel @ 2014-01-22  3:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Removed 3 workarounds as not needed for VLV+(B0 onwards)
1. WaDisableRHWOOptimizationForRenderHang
2. WaDisableL3CacheAging
3. WaDisableDopClockGating

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 4c36ff8..e4d220c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4947,12 +4947,6 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
 		   _MASKED_BIT_ENABLE(GEN7_MAX_PS_THREAD_DEP |
 				      GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
 
-	/* Apply the WaDisableRHWOOptimizationForRenderHang:vlv workaround. */
-	I915_WRITE(GEN7_COMMON_SLICE_CHICKEN1,
-		   GEN7_CSC1_RHWO_OPT_DISABLE_IN_RCC);
-
-	/* WaApplyL3ControlAndL3ChickenMode:vlv */
-	I915_WRITE(GEN7_L3CNTLREG1, I915_READ(GEN7_L3CNTLREG1) | GEN7_L3AGDIS);
 	I915_WRITE(GEN7_L3_CHICKEN_MODE_REGISTER, GEN7_WA_L3_CHICKEN_MODE);
 
 	/* WaDisable_RenderCache_OperationalFlush
@@ -4965,10 +4959,6 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
 	I915_WRITE(GEN7_L3SQCREG4, I915_READ(GEN7_L3SQCREG4) &
 		   ~L3SQ_URB_READ_CAM_MATCH_DISABLE);
 
-	/* WaDisableDopClockGating:vlv */
-	I915_WRITE(GEN7_ROW_CHICKEN2,
-		   _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
-
 	/* This is required by WaCatErrorRejectionIssue:vlv */
 	I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
 		   I915_READ(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG) |
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 6/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext'
  2014-01-22  3:45 [PATCH 0/6] Rendering specific Hw workarounds for VLV akash.goel
                   ` (4 preceding siblings ...)
  2014-01-22  3:45 ` [PATCH 5/6] drm/i915/vlv: Removed 3 rendering specific Hw WA from clock gating fn akash.goel
@ 2014-01-22  3:45 ` akash.goel
  2014-01-22 11:18   ` Ville Syrjälä
  5 siblings, 1 reply; 25+ messages in thread
From: akash.goel @ 2014-01-22  3:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

This workaround is needed on VLV for the HW context feature.
It is used after adding the mi_set_context command in ring buffer
for Hw context switch. As per the spec
"The software must send a pipe_control with a CS stall and a post sync
operation and then a dummy DRAW after every MI_SET_CONTEXT and after any
PIPELINE_SELECT that is enabling 3D mode".

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 64 +++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_reg.h         |  3 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  9 +++++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 4 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index ebe0f67..62a5362 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -532,6 +532,58 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 	return (struct i915_hw_context *)idr_find(&file_priv->context_idr, id);
 }
 
+static inline void
+mi_set_context_dummy3d_prim_wa(struct intel_ring_buffer *ring)
+{
+	u32 scratch_addr;
+	u32 flags = 0;
+
+	/*
+	 * Check if we have the scratch page allocated needed
+	 * for the Pipe Control command, otherwise don't apply
+	 * the dummmy 3d primitive workaround & add NOOPs instead
+	 */
+	if (get_pipe_control_scratch_addr(ring)) {
+		/* Actual scratch location is at 128 bytes offset */
+		scratch_addr = get_pipe_control_scratch_addr(ring) + 128;
+
+		/*
+		 * WaSendDummy3dPrimitveAfterSetContext
+		 * Software must send a pipe_control with a CS stall
+		 * and a post sync operation and then a dummy DRAW after
+		 * every MI_SET_CONTEXT and after any PIPELINE_SELECT that
+		 * is enabling 3D mode. A dummy draw is a 3DPRIMITIVE command
+		 * with Indirect Parameter Enable set to 0, UAV Coherency
+		 * Required set to 0, Predicate Enable set to 0,
+		 * End Offset Enable set to 0, and Vertex Count Per Instance
+		 * set to 0, All other parameters are a don't care.
+		 */
+
+		/*
+		 * Add a pipe control with CS Stall and postsync op
+		 * before dummy 3D_PRIMITIVE
+		 */
+		flags |= PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL;
+		intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
+		intel_ring_emit(ring, flags);
+		intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT);
+		intel_ring_emit(ring, 0);
+
+		/* Add a dummy 3D_PRIMITVE */
+		intel_ring_emit(ring, GFX_OP_3DPRIMITIVE());
+		intel_ring_emit(ring, 4); /* PrimTopoType*/
+		intel_ring_emit(ring, 0); /* VertexCountPerInstance */
+		intel_ring_emit(ring, 0); /* StartVertexLocation */
+		intel_ring_emit(ring, 0); /* InstanceCount */
+		intel_ring_emit(ring, 0); /* StartInstanceLocation */
+		intel_ring_emit(ring, 0); /* BaseVertexLocation  */
+	} else {
+		int i;
+		for (i = 0; i < 11; i++)
+			intel_ring_emit(ring, MI_NOOP);
+	}
+}
+
 static inline int
 mi_set_context(struct intel_ring_buffer *ring,
 	       struct i915_hw_context *new_context,
@@ -550,7 +602,10 @@ mi_set_context(struct intel_ring_buffer *ring,
 			return ret;
 	}
 
-	ret = intel_ring_begin(ring, 6);
+	if (IS_VALLEYVIEW(ring->dev))
+		ret = intel_ring_begin(ring, 6+4+8);
+	else
+		ret = intel_ring_begin(ring, 6);
 	if (ret)
 		return ret;
 
@@ -571,7 +626,12 @@ mi_set_context(struct intel_ring_buffer *ring,
 	intel_ring_emit(ring, MI_NOOP);
 
 	if (IS_GEN7(ring->dev))
-		intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
+		if (IS_VALLEYVIEW(ring->dev)) {
+			mi_set_context_dummy3d_prim_wa(ring);
+			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
+			intel_ring_emit(ring, MI_NOOP);
+		} else
+			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
 	else
 		intel_ring_emit(ring, MI_NOOP);
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d829754..649106d 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -335,6 +335,9 @@
 #define   PIPE_CONTROL_DEPTH_CACHE_FLUSH		(1<<0)
 #define   PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */
 
+#define GFX_OP_3DPRIMITIVE()              \
+	((0x3<<29)|(0x3<<27)|(0x3<<24)|       \
+	 (0x0<<16)|(0x0<<10)|(0x0<<8)|(7-2))
 
 /*
  * Reset registers
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 8b99df2..a93b631 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -556,6 +556,15 @@ err:
 	return ret;
 }
 
+u32
+get_pipe_control_scratch_addr(struct intel_ring_buffer *ring)
+{
+	if (ring->scratch.obj == NULL)
+		return 0;
+
+	return ring->scratch.gtt_offset;
+}
+
 static int init_render_ring(struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = ring->dev;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 71a73f4..2ae6029 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -257,6 +257,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev);
 
 u32 intel_ring_get_active_head(struct intel_ring_buffer *ring);
 void intel_ring_setup_status_page(struct intel_ring_buffer *ring);
+u32 get_pipe_control_scratch_addr(struct intel_ring_buffer *ring);
 
 static inline u32 intel_ring_get_tail(struct intel_ring_buffer *ring)
 {
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore'
  2014-01-22  3:45 ` [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore' akash.goel
@ 2014-01-22 10:51   ` Ville Syrjälä
  0 siblings, 0 replies; 25+ messages in thread
From: Ville Syrjälä @ 2014-01-22 10:51 UTC (permalink / raw)
  To: akash.goel; +Cc: intel-gfx

On Wed, Jan 22, 2014 at 09:15:05AM +0530, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> Store data commands.
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 442c9a6..133d273 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2177,6 +2177,28 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
>  	uint32_t flush_domains;
>  	int ret;
>  
> +	if (IS_VALLEYVIEW(ring->dev)) {
> +		/*
> +		 * WaTlbInvalidateStoreDataBefore
> +		 * Before pipecontrol with TLB invalidate set, need 2 store
> +		 * data commands (such as MI_STORE_DATA_IMM or MI_STORE_DATA_INDEX)
> +		 * Without this, hardware cannot guarantee the command after the
> +		 * PIPE_CONTROL with TLB inv will not use the old TLB values.
> +		 */
> +		int i;
> +		ret = intel_ring_begin(ring, 4 * 2);
> +		if (ret)
> +			return ret;
> +		for (i = 0; i < 2; i++) {
> +			intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
> +			intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_INDEX <<
> +						MI_STORE_DWORD_INDEX_SHIFT);
> +			intel_ring_emit(ring, 0);
> +			intel_ring_emit(ring, MI_NOOP);
> +		}
> +		intel_ring_advance(ring);
> +	}

This workaround is listed for everything SNB+, so it would seem we
should just check for gen>=6.

Also I think it should be placed inside the ring .flush() functions since
we call those w/ invalidate_domains!=0 from other places as well.

> +
>  	flush_domains = 0;
>  	if (ring->gpu_caches_dirty)
>  		flush_domains = I915_GEM_GPU_DOMAINS;
> -- 
> 1.8.5.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaReadAfterWriteHazard'
  2014-01-22  3:45 ` [PATCH 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaReadAfterWriteHazard' akash.goel
@ 2014-01-22 10:54   ` Ville Syrjälä
  2014-01-22 11:11     ` Chris Wilson
  0 siblings, 1 reply; 25+ messages in thread
From: Ville Syrjälä @ 2014-01-22 10:54 UTC (permalink / raw)
  To: akash.goel; +Cc: intel-gfx

On Wed, Jan 22, 2014 at 09:15:06AM +0530, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Added a new rendering specific Workaround 'WaReadAfterWriteHazard'.
> In this WA, need to add 12 MI Store Dword commands to ensure proper
> flush of h/w pipeline.
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 133d273..e8ec536 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2167,6 +2167,31 @@ intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
>  
>  	trace_i915_gem_ring_flush(ring, 0, I915_GEM_GPU_DOMAINS);
>  
> +	if (IS_VALLEYVIEW(ring->dev)) {
> +		/*
> +		 * WaReadAfterWriteHazard
> +		 * Send a number of Store Data commands here to finish
> +		 * flushing hardware pipeline.This is needed in the case
> +		 * where the next workload tries reading from the same
> +		 * surface that this batch writes to. Without these StoreDWs,
> +		 * not all of the data will actually be flushd to the surface
> +		 * by the time the next batch starts reading it, possibly
> +		 * causing a small amount of corruption.
> +		 */
> +		int i;
> +		ret = intel_ring_begin(ring, 4 * 12);

BSpec says 8 is enough. Is Bspec incorrect.

Also this workaround is also listed for everything SNB+.

> +		if (ret)
> +			return ret;
> +		for (i = 0; i < 12; i++) {
> +			intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
> +			intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_INDEX <<
> +							MI_STORE_DWORD_INDEX_SHIFT);
> +			intel_ring_emit(ring, 0);
> +			intel_ring_emit(ring, MI_NOOP);
> +		}
> +		intel_ring_advance(ring);
> +	}
> +
>  	ring->gpu_caches_dirty = false;
>  	return 0;
>  }
> -- 
> 1.8.5.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/6] drm/i915/vlv: Modified the programming of 2 regs in Ring initialisation
  2014-01-22  3:45 ` [PATCH 3/6] drm/i915/vlv: Modified the programming of 2 regs in Ring initialisation akash.goel
@ 2014-01-22 11:01   ` Ville Syrjälä
  0 siblings, 0 replies; 25+ messages in thread
From: Ville Syrjälä @ 2014-01-22 11:01 UTC (permalink / raw)
  To: akash.goel; +Cc: intel-gfx

On Wed, Jan 22, 2014 at 09:15:07AM +0530, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Modified programming of following 2 regs in Render ring initialisation fn.
> 1. GFX_MODE_GEN7 (Enabling TLB invalidate)
> 2. MI_MODE (Enabling MI Flush)
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index e8ec536..8b99df2 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -563,7 +563,9 @@ static int init_render_ring(struct intel_ring_buffer *ring)
>  	int ret = init_ring_common(ring);
>  
>  	if (INTEL_INFO(dev)->gen > 3)
> -		I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
> +		if (!IS_VALLEYVIEW(dev))
> +			I915_WRITE(MI_MODE,
> +				_MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));

BSpec says this should be enabled for everything before IVB. So based on
that the condition should just be 'if (gen > 3 && gen < 7)'

>  
>  	/* We need to disable the AsyncFlip performance optimisations in order
>  	 * to use MI_WAIT_FOR_EVENT within the CS. It should already be
> @@ -579,10 +581,17 @@ static int init_render_ring(struct intel_ring_buffer *ring)
>  		I915_WRITE(GFX_MODE,
>  			   _MASKED_BIT_ENABLE(GFX_TLB_INVALIDATE_ALWAYS));
>  
> -	if (IS_GEN7(dev))
> -		I915_WRITE(GFX_MODE_GEN7,
> -			   _MASKED_BIT_DISABLE(GFX_TLB_INVALIDATE_ALWAYS) |
> -			   _MASKED_BIT_ENABLE(GFX_REPLAY_MODE));
> +	if (IS_GEN7(dev)) {
> +		if (IS_VALLEYVIEW(dev)) {
> +			I915_WRITE(GFX_MODE_GEN7,
> +				_MASKED_BIT_ENABLE(GFX_REPLAY_MODE));
> +			I915_WRITE(MI_MODE, I915_READ(MI_MODE) |
> +				_MASKED_BIT_ENABLE(MI_FLUSH_ENABLE));

Why do we need to enable MI_FLUSH, and why only for VLV?

> +		} else
> +			I915_WRITE(GFX_MODE_GEN7,
> +				_MASKED_BIT_DISABLE(GFX_TLB_INVALIDATE_ALWAYS) |
> +				_MASKED_BIT_ENABLE(GFX_REPLAY_MODE));
> +	}

According to BSpec the GFX_TLB_INVALIDATE_ALWAYS bit only exists on SNB.
So it would we should just drop it here.

>  
>  	if (INTEL_INFO(dev)->gen >= 5) {
>  		ret = init_pipe_control(ring);
> -- 
> 1.8.5.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/6] drm/i915/vlv: Added 3 rendering specific Hw Workarounds in clock gating fn
  2014-01-22  3:45 ` [PATCH 4/6] drm/i915/vlv: Added 3 rendering specific Hw Workarounds in clock gating fn akash.goel
@ 2014-01-22 11:10   ` Ville Syrjälä
  2014-03-21 12:58     ` [PATCH 1/2] drm/i915/vlv:Implement WaDisable_RenderCache_OperationalFlush sourab.gupta
  0 siblings, 1 reply; 25+ messages in thread
From: Ville Syrjälä @ 2014-01-22 11:10 UTC (permalink / raw)
  To: akash.goel; +Cc: intel-gfx

On Wed, Jan 22, 2014 at 09:15:08AM +0530, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Added 2 new rendering specific Workarounds
> 1. WaDisable_RenderCache_OperationalFlush
>      Operational flush cannot be enabled on
>      BWG A0 [Errata BWT006]
> 2. WaVSThreadDispatchOverride
>      Performance optimization - Hw will decide which
>      half slice the thread will dispatch, May not be
>      really needed for VLV, as its single slice
> 
> Modified the implementation of 1 workaround
> 1. WaDisableL3Bank2xClockGate
>     Disabling L3 clock gating- MMIO 940c[25] = 1

Three things in one patch -> needs to be split up.

> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h |  3 +++
>  drivers/gpu/drm/i915/intel_pm.c | 22 +++++++++++++++++++++-
>  2 files changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index a699efd..d829754 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -934,6 +934,9 @@
>  #define   ECO_GATING_CX_ONLY	(1<<3)
>  #define   ECO_FLIP_DONE		(1<<0)
>  
> +#define GEN7_CACHE_MODE_0	0x07000 /* IVB+ only */
> +#define GEN7_RC_OP_FLUSH_ENABLE (1<<0)
> +
>  #define CACHE_MODE_1		0x7004 /* IVB+ */
>  #define   PIXEL_SUBSPAN_COLLECT_OPT_DISABLE (1<<6)
>  
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 469170c..4c36ff8 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4955,6 +4955,12 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
>  	I915_WRITE(GEN7_L3CNTLREG1, I915_READ(GEN7_L3CNTLREG1) | GEN7_L3AGDIS);
>  	I915_WRITE(GEN7_L3_CHICKEN_MODE_REGISTER, GEN7_WA_L3_CHICKEN_MODE);
>  
> +	/* WaDisable_RenderCache_OperationalFlush
> +	 * Clear bit 0, so we do a AND with the mask
> +	 * to keep other bits the same */
> +	I915_WRITE(GEN7_CACHE_MODE_0,  (I915_READ(GEN7_CACHE_MODE_0) |
> +			  _MASKED_BIT_DISABLE(GEN7_RC_OP_FLUSH_ENABLE)));

This should be disabled for everything gen4+. We don't seem to do it for
any other platform though, and I guess the reason is that it should
already default to disabled. So I'm not sure we need this, but if we do
then I'd prefer to do it consistently for all platforms.

> +
>  	/* WaForceL3Serialization:vlv */
>  	I915_WRITE(GEN7_L3SQCREG4, I915_READ(GEN7_L3SQCREG4) &
>  		   ~L3SQ_URB_READ_CAM_MATCH_DISABLE);
> @@ -4991,10 +4997,24 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
>  		   GEN6_RCPBUNIT_CLOCK_GATE_DISABLE |
>  		   GEN6_RCCUNIT_CLOCK_GATE_DISABLE);
>  
> -	I915_WRITE(GEN7_UCGCTL4, GEN7_L3BANK2X_CLOCK_GATE_DISABLE);
> +	/* WaDisableL3Bank2xClockGate
> +	 * Disabling L3 clock gating- MMIO 940c[25] = 1
> +	 * Set bit 25, to disable L3_BANK_2x_CLK_GATING */
> +	I915_WRIT9E(GEN7_UCGCTL4,
> +		I915_READ(GEN7_UCGCTL4) | GEN7_L3BANK2X_CLOCK_GATE_DISABLE);
>  
>  	I915_WRITE(MI_ARB_VLV, MI_ARB_DISPLAY_TRICKLE_FEED_DISABLE);
>  
> +	/* WaVSThreadDispatchOverride
> +	 * Hw will decide which half slice the thread will dispatch.
> +	 * May not be needed for VLV, as its a single slice */
> +	I915_WRITE(GEN7_CACHE_MODE_0,
> +		I915_READ(GEN7_FF_THREAD_MODE) &
> +		(~GEN7_FF_VS_SCHED_LOAD_BALANCE));

I'm pretty sure my workaround series from last summer tried to clean up
this w/a for all platforms. I guess you didn't look at those patches.

> +
> +	/* WaDisable4x2SubspanOptimization,
> +	 * Disable combining of two 2x2 subspans into a 4x2 subspan
> +	 * Set chicken bit to disable subspan optimization */

This should also be another patch.

>  	I915_WRITE(CACHE_MODE_1,
>  		   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
>  
> -- 
> 1.8.5.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/6] drm/i915/vlv: Removed 3 rendering specific Hw WA from clock gating fn
  2014-01-22  3:45 ` [PATCH 5/6] drm/i915/vlv: Removed 3 rendering specific Hw WA from clock gating fn akash.goel
@ 2014-01-22 11:11   ` Ville Syrjälä
  0 siblings, 0 replies; 25+ messages in thread
From: Ville Syrjälä @ 2014-01-22 11:11 UTC (permalink / raw)
  To: akash.goel; +Cc: intel-gfx

On Wed, Jan 22, 2014 at 09:15:09AM +0530, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Removed 3 workarounds as not needed for VLV+(B0 onwards)
> 1. WaDisableRHWOOptimizationForRenderHang
> 2. WaDisableL3CacheAging
> 3. WaDisableDopClockGating

Again multiple patches. And I think my earlier series already touched on
some of these.

> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 10 ----------
>  1 file changed, 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 4c36ff8..e4d220c 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4947,12 +4947,6 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
>  		   _MASKED_BIT_ENABLE(GEN7_MAX_PS_THREAD_DEP |
>  				      GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
>  
> -	/* Apply the WaDisableRHWOOptimizationForRenderHang:vlv workaround. */
> -	I915_WRITE(GEN7_COMMON_SLICE_CHICKEN1,
> -		   GEN7_CSC1_RHWO_OPT_DISABLE_IN_RCC);
> -
> -	/* WaApplyL3ControlAndL3ChickenMode:vlv */
> -	I915_WRITE(GEN7_L3CNTLREG1, I915_READ(GEN7_L3CNTLREG1) | GEN7_L3AGDIS);
>  	I915_WRITE(GEN7_L3_CHICKEN_MODE_REGISTER, GEN7_WA_L3_CHICKEN_MODE);
>  
>  	/* WaDisable_RenderCache_OperationalFlush
> @@ -4965,10 +4959,6 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
>  	I915_WRITE(GEN7_L3SQCREG4, I915_READ(GEN7_L3SQCREG4) &
>  		   ~L3SQ_URB_READ_CAM_MATCH_DISABLE);
>  
> -	/* WaDisableDopClockGating:vlv */
> -	I915_WRITE(GEN7_ROW_CHICKEN2,
> -		   _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
> -
>  	/* This is required by WaCatErrorRejectionIssue:vlv */
>  	I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
>  		   I915_READ(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG) |
> -- 
> 1.8.5.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaReadAfterWriteHazard'
  2014-01-22 10:54   ` Ville Syrjälä
@ 2014-01-22 11:11     ` Chris Wilson
  2014-03-21 11:53       ` Gupta, Sourab
  0 siblings, 1 reply; 25+ messages in thread
From: Chris Wilson @ 2014-01-22 11:11 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: akash.goel, intel-gfx

On Wed, Jan 22, 2014 at 12:54:51PM +0200, Ville Syrjälä wrote:
> On Wed, Jan 22, 2014 at 09:15:06AM +0530, akash.goel@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > Added a new rendering specific Workaround 'WaReadAfterWriteHazard'.
> > In this WA, need to add 12 MI Store Dword commands to ensure proper
> > flush of h/w pipeline.
> > 
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 25 +++++++++++++++++++++++++
> >  1 file changed, 25 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 133d273..e8ec536 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -2167,6 +2167,31 @@ intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
> >  
> >  	trace_i915_gem_ring_flush(ring, 0, I915_GEM_GPU_DOMAINS);
> >  
> > +	if (IS_VALLEYVIEW(ring->dev)) {
> > +		/*
> > +		 * WaReadAfterWriteHazard
> > +		 * Send a number of Store Data commands here to finish
> > +		 * flushing hardware pipeline.This is needed in the case
> > +		 * where the next workload tries reading from the same
> > +		 * surface that this batch writes to. Without these StoreDWs,
> > +		 * not all of the data will actually be flushd to the surface
> > +		 * by the time the next batch starts reading it, possibly
> > +		 * causing a small amount of corruption.
> > +		 */
> > +		int i;
> > +		ret = intel_ring_begin(ring, 4 * 12);
> 
> BSpec says 8 is enough. Is Bspec incorrect.

No, these are just figures they plucked out of the air. Last I heard
they were using 32...
 
> Also this workaround is also listed for everything SNB+.

And we already have a more effective workaround.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext'
  2014-01-22  3:45 ` [PATCH 6/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext' akash.goel
@ 2014-01-22 11:18   ` Ville Syrjälä
  0 siblings, 0 replies; 25+ messages in thread
From: Ville Syrjälä @ 2014-01-22 11:18 UTC (permalink / raw)
  To: akash.goel; +Cc: intel-gfx

On Wed, Jan 22, 2014 at 09:15:10AM +0530, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> This workaround is needed on VLV for the HW context feature.
> It is used after adding the mi_set_context command in ring buffer
> for Hw context switch. As per the spec
> "The software must send a pipe_control with a CS stall and a post sync
> operation and then a dummy DRAW after every MI_SET_CONTEXT and after any
> PIPELINE_SELECT that is enabling 3D mode".

This is also listed for IVB.

> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 64 +++++++++++++++++++++++++++++++--
>  drivers/gpu/drm/i915/i915_reg.h         |  3 ++
>  drivers/gpu/drm/i915/intel_ringbuffer.c |  9 +++++
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
>  4 files changed, 75 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index ebe0f67..62a5362 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -532,6 +532,58 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
>  	return (struct i915_hw_context *)idr_find(&file_priv->context_idr, id);
>  }
>  
> +static inline void
> +mi_set_context_dummy3d_prim_wa(struct intel_ring_buffer *ring)
> +{
> +	u32 scratch_addr;
> +	u32 flags = 0;
> +
> +	/*
> +	 * Check if we have the scratch page allocated needed
> +	 * for the Pipe Control command, otherwise don't apply
> +	 * the dummmy 3d primitive workaround & add NOOPs instead
> +	 */
> +	if (get_pipe_control_scratch_addr(ring)) {
> +		/* Actual scratch location is at 128 bytes offset */
> +		scratch_addr = get_pipe_control_scratch_addr(ring) + 128;
> +
> +		/*
> +		 * WaSendDummy3dPrimitveAfterSetContext
> +		 * Software must send a pipe_control with a CS stall
> +		 * and a post sync operation and then a dummy DRAW after
> +		 * every MI_SET_CONTEXT and after any PIPELINE_SELECT that
> +		 * is enabling 3D mode. A dummy draw is a 3DPRIMITIVE command
> +		 * with Indirect Parameter Enable set to 0, UAV Coherency
> +		 * Required set to 0, Predicate Enable set to 0,
> +		 * End Offset Enable set to 0, and Vertex Count Per Instance
> +		 * set to 0, All other parameters are a don't care.
> +		 */
> +
> +		/*
> +		 * Add a pipe control with CS Stall and postsync op
> +		 * before dummy 3D_PRIMITIVE
> +		 */
> +		flags |= PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL;
> +		intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4));
> +		intel_ring_emit(ring, flags);
> +		intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT);
> +		intel_ring_emit(ring, 0);
> +
> +		/* Add a dummy 3D_PRIMITVE */
> +		intel_ring_emit(ring, GFX_OP_3DPRIMITIVE());
> +		intel_ring_emit(ring, 4); /* PrimTopoType*/
> +		intel_ring_emit(ring, 0); /* VertexCountPerInstance */
> +		intel_ring_emit(ring, 0); /* StartVertexLocation */
> +		intel_ring_emit(ring, 0); /* InstanceCount */
> +		intel_ring_emit(ring, 0); /* StartInstanceLocation */
> +		intel_ring_emit(ring, 0); /* BaseVertexLocation  */
> +	} else {
> +		int i;
> +		for (i = 0; i < 11; i++)
> +			intel_ring_emit(ring, MI_NOOP);
> +	}
> +}
> +
>  static inline int
>  mi_set_context(struct intel_ring_buffer *ring,
>  	       struct i915_hw_context *new_context,
> @@ -550,7 +602,10 @@ mi_set_context(struct intel_ring_buffer *ring,
>  			return ret;
>  	}
>  
> -	ret = intel_ring_begin(ring, 6);
> +	if (IS_VALLEYVIEW(ring->dev))
> +		ret = intel_ring_begin(ring, 6+4+8);
> +	else
> +		ret = intel_ring_begin(ring, 6);
>  	if (ret)
>  		return ret;
>  
> @@ -571,7 +626,12 @@ mi_set_context(struct intel_ring_buffer *ring,
>  	intel_ring_emit(ring, MI_NOOP);
>  
>  	if (IS_GEN7(ring->dev))
> -		intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
> +		if (IS_VALLEYVIEW(ring->dev)) {
> +			mi_set_context_dummy3d_prim_wa(ring);
> +			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
> +			intel_ring_emit(ring, MI_NOOP);
> +		} else
> +			intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
>  	else
>  		intel_ring_emit(ring, MI_NOOP);
>  
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index d829754..649106d 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -335,6 +335,9 @@
>  #define   PIPE_CONTROL_DEPTH_CACHE_FLUSH		(1<<0)
>  #define   PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */
>  
> +#define GFX_OP_3DPRIMITIVE()              \
> +	((0x3<<29)|(0x3<<27)|(0x3<<24)|       \
> +	 (0x0<<16)|(0x0<<10)|(0x0<<8)|(7-2))
>  
>  /*
>   * Reset registers
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 8b99df2..a93b631 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -556,6 +556,15 @@ err:
>  	return ret;
>  }
>  
> +u32
> +get_pipe_control_scratch_addr(struct intel_ring_buffer *ring)
> +{
> +	if (ring->scratch.obj == NULL)
> +		return 0;
> +
> +	return ring->scratch.gtt_offset;
> +}
> +
>  static int init_render_ring(struct intel_ring_buffer *ring)
>  {
>  	struct drm_device *dev = ring->dev;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 71a73f4..2ae6029 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -257,6 +257,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev);
>  
>  u32 intel_ring_get_active_head(struct intel_ring_buffer *ring);
>  void intel_ring_setup_status_page(struct intel_ring_buffer *ring);
> +u32 get_pipe_control_scratch_addr(struct intel_ring_buffer *ring);
>  
>  static inline u32 intel_ring_get_tail(struct intel_ring_buffer *ring)
>  {
> -- 
> 1.8.5.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaReadAfterWriteHazard'
  2014-01-22 11:11     ` Chris Wilson
@ 2014-03-21 11:53       ` Gupta, Sourab
  2014-03-21 14:58         ` Daniel Vetter
  0 siblings, 1 reply; 25+ messages in thread
From: Gupta, Sourab @ 2014-03-21 11:53 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Goel, Akash, intel-gfx@lists.freedesktop.org

On Wed, 2014-01-22 at 11:11 +0000, Chris Wilson wrote:
> On Wed, Jan 22, 2014 at 12:54:51PM +0200, Ville Syrjälä wrote:
> > On Wed, Jan 22, 2014 at 09:15:06AM +0530, akash.goel@intel.com wrote:
> > > From: Akash Goel <akash.goel@intel.com>
> > >
> > > Added a new rendering specific Workaround 'WaReadAfterWriteHazard'.
> > > In this WA, need to add 12 MI Store Dword commands to ensure proper
> > > flush of h/w pipeline.
> > >
> > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/intel_ringbuffer.c | 25 +++++++++++++++++++++++++
> > >  1 file changed, 25 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > index 133d273..e8ec536 100644
> > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > @@ -2167,6 +2167,31 @@ intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
> > >
> > >     trace_i915_gem_ring_flush(ring, 0, I915_GEM_GPU_DOMAINS);
> > >
> > > +   if (IS_VALLEYVIEW(ring->dev)) {
> > > +           /*
> > > +            * WaReadAfterWriteHazard
> > > +            * Send a number of Store Data commands here to finish
> > > +            * flushing hardware pipeline.This is needed in the case
> > > +            * where the next workload tries reading from the same
> > > +            * surface that this batch writes to. Without these StoreDWs,
> > > +            * not all of the data will actually be flushd to the surface
> > > +            * by the time the next batch starts reading it, possibly
> > > +            * causing a small amount of corruption.
> > > +            */
> > > +           int i;
> > > +           ret = intel_ring_begin(ring, 4 * 12);
> >
> > BSpec says 8 is enough. Is Bspec incorrect.
> 
> No, these are just figures they plucked out of the air. Last I heard
> they were using 32...
> 
> > Also this workaround is also listed for everything SNB+.
> 
> And we already have a more effective workaround.
> -Chris

Can you please let us know whether this workaround patch is required or
not. If not, then how is this currently handled.
> 
> --
> Chris Wilson, Intel Open Source Technology Centre

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] drm/i915/vlv:Implement WaDisable_RenderCache_OperationalFlush
  2014-01-22 11:10   ` Ville Syrjälä
@ 2014-03-21 12:58     ` sourab.gupta
  2014-03-21 12:58       ` [PATCH 2/2] drm/i915/vlv: Modified Implementation of WaDisableL3Bank2xClockGate sourab.gupta
  0 siblings, 1 reply; 25+ messages in thread
From: sourab.gupta @ 2014-03-21 12:58 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

In Valleyview, Operational flush cannot be enabled on
BWG A0 [Errata BWT006]

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h | 3 +++
 drivers/gpu/drm/i915/intel_pm.c | 6 ++++++
 2 files changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 6174fda..8ddc3d5 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -970,6 +970,9 @@ enum punit_power_well {
 #define   ECO_GATING_CX_ONLY	(1<<3)
 #define   ECO_FLIP_DONE		(1<<0)
 
+#define GEN7_CACHE_MODE_0	0x07000 /* IVB+ only */
+#define GEN7_RC_OP_FLUSH_ENABLE (1<<0)
+
 #define CACHE_MODE_0_GEN7	0x7000 /* IVB+ */
 #define   HIZ_RAW_STALL_OPT_DISABLE (1<<2)
 #define CACHE_MODE_1		0x7004 /* IVB+ */
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 39f3238..97ff5e5 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5066,6 +5066,12 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
 		   _MASKED_BIT_ENABLE(GEN7_MAX_PS_THREAD_DEP |
 				      GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
 
+	/* WaDisable_RenderCache_OperationalFlush:vlv
+	 * Clear bit 0, so we do a AND with the mask
+	 * to keep other bits the same */
+	I915_WRITE(GEN7_CACHE_MODE_0,  (I915_READ(GEN7_CACHE_MODE_0) |
+			  _MASKED_BIT_DISABLE(GEN7_RC_OP_FLUSH_ENABLE)));
+
 	/* WaForceL3Serialization:vlv */
 	I915_WRITE(GEN7_L3SQCREG4, I915_READ(GEN7_L3SQCREG4) &
 		   ~L3SQ_URB_READ_CAM_MATCH_DISABLE);
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/2] drm/i915/vlv: Modified Implementation of WaDisableL3Bank2xClockGate
  2014-03-21 12:58     ` [PATCH 1/2] drm/i915/vlv:Implement WaDisable_RenderCache_OperationalFlush sourab.gupta
@ 2014-03-21 12:58       ` sourab.gupta
  0 siblings, 0 replies; 25+ messages in thread
From: sourab.gupta @ 2014-03-21 12:58 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

For VLV, disabling L3 clock gating- MMIO 940c[25] = 1

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 97ff5e5..f91218b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5094,8 +5094,11 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
 	I915_WRITE(GEN6_UCGCTL2,
 		   GEN6_RCZUNIT_CLOCK_GATE_DISABLE);
 
-	/* WaDisableL3Bank2xClockGate:vlv */
-	I915_WRITE(GEN7_UCGCTL4, GEN7_L3BANK2X_CLOCK_GATE_DISABLE);
+	/* WaDisableL3Bank2xClockGate:vlv
+	 * Disabling L3 clock gating- MMIO 940c[25] = 1
+	 * Set bit 25, to disable L3_BANK_2x_CLK_GATING */
+	I915_WRITE(GEN7_UCGCTL4,
+			I915_READ(GEN7_UCGCTL4) | GEN7_L3BANK2X_CLOCK_GATE_DISABLE);
 
 	I915_WRITE(MI_ARB_VLV, MI_ARB_DISPLAY_TRICKLE_FEED_DISABLE);
 
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaReadAfterWriteHazard'
  2014-03-21 11:53       ` Gupta, Sourab
@ 2014-03-21 14:58         ` Daniel Vetter
  2014-03-21 16:50           ` Gupta, Sourab
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2014-03-21 14:58 UTC (permalink / raw)
  To: Gupta, Sourab; +Cc: Goel, Akash, intel-gfx@lists.freedesktop.org

On Fri, Mar 21, 2014 at 11:53:40AM +0000, Gupta, Sourab wrote:
> On Wed, 2014-01-22 at 11:11 +0000, Chris Wilson wrote:
> > On Wed, Jan 22, 2014 at 12:54:51PM +0200, Ville Syrjälä wrote:
> > > On Wed, Jan 22, 2014 at 09:15:06AM +0530, akash.goel@intel.com wrote:
> > > > From: Akash Goel <akash.goel@intel.com>
> > > >
> > > > Added a new rendering specific Workaround 'WaReadAfterWriteHazard'.
> > > > In this WA, need to add 12 MI Store Dword commands to ensure proper
> > > > flush of h/w pipeline.
> > > >
> > > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/intel_ringbuffer.c | 25 +++++++++++++++++++++++++
> > > >  1 file changed, 25 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > index 133d273..e8ec536 100644
> > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > @@ -2167,6 +2167,31 @@ intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
> > > >
> > > >     trace_i915_gem_ring_flush(ring, 0, I915_GEM_GPU_DOMAINS);
> > > >
> > > > +   if (IS_VALLEYVIEW(ring->dev)) {
> > > > +           /*
> > > > +            * WaReadAfterWriteHazard
> > > > +            * Send a number of Store Data commands here to finish
> > > > +            * flushing hardware pipeline.This is needed in the case
> > > > +            * where the next workload tries reading from the same
> > > > +            * surface that this batch writes to. Without these StoreDWs,
> > > > +            * not all of the data will actually be flushd to the surface
> > > > +            * by the time the next batch starts reading it, possibly
> > > > +            * causing a small amount of corruption.
> > > > +            */
> > > > +           int i;
> > > > +           ret = intel_ring_begin(ring, 4 * 12);
> > >
> > > BSpec says 8 is enough. Is Bspec incorrect.
> > 
> > No, these are just figures they plucked out of the air. Last I heard
> > they were using 32...
> > 
> > > Also this workaround is also listed for everything SNB+.
> > 
> > And we already have a more effective workaround.
> > -Chris
> 
> Can you please let us know whether this workaround patch is required or
> not. If not, then how is this currently handled.

We emit XY_SETUP_BLT before certain blt operations to insert a
sufficiently long stall. The underlying bug this works around is that the
cache controller of the cpu falls over in certain very specific rwm
cycles. The official w/a is a full pipeline flush when switching between
reading and writing to a surface, which has a horribly perf impact on the
blitter.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaReadAfterWriteHazard'
  2014-03-21 14:58         ` Daniel Vetter
@ 2014-03-21 16:50           ` Gupta, Sourab
  0 siblings, 0 replies; 25+ messages in thread
From: Gupta, Sourab @ 2014-03-21 16:50 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Goel, Akash, intel-gfx@lists.freedesktop.org

On Fri, 2014-03-21 at 14:58 +0000, Daniel Vetter wrote:
> On Fri, Mar 21, 2014 at 11:53:40AM +0000, Gupta, Sourab wrote:
> > On Wed, 2014-01-22 at 11:11 +0000, Chris Wilson wrote:
> > > On Wed, Jan 22, 2014 at 12:54:51PM +0200, Ville Syrjälä wrote:
> > > > On Wed, Jan 22, 2014 at 09:15:06AM +0530, akash.goel@intel.com wrote:
> > > > > From: Akash Goel <akash.goel@intel.com>
> > > > >
> > > > > Added a new rendering specific Workaround 'WaReadAfterWriteHazard'.
> > > > > In this WA, need to add 12 MI Store Dword commands to ensure proper
> > > > > flush of h/w pipeline.
> > > > >
> > > > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/intel_ringbuffer.c | 25 +++++++++++++++++++++++++
> > > > >  1 file changed, 25 insertions(+)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > index 133d273..e8ec536 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > @@ -2167,6 +2167,31 @@ intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
> > > > >
> > > > >     trace_i915_gem_ring_flush(ring, 0, I915_GEM_GPU_DOMAINS);
> > > > >
> > > > > +   if (IS_VALLEYVIEW(ring->dev)) {
> > > > > +           /*
> > > > > +            * WaReadAfterWriteHazard
> > > > > +            * Send a number of Store Data commands here to finish
> > > > > +            * flushing hardware pipeline.This is needed in the case
> > > > > +            * where the next workload tries reading from the same
> > > > > +            * surface that this batch writes to. Without these StoreDWs,
> > > > > +            * not all of the data will actually be flushd to the surface
> > > > > +            * by the time the next batch starts reading it, possibly
> > > > > +            * causing a small amount of corruption.
> > > > > +            */
> > > > > +           int i;
> > > > > +           ret = intel_ring_begin(ring, 4 * 12);
> > > >
> > > > BSpec says 8 is enough. Is Bspec incorrect.
> > > 
> > > No, these are just figures they plucked out of the air. Last I heard
> > > they were using 32...
> > > 
> > > > Also this workaround is also listed for everything SNB+.
> > > 
> > > And we already have a more effective workaround.
> > > -Chris
> > 
> > Can you please let us know whether this workaround patch is required or
> > not. If not, then how is this currently handled.
> 
> We emit XY_SETUP_BLT before certain blt operations to insert a
> sufficiently long stall. The underlying bug this works around is that the
> cache controller of the cpu falls over in certain very specific rwm
> cycles. The official w/a is a full pipeline flush when switching between
> reading and writing to a surface, which has a horribly perf impact on the
> blitter.
> -Daniel

Hi Daniel,
In the kernel code, we're not able to see any such operation during the
batchbuffer submission. Is the XY_SETUP_BLT emit done through the
userspace, in these specific usecases?
Does this mean that this patch is not admissible as such for this
underlying bug?

Regards,
Sourab
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore'
  2014-03-24  6:49 [PATCH 0/6] Rendering Specific HW Workarounds for VLV sourab.gupta
@ 2014-03-24  6:49 ` sourab.gupta
  2014-03-24  9:32   ` Chris Wilson
  0 siblings, 1 reply; 25+ messages in thread
From: sourab.gupta @ 2014-03-24  6:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Akash Goel <akash.goel@intel.com>

Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
Store data commands.

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 87d1a2d..2812384 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2207,6 +2207,28 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
 	uint32_t flush_domains;
 	int ret;
 
+	if (IS_VALLEYVIEW(ring->dev)) {
+		/*
+		 * WaTlbInvalidateStoreDataBefore:vlv
+		 * Before pipecontrol with TLB invalidate set, need 2 store
+		 * data commands (such as MI_STORE_DATA_IMM or MI_STORE_DATA_INDEX)
+		 * Without this, hardware cannot guarantee the command after the
+		 * PIPE_CONTROL with TLB inv will not use the old TLB values.
+		 */
+		int i;
+		ret = intel_ring_begin(ring, 4 * 2);
+		if (ret)
+			return ret;
+		for (i = 0; i < 2; i++) {
+			intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
+			intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_INDEX <<
+						MI_STORE_DWORD_INDEX_SHIFT);
+			intel_ring_emit(ring, 0);
+			intel_ring_emit(ring, MI_NOOP);
+		}
+		intel_ring_advance(ring);
+	}
+
 	flush_domains = 0;
 	if (ring->gpu_caches_dirty)
 		flush_domains = I915_GEM_GPU_DOMAINS;
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore'
  2014-03-24  6:49 ` [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore' sourab.gupta
@ 2014-03-24  9:32   ` Chris Wilson
  2014-03-24 11:20     ` Gupta, Sourab
  0 siblings, 1 reply; 25+ messages in thread
From: Chris Wilson @ 2014-03-24  9:32 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Akash Goel, intel-gfx

On Mon, Mar 24, 2014 at 12:19:19PM +0530, sourab.gupta@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> Store data commands.
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 87d1a2d..2812384 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2207,6 +2207,28 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
>  	uint32_t flush_domains;
>  	int ret;
>  
> +	if (IS_VALLEYVIEW(ring->dev)) {
The ring flushes are vfuncs, so why is this here and not in a special
vlv ring flush?

> +		/*
> +		 * WaTlbInvalidateStoreDataBefore:vlv
> +		 * Before pipecontrol with TLB invalidate set, need 2 store
> +		 * data commands (such as MI_STORE_DATA_IMM or MI_STORE_DATA_INDEX)
> +		 * Without this, hardware cannot guarantee the command after the
> +		 * PIPE_CONTROL with TLB inv will not use the old TLB values.

Crumbs, it sounds like our i-g-t are not sensitive enough. This bug
crops up in many disguises over the years, do you have any suggestion on
how we can improve our tests?

> +		 */
> +		int i;
> +		ret = intel_ring_begin(ring, 4 * 2);

This can be we written to use 6 dwords.

> +		if (ret)
> +			return ret;
> +		for (i = 0; i < 2; i++) {
> +			intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
> +			intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_INDEX <<
> +						MI_STORE_DWORD_INDEX_SHIFT);

This is I915_GEM_HWS_SCRATCH_ADDR

> +			intel_ring_emit(ring, 0);
> +			intel_ring_emit(ring, MI_NOOP);
> +		}
> +		intel_ring_advance(ring);
> +	}
> +
>  	flush_domains = 0;
>  	if (ring->gpu_caches_dirty)
>  		flush_domains = I915_GEM_GPU_DOMAINS;

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore'
  2014-03-24  9:32   ` Chris Wilson
@ 2014-03-24 11:20     ` Gupta, Sourab
  2014-03-24 18:32       ` Ville Syrjälä
  0 siblings, 1 reply; 25+ messages in thread
From: Gupta, Sourab @ 2014-03-24 11:20 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Goel, Akash, intel-gfx@lists.freedesktop.org

On Mon, 2014-03-24 at 09:32 +0000, Chris Wilson wrote:
> On Mon, Mar 24, 2014 at 12:19:19PM +0530, sourab.gupta@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> > In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> > Store data commands.
> > 
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 87d1a2d..2812384 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -2207,6 +2207,28 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
> >  	uint32_t flush_domains;
> >  	int ret;
> >  
> > +	if (IS_VALLEYVIEW(ring->dev)) {
> The ring flushes are vfuncs, so why is this here and not in a special
> vlv ring flush?

Yes, we can as well put it in the platform specific vlv flush. Since we
apply this WA only for invalidate_all_caches function, we have to
differentiate in the vlv flush function regarding where the flush
originated from. For this we plan to check the 'invalidate_domains'
field of flush function. (This field will be non-zero in case the call
originated from invalidate_all_caches function). So, we'll have a
vlv_render_ring_flush something like this:
	if(invalidate_domains)
		apply_our_wa;
	gen7_render_ring_flush();

Does this look okay?

Regards,
Sourab

> 
> > +		/*
> > +		 * WaTlbInvalidateStoreDataBefore:vlv
> > +		 * Before pipecontrol with TLB invalidate set, need 2 store
> > +		 * data commands (such as MI_STORE_DATA_IMM or MI_STORE_DATA_INDEX)
> > +		 * Without this, hardware cannot guarantee the command after the
> > +		 * PIPE_CONTROL with TLB inv will not use the old TLB values.
> 
> Crumbs, it sounds like our i-g-t are not sensitive enough. This bug
> crops up in many disguises over the years, do you have any suggestion on
> how we can improve our tests?
> 
We'll think of how to capture the scenario in the i-g-t testcases and
come back with suggestions.

> > +		 */
> > +		int i;
> > +		ret = intel_ring_begin(ring, 4 * 2);
> 
> This can be we written to use 6 dwords.
> 
Agreed. We'll have this in our next version
> > +		if (ret)
> > +			return ret;
> > +		for (i = 0; i < 2; i++) {
> > +			intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
> > +			intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_INDEX <<
> > +						MI_STORE_DWORD_INDEX_SHIFT);
> 
> This is I915_GEM_HWS_SCRATCH_ADDR

Agreed. We'll have this in our next version

> 
> > +			intel_ring_emit(ring, 0);
> > +			intel_ring_emit(ring, MI_NOOP);
> > +		}
> > +		intel_ring_advance(ring);
> > +	}
> > +
> >  	flush_domains = 0;
> >  	if (ring->gpu_caches_dirty)
> >  		flush_domains = I915_GEM_GPU_DOMAINS;
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore'
  2014-03-24 11:20     ` Gupta, Sourab
@ 2014-03-24 18:32       ` Ville Syrjälä
  2014-03-24 18:47         ` Chris Wilson
  0 siblings, 1 reply; 25+ messages in thread
From: Ville Syrjälä @ 2014-03-24 18:32 UTC (permalink / raw)
  To: Gupta, Sourab; +Cc: intel-gfx@lists.freedesktop.org, Goel, Akash

On Mon, Mar 24, 2014 at 11:20:40AM +0000, Gupta, Sourab wrote:
> On Mon, 2014-03-24 at 09:32 +0000, Chris Wilson wrote:
> > On Mon, Mar 24, 2014 at 12:19:19PM +0530, sourab.gupta@intel.com wrote:
> > > From: Akash Goel <akash.goel@intel.com>
> > > 
> > > Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> > > In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> > > Store data commands.
> > > 
> > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++
> > >  1 file changed, 22 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > index 87d1a2d..2812384 100644
> > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > @@ -2207,6 +2207,28 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
> > >  	uint32_t flush_domains;
> > >  	int ret;
> > >  
> > > +	if (IS_VALLEYVIEW(ring->dev)) {
> > The ring flushes are vfuncs, so why is this here and not in a special
> > vlv ring flush?
> 
> Yes, we can as well put it in the platform specific vlv flush. Since we
> apply this WA only for invalidate_all_caches function, we have to
> differentiate in the vlv flush function regarding where the flush
> originated from. For this we plan to check the 'invalidate_domains'
> field of flush function. (This field will be non-zero in case the call
> originated from invalidate_all_caches function). So, we'll have a
> vlv_render_ring_flush something like this:
> 	if(invalidate_domains)
> 		apply_our_wa;
> 	gen7_render_ring_flush();
> 
> Does this look okay?

Since we supposdely need this for all gen6/gen7, I'd just add a new func
(eg. gen6_tlb_invalidate_wa()) and call that from gen6_render_ring_flush(),
gen7_render_ring_flush(), gen6_bsd_ring_flush() and gen6_ring_flush().

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore'
  2014-03-24 18:32       ` Ville Syrjälä
@ 2014-03-24 18:47         ` Chris Wilson
  2014-03-25  5:17           ` Gupta, Sourab
  0 siblings, 1 reply; 25+ messages in thread
From: Chris Wilson @ 2014-03-24 18:47 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Goel, Akash, Gupta, Sourab, intel-gfx@lists.freedesktop.org

On Mon, Mar 24, 2014 at 08:32:30PM +0200, Ville Syrjälä wrote:
> On Mon, Mar 24, 2014 at 11:20:40AM +0000, Gupta, Sourab wrote:
> > On Mon, 2014-03-24 at 09:32 +0000, Chris Wilson wrote:
> > > On Mon, Mar 24, 2014 at 12:19:19PM +0530, sourab.gupta@intel.com wrote:
> > > > From: Akash Goel <akash.goel@intel.com>
> > > > 
> > > > Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> > > > In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> > > > Store data commands.
> > > > 
> > > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > > > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++
> > > >  1 file changed, 22 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > index 87d1a2d..2812384 100644
> > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > @@ -2207,6 +2207,28 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
> > > >  	uint32_t flush_domains;
> > > >  	int ret;
> > > >  
> > > > +	if (IS_VALLEYVIEW(ring->dev)) {
> > > The ring flushes are vfuncs, so why is this here and not in a special
> > > vlv ring flush?
> > 
> > Yes, we can as well put it in the platform specific vlv flush. Since we
> > apply this WA only for invalidate_all_caches function, we have to
> > differentiate in the vlv flush function regarding where the flush
> > originated from. For this we plan to check the 'invalidate_domains'
> > field of flush function. (This field will be non-zero in case the call
> > originated from invalidate_all_caches function). So, we'll have a
> > vlv_render_ring_flush something like this:
> > 	if(invalidate_domains)
> > 		apply_our_wa;
> > 	gen7_render_ring_flush();
> > 
> > Does this look okay?
> 
> Since we supposdely need this for all gen6/gen7, I'd just add a new func
> (eg. gen6_tlb_invalidate_wa()) and call that from gen6_render_ring_flush(),
> gen7_render_ring_flush(), gen6_bsd_ring_flush() and gen6_ring_flush().

Now, I am extremely curious as to what the exact bug symptoms are. We
seem to have an absence of bug reports since SNB regarding random
corruption.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore'
  2014-03-24 18:47         ` Chris Wilson
@ 2014-03-25  5:17           ` Gupta, Sourab
  0 siblings, 0 replies; 25+ messages in thread
From: Gupta, Sourab @ 2014-03-25  5:17 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Goel, Akash, intel-gfx@lists.freedesktop.org

On Mon, 2014-03-24 at 18:47 +0000, Chris Wilson wrote:
> On Mon, Mar 24, 2014 at 08:32:30PM +0200, Ville Syrjälä wrote:
> > On Mon, Mar 24, 2014 at 11:20:40AM +0000, Gupta, Sourab wrote:
> > > On Mon, 2014-03-24 at 09:32 +0000, Chris Wilson wrote:
> > > > On Mon, Mar 24, 2014 at 12:19:19PM +0530, sourab.gupta@intel.com wrote:
> > > > > From: Akash Goel <akash.goel@intel.com>
> > > > > 
> > > > > Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> > > > > In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> > > > > Store data commands.
> > > > > 
> > > > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > > > > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++++++
> > > > >  1 file changed, 22 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > index 87d1a2d..2812384 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > @@ -2207,6 +2207,28 @@ intel_ring_invalidate_all_caches(struct intel_ring_buffer *ring)
> > > > >  	uint32_t flush_domains;
> > > > >  	int ret;
> > > > >  
> > > > > +	if (IS_VALLEYVIEW(ring->dev)) {
> > > > The ring flushes are vfuncs, so why is this here and not in a special
> > > > vlv ring flush?
> > > 
> > > Yes, we can as well put it in the platform specific vlv flush. Since we
> > > apply this WA only for invalidate_all_caches function, we have to
> > > differentiate in the vlv flush function regarding where the flush
> > > originated from. For this we plan to check the 'invalidate_domains'
> > > field of flush function. (This field will be non-zero in case the call
> > > originated from invalidate_all_caches function). So, we'll have a
> > > vlv_render_ring_flush something like this:
> > > 	if(invalidate_domains)
> > > 		apply_our_wa;
> > > 	gen7_render_ring_flush();
> > > 
> > > Does this look okay?
> > 
> > Since we supposdely need this for all gen6/gen7, I'd just add a new func
> > (eg. gen6_tlb_invalidate_wa()) and call that from gen6_render_ring_flush(),
> > gen7_render_ring_flush(), gen6_bsd_ring_flush() and gen6_ring_flush().
> 
> Now, I am extremely curious as to what the exact bug symptoms are. We
> seem to have an absence of bug reports since SNB regarding random
> corruption.
> -Chris
> 
Hi Chris,
We had applied this WA in a preemptive way, as this was amongst the
recommended list of WA's applicable. So, we don't have any specific bug
characteristics which is prevented by this particular WA.
Generally, we have been very cautious wrt WA's as we have seen hangs
(which may be very difficult to debug in wild), if we miss out on some
WA's recommended.
Regards,
Sourab

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2014-03-25  5:17 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-22  3:45 [PATCH 0/6] Rendering specific Hw workarounds for VLV akash.goel
2014-01-22  3:45 ` [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore' akash.goel
2014-01-22 10:51   ` Ville Syrjälä
2014-01-22  3:45 ` [PATCH 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaReadAfterWriteHazard' akash.goel
2014-01-22 10:54   ` Ville Syrjälä
2014-01-22 11:11     ` Chris Wilson
2014-03-21 11:53       ` Gupta, Sourab
2014-03-21 14:58         ` Daniel Vetter
2014-03-21 16:50           ` Gupta, Sourab
2014-01-22  3:45 ` [PATCH 3/6] drm/i915/vlv: Modified the programming of 2 regs in Ring initialisation akash.goel
2014-01-22 11:01   ` Ville Syrjälä
2014-01-22  3:45 ` [PATCH 4/6] drm/i915/vlv: Added 3 rendering specific Hw Workarounds in clock gating fn akash.goel
2014-01-22 11:10   ` Ville Syrjälä
2014-03-21 12:58     ` [PATCH 1/2] drm/i915/vlv:Implement WaDisable_RenderCache_OperationalFlush sourab.gupta
2014-03-21 12:58       ` [PATCH 2/2] drm/i915/vlv: Modified Implementation of WaDisableL3Bank2xClockGate sourab.gupta
2014-01-22  3:45 ` [PATCH 5/6] drm/i915/vlv: Removed 3 rendering specific Hw WA from clock gating fn akash.goel
2014-01-22 11:11   ` Ville Syrjälä
2014-01-22  3:45 ` [PATCH 6/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaSendDummy3dPrimitveAfterSetContext' akash.goel
2014-01-22 11:18   ` Ville Syrjälä
  -- strict thread matches above, loose matches on Subject: below --
2014-03-24  6:49 [PATCH 0/6] Rendering Specific HW Workarounds for VLV sourab.gupta
2014-03-24  6:49 ` [PATCH 1/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore' sourab.gupta
2014-03-24  9:32   ` Chris Wilson
2014-03-24 11:20     ` Gupta, Sourab
2014-03-24 18:32       ` Ville Syrjälä
2014-03-24 18:47         ` Chris Wilson
2014-03-25  5:17           ` Gupta, Sourab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox