public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH 1/4] drm/i915: Restore default render context after hw state reset
@ 2014-10-07 14:21 Mika Kuoppala
  2014-10-07 14:21 ` [PATCH 2/4] drm/i915: Reinitialize default context after resume Mika Kuoppala
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Mika Kuoppala @ 2014-10-07 14:21 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, miku

After reset or suspend, the hardware render context state
has been cleared to default values. If this is a first
switch after such event, we need to restore the state
to get back to pre reset state.

As the render context state contains the wa
registers, on bdw, this also effectively restores the
workarounds past reset/suspend. With older gens,
explicit reinitialization of some registers by driver is
still required.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Brad Volkin <bradley.d.volkin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index a5221d8..e7fefd7 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -577,7 +577,8 @@ static int do_switch(struct intel_engine_cs *ring,
 		vma->bind_vma(vma, to->legacy_hw_ctx.rcs_state->cache_level, GLOBAL_BIND);
 	}
 
-	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to))
+	if (!to->legacy_hw_ctx.initialized ||
+	    (from != NULL && i915_gem_context_is_default(to)))
 		hw_flags |= MI_RESTORE_INHIBIT;
 
 	ret = mi_set_context(ring, to, hw_flags);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] drm/i915: Reinitialize default context after resume
  2014-10-07 14:21 [PATCH 1/4] drm/i915: Restore default render context after hw state reset Mika Kuoppala
@ 2014-10-07 14:21 ` Mika Kuoppala
  2014-10-07 15:06   ` Chris Wilson
  2014-10-07 14:21 ` [PATCH 3/4] drm/i915: Build workaround list in ring initialization Mika Kuoppala
  2014-10-07 14:21 ` [PATCH 4/4] drm/i915: Check workaround status on dfs read time Mika Kuoppala
  2 siblings, 1 reply; 10+ messages in thread
From: Mika Kuoppala @ 2014-10-07 14:21 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

We have lost render context state on suspend. This is
identical how we lose the state on reset. So mark the
context as reset so that we restore from pre suspend
state.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index a05a1d0..99df309 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -608,6 +608,7 @@ static int i915_drm_freeze(struct drm_device *dev)
 	}
 
 	i915_gem_suspend_gtt_mappings(dev);
+	i915_gem_context_reset(dev);
 
 	i915_save_state(dev);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] drm/i915: Build workaround list in ring initialization
  2014-10-07 14:21 [PATCH 1/4] drm/i915: Restore default render context after hw state reset Mika Kuoppala
  2014-10-07 14:21 ` [PATCH 2/4] drm/i915: Reinitialize default context after resume Mika Kuoppala
@ 2014-10-07 14:21 ` Mika Kuoppala
  2014-10-13 15:21   ` Siluvery, Arun
                     ` (2 more replies)
  2014-10-07 14:21 ` [PATCH 4/4] drm/i915: Check workaround status on dfs read time Mika Kuoppala
  2 siblings, 3 replies; 10+ messages in thread
From: Mika Kuoppala @ 2014-10-07 14:21 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

If we build the workaround list in ring initialization
and decouple it from the actual writing of values, we
gain the ability to decide where and how we want to apply
the values.

The advantage of this will become more clear when
we need to initialize workarounds on older gens where
it is not possible to write all the registers through ring
LRIs.

v2: rebase on newest bdw workarounds

Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |  20 ++--
 drivers/gpu/drm/i915/i915_drv.h         |  28 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 185 ++++++++++++++++++--------------
 3 files changed, 130 insertions(+), 103 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index da4036d..87482f8 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2655,18 +2655,20 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
 
 	intel_runtime_pm_get(dev_priv);
 
-	seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs);
-	for (i = 0; i < dev_priv->num_wa_regs; ++i) {
+	gen6_gt_force_wake_get(dev_priv, FORCEWAKE_ALL);
+
+	seq_printf(m, "Workarounds applied: %d\n", dev_priv->workarounds.count);
+	for (i = 0; i < dev_priv->workarounds.count; ++i) {
 		u32 addr, mask;
 
-		addr = dev_priv->intel_wa_regs[i].addr;
-		mask = dev_priv->intel_wa_regs[i].mask;
-		dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask;
-		if (dev_priv->intel_wa_regs[i].addr)
+		addr = dev_priv->workarounds.reg[i].addr;
+		mask = dev_priv->workarounds.reg[i].mask;
+		dev_priv->workarounds.reg[i].value = I915_READ(addr) | mask;
+		if (dev_priv->workarounds.reg[i].addr)
 			seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
-				   dev_priv->intel_wa_regs[i].addr,
-				   dev_priv->intel_wa_regs[i].value,
-				   dev_priv->intel_wa_regs[i].mask);
+				   dev_priv->workarounds.reg[i].addr,
+				   dev_priv->workarounds.reg[i].value,
+				   dev_priv->workarounds.reg[i].mask);
 	}
 
 	intel_runtime_pm_put(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1e476b5..f7265bf 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1448,6 +1448,20 @@ struct i915_frontbuffer_tracking {
 	unsigned flip_bits;
 };
 
+struct i915_wa_reg {
+	u32 addr;
+	u32 value;
+	/* bitmask representing WA bits */
+	u32 mask;
+};
+
+#define I915_MAX_WA_REGS 16
+
+struct i915_workarounds {
+	struct i915_wa_reg reg[I915_MAX_WA_REGS];
+	u32 count;
+};
+
 struct drm_i915_private {
 	struct drm_device *dev;
 	struct kmem_cache *slab;
@@ -1590,19 +1604,7 @@ struct drm_i915_private {
 	struct intel_shared_dpll shared_dplls[I915_NUM_PLLS];
 	int dpio_phy_iosf_port[I915_NUM_PHYS_VLV];
 
-	/*
-	 * workarounds are currently applied at different places and
-	 * changes are being done to consolidate them so exact count is
-	 * not clear at this point, use a max value for now.
-	 */
-#define I915_MAX_WA_REGS  16
-	struct {
-		u32 addr;
-		u32 value;
-		/* bitmask representing WA bits */
-		u32 mask;
-	} intel_wa_regs[I915_MAX_WA_REGS];
-	u32 num_wa_regs;
+	struct i915_workarounds workarounds;
 
 	/* Reclocking support */
 	bool render_reclock_avail;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 816a692..12a546f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -665,80 +665,107 @@ err:
 	return ret;
 }
 
-static inline void intel_ring_emit_wa(struct intel_engine_cs *ring,
-				       u32 addr, u32 value)
+static int intel_ring_workarounds_emit(struct intel_engine_cs *ring)
 {
+	int ret, i;
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_workarounds *w = &dev_priv->workarounds;
 
-	if (WARN_ON(dev_priv->num_wa_regs >= I915_MAX_WA_REGS))
-		return;
+	if (WARN_ON(w->count == 0))
+		return 0;
 
-	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
-	intel_ring_emit(ring, addr);
-	intel_ring_emit(ring, value);
+	ring->gpu_caches_dirty = true;
+	ret = intel_ring_flush_all_caches(ring);
+	if (ret)
+		return ret;
 
-	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].addr = addr;
-	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].mask = value & 0xFFFF;
-	/* value is updated with the status of remaining bits of this
-	 * register when it is read from debugfs file
-	 */
-	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].value = value;
-	dev_priv->num_wa_regs++;
+	ret = intel_ring_begin(ring, w->count * 3);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < w->count; i++) {
+		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+		intel_ring_emit(ring, w->reg[i].addr);
+		intel_ring_emit(ring, w->reg[i].value);
+	}
+
+	intel_ring_advance(ring);
+
+	ring->gpu_caches_dirty = true;
+	ret = intel_ring_flush_all_caches(ring);
+	if (ret)
+		return ret;
+
+	DRM_DEBUG_DRIVER("Number of Workarounds emitted: %d\n", w->count);
 
-	return;
+	return 0;
+}
+
+static int wa_add(struct drm_i915_private *dev_priv,
+		  const u32 addr, const u32 val, const u32 mask)
+{
+	const u32 idx = dev_priv->workarounds.count;
+
+	if (WARN_ON(idx >= I915_MAX_WA_REGS))
+		return -ENOSPC;
+
+	dev_priv->workarounds.reg[idx].addr = addr;
+	dev_priv->workarounds.reg[idx].value = val;
+	dev_priv->workarounds.reg[idx].mask = mask;
+
+	dev_priv->workarounds.count++;
+
+	return 0;
 }
 
+#define WA_REG(addr, val, mask) { \
+		const int r = wa_add(dev_priv, (addr), (val), (mask)); \
+		if (r) \
+			return r; \
+	}
+
+#define WA_SET_BIT_MASKED(addr, mask) WA_REG(addr, \
+				    _MASKED_BIT_ENABLE(mask), (mask) & 0xffff)
+
+#define WA_CLR_BIT_MASKED(addr, mask) WA_REG(addr, \
+				    _MASKED_BIT_DISABLE(mask), (mask) & 0xffff)
+
+#define WA_SET_BIT(addr, mask) WA_REG(addr, I915_READ(addr) | (mask), mask)
+#define WA_CLR_BIT(addr, mask) WA_REG(addr, I915_READ(addr) & ~(mask), mask)
+
+#define WA_WRITE(addr, val) WA_REG(addr, val, 0xffffffff)
+
 static int bdw_init_workarounds(struct intel_engine_cs *ring)
 {
-	int ret;
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	/*
-	 * workarounds applied in this fn are part of register state context,
-	 * they need to be re-initialized followed by gpu reset, suspend/resume,
-	 * module reload.
-	 */
-	dev_priv->num_wa_regs = 0;
-	memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
-
-	/*
-	 * update the number of dwords required based on the
-	 * actual number of workarounds applied
-	 */
-	ret = intel_ring_begin(ring, 18);
-	if (ret)
-		return ret;
-
 	/* WaDisablePartialInstShootdown:bdw */
 	/* WaDisableThreadStallDopClockGating:bdw */
-	/* FIXME: Unclear whether we really need this on production bdw. */
-	intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
-			   _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE
-					     | STALL_DOP_GATING_DISABLE));
+	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+		  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE |
+		  STALL_DOP_GATING_DISABLE);
 
 	/* WaDisableDopClockGating:bdw May not be needed for production */
-	intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
-			   _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
+	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
+		  DOP_CLOCK_GATING_DISABLE);
 
-	intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
-			   _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
+	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+			  GEN8_SAMPLER_POWER_BYPASS_DIS);
 
 	/* Use Force Non-Coherent whenever executing a 3D context. This is a
 	 * workaround for for a possible hang in the unlikely event a TLB
 	 * invalidation occurs during a PSD flush.
 	 */
 	/* WaDisableFenceDestinationToSLM:bdw (GT3 pre-production) */
-	intel_ring_emit_wa(ring, HDC_CHICKEN0,
-			   _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT |
-					      (IS_BDW_GT3(dev) ?
-					       HDC_FENCE_DEST_SLM_DISABLE : 0)
-				   ));
+	WA_SET_BIT_MASKED(HDC_CHICKEN0,
+			  HDC_FORCE_NON_COHERENT |
+			  (IS_BDW_GT3(dev) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
 
 	/* Wa4x4STCOptimizationDisable:bdw */
-	intel_ring_emit_wa(ring, CACHE_MODE_1,
-			   _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE));
+	WA_SET_BIT_MASKED(CACHE_MODE_1,
+		  GEN8_4x4_STC_OPTIMIZATION_DISABLE);
 
 	/*
 	 * BSpec recommends 8x4 when MSAA is used,
@@ -748,52 +775,50 @@ static int bdw_init_workarounds(struct intel_engine_cs *ring)
 	 * disable bit, which we don't touch here, but it's good
 	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
 	 */
-	intel_ring_emit_wa(ring, GEN7_GT_MODE,
-			   GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
-
-	intel_ring_advance(ring);
-
-	DRM_DEBUG_DRIVER("Number of Workarounds applied: %d\n",
-			 dev_priv->num_wa_regs);
+	WA_SET_BIT_MASKED(GEN7_GT_MODE,
+		  GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
 
 	return 0;
 }
 
 static int chv_init_workarounds(struct intel_engine_cs *ring)
 {
-	int ret;
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	/*
-	 * workarounds applied in this fn are part of register state context,
-	 * they need to be re-initialized followed by gpu reset, suspend/resume,
-	 * module reload.
-	 */
-	dev_priv->num_wa_regs = 0;
-	memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
-
-	ret = intel_ring_begin(ring, 12);
-	if (ret)
-		return ret;
-
 	/* WaDisablePartialInstShootdown:chv */
-	intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
-			   _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE));
+	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+		  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
 
 	/* WaDisableThreadStallDopClockGating:chv */
-	intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
-			   _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE));
+	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+		  STALL_DOP_GATING_DISABLE);
 
 	/* WaDisableDopClockGating:chv (pre-production hw) */
-	intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
-			   _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
+	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
+		  DOP_CLOCK_GATING_DISABLE);
 
 	/* WaDisableSamplerPowerBypass:chv (pre-production hw) */
-	intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
-			   _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
+	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+		  GEN8_SAMPLER_POWER_BYPASS_DIS);
 
-	intel_ring_advance(ring);
+	return 0;
+}
+
+static int init_workarounds_ring(struct intel_engine_cs *ring)
+{
+	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	WARN_ON(ring->id != RCS);
+
+	dev_priv->workarounds.count = 0;
+
+	if (IS_BROADWELL(dev))
+		return bdw_init_workarounds(ring);
+
+	if (IS_CHERRYVIEW(dev))
+		return chv_init_workarounds(ring);
 
 	return 0;
 }
@@ -853,7 +878,7 @@ static int init_render_ring(struct intel_engine_cs *ring)
 	if (HAS_L3_DPF(dev))
 		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
 
-	return ret;
+	return init_workarounds_ring(ring);
 }
 
 static void render_ring_cleanup(struct intel_engine_cs *ring)
@@ -2299,10 +2324,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 					dev_priv->semaphore_obj = obj;
 			}
 		}
-		if (IS_CHERRYVIEW(dev))
-			ring->init_context = chv_init_workarounds;
-		else
-			ring->init_context = bdw_init_workarounds;
+
+		ring->init_context = intel_ring_workarounds_emit;
 		ring->add_request = gen6_add_request;
 		ring->flush = gen8_render_ring_flush;
 		ring->irq_get = gen8_ring_get_irq;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] drm/i915: Check workaround status on dfs read time
  2014-10-07 14:21 [PATCH 1/4] drm/i915: Restore default render context after hw state reset Mika Kuoppala
  2014-10-07 14:21 ` [PATCH 2/4] drm/i915: Reinitialize default context after resume Mika Kuoppala
  2014-10-07 14:21 ` [PATCH 3/4] drm/i915: Build workaround list in ring initialization Mika Kuoppala
@ 2014-10-07 14:21 ` Mika Kuoppala
  2014-10-13 15:22   ` Siluvery, Arun
  2 siblings, 1 reply; 10+ messages in thread
From: Mika Kuoppala @ 2014-10-07 14:21 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

As the workaround list has the value as initialization time
constant, we can do the simple checking on the go without
negleting igt.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 87482f8..dbd5dc5 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2659,16 +2659,16 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
 
 	seq_printf(m, "Workarounds applied: %d\n", dev_priv->workarounds.count);
 	for (i = 0; i < dev_priv->workarounds.count; ++i) {
-		u32 addr, mask;
+		u32 addr, mask, value, read;
+		bool ok;
 
 		addr = dev_priv->workarounds.reg[i].addr;
 		mask = dev_priv->workarounds.reg[i].mask;
-		dev_priv->workarounds.reg[i].value = I915_READ(addr) | mask;
-		if (dev_priv->workarounds.reg[i].addr)
-			seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
-				   dev_priv->workarounds.reg[i].addr,
-				   dev_priv->workarounds.reg[i].value,
-				   dev_priv->workarounds.reg[i].mask);
+		value = dev_priv->workarounds.reg[i].value;
+		read = I915_READ(addr);
+		ok = (value & mask) == (read & mask);
+		seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s\n",
+			   addr, value, mask, read, ok ? "OK" : "FAIL");
 	}
 
 	intel_runtime_pm_put(dev_priv);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] drm/i915: Reinitialize default context after resume
  2014-10-07 14:21 ` [PATCH 2/4] drm/i915: Reinitialize default context after resume Mika Kuoppala
@ 2014-10-07 15:06   ` Chris Wilson
  0 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2014-10-07 15:06 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx, miku

On Tue, Oct 07, 2014 at 05:21:25PM +0300, Mika Kuoppala wrote:
> We have lost render context state on suspend. This is
> identical how we lose the state on reset. So mark the
> context as reset so that we restore from pre suspend
> state.

You are on the right track. With a little more effort you can find the
solution that eliminates these first two patches by making the
init/enable sequence of contexts consistent with the rest of the driver
(and so suspend/resume etc).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] drm/i915: Build workaround list in ring initialization
  2014-10-07 14:21 ` [PATCH 3/4] drm/i915: Build workaround list in ring initialization Mika Kuoppala
@ 2014-10-13 15:21   ` Siluvery, Arun
  2014-10-20 16:13   ` Siluvery, Arun
  2014-10-21 15:54   ` Mika Kuoppala
  2 siblings, 0 replies; 10+ messages in thread
From: Siluvery, Arun @ 2014-10-13 15:21 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx; +Cc: miku

On 07/10/2014 15:21, Mika Kuoppala wrote:
> If we build the workaround list in ring initialization
> and decouple it from the actual writing of values, we
> gain the ability to decide where and how we want to apply
> the values.
>
> The advantage of this will become more clear when
> we need to initialize workarounds on older gens where
> it is not possible to write all the registers through ring
> LRIs.
>
> v2: rebase on newest bdw workarounds
>
> Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c     |  20 ++--
>   drivers/gpu/drm/i915/i915_drv.h         |  28 ++---
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 185 ++++++++++++++++++--------------
>   3 files changed, 130 insertions(+), 103 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index da4036d..87482f8 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2655,18 +2655,20 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
>
>   	intel_runtime_pm_get(dev_priv);
>
> -	seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs);
> -	for (i = 0; i < dev_priv->num_wa_regs; ++i) {
> +	gen6_gt_force_wake_get(dev_priv, FORCEWAKE_ALL);
> +
> +	seq_printf(m, "Workarounds applied: %d\n", dev_priv->workarounds.count);
> +	for (i = 0; i < dev_priv->workarounds.count; ++i) {
>   		u32 addr, mask;
>
> -		addr = dev_priv->intel_wa_regs[i].addr;
> -		mask = dev_priv->intel_wa_regs[i].mask;
> -		dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask;
> -		if (dev_priv->intel_wa_regs[i].addr)
> +		addr = dev_priv->workarounds.reg[i].addr;
> +		mask = dev_priv->workarounds.reg[i].mask;
> +		dev_priv->workarounds.reg[i].value = I915_READ(addr) | mask;
> +		if (dev_priv->workarounds.reg[i].addr)
>   			seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
> -				   dev_priv->intel_wa_regs[i].addr,
> -				   dev_priv->intel_wa_regs[i].value,
> -				   dev_priv->intel_wa_regs[i].mask);
> +				   dev_priv->workarounds.reg[i].addr,
> +				   dev_priv->workarounds.reg[i].value,
> +				   dev_priv->workarounds.reg[i].mask);
>   	}
>
>   	intel_runtime_pm_put(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 1e476b5..f7265bf 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1448,6 +1448,20 @@ struct i915_frontbuffer_tracking {
>   	unsigned flip_bits;
>   };
>
> +struct i915_wa_reg {
> +	u32 addr;
> +	u32 value;
> +	/* bitmask representing WA bits */
> +	u32 mask;
> +};
> +
> +#define I915_MAX_WA_REGS 16
> +
> +struct i915_workarounds {
> +	struct i915_wa_reg reg[I915_MAX_WA_REGS];
> +	u32 count;
> +};
> +
>   struct drm_i915_private {
>   	struct drm_device *dev;
>   	struct kmem_cache *slab;
> @@ -1590,19 +1604,7 @@ struct drm_i915_private {
>   	struct intel_shared_dpll shared_dplls[I915_NUM_PLLS];
>   	int dpio_phy_iosf_port[I915_NUM_PHYS_VLV];
>
> -	/*
> -	 * workarounds are currently applied at different places and
> -	 * changes are being done to consolidate them so exact count is
> -	 * not clear at this point, use a max value for now.
> -	 */
> -#define I915_MAX_WA_REGS  16
> -	struct {
> -		u32 addr;
> -		u32 value;
> -		/* bitmask representing WA bits */
> -		u32 mask;
> -	} intel_wa_regs[I915_MAX_WA_REGS];
> -	u32 num_wa_regs;
> +	struct i915_workarounds workarounds;
>
>   	/* Reclocking support */
>   	bool render_reclock_avail;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 816a692..12a546f 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -665,80 +665,107 @@ err:
>   	return ret;
>   }
>
> -static inline void intel_ring_emit_wa(struct intel_engine_cs *ring,
> -				       u32 addr, u32 value)
> +static int intel_ring_workarounds_emit(struct intel_engine_cs *ring)
>   {
> +	int ret, i;
>   	struct drm_device *dev = ring->dev;
>   	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct i915_workarounds *w = &dev_priv->workarounds;
>
> -	if (WARN_ON(dev_priv->num_wa_regs >= I915_MAX_WA_REGS))
> -		return;
> +	if (WARN_ON(w->count == 0))
> +		return 0;
>
> -	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> -	intel_ring_emit(ring, addr);
> -	intel_ring_emit(ring, value);
> +	ring->gpu_caches_dirty = true;
> +	ret = intel_ring_flush_all_caches(ring);
> +	if (ret)
> +		return ret;
>
> -	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].addr = addr;
> -	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].mask = value & 0xFFFF;
> -	/* value is updated with the status of remaining bits of this
> -	 * register when it is read from debugfs file
> -	 */
> -	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].value = value;
> -	dev_priv->num_wa_regs++;
> +	ret = intel_ring_begin(ring, w->count * 3);
> +	if (ret)
> +		return ret;
> +
> +	for (i = 0; i < w->count; i++) {
> +		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +		intel_ring_emit(ring, w->reg[i].addr);
> +		intel_ring_emit(ring, w->reg[i].value);
> +	}
> +
> +	intel_ring_advance(ring);
> +
> +	ring->gpu_caches_dirty = true;
> +	ret = intel_ring_flush_all_caches(ring);
> +	if (ret)
> +		return ret;
> +
> +	DRM_DEBUG_DRIVER("Number of Workarounds emitted: %d\n", w->count);
>
> -	return;
> +	return 0;
> +}
> +
> +static int wa_add(struct drm_i915_private *dev_priv,
> +		  const u32 addr, const u32 val, const u32 mask)
> +{
> +	const u32 idx = dev_priv->workarounds.count;
> +
> +	if (WARN_ON(idx >= I915_MAX_WA_REGS))
> +		return -ENOSPC;
> +
> +	dev_priv->workarounds.reg[idx].addr = addr;
> +	dev_priv->workarounds.reg[idx].value = val;
> +	dev_priv->workarounds.reg[idx].mask = mask;
> +
> +	dev_priv->workarounds.count++;
> +
> +	return 0;
>   }
>
> +#define WA_REG(addr, val, mask) { \
> +		const int r = wa_add(dev_priv, (addr), (val), (mask)); \
> +		if (r) \
> +			return r; \
> +	}
> +
> +#define WA_SET_BIT_MASKED(addr, mask) WA_REG(addr, \
> +				    _MASKED_BIT_ENABLE(mask), (mask) & 0xffff)
> +
> +#define WA_CLR_BIT_MASKED(addr, mask) WA_REG(addr, \
> +				    _MASKED_BIT_DISABLE(mask), (mask) & 0xffff)
> +
> +#define WA_SET_BIT(addr, mask) WA_REG(addr, I915_READ(addr) | (mask), mask)
> +#define WA_CLR_BIT(addr, mask) WA_REG(addr, I915_READ(addr) & ~(mask), mask)
> +
> +#define WA_WRITE(addr, val) WA_REG(addr, val, 0xffffffff)
> +
>   static int bdw_init_workarounds(struct intel_engine_cs *ring)
>   {
> -	int ret;
>   	struct drm_device *dev = ring->dev;
>   	struct drm_i915_private *dev_priv = dev->dev_private;
>
> -	/*
> -	 * workarounds applied in this fn are part of register state context,
> -	 * they need to be re-initialized followed by gpu reset, suspend/resume,
> -	 * module reload.
> -	 */
> -	dev_priv->num_wa_regs = 0;
> -	memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
> -
> -	/*
> -	 * update the number of dwords required based on the
> -	 * actual number of workarounds applied
> -	 */
> -	ret = intel_ring_begin(ring, 18);
> -	if (ret)
> -		return ret;
> -
>   	/* WaDisablePartialInstShootdown:bdw */
>   	/* WaDisableThreadStallDopClockGating:bdw */
> -	/* FIXME: Unclear whether we really need this on production bdw. */
> -	intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
> -			   _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE
> -					     | STALL_DOP_GATING_DISABLE));
> +	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
> +		  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE |
> +		  STALL_DOP_GATING_DISABLE);
>
>   	/* WaDisableDopClockGating:bdw May not be needed for production */
> -	intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
> -			   _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
> +	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
> +		  DOP_CLOCK_GATING_DISABLE);
>
> -	intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
> -			   _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
> +	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
> +			  GEN8_SAMPLER_POWER_BYPASS_DIS);
>
>   	/* Use Force Non-Coherent whenever executing a 3D context. This is a
>   	 * workaround for for a possible hang in the unlikely event a TLB
>   	 * invalidation occurs during a PSD flush.
>   	 */
>   	/* WaDisableFenceDestinationToSLM:bdw (GT3 pre-production) */
> -	intel_ring_emit_wa(ring, HDC_CHICKEN0,
> -			   _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT |
> -					      (IS_BDW_GT3(dev) ?
> -					       HDC_FENCE_DEST_SLM_DISABLE : 0)
> -				   ));
> +	WA_SET_BIT_MASKED(HDC_CHICKEN0,
> +			  HDC_FORCE_NON_COHERENT |
> +			  (IS_BDW_GT3(dev) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
>
>   	/* Wa4x4STCOptimizationDisable:bdw */
> -	intel_ring_emit_wa(ring, CACHE_MODE_1,
> -			   _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE));
> +	WA_SET_BIT_MASKED(CACHE_MODE_1,
> +		  GEN8_4x4_STC_OPTIMIZATION_DISABLE);
>
>   	/*
>   	 * BSpec recommends 8x4 when MSAA is used,
> @@ -748,52 +775,50 @@ static int bdw_init_workarounds(struct intel_engine_cs *ring)
>   	 * disable bit, which we don't touch here, but it's good
>   	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
>   	 */
> -	intel_ring_emit_wa(ring, GEN7_GT_MODE,
> -			   GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
> -
> -	intel_ring_advance(ring);
> -
> -	DRM_DEBUG_DRIVER("Number of Workarounds applied: %d\n",
> -			 dev_priv->num_wa_regs);
> +	WA_SET_BIT_MASKED(GEN7_GT_MODE,
> +		  GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
>
>   	return 0;
>   }
>
>   static int chv_init_workarounds(struct intel_engine_cs *ring)
>   {
> -	int ret;
>   	struct drm_device *dev = ring->dev;
>   	struct drm_i915_private *dev_priv = dev->dev_private;
>
> -	/*
> -	 * workarounds applied in this fn are part of register state context,
> -	 * they need to be re-initialized followed by gpu reset, suspend/resume,
> -	 * module reload.
> -	 */
> -	dev_priv->num_wa_regs = 0;
> -	memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
> -
> -	ret = intel_ring_begin(ring, 12);
> -	if (ret)
> -		return ret;
> -
>   	/* WaDisablePartialInstShootdown:chv */
> -	intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
> -			   _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE));
> +	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
> +		  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
>
>   	/* WaDisableThreadStallDopClockGating:chv */
> -	intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
> -			   _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE));
> +	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
> +		  STALL_DOP_GATING_DISABLE);
>
>   	/* WaDisableDopClockGating:chv (pre-production hw) */
> -	intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
> -			   _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
> +	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
> +		  DOP_CLOCK_GATING_DISABLE);
>
>   	/* WaDisableSamplerPowerBypass:chv (pre-production hw) */
> -	intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
> -			   _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
> +	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
> +		  GEN8_SAMPLER_POWER_BYPASS_DIS);
>
> -	intel_ring_advance(ring);
> +	return 0;
> +}
> +
> +static int init_workarounds_ring(struct intel_engine_cs *ring)
> +{
> +	struct drm_device *dev = ring->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +
> +	WARN_ON(ring->id != RCS);
> +
> +	dev_priv->workarounds.count = 0;
> +
> +	if (IS_BROADWELL(dev))
> +		return bdw_init_workarounds(ring);
> +
> +	if (IS_CHERRYVIEW(dev))
> +		return chv_init_workarounds(ring);
>
>   	return 0;
>   }
> @@ -853,7 +878,7 @@ static int init_render_ring(struct intel_engine_cs *ring)
>   	if (HAS_L3_DPF(dev))
>   		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
>
> -	return ret;
> +	return init_workarounds_ring(ring);
>   }
>
>   static void render_ring_cleanup(struct intel_engine_cs *ring)
> @@ -2299,10 +2324,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>   					dev_priv->semaphore_obj = obj;
>   			}
>   		}
> -		if (IS_CHERRYVIEW(dev))
> -			ring->init_context = chv_init_workarounds;
> -		else
> -			ring->init_context = bdw_init_workarounds;
> +
> +		ring->init_context = intel_ring_workarounds_emit;
>   		ring->add_request = gen6_add_request;
>   		ring->flush = gen8_render_ring_flush;
>   		ring->irq_get = gen8_ring_get_irq;
>

Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>

regards
Arun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] drm/i915: Check workaround status on dfs read time
  2014-10-07 14:21 ` [PATCH 4/4] drm/i915: Check workaround status on dfs read time Mika Kuoppala
@ 2014-10-13 15:22   ` Siluvery, Arun
  0 siblings, 0 replies; 10+ messages in thread
From: Siluvery, Arun @ 2014-10-13 15:22 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx; +Cc: miku

On 07/10/2014 15:21, Mika Kuoppala wrote:
> As the workaround list has the value as initialization time
> constant, we can do the simple checking on the go without
> negleting igt.
>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c | 14 +++++++-------
>   1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 87482f8..dbd5dc5 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2659,16 +2659,16 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
>
>   	seq_printf(m, "Workarounds applied: %d\n", dev_priv->workarounds.count);
>   	for (i = 0; i < dev_priv->workarounds.count; ++i) {
> -		u32 addr, mask;
> +		u32 addr, mask, value, read;
> +		bool ok;
>
>   		addr = dev_priv->workarounds.reg[i].addr;
>   		mask = dev_priv->workarounds.reg[i].mask;
> -		dev_priv->workarounds.reg[i].value = I915_READ(addr) | mask;
> -		if (dev_priv->workarounds.reg[i].addr)
> -			seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
> -				   dev_priv->workarounds.reg[i].addr,
> -				   dev_priv->workarounds.reg[i].value,
> -				   dev_priv->workarounds.reg[i].mask);
> +		value = dev_priv->workarounds.reg[i].value;
> +		read = I915_READ(addr);
> +		ok = (value & mask) == (read & mask);
> +		seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s\n",
> +			   addr, value, mask, read, ok ? "OK" : "FAIL");
>   	}
>
>   	intel_runtime_pm_put(dev_priv);
>

Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>

regards
Arun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] drm/i915: Build workaround list in ring initialization
  2014-10-07 14:21 ` [PATCH 3/4] drm/i915: Build workaround list in ring initialization Mika Kuoppala
  2014-10-13 15:21   ` Siluvery, Arun
@ 2014-10-20 16:13   ` Siluvery, Arun
  2014-10-21 12:52     ` Daniel Vetter
  2014-10-21 15:54   ` Mika Kuoppala
  2 siblings, 1 reply; 10+ messages in thread
From: Siluvery, Arun @ 2014-10-20 16:13 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx; +Cc: miku

On 07/10/2014 15:21, Mika Kuoppala wrote:
> If we build the workaround list in ring initialization
> and decouple it from the actual writing of values, we
> gain the ability to decide where and how we want to apply
> the values.
>
> The advantage of this will become more clear when
> we need to initialize workarounds on older gens where
> it is not possible to write all the registers through ring
> LRIs.
>
> v2: rebase on newest bdw workarounds
>
> Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c     |  20 ++--
>   drivers/gpu/drm/i915/i915_drv.h         |  28 ++---
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 185 ++++++++++++++++++--------------
>   3 files changed, 130 insertions(+), 103 deletions(-)

Hi Daniel,

Patches 3, 4 in this series are independent of the first two.
Could you please pull-in these patches?

regards
Arun

>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index da4036d..87482f8 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2655,18 +2655,20 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
>
>   	intel_runtime_pm_get(dev_priv);
>
> -	seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs);
> -	for (i = 0; i < dev_priv->num_wa_regs; ++i) {
> +	gen6_gt_force_wake_get(dev_priv, FORCEWAKE_ALL);
> +
> +	seq_printf(m, "Workarounds applied: %d\n", dev_priv->workarounds.count);
> +	for (i = 0; i < dev_priv->workarounds.count; ++i) {
>   		u32 addr, mask;
>
> -		addr = dev_priv->intel_wa_regs[i].addr;
> -		mask = dev_priv->intel_wa_regs[i].mask;
> -		dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask;
> -		if (dev_priv->intel_wa_regs[i].addr)
> +		addr = dev_priv->workarounds.reg[i].addr;
> +		mask = dev_priv->workarounds.reg[i].mask;
> +		dev_priv->workarounds.reg[i].value = I915_READ(addr) | mask;
> +		if (dev_priv->workarounds.reg[i].addr)
>   			seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
> -				   dev_priv->intel_wa_regs[i].addr,
> -				   dev_priv->intel_wa_regs[i].value,
> -				   dev_priv->intel_wa_regs[i].mask);
> +				   dev_priv->workarounds.reg[i].addr,
> +				   dev_priv->workarounds.reg[i].value,
> +				   dev_priv->workarounds.reg[i].mask);
>   	}
>
>   	intel_runtime_pm_put(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 1e476b5..f7265bf 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1448,6 +1448,20 @@ struct i915_frontbuffer_tracking {
>   	unsigned flip_bits;
>   };
>
> +struct i915_wa_reg {
> +	u32 addr;
> +	u32 value;
> +	/* bitmask representing WA bits */
> +	u32 mask;
> +};
> +
> +#define I915_MAX_WA_REGS 16
> +
> +struct i915_workarounds {
> +	struct i915_wa_reg reg[I915_MAX_WA_REGS];
> +	u32 count;
> +};
> +
>   struct drm_i915_private {
>   	struct drm_device *dev;
>   	struct kmem_cache *slab;
> @@ -1590,19 +1604,7 @@ struct drm_i915_private {
>   	struct intel_shared_dpll shared_dplls[I915_NUM_PLLS];
>   	int dpio_phy_iosf_port[I915_NUM_PHYS_VLV];
>
> -	/*
> -	 * workarounds are currently applied at different places and
> -	 * changes are being done to consolidate them so exact count is
> -	 * not clear at this point, use a max value for now.
> -	 */
> -#define I915_MAX_WA_REGS  16
> -	struct {
> -		u32 addr;
> -		u32 value;
> -		/* bitmask representing WA bits */
> -		u32 mask;
> -	} intel_wa_regs[I915_MAX_WA_REGS];
> -	u32 num_wa_regs;
> +	struct i915_workarounds workarounds;
>
>   	/* Reclocking support */
>   	bool render_reclock_avail;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 816a692..12a546f 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -665,80 +665,107 @@ err:
>   	return ret;
>   }
>
> -static inline void intel_ring_emit_wa(struct intel_engine_cs *ring,
> -				       u32 addr, u32 value)
> +static int intel_ring_workarounds_emit(struct intel_engine_cs *ring)
>   {
> +	int ret, i;
>   	struct drm_device *dev = ring->dev;
>   	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct i915_workarounds *w = &dev_priv->workarounds;
>
> -	if (WARN_ON(dev_priv->num_wa_regs >= I915_MAX_WA_REGS))
> -		return;
> +	if (WARN_ON(w->count == 0))
> +		return 0;
>
> -	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> -	intel_ring_emit(ring, addr);
> -	intel_ring_emit(ring, value);
> +	ring->gpu_caches_dirty = true;
> +	ret = intel_ring_flush_all_caches(ring);
> +	if (ret)
> +		return ret;
>
> -	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].addr = addr;
> -	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].mask = value & 0xFFFF;
> -	/* value is updated with the status of remaining bits of this
> -	 * register when it is read from debugfs file
> -	 */
> -	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].value = value;
> -	dev_priv->num_wa_regs++;
> +	ret = intel_ring_begin(ring, w->count * 3);
> +	if (ret)
> +		return ret;
> +
> +	for (i = 0; i < w->count; i++) {
> +		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +		intel_ring_emit(ring, w->reg[i].addr);
> +		intel_ring_emit(ring, w->reg[i].value);
> +	}
> +
> +	intel_ring_advance(ring);
> +
> +	ring->gpu_caches_dirty = true;
> +	ret = intel_ring_flush_all_caches(ring);
> +	if (ret)
> +		return ret;
> +
> +	DRM_DEBUG_DRIVER("Number of Workarounds emitted: %d\n", w->count);
>
> -	return;
> +	return 0;
> +}
> +
> +static int wa_add(struct drm_i915_private *dev_priv,
> +		  const u32 addr, const u32 val, const u32 mask)
> +{
> +	const u32 idx = dev_priv->workarounds.count;
> +
> +	if (WARN_ON(idx >= I915_MAX_WA_REGS))
> +		return -ENOSPC;
> +
> +	dev_priv->workarounds.reg[idx].addr = addr;
> +	dev_priv->workarounds.reg[idx].value = val;
> +	dev_priv->workarounds.reg[idx].mask = mask;
> +
> +	dev_priv->workarounds.count++;
> +
> +	return 0;
>   }
>
> +#define WA_REG(addr, val, mask) { \
> +		const int r = wa_add(dev_priv, (addr), (val), (mask)); \
> +		if (r) \
> +			return r; \
> +	}
> +
> +#define WA_SET_BIT_MASKED(addr, mask) WA_REG(addr, \
> +				    _MASKED_BIT_ENABLE(mask), (mask) & 0xffff)
> +
> +#define WA_CLR_BIT_MASKED(addr, mask) WA_REG(addr, \
> +				    _MASKED_BIT_DISABLE(mask), (mask) & 0xffff)
> +
> +#define WA_SET_BIT(addr, mask) WA_REG(addr, I915_READ(addr) | (mask), mask)
> +#define WA_CLR_BIT(addr, mask) WA_REG(addr, I915_READ(addr) & ~(mask), mask)
> +
> +#define WA_WRITE(addr, val) WA_REG(addr, val, 0xffffffff)
> +
>   static int bdw_init_workarounds(struct intel_engine_cs *ring)
>   {
> -	int ret;
>   	struct drm_device *dev = ring->dev;
>   	struct drm_i915_private *dev_priv = dev->dev_private;
>
> -	/*
> -	 * workarounds applied in this fn are part of register state context,
> -	 * they need to be re-initialized followed by gpu reset, suspend/resume,
> -	 * module reload.
> -	 */
> -	dev_priv->num_wa_regs = 0;
> -	memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
> -
> -	/*
> -	 * update the number of dwords required based on the
> -	 * actual number of workarounds applied
> -	 */
> -	ret = intel_ring_begin(ring, 18);
> -	if (ret)
> -		return ret;
> -
>   	/* WaDisablePartialInstShootdown:bdw */
>   	/* WaDisableThreadStallDopClockGating:bdw */
> -	/* FIXME: Unclear whether we really need this on production bdw. */
> -	intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
> -			   _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE
> -					     | STALL_DOP_GATING_DISABLE));
> +	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
> +		  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE |
> +		  STALL_DOP_GATING_DISABLE);
>
>   	/* WaDisableDopClockGating:bdw May not be needed for production */
> -	intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
> -			   _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
> +	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
> +		  DOP_CLOCK_GATING_DISABLE);
>
> -	intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
> -			   _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
> +	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
> +			  GEN8_SAMPLER_POWER_BYPASS_DIS);
>
>   	/* Use Force Non-Coherent whenever executing a 3D context. This is a
>   	 * workaround for for a possible hang in the unlikely event a TLB
>   	 * invalidation occurs during a PSD flush.
>   	 */
>   	/* WaDisableFenceDestinationToSLM:bdw (GT3 pre-production) */
> -	intel_ring_emit_wa(ring, HDC_CHICKEN0,
> -			   _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT |
> -					      (IS_BDW_GT3(dev) ?
> -					       HDC_FENCE_DEST_SLM_DISABLE : 0)
> -				   ));
> +	WA_SET_BIT_MASKED(HDC_CHICKEN0,
> +			  HDC_FORCE_NON_COHERENT |
> +			  (IS_BDW_GT3(dev) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
>
>   	/* Wa4x4STCOptimizationDisable:bdw */
> -	intel_ring_emit_wa(ring, CACHE_MODE_1,
> -			   _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE));
> +	WA_SET_BIT_MASKED(CACHE_MODE_1,
> +		  GEN8_4x4_STC_OPTIMIZATION_DISABLE);
>
>   	/*
>   	 * BSpec recommends 8x4 when MSAA is used,
> @@ -748,52 +775,50 @@ static int bdw_init_workarounds(struct intel_engine_cs *ring)
>   	 * disable bit, which we don't touch here, but it's good
>   	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
>   	 */
> -	intel_ring_emit_wa(ring, GEN7_GT_MODE,
> -			   GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
> -
> -	intel_ring_advance(ring);
> -
> -	DRM_DEBUG_DRIVER("Number of Workarounds applied: %d\n",
> -			 dev_priv->num_wa_regs);
> +	WA_SET_BIT_MASKED(GEN7_GT_MODE,
> +		  GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
>
>   	return 0;
>   }
>
>   static int chv_init_workarounds(struct intel_engine_cs *ring)
>   {
> -	int ret;
>   	struct drm_device *dev = ring->dev;
>   	struct drm_i915_private *dev_priv = dev->dev_private;
>
> -	/*
> -	 * workarounds applied in this fn are part of register state context,
> -	 * they need to be re-initialized followed by gpu reset, suspend/resume,
> -	 * module reload.
> -	 */
> -	dev_priv->num_wa_regs = 0;
> -	memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
> -
> -	ret = intel_ring_begin(ring, 12);
> -	if (ret)
> -		return ret;
> -
>   	/* WaDisablePartialInstShootdown:chv */
> -	intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
> -			   _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE));
> +	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
> +		  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
>
>   	/* WaDisableThreadStallDopClockGating:chv */
> -	intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
> -			   _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE));
> +	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
> +		  STALL_DOP_GATING_DISABLE);
>
>   	/* WaDisableDopClockGating:chv (pre-production hw) */
> -	intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
> -			   _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
> +	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
> +		  DOP_CLOCK_GATING_DISABLE);
>
>   	/* WaDisableSamplerPowerBypass:chv (pre-production hw) */
> -	intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
> -			   _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
> +	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
> +		  GEN8_SAMPLER_POWER_BYPASS_DIS);
>
> -	intel_ring_advance(ring);
> +	return 0;
> +}
> +
> +static int init_workarounds_ring(struct intel_engine_cs *ring)
> +{
> +	struct drm_device *dev = ring->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +
> +	WARN_ON(ring->id != RCS);
> +
> +	dev_priv->workarounds.count = 0;
> +
> +	if (IS_BROADWELL(dev))
> +		return bdw_init_workarounds(ring);
> +
> +	if (IS_CHERRYVIEW(dev))
> +		return chv_init_workarounds(ring);
>
>   	return 0;
>   }
> @@ -853,7 +878,7 @@ static int init_render_ring(struct intel_engine_cs *ring)
>   	if (HAS_L3_DPF(dev))
>   		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
>
> -	return ret;
> +	return init_workarounds_ring(ring);
>   }
>
>   static void render_ring_cleanup(struct intel_engine_cs *ring)
> @@ -2299,10 +2324,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>   					dev_priv->semaphore_obj = obj;
>   			}
>   		}
> -		if (IS_CHERRYVIEW(dev))
> -			ring->init_context = chv_init_workarounds;
> -		else
> -			ring->init_context = bdw_init_workarounds;
> +
> +		ring->init_context = intel_ring_workarounds_emit;
>   		ring->add_request = gen6_add_request;
>   		ring->flush = gen8_render_ring_flush;
>   		ring->irq_get = gen8_ring_get_irq;
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] drm/i915: Build workaround list in ring initialization
  2014-10-20 16:13   ` Siluvery, Arun
@ 2014-10-21 12:52     ` Daniel Vetter
  0 siblings, 0 replies; 10+ messages in thread
From: Daniel Vetter @ 2014-10-21 12:52 UTC (permalink / raw)
  To: Siluvery, Arun; +Cc: intel-gfx, miku

On Mon, Oct 20, 2014 at 05:13:45PM +0100, Siluvery, Arun wrote:
> On 07/10/2014 15:21, Mika Kuoppala wrote:
> >If we build the workaround list in ring initialization
> >and decouple it from the actual writing of values, we
> >gain the ability to decide where and how we want to apply
> >the values.
> >
> >The advantage of this will become more clear when
> >we need to initialize workarounds on older gens where
> >it is not possible to write all the registers through ring
> >LRIs.
> >
> >v2: rebase on newest bdw workarounds
> >
> >Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
> >Cc: Damien Lespiau <damien.lespiau@intel.com>
> >Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> >---
> >  drivers/gpu/drm/i915/i915_debugfs.c     |  20 ++--
> >  drivers/gpu/drm/i915/i915_drv.h         |  28 ++---
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 185 ++++++++++++++++++--------------
> >  3 files changed, 130 insertions(+), 103 deletions(-)
> 
> Hi Daniel,
> 
> Patches 3, 4 in this series are independent of the first two.
> Could you please pull-in these patches?

Thanks for the pointer, both patches merged. I did frob the whitespace a
bit on patch 3 though to satisfy my ocd ;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] drm/i915: Build workaround list in ring initialization
  2014-10-07 14:21 ` [PATCH 3/4] drm/i915: Build workaround list in ring initialization Mika Kuoppala
  2014-10-13 15:21   ` Siluvery, Arun
  2014-10-20 16:13   ` Siluvery, Arun
@ 2014-10-21 15:54   ` Mika Kuoppala
  2 siblings, 0 replies; 10+ messages in thread
From: Mika Kuoppala @ 2014-10-21 15:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

Mika Kuoppala <mika.kuoppala@linux.intel.com> writes:

> If we build the workaround list in ring initialization
> and decouple it from the actual writing of values, we
> gain the ability to decide where and how we want to apply
> the values.
>
> The advantage of this will become more clear when
> we need to initialize workarounds on older gens where
> it is not possible to write all the registers through ring
> LRIs.
>
> v2: rebase on newest bdw workarounds
>
> Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c     |  20 ++--
>  drivers/gpu/drm/i915/i915_drv.h         |  28 ++---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 185 ++++++++++++++++++--------------
>  3 files changed, 130 insertions(+), 103 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index da4036d..87482f8 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2655,18 +2655,20 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
>  
>  	intel_runtime_pm_get(dev_priv);
>  
> -	seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs);
> -	for (i = 0; i < dev_priv->num_wa_regs; ++i) {
> +	gen6_gt_force_wake_get(dev_priv, FORCEWAKE_ALL);
> +

Just noticed that this extra wake_get is not correct in here.
-Mika

> +	seq_printf(m, "Workarounds applied: %d\n", dev_priv->workarounds.count);
> +	for (i = 0; i < dev_priv->workarounds.count; ++i) {
>  		u32 addr, mask;
>  
> -		addr = dev_priv->intel_wa_regs[i].addr;
> -		mask = dev_priv->intel_wa_regs[i].mask;
> -		dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask;
> -		if (dev_priv->intel_wa_regs[i].addr)
> +		addr = dev_priv->workarounds.reg[i].addr;
> +		mask = dev_priv->workarounds.reg[i].mask;
> +		dev_priv->workarounds.reg[i].value = I915_READ(addr) | mask;
> +		if (dev_priv->workarounds.reg[i].addr)
>  			seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
> -				   dev_priv->intel_wa_regs[i].addr,
> -				   dev_priv->intel_wa_regs[i].value,
> -				   dev_priv->intel_wa_regs[i].mask);
> +				   dev_priv->workarounds.reg[i].addr,
> +				   dev_priv->workarounds.reg[i].value,
> +				   dev_priv->workarounds.reg[i].mask);
>  	}
>  
>  	intel_runtime_pm_put(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 1e476b5..f7265bf 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1448,6 +1448,20 @@ struct i915_frontbuffer_tracking {
>  	unsigned flip_bits;
>  };
>  
> +struct i915_wa_reg {
> +	u32 addr;
> +	u32 value;
> +	/* bitmask representing WA bits */
> +	u32 mask;
> +};
> +
> +#define I915_MAX_WA_REGS 16
> +
> +struct i915_workarounds {
> +	struct i915_wa_reg reg[I915_MAX_WA_REGS];
> +	u32 count;
> +};
> +
>  struct drm_i915_private {
>  	struct drm_device *dev;
>  	struct kmem_cache *slab;
> @@ -1590,19 +1604,7 @@ struct drm_i915_private {
>  	struct intel_shared_dpll shared_dplls[I915_NUM_PLLS];
>  	int dpio_phy_iosf_port[I915_NUM_PHYS_VLV];
>  
> -	/*
> -	 * workarounds are currently applied at different places and
> -	 * changes are being done to consolidate them so exact count is
> -	 * not clear at this point, use a max value for now.
> -	 */
> -#define I915_MAX_WA_REGS  16
> -	struct {
> -		u32 addr;
> -		u32 value;
> -		/* bitmask representing WA bits */
> -		u32 mask;
> -	} intel_wa_regs[I915_MAX_WA_REGS];
> -	u32 num_wa_regs;
> +	struct i915_workarounds workarounds;
>  
>  	/* Reclocking support */
>  	bool render_reclock_avail;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 816a692..12a546f 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -665,80 +665,107 @@ err:
>  	return ret;
>  }
>  
> -static inline void intel_ring_emit_wa(struct intel_engine_cs *ring,
> -				       u32 addr, u32 value)
> +static int intel_ring_workarounds_emit(struct intel_engine_cs *ring)
>  {
> +	int ret, i;
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct i915_workarounds *w = &dev_priv->workarounds;
>  
> -	if (WARN_ON(dev_priv->num_wa_regs >= I915_MAX_WA_REGS))
> -		return;
> +	if (WARN_ON(w->count == 0))
> +		return 0;
>  
> -	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> -	intel_ring_emit(ring, addr);
> -	intel_ring_emit(ring, value);
> +	ring->gpu_caches_dirty = true;
> +	ret = intel_ring_flush_all_caches(ring);
> +	if (ret)
> +		return ret;
>  
> -	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].addr = addr;
> -	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].mask = value & 0xFFFF;
> -	/* value is updated with the status of remaining bits of this
> -	 * register when it is read from debugfs file
> -	 */
> -	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].value = value;
> -	dev_priv->num_wa_regs++;
> +	ret = intel_ring_begin(ring, w->count * 3);
> +	if (ret)
> +		return ret;
> +
> +	for (i = 0; i < w->count; i++) {
> +		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> +		intel_ring_emit(ring, w->reg[i].addr);
> +		intel_ring_emit(ring, w->reg[i].value);
> +	}
> +
> +	intel_ring_advance(ring);
> +
> +	ring->gpu_caches_dirty = true;
> +	ret = intel_ring_flush_all_caches(ring);
> +	if (ret)
> +		return ret;
> +
> +	DRM_DEBUG_DRIVER("Number of Workarounds emitted: %d\n", w->count);
>  
> -	return;
> +	return 0;
> +}
> +
> +static int wa_add(struct drm_i915_private *dev_priv,
> +		  const u32 addr, const u32 val, const u32 mask)
> +{
> +	const u32 idx = dev_priv->workarounds.count;
> +
> +	if (WARN_ON(idx >= I915_MAX_WA_REGS))
> +		return -ENOSPC;
> +
> +	dev_priv->workarounds.reg[idx].addr = addr;
> +	dev_priv->workarounds.reg[idx].value = val;
> +	dev_priv->workarounds.reg[idx].mask = mask;
> +
> +	dev_priv->workarounds.count++;
> +
> +	return 0;
>  }
>  
> +#define WA_REG(addr, val, mask) { \
> +		const int r = wa_add(dev_priv, (addr), (val), (mask)); \
> +		if (r) \
> +			return r; \
> +	}
> +
> +#define WA_SET_BIT_MASKED(addr, mask) WA_REG(addr, \
> +				    _MASKED_BIT_ENABLE(mask), (mask) & 0xffff)
> +
> +#define WA_CLR_BIT_MASKED(addr, mask) WA_REG(addr, \
> +				    _MASKED_BIT_DISABLE(mask), (mask) & 0xffff)
> +
> +#define WA_SET_BIT(addr, mask) WA_REG(addr, I915_READ(addr) | (mask), mask)
> +#define WA_CLR_BIT(addr, mask) WA_REG(addr, I915_READ(addr) & ~(mask), mask)
> +
> +#define WA_WRITE(addr, val) WA_REG(addr, val, 0xffffffff)
> +
>  static int bdw_init_workarounds(struct intel_engine_cs *ring)
>  {
> -	int ret;
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> -	/*
> -	 * workarounds applied in this fn are part of register state context,
> -	 * they need to be re-initialized followed by gpu reset, suspend/resume,
> -	 * module reload.
> -	 */
> -	dev_priv->num_wa_regs = 0;
> -	memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
> -
> -	/*
> -	 * update the number of dwords required based on the
> -	 * actual number of workarounds applied
> -	 */
> -	ret = intel_ring_begin(ring, 18);
> -	if (ret)
> -		return ret;
> -
>  	/* WaDisablePartialInstShootdown:bdw */
>  	/* WaDisableThreadStallDopClockGating:bdw */
> -	/* FIXME: Unclear whether we really need this on production bdw. */
> -	intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
> -			   _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE
> -					     | STALL_DOP_GATING_DISABLE));
> +	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
> +		  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE |
> +		  STALL_DOP_GATING_DISABLE);
>  
>  	/* WaDisableDopClockGating:bdw May not be needed for production */
> -	intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
> -			   _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
> +	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
> +		  DOP_CLOCK_GATING_DISABLE);
>  
> -	intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
> -			   _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
> +	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
> +			  GEN8_SAMPLER_POWER_BYPASS_DIS);
>  
>  	/* Use Force Non-Coherent whenever executing a 3D context. This is a
>  	 * workaround for for a possible hang in the unlikely event a TLB
>  	 * invalidation occurs during a PSD flush.
>  	 */
>  	/* WaDisableFenceDestinationToSLM:bdw (GT3 pre-production) */
> -	intel_ring_emit_wa(ring, HDC_CHICKEN0,
> -			   _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT |
> -					      (IS_BDW_GT3(dev) ?
> -					       HDC_FENCE_DEST_SLM_DISABLE : 0)
> -				   ));
> +	WA_SET_BIT_MASKED(HDC_CHICKEN0,
> +			  HDC_FORCE_NON_COHERENT |
> +			  (IS_BDW_GT3(dev) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
>  
>  	/* Wa4x4STCOptimizationDisable:bdw */
> -	intel_ring_emit_wa(ring, CACHE_MODE_1,
> -			   _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE));
> +	WA_SET_BIT_MASKED(CACHE_MODE_1,
> +		  GEN8_4x4_STC_OPTIMIZATION_DISABLE);
>  
>  	/*
>  	 * BSpec recommends 8x4 when MSAA is used,
> @@ -748,52 +775,50 @@ static int bdw_init_workarounds(struct intel_engine_cs *ring)
>  	 * disable bit, which we don't touch here, but it's good
>  	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
>  	 */
> -	intel_ring_emit_wa(ring, GEN7_GT_MODE,
> -			   GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
> -
> -	intel_ring_advance(ring);
> -
> -	DRM_DEBUG_DRIVER("Number of Workarounds applied: %d\n",
> -			 dev_priv->num_wa_regs);
> +	WA_SET_BIT_MASKED(GEN7_GT_MODE,
> +		  GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
>  
>  	return 0;
>  }
>  
>  static int chv_init_workarounds(struct intel_engine_cs *ring)
>  {
> -	int ret;
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> -	/*
> -	 * workarounds applied in this fn are part of register state context,
> -	 * they need to be re-initialized followed by gpu reset, suspend/resume,
> -	 * module reload.
> -	 */
> -	dev_priv->num_wa_regs = 0;
> -	memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
> -
> -	ret = intel_ring_begin(ring, 12);
> -	if (ret)
> -		return ret;
> -
>  	/* WaDisablePartialInstShootdown:chv */
> -	intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
> -			   _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE));
> +	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
> +		  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
>  
>  	/* WaDisableThreadStallDopClockGating:chv */
> -	intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
> -			   _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE));
> +	WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
> +		  STALL_DOP_GATING_DISABLE);
>  
>  	/* WaDisableDopClockGating:chv (pre-production hw) */
> -	intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
> -			   _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
> +	WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
> +		  DOP_CLOCK_GATING_DISABLE);
>  
>  	/* WaDisableSamplerPowerBypass:chv (pre-production hw) */
> -	intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
> -			   _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
> +	WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
> +		  GEN8_SAMPLER_POWER_BYPASS_DIS);
>  
> -	intel_ring_advance(ring);
> +	return 0;
> +}
> +
> +static int init_workarounds_ring(struct intel_engine_cs *ring)
> +{
> +	struct drm_device *dev = ring->dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +
> +	WARN_ON(ring->id != RCS);
> +
> +	dev_priv->workarounds.count = 0;
> +
> +	if (IS_BROADWELL(dev))
> +		return bdw_init_workarounds(ring);
> +
> +	if (IS_CHERRYVIEW(dev))
> +		return chv_init_workarounds(ring);
>  
>  	return 0;
>  }
> @@ -853,7 +878,7 @@ static int init_render_ring(struct intel_engine_cs *ring)
>  	if (HAS_L3_DPF(dev))
>  		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
>  
> -	return ret;
> +	return init_workarounds_ring(ring);
>  }
>  
>  static void render_ring_cleanup(struct intel_engine_cs *ring)
> @@ -2299,10 +2324,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>  					dev_priv->semaphore_obj = obj;
>  			}
>  		}
> -		if (IS_CHERRYVIEW(dev))
> -			ring->init_context = chv_init_workarounds;
> -		else
> -			ring->init_context = bdw_init_workarounds;
> +
> +		ring->init_context = intel_ring_workarounds_emit;
>  		ring->add_request = gen6_add_request;
>  		ring->flush = gen8_render_ring_flush;
>  		ring->irq_get = gen8_ring_get_irq;
> -- 
> 1.9.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-10-21 21:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-07 14:21 [PATCH 1/4] drm/i915: Restore default render context after hw state reset Mika Kuoppala
2014-10-07 14:21 ` [PATCH 2/4] drm/i915: Reinitialize default context after resume Mika Kuoppala
2014-10-07 15:06   ` Chris Wilson
2014-10-07 14:21 ` [PATCH 3/4] drm/i915: Build workaround list in ring initialization Mika Kuoppala
2014-10-13 15:21   ` Siluvery, Arun
2014-10-20 16:13   ` Siluvery, Arun
2014-10-21 12:52     ` Daniel Vetter
2014-10-21 15:54   ` Mika Kuoppala
2014-10-07 14:21 ` [PATCH 4/4] drm/i915: Check workaround status on dfs read time Mika Kuoppala
2014-10-13 15:22   ` Siluvery, Arun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox