public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Apply BDW workarounds using LRIs in render init fn
@ 2014-08-22 19:39 Arun Siluvery
  0 siblings, 0 replies; 9+ messages in thread
From: Arun Siluvery @ 2014-08-22 19:39 UTC (permalink / raw)
  To: intel-gfx

Workarounds for BDW are applied using LRIs during render ring initialization.
Only those WA registers that are part of register state are initialized
in this fn, remaining are still in its current place init_clock_gating() which
are not affected by a gpu reset. I can send another patch where they can be
moved to render ring init function but during testing I found their state
doesn't change after reset.

Arun Siluvery (2):
  drm/i915/bdw: Apply workarounds in render ring init function
  drm/i915/bdw: Export workaround data to debugfs

 drivers/gpu/drm/i915/i915_debugfs.c     | 40 +++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h         | 14 ++++++
 drivers/gpu/drm/i915/i915_gem_context.c |  6 +++
 drivers/gpu/drm/i915/intel_pm.c         | 48 --------------------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 77 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.h | 17 ++++++++
 6 files changed, 154 insertions(+), 48 deletions(-)

-- 
2.0.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 0/2] Apply BDW workarounds using LRIs in render init fn
@ 2014-08-26  9:33 Arun Siluvery
  2014-08-26  9:33 ` [PATCH 1/2] drm/i915/bdw: Apply workarounds in render ring init function Arun Siluvery
  2014-08-26  9:33 ` [PATCH 2/2] drm/i915/bdw: Export workaround data to debugfs Arun Siluvery
  0 siblings, 2 replies; 9+ messages in thread
From: Arun Siluvery @ 2014-08-26  9:33 UTC (permalink / raw)
  To: intel-gfx

Workarounds for BDW are applied using LRIs during render ring initialization.
Only those WA registers that are part of register state are initialized
in this fn, remaining are still in its current place init_clock_gating() which
are not affected by a gpu reset. I can send another patch where they can be
moved to render ring init function but during testing I found their state
doesn't change after reset.

Arun Siluvery (2):
  drm/i915/bdw: Apply workarounds in render ring init function
  drm/i915/bdw: Export workaround data to debugfs

 drivers/gpu/drm/i915/i915_debugfs.c     |  40 +++++++++++++
 drivers/gpu/drm/i915/i915_drv.h         |  14 +++++
 drivers/gpu/drm/i915/i915_gem_context.c |   6 ++
 drivers/gpu/drm/i915/intel_pm.c         |  48 ---------------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 101 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
 6 files changed, 162 insertions(+), 48 deletions(-)

-- 
2.0.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] drm/i915/bdw: Apply workarounds in render ring init function
  2014-08-26  9:33 [PATCH 0/2] Apply BDW workarounds using LRIs in render init fn Arun Siluvery
@ 2014-08-26  9:33 ` Arun Siluvery
  2014-08-26 10:09   ` Chris Wilson
  2014-08-26  9:33 ` [PATCH 2/2] drm/i915/bdw: Export workaround data to debugfs Arun Siluvery
  1 sibling, 1 reply; 9+ messages in thread
From: Arun Siluvery @ 2014-08-26  9:33 UTC (permalink / raw)
  To: intel-gfx

For BDW workarounds are currently initialized in init_clock_gating() but
they are lost during reset, suspend/resume etc; this patch moves the WAs
that are part of register state context to render ring init fn otherwise
default context ends up with incorrect values as they don't get initialized
until init_clock_gating fn.

v2: Add workarounds to golden render state
This method has its own issues, first of all this is different for
each gen and it is generated using a tool so adding new workaround
and mainitaining them across gens is not a straightforward process.

v3: Use LRIs to emit these workarounds (Ville)
Instead of modifying the golden render state the same LRIs are
emitted from within the driver.

For: VIZ-4092
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c |  6 +++
 drivers/gpu/drm/i915/intel_pm.c         | 48 --------------------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 78 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 4 files changed, 85 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 9683e62..2debce4 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -631,20 +631,26 @@ static int do_switch(struct intel_engine_cs *ring,
 	}
 
 	uninitialized = !to->legacy_hw_ctx.initialized && from == NULL;
 	to->legacy_hw_ctx.initialized = true;
 
 done:
 	i915_gem_context_reference(to);
 	ring->last_context = to;
 
 	if (uninitialized) {
+		if (IS_BROADWELL(ring->dev)) {
+			ret = bdw_init_workarounds(ring);
+			if (ret)
+				DRM_ERROR("init workarounds: %d\n", ret);
+		}
+
 		ret = i915_gem_render_state_init(ring);
 		if (ret)
 			DRM_ERROR("init render state: %d\n", ret);
 	}
 
 	return 0;
 
 unpin_out:
 	if (ring->id == RCS)
 		i915_gem_object_ggtt_unpin(to->legacy_hw_ctx.rcs_state);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index c8f744c..668acd9 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5507,101 +5507,53 @@ static void gen8_init_clock_gating(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	enum pipe pipe;
 
 	I915_WRITE(WM3_LP_ILK, 0);
 	I915_WRITE(WM2_LP_ILK, 0);
 	I915_WRITE(WM1_LP_ILK, 0);
 
 	/* FIXME(BDW): Check all the w/a, some might only apply to
 	 * pre-production hw. */
 
-	/* WaDisablePartialInstShootdown:bdw */
-	I915_WRITE(GEN8_ROW_CHICKEN,
-		   _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE));
-
-	/* WaDisableThreadStallDopClockGating:bdw */
-	/* FIXME: Unclear whether we really need this on production bdw. */
-	I915_WRITE(GEN8_ROW_CHICKEN,
-		   _MASKED_BIT_ENABLE(STALL_DOP_GATING_DISABLE));
 
-	/*
-	 * This GEN8_CENTROID_PIXEL_OPT_DIS W/A is only needed for
-	 * pre-production hardware
-	 */
-	I915_WRITE(HALF_SLICE_CHICKEN3,
-		   _MASKED_BIT_ENABLE(GEN8_CENTROID_PIXEL_OPT_DIS));
-	I915_WRITE(HALF_SLICE_CHICKEN3,
-		   _MASKED_BIT_ENABLE(GEN8_SAMPLER_POWER_BYPASS_DIS));
 	I915_WRITE(GAMTARBMODE, _MASKED_BIT_ENABLE(ARB_MODE_BWGTLB_DISABLE));
 
 	I915_WRITE(_3D_CHICKEN3,
 		   _MASKED_BIT_ENABLE(_3D_CHICKEN_SDE_LIMIT_FIFO_POLY_DEPTH(2)));
 
-	I915_WRITE(COMMON_SLICE_CHICKEN2,
-		   _MASKED_BIT_ENABLE(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE));
-
-	I915_WRITE(GEN7_HALF_SLICE_CHICKEN1,
-		   _MASKED_BIT_ENABLE(GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE));
-
-	/* WaDisableDopClockGating:bdw May not be needed for production */
-	I915_WRITE(GEN7_ROW_CHICKEN2,
-		   _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
 
 	/* WaSwitchSolVfFArbitrationPriority:bdw */
 	I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
 
 	/* WaPsrDPAMaskVBlankInSRD:bdw */
 	I915_WRITE(CHICKEN_PAR1_1,
 		   I915_READ(CHICKEN_PAR1_1) | DPA_MASK_VBLANK_SRD);
 
 	/* WaPsrDPRSUnmaskVBlankInSRD:bdw */
 	for_each_pipe(pipe) {
 		I915_WRITE(CHICKEN_PIPESL_1(pipe),
 			   I915_READ(CHICKEN_PIPESL_1(pipe)) |
 			   BDW_DPRS_MASK_VBLANK_SRD);
 	}
 
-	/* Use Force Non-Coherent whenever executing a 3D context. This is a
-	 * workaround for for a possible hang in the unlikely event a TLB
-	 * invalidation occurs during a PSD flush.
-	 */
-	I915_WRITE(HDC_CHICKEN0,
-		   I915_READ(HDC_CHICKEN0) |
-		   _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT));
-
 	/* WaVSRefCountFullforceMissDisable:bdw */
 	/* WaDSRefCountFullforceMissDisable:bdw */
 	I915_WRITE(GEN7_FF_THREAD_MODE,
 		   I915_READ(GEN7_FF_THREAD_MODE) &
 		   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
 
-	/*
-	 * BSpec recommends 8x4 when MSAA is used,
-	 * however in practice 16x4 seems fastest.
-	 *
-	 * Note that PS/WM thread counts depend on the WIZ hashing
-	 * disable bit, which we don't touch here, but it's good
-	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
-	 */
-	I915_WRITE(GEN7_GT_MODE,
-		   GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
-
 	I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
 		   _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
 
 	/* WaDisableSDEUnitClockGating:bdw */
 	I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
 		   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
-
-	/* Wa4x4STCOptimizationDisable:bdw */
-	I915_WRITE(CACHE_MODE_1,
-		   _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE));
 }
 
 static void haswell_init_clock_gating(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
 	ilk_init_lp_watermarks(dev);
 
 	/* L3 caching of data atomics doesn't work -- disable it. */
 	I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 13543f8..c1e7da6 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -643,20 +643,98 @@ intel_init_pipe_control(struct intel_engine_cs *ring)
 	return 0;
 
 err_unpin:
 	i915_gem_object_ggtt_unpin(ring->scratch.obj);
 err_unref:
 	drm_gem_object_unreference(&ring->scratch.obj->base);
 err:
 	return ret;
 }
 
+static inline void intel_ring_emit_wa(struct intel_engine_cs *ring,
+				       u32 addr, u32 value)
+{
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+	intel_ring_emit(ring, addr);
+	intel_ring_emit(ring, value);
+}
+
+int bdw_init_workarounds(struct intel_engine_cs *ring)
+{
+	int ret;
+
+	/*
+	 * workarounds applied in this fn are part of register state context,
+	 * they need to be re-initialized followed by gpu reset, suspend/resume,
+	 * module reload.
+	 */
+
+	/*
+	 * update the number of dwords required based on the
+	 * actual number of workarounds applied
+	 */
+	ret = intel_ring_begin(ring, 24);
+	if (ret)
+		return ret;
+
+	/* WaDisablePartialInstShootdown:bdw */
+	/* WaDisableThreadStallDopClockGating:bdw */
+	/* FIXME: Unclear whether we really need this on production bdw. */
+	intel_ring_emit_wa(ring, GEN8_ROW_CHICKEN,
+			   _MASKED_BIT_ENABLE(PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE
+					     | STALL_DOP_GATING_DISABLE));
+
+	/* WaDisableDopClockGating:bdw May not be needed for production */
+	intel_ring_emit_wa(ring, GEN7_ROW_CHICKEN2,
+			   _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
+
+	/*
+	 * This GEN8_CENTROID_PIXEL_OPT_DIS W/A is only needed for
+	 * pre-production hardware
+	 */
+	intel_ring_emit_wa(ring, HALF_SLICE_CHICKEN3,
+			   _MASKED_BIT_ENABLE(GEN8_CENTROID_PIXEL_OPT_DIS
+					      | GEN8_SAMPLER_POWER_BYPASS_DIS));
+
+	intel_ring_emit_wa(ring, GEN7_HALF_SLICE_CHICKEN1,
+			   _MASKED_BIT_ENABLE(GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE));
+
+	intel_ring_emit_wa(ring, COMMON_SLICE_CHICKEN2,
+			   _MASKED_BIT_ENABLE(GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE));
+
+	/* Use Force Non-Coherent whenever executing a 3D context. This is a
+	 * workaround for for a possible hang in the unlikely event a TLB
+	 * invalidation occurs during a PSD flush.
+	 */
+	intel_ring_emit_wa(ring, HDC_CHICKEN0,
+			   _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT));
+
+	/* Wa4x4STCOptimizationDisable:bdw */
+	intel_ring_emit_wa(ring, CACHE_MODE_1,
+			   _MASKED_BIT_ENABLE(GEN8_4x4_STC_OPTIMIZATION_DISABLE));
+
+	/*
+	 * BSpec recommends 8x4 when MSAA is used,
+	 * however in practice 16x4 seems fastest.
+	 *
+	 * Note that PS/WM thread counts depend on the WIZ hashing
+	 * disable bit, which we don't touch here, but it's good
+	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
+	 */
+	intel_ring_emit_wa(ring, GEN7_GT_MODE,
+			   GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
+
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
 static int init_render_ring(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret = init_ring_common(ring);
 	if (ret)
 		return ret;
 
 	/* WaTimedSingleVertexDispatch:cl,bw,ctg,elk,ilk,snb */
 	if (INTEL_INFO(dev)->gen >= 4 && INTEL_INFO(dev)->gen < 7)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 9cbf7b0..9da4107 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -411,20 +411,21 @@ int intel_ring_flush_all_caches(struct intel_engine_cs *ring);
 int intel_ring_invalidate_all_caches(struct intel_engine_cs *ring);
 
 void intel_fini_pipe_control(struct intel_engine_cs *ring);
 int intel_init_pipe_control(struct intel_engine_cs *ring);
 
 int intel_init_render_ring_buffer(struct drm_device *dev);
 int intel_init_bsd_ring_buffer(struct drm_device *dev);
 int intel_init_bsd2_ring_buffer(struct drm_device *dev);
 int intel_init_blt_ring_buffer(struct drm_device *dev);
 int intel_init_vebox_ring_buffer(struct drm_device *dev);
+int bdw_init_workarounds(struct intel_engine_cs *ring);
 
 u64 intel_ring_get_active_head(struct intel_engine_cs *ring);
 void intel_ring_setup_status_page(struct intel_engine_cs *ring);
 
 static inline u32 intel_ring_get_tail(struct intel_ringbuffer *ringbuf)
 {
 	return ringbuf->tail;
 }
 
 static inline u32 intel_ring_get_seqno(struct intel_engine_cs *ring)
-- 
2.0.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] drm/i915/bdw: Export workaround data to debugfs
  2014-08-26  9:33 [PATCH 0/2] Apply BDW workarounds using LRIs in render init fn Arun Siluvery
  2014-08-26  9:33 ` [PATCH 1/2] drm/i915/bdw: Apply workarounds in render ring init function Arun Siluvery
@ 2014-08-26  9:33 ` Arun Siluvery
  1 sibling, 0 replies; 9+ messages in thread
From: Arun Siluvery @ 2014-08-26  9:33 UTC (permalink / raw)
  To: intel-gfx

The workarounds that are applied are exported to a debugfs file;
this is used to verify their state after the test case (reset or
suspend/resume etc). This patch is only required to support i-g-t.

Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     | 40 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h         | 14 ++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++++++++++++++++++
 3 files changed, 77 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index d42db6b..f0d63f6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2451,20 +2451,59 @@ static int i915_shared_dplls_info(struct seq_file *m, void *unused)
 		seq_printf(m, " dpll_md: 0x%08x\n", pll->hw_state.dpll_md);
 		seq_printf(m, " fp0:     0x%08x\n", pll->hw_state.fp0);
 		seq_printf(m, " fp1:     0x%08x\n", pll->hw_state.fp1);
 		seq_printf(m, " wrpll:   0x%08x\n", pll->hw_state.wrpll);
 	}
 	drm_modeset_unlock_all(dev);
 
 	return 0;
 }
 
+static int intel_wa_registers(struct seq_file *m, void *unused)
+{
+	int i;
+	int ret;
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!IS_BROADWELL(dev)) {
+		DRM_DEBUG_DRIVER("Workaround table not available !!\n");
+		return -EINVAL;
+	}
+
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		return ret;
+
+	intel_runtime_pm_get(dev_priv);
+
+	seq_printf(m, "Workarounds applied: %d\n", dev_priv->num_wa_regs);
+	for (i = 0; i < dev_priv->num_wa_regs; ++i) {
+		u32 addr, mask;
+
+		addr = dev_priv->intel_wa_regs[i].addr;
+		mask = dev_priv->intel_wa_regs[i].mask;
+		dev_priv->intel_wa_regs[i].value = I915_READ(addr) | mask;
+		if (dev_priv->intel_wa_regs[i].addr)
+			seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X\n",
+				   dev_priv->intel_wa_regs[i].addr,
+				   dev_priv->intel_wa_regs[i].value,
+				   dev_priv->intel_wa_regs[i].mask);
+	}
+
+	intel_runtime_pm_put(dev_priv);
+	mutex_unlock(&dev->struct_mutex);
+
+	return 0;
+}
+
 struct pipe_crc_info {
 	const char *name;
 	struct drm_device *dev;
 	enum pipe pipe;
 };
 
 static int i915_dp_mst_info(struct seq_file *m, void *unused)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
@@ -3980,20 +4019,21 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_llc", i915_llc, 0},
 	{"i915_edp_psr_status", i915_edp_psr_status, 0},
 	{"i915_sink_crc_eDP1", i915_sink_crc, 0},
 	{"i915_energy_uJ", i915_energy_uJ, 0},
 	{"i915_pc8_status", i915_pc8_status, 0},
 	{"i915_power_domain_info", i915_power_domain_info, 0},
 	{"i915_display_info", i915_display_info, 0},
 	{"i915_semaphore_status", i915_semaphore_status, 0},
 	{"i915_shared_dplls_info", i915_shared_dplls_info, 0},
 	{"i915_dp_mst_info", i915_dp_mst_info, 0},
+	{"intel_wa_registers", intel_wa_registers, 0}
 };
 #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
 
 static const struct i915_debugfs_files {
 	const char *name;
 	const struct file_operations *fops;
 } i915_debugfs_files[] = {
 	{"i915_wedged", &i915_wedged_fops},
 	{"i915_max_freq", &i915_max_freq_fops},
 	{"i915_min_freq", &i915_min_freq_fops},
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bcf79f0..49b7be7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1546,20 +1546,34 @@ struct drm_i915_private {
 	wait_queue_head_t pending_flip_queue;
 
 #ifdef CONFIG_DEBUG_FS
 	struct intel_pipe_crc pipe_crc[I915_MAX_PIPES];
 #endif
 
 	int num_shared_dpll;
 	struct intel_shared_dpll shared_dplls[I915_NUM_PLLS];
 	int dpio_phy_iosf_port[I915_NUM_PHYS_VLV];
 
+	/*
+	 * workarounds are currently applied at different places and
+	 * changes are being done to consolidate them so exact count is
+	 * not clear at this point, use a max value for now.
+	 */
+#define I915_MAX_WA_REGS  16
+	struct {
+		u32 addr;
+		u32 value;
+		/* bitmask representing WA bits */
+		u32 mask;
+	} intel_wa_regs[I915_MAX_WA_REGS];
+	u32 num_wa_regs;
+
 	/* Reclocking support */
 	bool render_reclock_avail;
 	bool lvds_downclock_avail;
 	/* indicates the reduced downclock for LVDS*/
 	int lvds_downclock;
 
 	struct i915_frontbuffer_tracking fb_tracking;
 
 	u16 orig_clock;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index c1e7da6..0e496d2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -646,34 +646,54 @@ err_unpin:
 	i915_gem_object_ggtt_unpin(ring->scratch.obj);
 err_unref:
 	drm_gem_object_unreference(&ring->scratch.obj->base);
 err:
 	return ret;
 }
 
 static inline void intel_ring_emit_wa(struct intel_engine_cs *ring,
 				       u32 addr, u32 value)
 {
+	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (dev_priv->num_wa_regs > I915_MAX_WA_REGS)
+		return;
+
 	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
 	intel_ring_emit(ring, addr);
 	intel_ring_emit(ring, value);
+
+	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].addr = addr;
+	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].mask = (value) & 0xFFFF;
+	/* value is updated with the status of remaining bits of this
+	 * register when it is read from debugfs file
+	 */
+	dev_priv->intel_wa_regs[dev_priv->num_wa_regs].value = value;
+	dev_priv->num_wa_regs++;
+
+	return;
 }
 
 int bdw_init_workarounds(struct intel_engine_cs *ring)
 {
 	int ret;
+	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 
 	/*
 	 * workarounds applied in this fn are part of register state context,
 	 * they need to be re-initialized followed by gpu reset, suspend/resume,
 	 * module reload.
 	 */
+	dev_priv->num_wa_regs = 0;
+	memset(dev_priv->intel_wa_regs, 0, sizeof(dev_priv->intel_wa_regs));
 
 	/*
 	 * update the number of dwords required based on the
 	 * actual number of workarounds applied
 	 */
 	ret = intel_ring_begin(ring, 24);
 	if (ret)
 		return ret;
 
 	/* WaDisablePartialInstShootdown:bdw */
@@ -718,20 +738,23 @@ int bdw_init_workarounds(struct intel_engine_cs *ring)
 	 *
 	 * Note that PS/WM thread counts depend on the WIZ hashing
 	 * disable bit, which we don't touch here, but it's good
 	 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
 	 */
 	intel_ring_emit_wa(ring, GEN7_GT_MODE,
 			   GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
 
 	intel_ring_advance(ring);
 
+	DRM_DEBUG_DRIVER("Number of Workarounds applied: %d\n",
+			 dev_priv->num_wa_regs);
+
 	return 0;
 }
 
 static int init_render_ring(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret = init_ring_common(ring);
 	if (ret)
 		return ret;
-- 
2.0.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] drm/i915/bdw: Apply workarounds in render ring init function
  2014-08-26  9:33 ` [PATCH 1/2] drm/i915/bdw: Apply workarounds in render ring init function Arun Siluvery
@ 2014-08-26 10:09   ` Chris Wilson
  2014-08-26 10:16     ` Siluvery, Arun
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2014-08-26 10:09 UTC (permalink / raw)
  To: Arun Siluvery; +Cc: intel-gfx

On Tue, Aug 26, 2014 at 10:33:16AM +0100, Arun Siluvery wrote:
> For BDW workarounds are currently initialized in init_clock_gating() but
> they are lost during reset, suspend/resume etc; this patch moves the WAs
> that are part of register state context to render ring init fn otherwise
> default context ends up with incorrect values as they don't get initialized
> until init_clock_gating fn.
> 
> v2: Add workarounds to golden render state
> This method has its own issues, first of all this is different for
> each gen and it is generated using a tool so adding new workaround
> and mainitaining them across gens is not a straightforward process.
> 
> v3: Use LRIs to emit these workarounds (Ville)
> Instead of modifying the golden render state the same LRIs are
> emitted from within the driver.
> 
> For: VIZ-4092
> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_context.c |  6 +++
>  drivers/gpu/drm/i915/intel_pm.c         | 48 --------------------
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 78 +++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
>  4 files changed, 85 insertions(+), 48 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 9683e62..2debce4 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -631,20 +631,26 @@ static int do_switch(struct intel_engine_cs *ring,
>  	}
>  
>  	uninitialized = !to->legacy_hw_ctx.initialized && from == NULL;
>  	to->legacy_hw_ctx.initialized = true;
>  
>  done:
>  	i915_gem_context_reference(to);
>  	ring->last_context = to;
>  
>  	if (uninitialized) {
> +		if (IS_BROADWELL(ring->dev)) {
> +			ret = bdw_init_workarounds(ring);
> +			if (ret)
> +				DRM_ERROR("init workarounds: %d\n", ret);

A good rule of thumb is that if you are exporting gen specific routines,
the layering and abstraction is fishy.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] drm/i915/bdw: Apply workarounds in render ring init function
  2014-08-26 10:09   ` Chris Wilson
@ 2014-08-26 10:16     ` Siluvery, Arun
  2014-08-26 10:34       ` Chris Wilson
  0 siblings, 1 reply; 9+ messages in thread
From: Siluvery, Arun @ 2014-08-26 10:16 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 26/08/2014 11:09, Chris Wilson wrote:
> On Tue, Aug 26, 2014 at 10:33:16AM +0100, Arun Siluvery wrote:
>> For BDW workarounds are currently initialized in init_clock_gating() but
>> they are lost during reset, suspend/resume etc; this patch moves the WAs
>> that are part of register state context to render ring init fn otherwise
>> default context ends up with incorrect values as they don't get initialized
>> until init_clock_gating fn.
>>
>> v2: Add workarounds to golden render state
>> This method has its own issues, first of all this is different for
>> each gen and it is generated using a tool so adding new workaround
>> and mainitaining them across gens is not a straightforward process.
>>
>> v3: Use LRIs to emit these workarounds (Ville)
>> Instead of modifying the golden render state the same LRIs are
>> emitted from within the driver.
>>
>> For: VIZ-4092
>> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gem_context.c |  6 +++
>>   drivers/gpu/drm/i915/intel_pm.c         | 48 --------------------
>>   drivers/gpu/drm/i915/intel_ringbuffer.c | 78 +++++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
>>   4 files changed, 85 insertions(+), 48 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>> index 9683e62..2debce4 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> @@ -631,20 +631,26 @@ static int do_switch(struct intel_engine_cs *ring,
>>   	}
>>
>>   	uninitialized = !to->legacy_hw_ctx.initialized && from == NULL;
>>   	to->legacy_hw_ctx.initialized = true;
>>
>>   done:
>>   	i915_gem_context_reference(to);
>>   	ring->last_context = to;
>>
>>   	if (uninitialized) {
>> +		if (IS_BROADWELL(ring->dev)) {
>> +			ret = bdw_init_workarounds(ring);
>> +			if (ret)
>> +				DRM_ERROR("init workarounds: %d\n", ret);
>
> A good rule of thumb is that if you are exporting gen specific routines,
> the layering and abstraction is fishy.
> -Chris
>
ok, so something like i915_init_workarounds() is ok? with a check for 
bdw/gen8 done inside that function.

regards
Arun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] drm/i915/bdw: Apply workarounds in render ring init function
  2014-08-26 10:16     ` Siluvery, Arun
@ 2014-08-26 10:34       ` Chris Wilson
  2014-08-26 11:42         ` Siluvery, Arun
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2014-08-26 10:34 UTC (permalink / raw)
  To: Siluvery, Arun; +Cc: intel-gfx

On Tue, Aug 26, 2014 at 11:16:29AM +0100, Siluvery, Arun wrote:
> On 26/08/2014 11:09, Chris Wilson wrote:
> >On Tue, Aug 26, 2014 at 10:33:16AM +0100, Arun Siluvery wrote:
> >>For BDW workarounds are currently initialized in init_clock_gating() but
> >>they are lost during reset, suspend/resume etc; this patch moves the WAs
> >>that are part of register state context to render ring init fn otherwise
> >>default context ends up with incorrect values as they don't get initialized
> >>until init_clock_gating fn.
> >>
> >>v2: Add workarounds to golden render state
> >>This method has its own issues, first of all this is different for
> >>each gen and it is generated using a tool so adding new workaround
> >>and mainitaining them across gens is not a straightforward process.
> >>
> >>v3: Use LRIs to emit these workarounds (Ville)
> >>Instead of modifying the golden render state the same LRIs are
> >>emitted from within the driver.
> >>
> >>For: VIZ-4092
> >>Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
> >>---
> >>  drivers/gpu/drm/i915/i915_gem_context.c |  6 +++
> >>  drivers/gpu/drm/i915/intel_pm.c         | 48 --------------------
> >>  drivers/gpu/drm/i915/intel_ringbuffer.c | 78 +++++++++++++++++++++++++++++++++
> >>  drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
> >>  4 files changed, 85 insertions(+), 48 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> >>index 9683e62..2debce4 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem_context.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem_context.c
> >>@@ -631,20 +631,26 @@ static int do_switch(struct intel_engine_cs *ring,
> >>  	}
> >>
> >>  	uninitialized = !to->legacy_hw_ctx.initialized && from == NULL;
> >>  	to->legacy_hw_ctx.initialized = true;
> >>
> >>  done:
> >>  	i915_gem_context_reference(to);
> >>  	ring->last_context = to;
> >>
> >>  	if (uninitialized) {
> >>+		if (IS_BROADWELL(ring->dev)) {
> >>+			ret = bdw_init_workarounds(ring);
> >>+			if (ret)
> >>+				DRM_ERROR("init workarounds: %d\n", ret);
> >
> >A good rule of thumb is that if you are exporting gen specific routines,
> >the layering and abstraction is fishy.
> >-Chris
> >
> ok, so something like i915_init_workarounds() is ok? with a check
> for bdw/gen8 done inside that function.

Except for init_workarounds is quite useless as a function name and we
already have a structure that is already customised per-engine and
per-gen that you could hook into.
engine->ring_init_context() ?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] drm/i915/bdw: Apply workarounds in render ring init function
  2014-08-26 10:34       ` Chris Wilson
@ 2014-08-26 11:42         ` Siluvery, Arun
  0 siblings, 0 replies; 9+ messages in thread
From: Siluvery, Arun @ 2014-08-26 11:42 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 26/08/2014 11:34, Chris Wilson wrote:
> On Tue, Aug 26, 2014 at 11:16:29AM +0100, Siluvery, Arun wrote:
>> On 26/08/2014 11:09, Chris Wilson wrote:
>>> On Tue, Aug 26, 2014 at 10:33:16AM +0100, Arun Siluvery wrote:
>>>> For BDW workarounds are currently initialized in init_clock_gating() but
>>>> they are lost during reset, suspend/resume etc; this patch moves the WAs
>>>> that are part of register state context to render ring init fn otherwise
>>>> default context ends up with incorrect values as they don't get initialized
>>>> until init_clock_gating fn.
>>>>
>>>> v2: Add workarounds to golden render state
>>>> This method has its own issues, first of all this is different for
>>>> each gen and it is generated using a tool so adding new workaround
>>>> and mainitaining them across gens is not a straightforward process.
>>>>
>>>> v3: Use LRIs to emit these workarounds (Ville)
>>>> Instead of modifying the golden render state the same LRIs are
>>>> emitted from within the driver.
>>>>
>>>> For: VIZ-4092
>>>> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/i915_gem_context.c |  6 +++
>>>>   drivers/gpu/drm/i915/intel_pm.c         | 48 --------------------
>>>>   drivers/gpu/drm/i915/intel_ringbuffer.c | 78 +++++++++++++++++++++++++++++++++
>>>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
>>>>   4 files changed, 85 insertions(+), 48 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>>>> index 9683e62..2debce4 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>>>> @@ -631,20 +631,26 @@ static int do_switch(struct intel_engine_cs *ring,
>>>>   	}
>>>>
>>>>   	uninitialized = !to->legacy_hw_ctx.initialized && from == NULL;
>>>>   	to->legacy_hw_ctx.initialized = true;
>>>>
>>>>   done:
>>>>   	i915_gem_context_reference(to);
>>>>   	ring->last_context = to;
>>>>
>>>>   	if (uninitialized) {
>>>> +		if (IS_BROADWELL(ring->dev)) {
>>>> +			ret = bdw_init_workarounds(ring);
>>>> +			if (ret)
>>>> +				DRM_ERROR("init workarounds: %d\n", ret);
>>>
>>> A good rule of thumb is that if you are exporting gen specific routines,
>>> the layering and abstraction is fishy.
>>> -Chris
>>>
>> ok, so something like i915_init_workarounds() is ok? with a check
>> for bdw/gen8 done inside that function.
>
> Except for init_workarounds is quite useless as a function name and we
> already have a structure that is already customised per-engine and
> per-gen that you could hook into.
> engine->ring_init_context() ?
> -Chris
>

Ok thanks, I can create a new fn ring_init_context().

regards
Arun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 0/2] Apply BDW workarounds using LRIs in render init fn
@ 2014-08-26 13:44 Arun Siluvery
  0 siblings, 0 replies; 9+ messages in thread
From: Arun Siluvery @ 2014-08-26 13:44 UTC (permalink / raw)
  To: intel-gfx

Workarounds for BDW are applied using LRIs during render ring initialization.
Only those WA registers that are part of register state are initialized
in this fn, remaining are still in its current place init_clock_gating() which
are not affected by a gpu reset. I can send another patch where they can be
moved to render ring init function but during testing I found their state
doesn't change after reset.


Arun Siluvery (2):
  drm/i915/bdw: Apply workarounds in render ring init function
  drm/i915/bdw: Export workaround data to debugfs

 drivers/gpu/drm/i915/i915_debugfs.c     |  40 +++++++++++++
 drivers/gpu/drm/i915/i915_drv.h         |  14 +++++
 drivers/gpu/drm/i915/i915_gem_context.c |   6 ++
 drivers/gpu/drm/i915/intel_pm.c         |  48 ---------------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 102 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.h |   2 +
 6 files changed, 164 insertions(+), 48 deletions(-)

-- 
2.0.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-08-26 13:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-26  9:33 [PATCH 0/2] Apply BDW workarounds using LRIs in render init fn Arun Siluvery
2014-08-26  9:33 ` [PATCH 1/2] drm/i915/bdw: Apply workarounds in render ring init function Arun Siluvery
2014-08-26 10:09   ` Chris Wilson
2014-08-26 10:16     ` Siluvery, Arun
2014-08-26 10:34       ` Chris Wilson
2014-08-26 11:42         ` Siluvery, Arun
2014-08-26  9:33 ` [PATCH 2/2] drm/i915/bdw: Export workaround data to debugfs Arun Siluvery
  -- strict thread matches above, loose matches on Subject: below --
2014-08-26 13:44 [PATCH 0/2] Apply BDW workarounds using LRIs in render init fn Arun Siluvery
2014-08-22 19:39 Arun Siluvery

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox