public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Enable the HiZ RAW Stall Optimization on Gen8.
@ 2015-01-11  2:44 Kenneth Graunke
  2015-01-11 21:49 ` [Mesa-dev] " Ben Widawsky
  2015-01-12 12:32 ` [Intel-gfx] " Ville Syrjälä
  0 siblings, 2 replies; 15+ messages in thread
From: Kenneth Graunke @ 2015-01-11  2:44 UTC (permalink / raw)
  To: intel-gfx; +Cc: mesa-dev

This is an important optimization for avoiding read-after-write (RAW)
stalls in the HiZ buffer.  Certain workloads would run very slowly with
HiZ enabled, but run much faster with the "hiz=false" driconf option.
With this patch, they run at full speed even with HiZ.

Improves performance in OglVSInstancing by 3.2x on Broadwell GT3e
(Iris Pro 6200).

Thanks to Jesse Barnes for finding this missing bit!
Thanks to Chris Wilson for helping me find where to set it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

Here's an alternate patch which implements the workaround in the kernel
instead of Mesa.  It's probably better to do it there, since the kernel
does it on Haswell already.

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index dabc1d8..23020d6 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -796,6 +796,16 @@ static int bdw_init_workarounds(struct intel_engine_cs *ring)
 			  HDC_DONOT_FETCH_MEM_WHEN_MASKED |
 			  (IS_BDW_GT3(dev) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
 
+	/* From the Haswell PRM, Command Reference: Registers, CACHE_MODE_0:
+	 * "The Hierarchical Z RAW Stall Optimization allows non-overlapping
+	 *  polygons in the same 8x4 pixel/sample area to be processed without
+	 *  stalling waiting for the earlier ones to write to Hierarchical Z
+	 *  buffer."
+	 *
+	 * This optimization is off by default for Broadwell; turn it on.
+	 */
+	WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
+
 	/* Wa4x4STCOptimizationDisable:bdw */
 	WA_SET_BIT_MASKED(CACHE_MODE_1,
 			  GEN8_4x4_STC_OPTIMIZATION_DISABLE);
@@ -836,6 +846,11 @@ static int chv_init_workarounds(struct intel_engine_cs *ring)
 			  HDC_FORCE_NON_COHERENT |
 			  HDC_DONOT_FETCH_MEM_WHEN_MASKED);
 
+	/* According to the CACHE_MODE_0 default value documentation, some
+	 * CHV platforms disable this optimization by default.  Turn it on.
+	 */
+	WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
+
 	/* Improve HiZ throughput on CHV. */
 	WA_SET_BIT_MASKED(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
 
-- 
2.2.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-01-13 20:03 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-11  2:44 [PATCH] drm/i915: Enable the HiZ RAW Stall Optimization on Gen8 Kenneth Graunke
2015-01-11 21:49 ` [Mesa-dev] " Ben Widawsky
2015-01-12  0:05   ` Kenneth Graunke
2015-01-12  1:46     ` Ben Widawsky
2015-01-12  2:53       ` Kenneth Graunke
2015-01-12  3:09         ` Ben Widawsky
2015-01-12  3:05       ` Kenneth Graunke
2015-01-12  3:14         ` [Mesa-dev] " Ben Widawsky
2015-01-12 12:02           ` Ville Syrjälä
2015-01-12 18:02             ` Ben Widawsky
2015-01-12 18:09               ` Dave Gordon
2015-01-13  2:07                 ` Ben Widawsky
2015-01-13 20:03                   ` Ville Syrjälä
2015-01-12 12:32 ` [Intel-gfx] " Ville Syrjälä
2015-01-12 21:41   ` Kenneth Graunke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox