public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: enable HiZ Raw Stall Optimization
@ 2014-01-27  8:18 Chia-I Wu
  2014-01-27  8:33 ` Chia-I Wu
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Chia-I Wu @ 2014-01-27  8:18 UTC (permalink / raw)
  To: intel-gfx

From: Chia-I Wu <olv@lunarg.com>

The optimization is available on Ivy Bridge and later, and is disabled by
default.  Enabling it helps certain workloads such as GLBenchmark TRex test.

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Chad Versace <chad.versace@linux.intel.com>

---
 drivers/gpu/drm/i915/i915_reg.h     | 2 ++
 drivers/gpu/drm/i915/i915_suspend.c | 9 +++++++--
 drivers/gpu/drm/i915/intel_pm.c     | 8 ++++++++
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index ee27421..bd90ef3 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -930,6 +930,8 @@
 #define   ECO_GATING_CX_ONLY	(1<<3)
 #define   ECO_FLIP_DONE		(1<<0)
 
+#define CACHE_MODE_0_GEN7	0x7000 /* IVB+ */
+#define   HIZ_RAW_STALL_OPT_DISABLE (1<<2)
 #define CACHE_MODE_1		0x7004 /* IVB+ */
 #define   PIXEL_SUBSPAN_COLLECT_OPT_DISABLE (1<<6)
 
diff --git a/drivers/gpu/drm/i915/i915_suspend.c b/drivers/gpu/drm/i915/i915_suspend.c
index 98790c7..13fefbd 100644
--- a/drivers/gpu/drm/i915/i915_suspend.c
+++ b/drivers/gpu/drm/i915/i915_suspend.c
@@ -398,7 +398,9 @@ int i915_save_state(struct drm_device *dev)
 	intel_disable_gt_powersave(dev);
 
 	/* Cache mode state */
-	if (INTEL_INFO(dev)->gen < 7)
+	if (INTEL_INFO(dev)->gen >= 7)
+		dev_priv->regfile.saveCACHE_MODE_0 = I915_READ(CACHE_MODE_0_GEN7);
+	else
 		dev_priv->regfile.saveCACHE_MODE_0 = I915_READ(CACHE_MODE_0);
 
 	/* Memory Arbitration state */
@@ -448,7 +450,10 @@ int i915_restore_state(struct drm_device *dev)
 	}
 
 	/* Cache mode state */
-	if (INTEL_INFO(dev)->gen < 7)
+	if (INTEL_INFO(dev)->gen >= 7)
+		I915_WRITE(CACHE_MODE_0_GEN7, dev_priv->regfile.saveCACHE_MODE_0 |
+			   0xffff0000);
+	else
 		I915_WRITE(CACHE_MODE_0, dev_priv->regfile.saveCACHE_MODE_0 |
 			   0xffff0000);
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 26c29c1..d6ddc39 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5355,6 +5355,10 @@ static void haswell_init_clock_gating(struct drm_device *dev)
 	/* WaVSRefCountFullforceMissDisable:hsw */
 	gen7_setup_fixed_func_scheduler(dev_priv);
 
+	/* enable HiZ Raw Stall Optimization */
+	I915_WRITE(CACHE_MODE_0_GEN7,
+		   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
+
 	/* WaDisable4x2SubspanOptimization:hsw */
 	I915_WRITE(CACHE_MODE_1,
 		   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
@@ -5445,6 +5449,10 @@ static void ivybridge_init_clock_gating(struct drm_device *dev)
 	/* WaVSRefCountFullforceMissDisable:ivb */
 	gen7_setup_fixed_func_scheduler(dev_priv);
 
+	/* enable HiZ Raw Stall Optimization */
+	I915_WRITE(CACHE_MODE_0_GEN7,
+		   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
+
 	/* WaDisable4x2SubspanOptimization:ivb */
 	I915_WRITE(CACHE_MODE_1,
 		   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
-- 
1.8.5.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: enable HiZ Raw Stall Optimization
  2014-01-27  8:18 [PATCH] drm/i915: enable HiZ Raw Stall Optimization Chia-I Wu
@ 2014-01-27  8:33 ` Chia-I Wu
  2014-01-27  9:04   ` Daniel Vetter
  2014-01-27 13:07 ` Ville Syrjälä
  2014-01-28  5:29 ` [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW Chia-I Wu
  2 siblings, 1 reply; 12+ messages in thread
From: Chia-I Wu @ 2014-01-27  8:33 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ian Romanick

[Additional comments, and copy Ian and Chad for real]

On Mon, Jan 27, 2014 at 4:18 PM, Chia-I Wu <olvaffe@gmail.com> wrote:
> From: Chia-I Wu <olv@lunarg.com>
>
> The optimization is available on Ivy Bridge and later, and is disabled by
> default.  Enabling it helps certain workloads such as GLBenchmark TRex test.
With the patch applied, GLB27_TRex_C24Z16_FixedTimeStep goes from
99fps to 109fps on my Haswell, and from 60fps to 65fps on my Ivy
Bridge.  No piglit regression on both GENs.

I had a non-recoverable system hang once with the patch applied.  I
was not sure if it is because of the patch or drm-intel-nightly (which
I checkout out some weeks ago).  I did my tests today against latest
drm-intel-next, and did not have any hang.  Since the optimization is
disabled by default, I am curious if there is any caveat before
enabling it.

>
> Signed-off-by: Chia-I Wu <olv@lunarg.com>
> Cc: Ian Romanick <ian.d.romanick@intel.com>
> Cc: Chad Versace <chad.versace@linux.intel.com>
>
> ---
>  drivers/gpu/drm/i915/i915_reg.h     | 2 ++
>  drivers/gpu/drm/i915/i915_suspend.c | 9 +++++++--
>  drivers/gpu/drm/i915/intel_pm.c     | 8 ++++++++
>  3 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index ee27421..bd90ef3 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -930,6 +930,8 @@
>  #define   ECO_GATING_CX_ONLY   (1<<3)
>  #define   ECO_FLIP_DONE                (1<<0)
>
> +#define CACHE_MODE_0_GEN7      0x7000 /* IVB+ */
> +#define   HIZ_RAW_STALL_OPT_DISABLE (1<<2)
>  #define CACHE_MODE_1           0x7004 /* IVB+ */
>  #define   PIXEL_SUBSPAN_COLLECT_OPT_DISABLE (1<<6)
>
> diff --git a/drivers/gpu/drm/i915/i915_suspend.c b/drivers/gpu/drm/i915/i915_suspend.c
> index 98790c7..13fefbd 100644
> --- a/drivers/gpu/drm/i915/i915_suspend.c
> +++ b/drivers/gpu/drm/i915/i915_suspend.c
> @@ -398,7 +398,9 @@ int i915_save_state(struct drm_device *dev)
>         intel_disable_gt_powersave(dev);
>
>         /* Cache mode state */
> -       if (INTEL_INFO(dev)->gen < 7)
> +       if (INTEL_INFO(dev)->gen >= 7)
> +               dev_priv->regfile.saveCACHE_MODE_0 = I915_READ(CACHE_MODE_0_GEN7);
> +       else
>                 dev_priv->regfile.saveCACHE_MODE_0 = I915_READ(CACHE_MODE_0);
>
>         /* Memory Arbitration state */
> @@ -448,7 +450,10 @@ int i915_restore_state(struct drm_device *dev)
>         }
>
>         /* Cache mode state */
> -       if (INTEL_INFO(dev)->gen < 7)
> +       if (INTEL_INFO(dev)->gen >= 7)
> +               I915_WRITE(CACHE_MODE_0_GEN7, dev_priv->regfile.saveCACHE_MODE_0 |
> +                          0xffff0000);
> +       else
>                 I915_WRITE(CACHE_MODE_0, dev_priv->regfile.saveCACHE_MODE_0 |
>                            0xffff0000);
>
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 26c29c1..d6ddc39 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -5355,6 +5355,10 @@ static void haswell_init_clock_gating(struct drm_device *dev)
>         /* WaVSRefCountFullforceMissDisable:hsw */
>         gen7_setup_fixed_func_scheduler(dev_priv);
>
> +       /* enable HiZ Raw Stall Optimization */
> +       I915_WRITE(CACHE_MODE_0_GEN7,
> +                  _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
> +
>         /* WaDisable4x2SubspanOptimization:hsw */
>         I915_WRITE(CACHE_MODE_1,
>                    _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
> @@ -5445,6 +5449,10 @@ static void ivybridge_init_clock_gating(struct drm_device *dev)
>         /* WaVSRefCountFullforceMissDisable:ivb */
>         gen7_setup_fixed_func_scheduler(dev_priv);
>
> +       /* enable HiZ Raw Stall Optimization */
> +       I915_WRITE(CACHE_MODE_0_GEN7,
> +                  _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
> +
>         /* WaDisable4x2SubspanOptimization:ivb */
>         I915_WRITE(CACHE_MODE_1,
>                    _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
> --
> 1.8.5.3
>



-- 
olv@LunarG.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: enable HiZ Raw Stall Optimization
  2014-01-27  8:33 ` Chia-I Wu
@ 2014-01-27  9:04   ` Daniel Vetter
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel Vetter @ 2014-01-27  9:04 UTC (permalink / raw)
  To: Chia-I Wu; +Cc: intel-gfx, Ian Romanick

On Mon, Jan 27, 2014 at 9:33 AM, Chia-I Wu <olvaffe@gmail.com> wrote:
> [Additional comments, and copy Ian and Chad for real]
>
> On Mon, Jan 27, 2014 at 4:18 PM, Chia-I Wu <olvaffe@gmail.com> wrote:
>> From: Chia-I Wu <olv@lunarg.com>
>>
>> The optimization is available on Ivy Bridge and later, and is disabled by
>> default.  Enabling it helps certain workloads such as GLBenchmark TRex test.
> With the patch applied, GLB27_TRex_C24Z16_FixedTimeStep goes from
> 99fps to 109fps on my Haswell, and from 60fps to 65fps on my Ivy
> Bridge.  No piglit regression on both GENs.
>
> I had a non-recoverable system hang once with the patch applied.  I
> was not sure if it is because of the patch or drm-intel-nightly (which
> I checkout out some weeks ago).  I did my tests today against latest
> drm-intel-next, and did not have any hang.  Since the optimization is
> disabled by default, I am curious if there is any caveat before
> enabling it.

Chad's still missing ;-) Also the hang might just be ppgtt fallout,
there's still a few regressions with that.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: enable HiZ Raw Stall Optimization
  2014-01-27  8:18 [PATCH] drm/i915: enable HiZ Raw Stall Optimization Chia-I Wu
  2014-01-27  8:33 ` Chia-I Wu
@ 2014-01-27 13:07 ` Ville Syrjälä
  2014-01-27 16:22   ` Daniel Vetter
  2014-01-28  4:40   ` Chia-I Wu
  2014-01-28  5:29 ` [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW Chia-I Wu
  2 siblings, 2 replies; 12+ messages in thread
From: Ville Syrjälä @ 2014-01-27 13:07 UTC (permalink / raw)
  To: Chia-I Wu; +Cc: intel-gfx

On Mon, Jan 27, 2014 at 04:18:36PM +0800, Chia-I Wu wrote:
> From: Chia-I Wu <olv@lunarg.com>
> 
> The optimization is available on Ivy Bridge and later, and is disabled by
> default.  Enabling it helps certain workloads such as GLBenchmark TRex test.

Actually BSpec even goes as far as saying that this optimization must
be enabled on HSW+.

So it seems you should enable it for BDW as well. I'm not sure about VLV.
The description of the bit says nothing about VLV, even though the
documented default value is specified to have it set for VLV as well. I
guess someone should just try it and see what happens.

Might make sense to split the patch into per-platforms patches. That way
we could more easily revert eg. just the IVB part if it causes problems.

> 
> Signed-off-by: Chia-I Wu <olv@lunarg.com>
> Cc: Ian Romanick <ian.d.romanick@intel.com>
> Cc: Chad Versace <chad.versace@linux.intel.com>
> 
> ---
>  drivers/gpu/drm/i915/i915_reg.h     | 2 ++
>  drivers/gpu/drm/i915/i915_suspend.c | 9 +++++++--
>  drivers/gpu/drm/i915/intel_pm.c     | 8 ++++++++
>  3 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index ee27421..bd90ef3 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -930,6 +930,8 @@
>  #define   ECO_GATING_CX_ONLY	(1<<3)
>  #define   ECO_FLIP_DONE		(1<<0)
>  
> +#define CACHE_MODE_0_GEN7	0x7000 /* IVB+ */
> +#define   HIZ_RAW_STALL_OPT_DISABLE (1<<2)
>  #define CACHE_MODE_1		0x7004 /* IVB+ */
>  #define   PIXEL_SUBSPAN_COLLECT_OPT_DISABLE (1<<6)
>  
> diff --git a/drivers/gpu/drm/i915/i915_suspend.c b/drivers/gpu/drm/i915/i915_suspend.c
> index 98790c7..13fefbd 100644
> --- a/drivers/gpu/drm/i915/i915_suspend.c
> +++ b/drivers/gpu/drm/i915/i915_suspend.c
> @@ -398,7 +398,9 @@ int i915_save_state(struct drm_device *dev)
>  	intel_disable_gt_powersave(dev);
>  
>  	/* Cache mode state */
> -	if (INTEL_INFO(dev)->gen < 7)
> +	if (INTEL_INFO(dev)->gen >= 7)
> +		dev_priv->regfile.saveCACHE_MODE_0 = I915_READ(CACHE_MODE_0_GEN7);
> +	else
>  		dev_priv->regfile.saveCACHE_MODE_0 = I915_READ(CACHE_MODE_0);
>  
>  	/* Memory Arbitration state */
> @@ -448,7 +450,10 @@ int i915_restore_state(struct drm_device *dev)
>  	}
>  
>  	/* Cache mode state */
> -	if (INTEL_INFO(dev)->gen < 7)
> +	if (INTEL_INFO(dev)->gen >= 7)
> +		I915_WRITE(CACHE_MODE_0_GEN7, dev_priv->regfile.saveCACHE_MODE_0 |
> +			   0xffff0000);
> +	else
>  		I915_WRITE(CACHE_MODE_0, dev_priv->regfile.saveCACHE_MODE_0 |
>  			   0xffff0000);

These hunks are material for a separate patch.

>  
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 26c29c1..d6ddc39 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -5355,6 +5355,10 @@ static void haswell_init_clock_gating(struct drm_device *dev)
>  	/* WaVSRefCountFullforceMissDisable:hsw */
>  	gen7_setup_fixed_func_scheduler(dev_priv);
>  
> +	/* enable HiZ Raw Stall Optimization */
> +	I915_WRITE(CACHE_MODE_0_GEN7,
> +		   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
> +
>  	/* WaDisable4x2SubspanOptimization:hsw */
>  	I915_WRITE(CACHE_MODE_1,
>  		   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
> @@ -5445,6 +5449,10 @@ static void ivybridge_init_clock_gating(struct drm_device *dev)
>  	/* WaVSRefCountFullforceMissDisable:ivb */
>  	gen7_setup_fixed_func_scheduler(dev_priv);
>  
> +	/* enable HiZ Raw Stall Optimization */
> +	I915_WRITE(CACHE_MODE_0_GEN7,
> +		   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
> +
>  	/* WaDisable4x2SubspanOptimization:ivb */
>  	I915_WRITE(CACHE_MODE_1,
>  		   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
> -- 
> 1.8.5.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: enable HiZ Raw Stall Optimization
  2014-01-27 13:07 ` Ville Syrjälä
@ 2014-01-27 16:22   ` Daniel Vetter
  2014-01-28  4:40   ` Chia-I Wu
  1 sibling, 0 replies; 12+ messages in thread
From: Daniel Vetter @ 2014-01-27 16:22 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Mon, Jan 27, 2014 at 03:07:45PM +0200, Ville Syrjälä wrote:
> On Mon, Jan 27, 2014 at 04:18:36PM +0800, Chia-I Wu wrote:
> > From: Chia-I Wu <olv@lunarg.com>
> > 
> > The optimization is available on Ivy Bridge and later, and is disabled by
> > default.  Enabling it helps certain workloads such as GLBenchmark TRex test.
> 
> Actually BSpec even goes as far as saying that this optimization must
> be enabled on HSW+.
> 
> So it seems you should enable it for BDW as well. I'm not sure about VLV.
> The description of the bit says nothing about VLV, even though the
> documented default value is specified to have it set for VLV as well. I
> guess someone should just try it and see what happens.
> 
> Might make sense to split the patch into per-platforms patches. That way
> we could more easily revert eg. just the IVB part if it causes problems.
> 
> > 
> > Signed-off-by: Chia-I Wu <olv@lunarg.com>
> > Cc: Ian Romanick <ian.d.romanick@intel.com>
> > Cc: Chad Versace <chad.versace@linux.intel.com>
> > 
> > ---
> >  drivers/gpu/drm/i915/i915_reg.h     | 2 ++
> >  drivers/gpu/drm/i915/i915_suspend.c | 9 +++++++--
> >  drivers/gpu/drm/i915/intel_pm.c     | 8 ++++++++
> >  3 files changed, 17 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index ee27421..bd90ef3 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -930,6 +930,8 @@
> >  #define   ECO_GATING_CX_ONLY	(1<<3)
> >  #define   ECO_FLIP_DONE		(1<<0)
> >  
> > +#define CACHE_MODE_0_GEN7	0x7000 /* IVB+ */
> > +#define   HIZ_RAW_STALL_OPT_DISABLE (1<<2)
> >  #define CACHE_MODE_1		0x7004 /* IVB+ */
> >  #define   PIXEL_SUBSPAN_COLLECT_OPT_DISABLE (1<<6)
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_suspend.c b/drivers/gpu/drm/i915/i915_suspend.c
> > index 98790c7..13fefbd 100644
> > --- a/drivers/gpu/drm/i915/i915_suspend.c
> > +++ b/drivers/gpu/drm/i915/i915_suspend.c
> > @@ -398,7 +398,9 @@ int i915_save_state(struct drm_device *dev)
> >  	intel_disable_gt_powersave(dev);
> >  
> >  	/* Cache mode state */
> > -	if (INTEL_INFO(dev)->gen < 7)
> > +	if (INTEL_INFO(dev)->gen >= 7)
> > +		dev_priv->regfile.saveCACHE_MODE_0 = I915_READ(CACHE_MODE_0_GEN7);
> > +	else
> >  		dev_priv->regfile.saveCACHE_MODE_0 = I915_READ(CACHE_MODE_0);
> >  
> >  	/* Memory Arbitration state */
> > @@ -448,7 +450,10 @@ int i915_restore_state(struct drm_device *dev)
> >  	}
> >  
> >  	/* Cache mode state */
> > -	if (INTEL_INFO(dev)->gen < 7)
> > +	if (INTEL_INFO(dev)->gen >= 7)
> > +		I915_WRITE(CACHE_MODE_0_GEN7, dev_priv->regfile.saveCACHE_MODE_0 |
> > +			   0xffff0000);
> > +	else
> >  		I915_WRITE(CACHE_MODE_0, dev_priv->regfile.saveCACHE_MODE_0 |
> >  			   0xffff0000);
> 
> These hunks are material for a separate patch.

Also they shouldn't be required. On all modern platforms our setup code
(init_clock_gating callback) which runs both at driver load time and
resume time should take care of all these bits and registers. If we miss
some of them, we need to add them. So please drop this hunk for v2.
-Daniel

> 
> >  
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 26c29c1..d6ddc39 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -5355,6 +5355,10 @@ static void haswell_init_clock_gating(struct drm_device *dev)
> >  	/* WaVSRefCountFullforceMissDisable:hsw */
> >  	gen7_setup_fixed_func_scheduler(dev_priv);
> >  
> > +	/* enable HiZ Raw Stall Optimization */
> > +	I915_WRITE(CACHE_MODE_0_GEN7,
> > +		   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
> > +
> >  	/* WaDisable4x2SubspanOptimization:hsw */
> >  	I915_WRITE(CACHE_MODE_1,
> >  		   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
> > @@ -5445,6 +5449,10 @@ static void ivybridge_init_clock_gating(struct drm_device *dev)
> >  	/* WaVSRefCountFullforceMissDisable:ivb */
> >  	gen7_setup_fixed_func_scheduler(dev_priv);
> >  
> > +	/* enable HiZ Raw Stall Optimization */
> > +	I915_WRITE(CACHE_MODE_0_GEN7,
> > +		   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
> > +
> >  	/* WaDisable4x2SubspanOptimization:ivb */
> >  	I915_WRITE(CACHE_MODE_1,
> >  		   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
> > -- 
> > 1.8.5.3
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Ville Syrjälä
> Intel OTC
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: enable HiZ Raw Stall Optimization
  2014-01-27 13:07 ` Ville Syrjälä
  2014-01-27 16:22   ` Daniel Vetter
@ 2014-01-28  4:40   ` Chia-I Wu
  1 sibling, 0 replies; 12+ messages in thread
From: Chia-I Wu @ 2014-01-28  4:40 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Mon, Jan 27, 2014 at 9:07 PM, Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> On Mon, Jan 27, 2014 at 04:18:36PM +0800, Chia-I Wu wrote:
>> From: Chia-I Wu <olv@lunarg.com>
>>
>> The optimization is available on Ivy Bridge and later, and is disabled by
>> default.  Enabling it helps certain workloads such as GLBenchmark TRex test.
>
> Actually BSpec even goes as far as saying that this optimization must
> be enabled on HSW+.
The public documentation actually says if you want the optimization,
you must enable it.  Kind of stating the obvious. :)

> So it seems you should enable it for BDW as well. I'm not sure about VLV.
> The description of the bit says nothing about VLV, even though the
> documented default value is specified to have it set for VLV as well. I
> guess someone should just try it and see what happens.
>
> Might make sense to split the patch into per-platforms patches. That way
> we could more easily revert eg. just the IVB part if it causes problems.
Will do.  Though I will leave BDW/VLV out as I do not have the hardware.

>>
>> Signed-off-by: Chia-I Wu <olv@lunarg.com>
>> Cc: Ian Romanick <ian.d.romanick@intel.com>
>> Cc: Chad Versace <chad.versace@linux.intel.com>
>>
>> ---
>>  drivers/gpu/drm/i915/i915_reg.h     | 2 ++
>>  drivers/gpu/drm/i915/i915_suspend.c | 9 +++++++--
>>  drivers/gpu/drm/i915/intel_pm.c     | 8 ++++++++
>>  3 files changed, 17 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index ee27421..bd90ef3 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -930,6 +930,8 @@
>>  #define   ECO_GATING_CX_ONLY (1<<3)
>>  #define   ECO_FLIP_DONE              (1<<0)
>>
>> +#define CACHE_MODE_0_GEN7    0x7000 /* IVB+ */
>> +#define   HIZ_RAW_STALL_OPT_DISABLE (1<<2)
>>  #define CACHE_MODE_1         0x7004 /* IVB+ */
>>  #define   PIXEL_SUBSPAN_COLLECT_OPT_DISABLE (1<<6)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_suspend.c b/drivers/gpu/drm/i915/i915_suspend.c
>> index 98790c7..13fefbd 100644
>> --- a/drivers/gpu/drm/i915/i915_suspend.c
>> +++ b/drivers/gpu/drm/i915/i915_suspend.c
>> @@ -398,7 +398,9 @@ int i915_save_state(struct drm_device *dev)
>>       intel_disable_gt_powersave(dev);
>>
>>       /* Cache mode state */
>> -     if (INTEL_INFO(dev)->gen < 7)
>> +     if (INTEL_INFO(dev)->gen >= 7)
>> +             dev_priv->regfile.saveCACHE_MODE_0 = I915_READ(CACHE_MODE_0_GEN7);
>> +     else
>>               dev_priv->regfile.saveCACHE_MODE_0 = I915_READ(CACHE_MODE_0);
>>
>>       /* Memory Arbitration state */
>> @@ -448,7 +450,10 @@ int i915_restore_state(struct drm_device *dev)
>>       }
>>
>>       /* Cache mode state */
>> -     if (INTEL_INFO(dev)->gen < 7)
>> +     if (INTEL_INFO(dev)->gen >= 7)
>> +             I915_WRITE(CACHE_MODE_0_GEN7, dev_priv->regfile.saveCACHE_MODE_0 |
>> +                        0xffff0000);
>> +     else
>>               I915_WRITE(CACHE_MODE_0, dev_priv->regfile.saveCACHE_MODE_0 |
>>                          0xffff0000);
>
> These hunks are material for a separate patch.
>
>>
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index 26c29c1..d6ddc39 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -5355,6 +5355,10 @@ static void haswell_init_clock_gating(struct drm_device *dev)
>>       /* WaVSRefCountFullforceMissDisable:hsw */
>>       gen7_setup_fixed_func_scheduler(dev_priv);
>>
>> +     /* enable HiZ Raw Stall Optimization */
>> +     I915_WRITE(CACHE_MODE_0_GEN7,
>> +                _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
>> +
>>       /* WaDisable4x2SubspanOptimization:hsw */
>>       I915_WRITE(CACHE_MODE_1,
>>                  _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
>> @@ -5445,6 +5449,10 @@ static void ivybridge_init_clock_gating(struct drm_device *dev)
>>       /* WaVSRefCountFullforceMissDisable:ivb */
>>       gen7_setup_fixed_func_scheduler(dev_priv);
>>
>> +     /* enable HiZ Raw Stall Optimization */
>> +     I915_WRITE(CACHE_MODE_0_GEN7,
>> +                _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
>> +
>>       /* WaDisable4x2SubspanOptimization:ivb */
>>       I915_WRITE(CACHE_MODE_1,
>>                  _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
>> --
>> 1.8.5.3
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Ville Syrjälä
> Intel OTC



-- 
olv@LunarG.com
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW
  2014-01-27  8:18 [PATCH] drm/i915: enable HiZ Raw Stall Optimization Chia-I Wu
  2014-01-27  8:33 ` Chia-I Wu
  2014-01-27 13:07 ` Ville Syrjälä
@ 2014-01-28  5:29 ` Chia-I Wu
  2014-01-28  5:29   ` [PATCH 2/2] drm/i915: enable HiZ Raw Stall Optimization on IVB Chia-I Wu
  2014-01-29 17:56   ` [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW Daniel Vetter
  2 siblings, 2 replies; 12+ messages in thread
From: Chia-I Wu @ 2014-01-28  5:29 UTC (permalink / raw)
  To: intel-gfx

From: Chia-I Wu <olv@lunarg.com>

The optimization is available on Ivy Bridge and later, and is disabled by
default.  Enabling it helps certain workloads such as GLBenchmark TRex test.

No piglit regression.

v2
 - no need to save the register before suspend as init_clock_gating can
   correctly program it after resume
 - split IVB change to another commit

Signed-off-by: Chia-I Wu <olv@lunarg.com>
---
 drivers/gpu/drm/i915/i915_reg.h | 2 ++
 drivers/gpu/drm/i915/intel_pm.c | 4 ++++
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 76126e0..c74bc28 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -934,6 +934,8 @@
 #define   ECO_GATING_CX_ONLY	(1<<3)
 #define   ECO_FLIP_DONE		(1<<0)
 
+#define CACHE_MODE_0_GEN7	0x7000 /* IVB+ */
+#define   HIZ_RAW_STALL_OPT_DISABLE (1<<2)
 #define CACHE_MODE_1		0x7004 /* IVB+ */
 #define   PIXEL_SUBSPAN_COLLECT_OPT_DISABLE (1<<6)
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d77cc81..c535e5c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4789,6 +4789,10 @@ static void haswell_init_clock_gating(struct drm_device *dev)
 	/* WaVSRefCountFullforceMissDisable:hsw */
 	gen7_setup_fixed_func_scheduler(dev_priv);
 
+	/* enable HiZ Raw Stall Optimization */
+	I915_WRITE(CACHE_MODE_0_GEN7,
+		   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
+
 	/* WaDisable4x2SubspanOptimization:hsw */
 	I915_WRITE(CACHE_MODE_1,
 		   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] drm/i915: enable HiZ Raw Stall Optimization on IVB
  2014-01-28  5:29 ` [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW Chia-I Wu
@ 2014-01-28  5:29   ` Chia-I Wu
  2014-01-29 17:56   ` [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW Daniel Vetter
  1 sibling, 0 replies; 12+ messages in thread
From: Chia-I Wu @ 2014-01-28  5:29 UTC (permalink / raw)
  To: intel-gfx

From: Chia-I Wu <olv@lunarg.com>

The optimization helps IVB too.  No piglit regression.

Signed-off-by: Chia-I Wu <olv@lunarg.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index c535e5c..58aba3e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4881,6 +4881,10 @@ static void ivybridge_init_clock_gating(struct drm_device *dev)
 	/* WaVSRefCountFullforceMissDisable:ivb */
 	gen7_setup_fixed_func_scheduler(dev_priv);
 
+	/* enable HiZ Raw Stall Optimization */
+	I915_WRITE(CACHE_MODE_0_GEN7,
+		   _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
+
 	/* WaDisable4x2SubspanOptimization:ivb */
 	I915_WRITE(CACHE_MODE_1,
 		   _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW
  2014-01-28  5:29 ` [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW Chia-I Wu
  2014-01-28  5:29   ` [PATCH 2/2] drm/i915: enable HiZ Raw Stall Optimization on IVB Chia-I Wu
@ 2014-01-29 17:56   ` Daniel Vetter
  2014-01-30  2:23     ` Matt Turner
  1 sibling, 1 reply; 12+ messages in thread
From: Daniel Vetter @ 2014-01-29 17:56 UTC (permalink / raw)
  To: Chia-I Wu; +Cc: intel-gfx

On Tue, Jan 28, 2014 at 6:29 AM, Chia-I Wu <olvaffe@gmail.com> wrote:
> From: Chia-I Wu <olv@lunarg.com>
>
> The optimization is available on Ivy Bridge and later, and is disabled by
> default.  Enabling it helps certain workloads such as GLBenchmark TRex test.
>
> No piglit regression.
>
> v2
>  - no need to save the register before suspend as init_clock_gating can
>    correctly program it after resume
>  - split IVB change to another commit
>
> Signed-off-by: Chia-I Wu <olv@lunarg.com>

What about byt?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW
  2014-01-29 17:56   ` [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW Daniel Vetter
@ 2014-01-30  2:23     ` Matt Turner
  2014-01-30 12:10       ` Daniel Vetter
  0 siblings, 1 reply; 12+ messages in thread
From: Matt Turner @ 2014-01-30  2:23 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Wed, Jan 29, 2014 at 9:56 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Jan 28, 2014 at 6:29 AM, Chia-I Wu <olvaffe@gmail.com> wrote:
>> From: Chia-I Wu <olv@lunarg.com>
>>
>> The optimization is available on Ivy Bridge and later, and is disabled by
>> default.  Enabling it helps certain workloads such as GLBenchmark TRex test.
>>
>> No piglit regression.
>>
>> v2
>>  - no need to save the register before suspend as init_clock_gating can
>>    correctly program it after resume
>>  - split IVB change to another commit
>>
>> Signed-off-by: Chia-I Wu <olv@lunarg.com>
>
> What about byt?
> -Daniel

In the previous thread, Ville pointed out that the documentation
doesn't say anything about BYT for this bit, so it's unknown whether
it's supported, and Chia-I said

> Though I will leave BDW/VLV out as I do not have the hardware.

I do have BYT, so I'll check it out, but this patch can go in without
it that information, since Chia-I took Ville's advice and split the
patches into per-generation hunks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW
  2014-01-30  2:23     ` Matt Turner
@ 2014-01-30 12:10       ` Daniel Vetter
  2014-01-30 12:40         ` Ville Syrjälä
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Vetter @ 2014-01-30 12:10 UTC (permalink / raw)
  To: Matt Turner; +Cc: intel-gfx

On Wed, Jan 29, 2014 at 06:23:40PM -0800, Matt Turner wrote:
> On Wed, Jan 29, 2014 at 9:56 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Tue, Jan 28, 2014 at 6:29 AM, Chia-I Wu <olvaffe@gmail.com> wrote:
> >> From: Chia-I Wu <olv@lunarg.com>
> >>
> >> The optimization is available on Ivy Bridge and later, and is disabled by
> >> default.  Enabling it helps certain workloads such as GLBenchmark TRex test.
> >>
> >> No piglit regression.
> >>
> >> v2
> >>  - no need to save the register before suspend as init_clock_gating can
> >>    correctly program it after resume
> >>  - split IVB change to another commit
> >>
> >> Signed-off-by: Chia-I Wu <olv@lunarg.com>
> >
> > What about byt?
> > -Daniel
> 
> In the previous thread, Ville pointed out that the documentation
> doesn't say anything about BYT for this bit, so it's unknown whether
> it's supported, and Chia-I said
> 
> > Though I will leave BDW/VLV out as I do not have the hardware.
> 
> I do have BYT, so I'll check it out, but this patch can go in without
> it that information, since Chia-I took Ville's advice and split the
> patches into per-generation hunks.

Make sense. Patches merged and I'll happily pull in the byt update if that
one checks out ok.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW
  2014-01-30 12:10       ` Daniel Vetter
@ 2014-01-30 12:40         ` Ville Syrjälä
  0 siblings, 0 replies; 12+ messages in thread
From: Ville Syrjälä @ 2014-01-30 12:40 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Thu, Jan 30, 2014 at 01:10:07PM +0100, Daniel Vetter wrote:
> On Wed, Jan 29, 2014 at 06:23:40PM -0800, Matt Turner wrote:
> > On Wed, Jan 29, 2014 at 9:56 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> > > On Tue, Jan 28, 2014 at 6:29 AM, Chia-I Wu <olvaffe@gmail.com> wrote:
> > >> From: Chia-I Wu <olv@lunarg.com>
> > >>
> > >> The optimization is available on Ivy Bridge and later, and is disabled by
> > >> default.  Enabling it helps certain workloads such as GLBenchmark TRex test.
> > >>
> > >> No piglit regression.
> > >>
> > >> v2
> > >>  - no need to save the register before suspend as init_clock_gating can
> > >>    correctly program it after resume
> > >>  - split IVB change to another commit
> > >>
> > >> Signed-off-by: Chia-I Wu <olv@lunarg.com>
> > >
> > > What about byt?
> > > -Daniel
> > 
> > In the previous thread, Ville pointed out that the documentation
> > doesn't say anything about BYT for this bit, so it's unknown whether
> > it's supported, and Chia-I said
> > 
> > > Though I will leave BDW/VLV out as I do not have the hardware.
> > 
> > I do have BYT, so I'll check it out, but this patch can go in without
> > it that information, since Chia-I took Ville's advice and split the
> > patches into per-generation hunks.
> 
> Make sense. Patches merged and I'll happily pull in the byt update if that
> one checks out ok.

and bdw also needs this (bspec says hsw+)

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-01-30 12:40 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-27  8:18 [PATCH] drm/i915: enable HiZ Raw Stall Optimization Chia-I Wu
2014-01-27  8:33 ` Chia-I Wu
2014-01-27  9:04   ` Daniel Vetter
2014-01-27 13:07 ` Ville Syrjälä
2014-01-27 16:22   ` Daniel Vetter
2014-01-28  4:40   ` Chia-I Wu
2014-01-28  5:29 ` [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW Chia-I Wu
2014-01-28  5:29   ` [PATCH 2/2] drm/i915: enable HiZ Raw Stall Optimization on IVB Chia-I Wu
2014-01-29 17:56   ` [PATCH 1/2] drm/i915: enable HiZ Raw Stall Optimization on HSW Daniel Vetter
2014-01-30  2:23     ` Matt Turner
2014-01-30 12:10       ` Daniel Vetter
2014-01-30 12:40         ` Ville Syrjälä

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox