* [PATCH 2/8] drm/i915: implement WaDisableDopClockGatingisable on VLV and IVB
2012-10-25 19:15 [PATCH 1/8] drm/i915: implement WaDisableL3CacheAging on VLV Jesse Barnes
@ 2012-10-25 19:15 ` Jesse Barnes
2012-10-25 19:15 ` [PATCH 3/8] drm/i915: implement WaForceL3Serialization " Jesse Barnes
` (5 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Jesse Barnes @ 2012-10-25 19:15 UTC (permalink / raw)
To: intel-gfx
v2: use correct register
v3: remove extra hunks, pull in register definitions & offset check directly
v4: add GT1 vs GT2 distinction for IVB portion (Ben)
References: https://bugs.freedesktop.org/show_bug.cgi?id=50233
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
---
drivers/gpu/drm/i915/i915_drv.c | 7 +++++++
drivers/gpu/drm/i915/i915_drv.h | 3 +++
drivers/gpu/drm/i915/i915_reg.h | 4 ++++
drivers/gpu/drm/i915/intel_pm.c | 13 ++++++++++++-
4 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 59dc481..8101cf6 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1128,6 +1128,13 @@ static bool IS_DISPLAYREG(u32 reg)
if (reg == GEN6_GDRST)
return false;
+ switch (reg) {
+ case GEN7_ROW_CHICKEN2:
+ return false;
+ default:
+ break;
+ }
+
return true;
}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2fcf284..1f55d9c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1124,6 +1124,9 @@ struct drm_i915_file_private {
#define IS_IRONLAKE_D(dev) ((dev)->pci_device == 0x0042)
#define IS_IRONLAKE_M(dev) ((dev)->pci_device == 0x0046)
#define IS_IVYBRIDGE(dev) (INTEL_INFO(dev)->is_ivybridge)
+#define IS_IVB_GT1(dev) ((dev)->pci_device == 0x0156 || \
+ (dev)->pci_device == 0x0152 || \
+ (dev)->pci_device == 0x015a)
#define IS_VALLEYVIEW(dev) (INTEL_INFO(dev)->is_valleyview)
#define IS_HASWELL(dev) (INTEL_INFO(dev)->is_haswell)
#define IS_MOBILE(dev) (INTEL_INFO(dev)->is_mobile)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 5da227b..6464eaa 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -4271,6 +4271,10 @@
#define GEN7_L3LOG_BASE 0xB070
#define GEN7_L3LOG_SIZE 0x80
+#define GEN7_ROW_CHICKEN2 0xe4f4
+#define GEN7_ROW_CHICKEN2_GT2 0xf4f4
+#define DOP_CLOCK_GATING_DISABLE (1<<0)
+
#define G4X_AUD_VID_DID 0x62020
#define INTEL_AUDIO_DEVCL 0x808629FB
#define INTEL_AUDIO_DEVBLC 0x80862801
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d4ddcf2..a5ae0e7 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3560,7 +3560,14 @@ static void ivybridge_init_clock_gating(struct drm_device *dev)
I915_WRITE(GEN7_L3CNTLREG1,
GEN7_WA_FOR_GEN7_L3_CONTROL);
I915_WRITE(GEN7_L3_CHICKEN_MODE_REGISTER,
- GEN7_WA_L3_CHICKEN_MODE);
+ GEN7_WA_L3_CHICKEN_MODE);
+ if (IS_IVB_GT1(dev))
+ I915_WRITE(GEN7_ROW_CHICKEN2,
+ _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
+ else
+ I915_WRITE(GEN7_ROW_CHICKEN2_GT2,
+ _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
+
/* WaForceL3Serialization */
I915_WRITE(GEN7_L3SQCREG4, I915_READ(GEN7_L3SQCREG4) &
@@ -3642,6 +3649,10 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
I915_WRITE(GEN7_L3SQCREG4, I915_READ(GEN7_L3SQCREG4) &
~L3SQ_URB_READ_CAM_MATCH_DISABLE);
+ /* WaDisableDopClockGating */
+ I915_WRITE(GEN7_ROW_CHICKEN2,
+ _MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
+
/* This is required by WaCatErrorRejectionIssue */
I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
I915_READ(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG) |
--
1.7.9.5
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 3/8] drm/i915: implement WaForceL3Serialization on VLV and IVB
2012-10-25 19:15 [PATCH 1/8] drm/i915: implement WaDisableL3CacheAging on VLV Jesse Barnes
2012-10-25 19:15 ` [PATCH 2/8] drm/i915: implement WaDisableDopClockGatingisable on VLV and IVB Jesse Barnes
@ 2012-10-25 19:15 ` Jesse Barnes
2012-10-25 19:15 ` [PATCH 4/8] drm/i915: implement WaDisableVLVClockGating_VBIIssue on VLV Jesse Barnes
` (4 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Jesse Barnes @ 2012-10-25 19:15 UTC (permalink / raw)
To: intel-gfx
References: https://bugs.freedesktop.org/show_bug.cgi?id=50250
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
---
drivers/gpu/drm/i915/intel_pm.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index a5ae0e7..d04e87f 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3653,6 +3653,10 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
I915_WRITE(GEN7_ROW_CHICKEN2,
_MASKED_BIT_ENABLE(DOP_CLOCK_GATING_DISABLE));
+ /* WaForceL3Serialization */
+ I915_WRITE(GEN7_L3SQCREG4, I915_READ(GEN7_L3SQCREG4) &
+ ~L3SQ_URB_READ_CAM_MATCH_DISABLE);
+
/* This is required by WaCatErrorRejectionIssue */
I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
I915_READ(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG) |
--
1.7.9.5
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 4/8] drm/i915: implement WaDisableVLVClockGating_VBIIssue on VLV
2012-10-25 19:15 [PATCH 1/8] drm/i915: implement WaDisableL3CacheAging on VLV Jesse Barnes
2012-10-25 19:15 ` [PATCH 2/8] drm/i915: implement WaDisableDopClockGatingisable on VLV and IVB Jesse Barnes
2012-10-25 19:15 ` [PATCH 3/8] drm/i915: implement WaForceL3Serialization " Jesse Barnes
@ 2012-10-25 19:15 ` Jesse Barnes
2012-11-01 14:48 ` Antti Koskipää
2012-10-25 19:15 ` [PATCH 5/8] drm/i915: implement WaDisablePSDDualDispatchEnable on IVB & VLV Jesse Barnes
` (3 subsequent siblings)
6 siblings, 1 reply; 15+ messages in thread
From: Jesse Barnes @ 2012-10-25 19:15 UTC (permalink / raw)
To: intel-gfx
This allows us to get the right vblank interrupt frequency.
v2: pull in register definition
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
---
drivers/gpu/drm/i915/i915_reg.h | 2 ++
drivers/gpu/drm/i915/intel_pm.c | 7 +++++++
2 files changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 6464eaa..4aec0a3 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -556,6 +556,8 @@
#define IIR 0x020a4
#define IMR 0x020a8
#define ISR 0x020ac
+#define VLV_GUNIT_CLOCK_GATE 0x182060
+#define GCFG_DIS (1<<8)
#define VLV_IIR_RW 0x182084
#define VLV_IER 0x1820a0
#define VLV_IIR 0x1820a4
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d04e87f..88c154c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3713,6 +3713,13 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
PIPEA_HLINE_INT_EN | PIPEA_VBLANK_INT_EN |
SPRITEB_FLIPDONE_INT_EN | SPRITEA_FLIPDONE_INT_EN |
PLANEA_FLIPDONE_INT_EN);
+
+ /*
+ * WaDisableVLVClockGating_VBIIssue
+ * Disable clock gating on th GCFG unit to prevent a delay
+ * in the reporting of vblank events.
+ */
+ I915_WRITE(VLV_GUNIT_CLOCK_GATE, GCFG_DIS);
}
static void g4x_init_clock_gating(struct drm_device *dev)
--
1.7.9.5
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 4/8] drm/i915: implement WaDisableVLVClockGating_VBIIssue on VLV
2012-10-25 19:15 ` [PATCH 4/8] drm/i915: implement WaDisableVLVClockGating_VBIIssue on VLV Jesse Barnes
@ 2012-11-01 14:48 ` Antti Koskipää
2012-11-01 14:50 ` Jesse Barnes
0 siblings, 1 reply; 15+ messages in thread
From: Antti Koskipää @ 2012-11-01 14:48 UTC (permalink / raw)
To: Jesse Barnes; +Cc: intel-gfx
Hi,
On 10/25/12 22:15, Jesse Barnes wrote:
> This allows us to get the right vblank interrupt frequency.
>
> v2: pull in register definition
>
> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 2 ++
> drivers/gpu/drm/i915/intel_pm.c | 7 +++++++
> 2 files changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 6464eaa..4aec0a3 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -556,6 +556,8 @@
> #define IIR 0x020a4
> #define IMR 0x020a8
> #define ISR 0x020ac
> +#define VLV_GUNIT_CLOCK_GATE 0x182060
Where did you pull this offset from? It's not in the Bspec.
> +#define GCFG_DIS (1<<8)
> #define VLV_IIR_RW 0x182084
> #define VLV_IER 0x1820a0
> #define VLV_IIR 0x1820a4
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index d04e87f..88c154c 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3713,6 +3713,13 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
> PIPEA_HLINE_INT_EN | PIPEA_VBLANK_INT_EN |
> SPRITEB_FLIPDONE_INT_EN | SPRITEA_FLIPDONE_INT_EN |
> PLANEA_FLIPDONE_INT_EN);
> +
> + /*
> + * WaDisableVLVClockGating_VBIIssue
> + * Disable clock gating on th GCFG unit to prevent a delay
> + * in the reporting of vblank events.
> + */
> + I915_WRITE(VLV_GUNIT_CLOCK_GATE, GCFG_DIS);
> }
>
> static void g4x_init_clock_gating(struct drm_device *dev)
--
- Antti
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/8] drm/i915: implement WaDisableVLVClockGating_VBIIssue on VLV
2012-11-01 14:48 ` Antti Koskipää
@ 2012-11-01 14:50 ` Jesse Barnes
2012-11-01 14:52 ` Antti Koskipää
0 siblings, 1 reply; 15+ messages in thread
From: Jesse Barnes @ 2012-11-01 14:50 UTC (permalink / raw)
To: Antti Koskipää; +Cc: intel-gfx
On Thu, 01 Nov 2012 16:48:17 +0200
Antti Koskipää <antti.koskipaa@linux.intel.com> wrote:
> Hi,
>
> On 10/25/12 22:15, Jesse Barnes wrote:
> > This allows us to get the right vblank interrupt frequency.
> >
> > v2: pull in register definition
> >
> > Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
> > ---
> > drivers/gpu/drm/i915/i915_reg.h | 2 ++
> > drivers/gpu/drm/i915/intel_pm.c | 7 +++++++
> > 2 files changed, 9 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index 6464eaa..4aec0a3 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -556,6 +556,8 @@
> > #define IIR 0x020a4
> > #define IMR 0x020a8
> > #define ISR 0x020ac
> > +#define VLV_GUNIT_CLOCK_GATE 0x182060
>
> Where did you pull this offset from? It's not in the Bspec.
No, it's in the gunit spec. I'm still working on getting that one
opened up.
--
Jesse Barnes, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 5/8] drm/i915: implement WaDisablePSDDualDispatchEnable on IVB & VLV
2012-10-25 19:15 [PATCH 1/8] drm/i915: implement WaDisableL3CacheAging on VLV Jesse Barnes
` (2 preceding siblings ...)
2012-10-25 19:15 ` [PATCH 4/8] drm/i915: implement WaDisableVLVClockGating_VBIIssue on VLV Jesse Barnes
@ 2012-10-25 19:15 ` Jesse Barnes
2012-10-25 19:15 ` [PATCH 6/8] drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op Jesse Barnes
` (2 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Jesse Barnes @ 2012-10-25 19:15 UTC (permalink / raw)
To: intel-gfx
Workaround for dual port PS dispatch on GT1.
v2: pull in register definition & offset handling
v3: use IVB GT1 macro to get the right regs (Ben)
v4: add for VLV too (Ben)
v5: don't read the reg, it's masked so we'll only enable the one extra bit (Chris)
v6: use a _GT2 suffix for the second reg (Chris)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
---
drivers/gpu/drm/i915/i915_drv.c | 1 +
drivers/gpu/drm/i915/i915_reg.h | 5 +++++
drivers/gpu/drm/i915/intel_pm.c | 11 +++++++++++
3 files changed, 17 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 8101cf6..d4b3507 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1130,6 +1130,7 @@ static bool IS_DISPLAYREG(u32 reg)
switch (reg) {
case GEN7_ROW_CHICKEN2:
+ case GEN7_HALF_SLICE_CHICKEN1:
return false;
default:
break;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 4aec0a3..f29401b 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -4273,6 +4273,11 @@
#define GEN7_L3LOG_BASE 0xB070
#define GEN7_L3LOG_SIZE 0x80
+#define GEN7_HALF_SLICE_CHICKEN1 0xe100 /* IVB GT1 + VLV */
+#define GEN7_HALF_SLICE_CHICKEN1_GT2 0xf100
+#define GEN7_MAX_PS_THREAD_DEP (8<<12)
+#define GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE (1<<3)
+
#define GEN7_ROW_CHICKEN2 0xe4f4
#define GEN7_ROW_CHICKEN2_GT2 0xf4f4
#define DOP_CLOCK_GATING_DISABLE (1<<0)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 88c154c..91709a3 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3552,6 +3552,14 @@ static void ivybridge_init_clock_gating(struct drm_device *dev)
CHICKEN3_DGMG_REQ_OUT_FIX_DISABLE |
CHICKEN3_DGMG_DONE_FIX_DISABLE);
+ /* WaDisablePSDDualDispatchEnable */
+ if (IS_IVB_GT1(dev))
+ I915_WRITE(GEN7_HALF_SLICE_CHICKEN1,
+ _MASKED_BIT_ENABLE(GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
+ else
+ I915_WRITE(GEN7_HALF_SLICE_CHICKEN1_GT2,
+ _MASKED_BIT_ENABLE(GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
+
/* Apply the WaDisableRHWOOptimizationForRenderHang workaround. */
I915_WRITE(GEN7_COMMON_SLICE_CHICKEN1,
GEN7_CSC1_RHWO_OPT_DISABLE_IN_RCC);
@@ -3637,6 +3645,9 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
CHICKEN3_DGMG_REQ_OUT_FIX_DISABLE |
CHICKEN3_DGMG_DONE_FIX_DISABLE);
+ I915_WRITE(GEN7_HALF_SLICE_CHICKEN1,
+ _MASKED_BIT_ENABLE(GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
+
/* Apply the WaDisableRHWOOptimizationForRenderHang workaround. */
I915_WRITE(GEN7_COMMON_SLICE_CHICKEN1,
GEN7_CSC1_RHWO_OPT_DISABLE_IN_RCC);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 6/8] drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op
2012-10-25 19:15 [PATCH 1/8] drm/i915: implement WaDisableL3CacheAging on VLV Jesse Barnes
` (3 preceding siblings ...)
2012-10-25 19:15 ` [PATCH 5/8] drm/i915: implement WaDisablePSDDualDispatchEnable on IVB & VLV Jesse Barnes
@ 2012-10-25 19:15 ` Jesse Barnes
2012-10-26 11:13 ` Chris Wilson
2012-10-25 19:15 ` [PATCH 7/8] drm/i915: PIPE_CONTROL TLB invalidate requires CS stall Jesse Barnes
2012-10-25 19:15 ` [PATCH 8/8] drm/i915: add clock gating regs to VLV offset check function Jesse Barnes
6 siblings, 1 reply; 15+ messages in thread
From: Jesse Barnes @ 2012-10-25 19:15 UTC (permalink / raw)
To: intel-gfx
So store into the scratch space of the HWS to make sure the invalidate
occurs.
v2: use GTT address space for store, clean up #defines (Chris)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
---
drivers/gpu/drm/i915/i915_reg.h | 8 ++++++--
drivers/gpu/drm/i915/intel_ringbuffer.c | 22 ++++++++++++++++++----
drivers/gpu/drm/i915/intel_ringbuffer.h | 2 ++
3 files changed, 26 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index f29401b..ea97430 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -242,8 +242,12 @@
*/
#define MI_LOAD_REGISTER_IMM(x) MI_INSTR(0x22, 2*x-1)
#define MI_FLUSH_DW MI_INSTR(0x26, 1) /* for GEN6 */
-#define MI_INVALIDATE_TLB (1<<18)
-#define MI_INVALIDATE_BSD (1<<7)
+#define MI_FLUSH_DW_STORE_INDEX (1<<21)
+#define MI_INVALIDATE_TLB (1<<18)
+#define MI_FLUSH_DW_OP_STOREDW (1<<14)
+#define MI_INVALIDATE_BSD (1<<7)
+#define MI_FLUSH_DW_USE_GTT (1<<2)
+#define MI_FLUSH_DW_USE_PPGTT (0<<2)
#define MI_BATCH_BUFFER MI_INSTR(0x30, 1)
#define MI_BATCH_NON_SECURE (1)
/* for snb/ivb/vlv this also means "batch in ppgtt" when ppgtt is enabled. */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 785df4f..dc30d71 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1395,10 +1395,17 @@ static int gen6_ring_flush(struct intel_ring_buffer *ring,
return ret;
cmd = MI_FLUSH_DW;
+ /*
+ * Bspec vol 1c.5 - video engine command streamer:
+ * "If ENABLED, all TLBs will be invalidated once the flush
+ * operation is complete. This bit is only valid when the
+ * Post-Sync Operation field is a value of 1h or 3h."
+ */
if (invalidate & I915_GEM_GPU_DOMAINS)
- cmd |= MI_INVALIDATE_TLB | MI_INVALIDATE_BSD;
+ cmd |= MI_INVALIDATE_TLB | MI_INVALIDATE_BSD |
+ MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW;
intel_ring_emit(ring, cmd);
- intel_ring_emit(ring, 0);
+ intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
intel_ring_emit(ring, 0);
intel_ring_emit(ring, MI_NOOP);
intel_ring_advance(ring);
@@ -1460,10 +1467,17 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
return ret;
cmd = MI_FLUSH_DW;
+ /*
+ * Bspec vol 1c.3 - blitter engine command streamer:
+ * "If ENABLED, all TLBs will be invalidated once the flush
+ * operation is complete. This bit is only valid when the
+ * Post-Sync Operation field is a value of 1h or 3h."
+ */
if (invalidate & I915_GEM_DOMAIN_RENDER)
- cmd |= MI_INVALIDATE_TLB;
+ cmd |= MI_INVALIDATE_TLB | MI_FLUSH_DW_STORE_INDEX |
+ MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_OP_STOREDW;
intel_ring_emit(ring, cmd);
- intel_ring_emit(ring, 0);
+ intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_INDEX | MI_FLUSH_DW_USE_GTT);
intel_ring_emit(ring, 0);
intel_ring_emit(ring, MI_NOOP);
intel_ring_advance(ring);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 3745d1d..5af65b8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -183,6 +183,8 @@ intel_read_status_page(struct intel_ring_buffer *ring,
* The area from dword 0x20 to 0x3ff is available for driver usage.
*/
#define I915_GEM_HWS_INDEX 0x20
+#define I915_GEM_HWS_SCRATCH_INDEX 0x30
+#define I915_GEM_HWS_SCRATCH_ADDR (I915_GEM_HWS_SCRATCH_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 6/8] drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op
2012-10-25 19:15 ` [PATCH 6/8] drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op Jesse Barnes
@ 2012-10-26 11:13 ` Chris Wilson
2012-10-26 16:42 ` Jesse Barnes
0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2012-10-26 11:13 UTC (permalink / raw)
To: Jesse Barnes, intel-gfx
On Thu, 25 Oct 2012 12:15:46 -0700, Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
> So store into the scratch space of the HWS to make sure the invalidate
> occurs.
>
> v2: use GTT address space for store, clean up #defines (Chris)
>
> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
> ---
> @@ -1460,10 +1467,17 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
> return ret;
>
> cmd = MI_FLUSH_DW;
> + /*
> + * Bspec vol 1c.3 - blitter engine command streamer:
> + * "If ENABLED, all TLBs will be invalidated once the flush
> + * operation is complete. This bit is only valid when the
> + * Post-Sync Operation field is a value of 1h or 3h."
> + */
> if (invalidate & I915_GEM_DOMAIN_RENDER)
> - cmd |= MI_INVALIDATE_TLB;
> + cmd |= MI_INVALIDATE_TLB | MI_FLUSH_DW_STORE_INDEX |
> + MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_OP_STOREDW;
> intel_ring_emit(ring, cmd);
> - intel_ring_emit(ring, 0);
> + intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_INDEX | MI_FLUSH_DW_USE_GTT);
s/SCRATCH_INDEX/SCRATCH_ADDR/
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 6/8] drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op
2012-10-26 11:13 ` Chris Wilson
@ 2012-10-26 16:42 ` Jesse Barnes
2012-11-02 12:41 ` Chris Wilson
0 siblings, 1 reply; 15+ messages in thread
From: Jesse Barnes @ 2012-10-26 16:42 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Fri, 26 Oct 2012 12:13:39 +0100
Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Thu, 25 Oct 2012 12:15:46 -0700, Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
> > So store into the scratch space of the HWS to make sure the invalidate
> > occurs.
> >
> > v2: use GTT address space for store, clean up #defines (Chris)
> >
> > Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
> > ---
> > @@ -1460,10 +1467,17 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
> > return ret;
> >
> > cmd = MI_FLUSH_DW;
> > + /*
> > + * Bspec vol 1c.3 - blitter engine command streamer:
> > + * "If ENABLED, all TLBs will be invalidated once the flush
> > + * operation is complete. This bit is only valid when the
> > + * Post-Sync Operation field is a value of 1h or 3h."
> > + */
> > if (invalidate & I915_GEM_DOMAIN_RENDER)
> > - cmd |= MI_INVALIDATE_TLB;
> > + cmd |= MI_INVALIDATE_TLB | MI_FLUSH_DW_STORE_INDEX |
> > + MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_OP_STOREDW;
> > intel_ring_emit(ring, cmd);
> > - intel_ring_emit(ring, 0);
> > + intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_INDEX | MI_FLUSH_DW_USE_GTT);
>
> s/SCRATCH_INDEX/SCRATCH_ADDR/
> -Chris
>
--
Jesse Barnes, Intel Open Source Technology Center
commit b99c792eddf804150b3341a85c256df50d7ab5c2
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date: Wed Sep 19 13:02:39 2012 -0700
drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op v3
So store into the scratch space of the HWS to make sure the invalidate
occurs.
v2: use GTT address space for store, clean up #defines (Chris)
v3: use correct #define in blt ring flush (Chris)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index f29401b..ea97430 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -242,8 +242,12 @@
*/
#define MI_LOAD_REGISTER_IMM(x) MI_INSTR(0x22, 2*x-1)
#define MI_FLUSH_DW MI_INSTR(0x26, 1) /* for GEN6 */
-#define MI_INVALIDATE_TLB (1<<18)
-#define MI_INVALIDATE_BSD (1<<7)
+#define MI_FLUSH_DW_STORE_INDEX (1<<21)
+#define MI_INVALIDATE_TLB (1<<18)
+#define MI_FLUSH_DW_OP_STOREDW (1<<14)
+#define MI_INVALIDATE_BSD (1<<7)
+#define MI_FLUSH_DW_USE_GTT (1<<2)
+#define MI_FLUSH_DW_USE_PPGTT (0<<2)
#define MI_BATCH_BUFFER MI_INSTR(0x30, 1)
#define MI_BATCH_NON_SECURE (1)
/* for snb/ivb/vlv this also means "batch in ppgtt" when ppgtt is enabled. */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 785df4f..55abda5 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1395,10 +1395,17 @@ static int gen6_ring_flush(struct intel_ring_buffer *ring,
return ret;
cmd = MI_FLUSH_DW;
+ /*
+ * Bspec vol 1c.5 - video engine command streamer:
+ * "If ENABLED, all TLBs will be invalidated once the flush
+ * operation is complete. This bit is only valid when the
+ * Post-Sync Operation field is a value of 1h or 3h."
+ */
if (invalidate & I915_GEM_GPU_DOMAINS)
- cmd |= MI_INVALIDATE_TLB | MI_INVALIDATE_BSD;
+ cmd |= MI_INVALIDATE_TLB | MI_INVALIDATE_BSD |
+ MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW;
intel_ring_emit(ring, cmd);
- intel_ring_emit(ring, 0);
+ intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
intel_ring_emit(ring, 0);
intel_ring_emit(ring, MI_NOOP);
intel_ring_advance(ring);
@@ -1460,10 +1467,17 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
return ret;
cmd = MI_FLUSH_DW;
+ /*
+ * Bspec vol 1c.3 - blitter engine command streamer:
+ * "If ENABLED, all TLBs will be invalidated once the flush
+ * operation is complete. This bit is only valid when the
+ * Post-Sync Operation field is a value of 1h or 3h."
+ */
if (invalidate & I915_GEM_DOMAIN_RENDER)
- cmd |= MI_INVALIDATE_TLB;
+ cmd |= MI_INVALIDATE_TLB | MI_FLUSH_DW_STORE_INDEX |
+ MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_OP_STOREDW;
intel_ring_emit(ring, cmd);
- intel_ring_emit(ring, 0);
+ intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
intel_ring_emit(ring, 0);
intel_ring_emit(ring, MI_NOOP);
intel_ring_advance(ring);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 3745d1d..5af65b8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -183,6 +183,8 @@ intel_read_status_page(struct intel_ring_buffer *ring,
* The area from dword 0x20 to 0x3ff is available for driver usage.
*/
#define I915_GEM_HWS_INDEX 0x20
+#define I915_GEM_HWS_SCRATCH_INDEX 0x30
+#define I915_GEM_HWS_SCRATCH_ADDR (I915_GEM_HWS_SCRATCH_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring);
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 6/8] drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op
2012-10-26 16:42 ` Jesse Barnes
@ 2012-11-02 12:41 ` Chris Wilson
0 siblings, 0 replies; 15+ messages in thread
From: Chris Wilson @ 2012-11-02 12:41 UTC (permalink / raw)
To: Jesse Barnes; +Cc: intel-gfx
On Fri, 26 Oct 2012 09:42:42 -0700, Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
> commit b99c792eddf804150b3341a85c256df50d7ab5c2
> Author: Jesse Barnes <jbarnes@virtuousgeek.org>
> Date: Wed Sep 19 13:02:39 2012 -0700
>
> drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op v3
>
> So store into the scratch space of the HWS to make sure the invalidate
> occurs.
>
> v2: use GTT address space for store, clean up #defines (Chris)
> v3: use correct #define in blt ring flush (Chris)
>
> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
That looks to be the code I executed, but I can't confirm it fixes any
problems.
References: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1063252
which looks to be a likely victim of a missing TLB flush on the blitter
ring.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 7/8] drm/i915: PIPE_CONTROL TLB invalidate requires CS stall
2012-10-25 19:15 [PATCH 1/8] drm/i915: implement WaDisableL3CacheAging on VLV Jesse Barnes
` (4 preceding siblings ...)
2012-10-25 19:15 ` [PATCH 6/8] drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op Jesse Barnes
@ 2012-10-25 19:15 ` Jesse Barnes
2012-10-25 19:15 ` [PATCH 8/8] drm/i915: add clock gating regs to VLV offset check function Jesse Barnes
6 siblings, 0 replies; 15+ messages in thread
From: Jesse Barnes @ 2012-10-25 19:15 UTC (permalink / raw)
To: intel-gfx
"If ENABLED, PIPE_CONTROL command will flush the in flight data written
out by render engine to Global Observation point on flush done. Also
Requires stall bit ([20] of DW1) set."
So set the stall bit to ensure proper invalidation.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
---
drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index dc30d71..82f8767 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -245,7 +245,7 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring,
/*
* TLB invalidate requires a post-sync write.
*/
- flags |= PIPE_CONTROL_QW_WRITE;
+ flags |= PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL;
}
ret = intel_ring_begin(ring, 4);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH 8/8] drm/i915: add clock gating regs to VLV offset check function
2012-10-25 19:15 [PATCH 1/8] drm/i915: implement WaDisableL3CacheAging on VLV Jesse Barnes
` (5 preceding siblings ...)
2012-10-25 19:15 ` [PATCH 7/8] drm/i915: PIPE_CONTROL TLB invalidate requires CS stall Jesse Barnes
@ 2012-10-25 19:15 ` Jesse Barnes
2012-11-02 15:34 ` Daniel Vetter
6 siblings, 1 reply; 15+ messages in thread
From: Jesse Barnes @ 2012-10-25 19:15 UTC (permalink / raw)
To: intel-gfx
So we can write them properly.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
---
drivers/gpu/drm/i915/i915_drv.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index d4b3507..fb4b816 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1129,8 +1129,17 @@ static bool IS_DISPLAYREG(u32 reg)
return false;
switch (reg) {
+ case _3D_CHICKEN3:
+ case IVB_CHICKEN3:
+ case GEN7_COMMON_SLICE_CHICKEN1:
+ case GEN7_L3CNTLREG1:
+ case GEN7_L3_CHICKEN_MODE_REGISTER:
case GEN7_ROW_CHICKEN2:
+ case GEN7_L3SQCREG4:
+ case GEN7_SQ_CHICKEN_MBCUNIT_CONFIG:
case GEN7_HALF_SLICE_CHICKEN1:
+ case GEN6_MBCTL:
+ case GEN6_UCGCTL2:
return false;
default:
break;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH 8/8] drm/i915: add clock gating regs to VLV offset check function
2012-10-25 19:15 ` [PATCH 8/8] drm/i915: add clock gating regs to VLV offset check function Jesse Barnes
@ 2012-11-02 15:34 ` Daniel Vetter
0 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2012-11-02 15:34 UTC (permalink / raw)
To: Jesse Barnes; +Cc: intel-gfx
On Thu, Oct 25, 2012 at 12:15:48PM -0700, Jesse Barnes wrote:
> So we can write them properly.
>
> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Slurped in the entire series, thanks for the patches.
-Daniel
> ---
> drivers/gpu/drm/i915/i915_drv.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index d4b3507..fb4b816 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1129,8 +1129,17 @@ static bool IS_DISPLAYREG(u32 reg)
> return false;
>
> switch (reg) {
> + case _3D_CHICKEN3:
> + case IVB_CHICKEN3:
> + case GEN7_COMMON_SLICE_CHICKEN1:
> + case GEN7_L3CNTLREG1:
> + case GEN7_L3_CHICKEN_MODE_REGISTER:
> case GEN7_ROW_CHICKEN2:
> + case GEN7_L3SQCREG4:
> + case GEN7_SQ_CHICKEN_MBCUNIT_CONFIG:
> case GEN7_HALF_SLICE_CHICKEN1:
> + case GEN6_MBCTL:
> + case GEN6_UCGCTL2:
> return false;
/me screams
> default:
> break;
> --
> 1.7.9.5
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 15+ messages in thread