* [RFC PATCH 01/11] drm/i915: No need for RING_MAX_NONPRIV_SLOTS space
2017-10-09 20:58 [RFC PATCH 00/11] Refactor HW workaround code Oscar Mateo
@ 2017-10-09 20:58 ` Oscar Mateo
2017-10-09 21:53 ` Michel Thierry
2017-10-10 10:10 ` Mika Kuoppala
2017-10-09 20:58 ` [RFC PATCH 02/11] drm/i915: Move a bunch of workaround-related code to its own file Oscar Mateo
` (11 subsequent siblings)
12 siblings, 2 replies; 22+ messages in thread
From: Oscar Mateo @ 2017-10-09 20:58 UTC (permalink / raw)
To: intel-gfx
Now that we write RING_FORCE_TO_NONPRIV registers directly to hardware,
there is no need to save space for them in the list of context workarounds.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 799a90a..47a357c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1954,13 +1954,7 @@ struct i915_wa_reg {
u32 mask;
};
-/*
- * RING_MAX_NONPRIV_SLOTS is per-engine but at this point we are only
- * allowing it for RCS as we don't foresee any requirement of having
- * a whitelist for other engines. When it is really required for
- * other engines then the limit need to be increased.
- */
-#define I915_MAX_WA_REGS (16 + RING_MAX_NONPRIV_SLOTS)
+#define I915_MAX_WA_REGS 16
struct i915_workarounds {
struct i915_wa_reg reg[I915_MAX_WA_REGS];
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [RFC PATCH 01/11] drm/i915: No need for RING_MAX_NONPRIV_SLOTS space
2017-10-09 20:58 ` [RFC PATCH 01/11] drm/i915: No need for RING_MAX_NONPRIV_SLOTS space Oscar Mateo
@ 2017-10-09 21:53 ` Michel Thierry
2017-10-10 10:10 ` Mika Kuoppala
1 sibling, 0 replies; 22+ messages in thread
From: Michel Thierry @ 2017-10-09 21:53 UTC (permalink / raw)
To: Oscar Mateo, intel-gfx@lists.freedesktop.org
On 10/9/2017 1:58 PM, Oscar Mateo wrote:
> Now that we write RING_FORCE_TO_NONPRIV registers directly to hardware,
> there is no need to save space for them in the list of context workarounds.
> > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Is it worth mention the commit that changed this? E.g.:
Now that we write RING_FORCE_TO_NONPRIV registers directly to hardware
[commit 32ced39c1b12 ("drm/i915: Transform whitelisting WAs into a
simple reg write")]...
Anyway,
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 8 +-------
> 1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 799a90a..47a357c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1954,13 +1954,7 @@ struct i915_wa_reg {
> u32 mask;
> };
>
> -/*
> - * RING_MAX_NONPRIV_SLOTS is per-engine but at this point we are only
> - * allowing it for RCS as we don't foresee any requirement of having
> - * a whitelist for other engines. When it is really required for
> - * other engines then the limit need to be increased.
> - */
> -#define I915_MAX_WA_REGS (16 + RING_MAX_NONPRIV_SLOTS)
> +#define I915_MAX_WA_REGS 16
>
> struct i915_workarounds {
> struct i915_wa_reg reg[I915_MAX_WA_REGS];
> --
> 1.9.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [RFC PATCH 01/11] drm/i915: No need for RING_MAX_NONPRIV_SLOTS space
2017-10-09 20:58 ` [RFC PATCH 01/11] drm/i915: No need for RING_MAX_NONPRIV_SLOTS space Oscar Mateo
2017-10-09 21:53 ` Michel Thierry
@ 2017-10-10 10:10 ` Mika Kuoppala
1 sibling, 0 replies; 22+ messages in thread
From: Mika Kuoppala @ 2017-10-10 10:10 UTC (permalink / raw)
To: Oscar Mateo, intel-gfx
Oscar Mateo <oscar.mateo@intel.com> writes:
> Now that we write RING_FORCE_TO_NONPRIV registers directly to hardware,
> there is no need to save space for them in the list of context workarounds.
>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 8 +-------
> 1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 799a90a..47a357c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1954,13 +1954,7 @@ struct i915_wa_reg {
> u32 mask;
> };
>
> -/*
> - * RING_MAX_NONPRIV_SLOTS is per-engine but at this point we are only
> - * allowing it for RCS as we don't foresee any requirement of having
> - * a whitelist for other engines. When it is really required for
> - * other engines then the limit need to be increased.
> - */
> -#define I915_MAX_WA_REGS (16 + RING_MAX_NONPRIV_SLOTS)
> +#define I915_MAX_WA_REGS 16
>
> struct i915_workarounds {
> struct i915_wa_reg reg[I915_MAX_WA_REGS];
> --
> 1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC PATCH 02/11] drm/i915: Move a bunch of workaround-related code to its own file
2017-10-09 20:58 [RFC PATCH 00/11] Refactor HW workaround code Oscar Mateo
2017-10-09 20:58 ` [RFC PATCH 01/11] drm/i915: No need for RING_MAX_NONPRIV_SLOTS space Oscar Mateo
@ 2017-10-09 20:58 ` Oscar Mateo
2017-10-09 21:06 ` Chris Wilson
2017-10-09 20:58 ` [RFC PATCH 03/11] drm/i915: Split out functions for different kinds of workarounds Oscar Mateo
` (10 subsequent siblings)
12 siblings, 1 reply; 22+ messages in thread
From: Oscar Mateo @ 2017-10-09 20:58 UTC (permalink / raw)
To: intel-gfx
This has grown to be a sizable amount of code, so move it to
its own file before we try to refactor anything. For the moment,
we are leaving behind the WA BB code and the WAs that get applied
(incorrectly) in init_clock_gating, but we will deal with it later.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/Makefile | 1 +
drivers/gpu/drm/i915/i915_workarounds.c | 705 ++++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_workarounds.h | 31 ++
drivers/gpu/drm/i915/intel_engine_cs.c | 679 ------------------------------
drivers/gpu/drm/i915/intel_lrc.c | 1 +
drivers/gpu/drm/i915/intel_ringbuffer.c | 1 +
drivers/gpu/drm/i915/intel_ringbuffer.h | 3 -
7 files changed, 739 insertions(+), 682 deletions(-)
create mode 100644 drivers/gpu/drm/i915/i915_workarounds.c
create mode 100644 drivers/gpu/drm/i915/i915_workarounds.h
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 66d23b6..2f0afca 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -19,6 +19,7 @@ i915-y := i915_drv.o \
i915_syncmap.o \
i915_sw_fence.o \
i915_sysfs.o \
+ i915_workarounds.o \
intel_csr.o \
intel_device_info.o \
intel_pm.o \
diff --git a/drivers/gpu/drm/i915/i915_workarounds.c b/drivers/gpu/drm/i915/i915_workarounds.c
new file mode 100644
index 0000000..57da318
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_workarounds.c
@@ -0,0 +1,705 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "i915_drv.h"
+#include "i915_workarounds.h"
+
+static int wa_add(struct drm_i915_private *dev_priv,
+ i915_reg_t addr,
+ const u32 mask, const u32 val)
+{
+ const u32 idx = dev_priv->workarounds.count;
+
+ if (WARN_ON(idx >= I915_MAX_WA_REGS))
+ return -ENOSPC;
+
+ dev_priv->workarounds.reg[idx].addr = addr;
+ dev_priv->workarounds.reg[idx].value = val;
+ dev_priv->workarounds.reg[idx].mask = mask;
+
+ dev_priv->workarounds.count++;
+
+ return 0;
+}
+
+#define WA_REG(addr, mask, val) do { \
+ const int r = wa_add(dev_priv, (addr), (mask), (val)); \
+ if (r) \
+ return r; \
+ } while (0)
+
+#define WA_SET_BIT_MASKED(addr, mask) \
+ WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
+
+#define WA_CLR_BIT_MASKED(addr, mask) \
+ WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
+
+#define WA_SET_FIELD_MASKED(addr, mask, value) \
+ WA_REG(addr, mask, _MASKED_FIELD(mask, value))
+
+static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
+ i915_reg_t reg)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ struct i915_workarounds *wa = &dev_priv->workarounds;
+ const uint32_t index = wa->hw_whitelist_count[engine->id];
+
+ if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
+ return -EINVAL;
+
+ I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
+ i915_mmio_reg_offset(reg));
+ wa->hw_whitelist_count[engine->id]++;
+
+ return 0;
+}
+
+static int gen8_init_workarounds(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+
+ WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
+
+ /* WaDisableAsyncFlipPerfMode:bdw,chv */
+ WA_SET_BIT_MASKED(MI_MODE, ASYNC_FLIP_PERF_DISABLE);
+
+ /* WaDisablePartialInstShootdown:bdw,chv */
+ WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+ PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
+
+ /* Use Force Non-Coherent whenever executing a 3D context. This is a
+ * workaround for for a possible hang in the unlikely event a TLB
+ * invalidation occurs during a PSD flush.
+ */
+ /* WaForceEnableNonCoherent:bdw,chv */
+ /* WaHdcDisableFetchWhenMasked:bdw,chv */
+ WA_SET_BIT_MASKED(HDC_CHICKEN0,
+ HDC_DONOT_FETCH_MEM_WHEN_MASKED |
+ HDC_FORCE_NON_COHERENT);
+
+ /* From the Haswell PRM, Command Reference: Registers, CACHE_MODE_0:
+ * "The Hierarchical Z RAW Stall Optimization allows non-overlapping
+ * polygons in the same 8x4 pixel/sample area to be processed without
+ * stalling waiting for the earlier ones to write to Hierarchical Z
+ * buffer."
+ *
+ * This optimization is off by default for BDW and CHV; turn it on.
+ */
+ WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
+
+ /* Wa4x4STCOptimizationDisable:bdw,chv */
+ WA_SET_BIT_MASKED(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);
+
+ /*
+ * BSpec recommends 8x4 when MSAA is used,
+ * however in practice 16x4 seems fastest.
+ *
+ * Note that PS/WM thread counts depend on the WIZ hashing
+ * disable bit, which we don't touch here, but it's good
+ * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
+ */
+ WA_SET_FIELD_MASKED(GEN7_GT_MODE,
+ GEN6_WIZ_HASHING_MASK,
+ GEN6_WIZ_HASHING_16x4);
+
+ return 0;
+}
+
+static int bdw_init_workarounds(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ int ret;
+
+ ret = gen8_init_workarounds(engine);
+ if (ret)
+ return ret;
+
+ /* WaDisableThreadStallDopClockGating:bdw (pre-production) */
+ WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+
+ /* WaDisableDopClockGating:bdw
+ *
+ * Also see the related UCGTCL1 write in broadwell_init_clock_gating()
+ * to disable EUTC clock gating.
+ */
+ WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
+ DOP_CLOCK_GATING_DISABLE);
+
+ WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+ GEN8_SAMPLER_POWER_BYPASS_DIS);
+
+ WA_SET_BIT_MASKED(HDC_CHICKEN0,
+ /* WaForceContextSaveRestoreNonCoherent:bdw */
+ HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
+ /* WaDisableFenceDestinationToSLM:bdw (pre-prod) */
+ (IS_BDW_GT3(dev_priv) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
+
+ return 0;
+}
+
+static int chv_init_workarounds(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ int ret;
+
+ ret = gen8_init_workarounds(engine);
+ if (ret)
+ return ret;
+
+ /* WaDisableThreadStallDopClockGating:chv */
+ WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+
+ /* Improve HiZ throughput on CHV. */
+ WA_SET_BIT_MASKED(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
+
+ return 0;
+}
+
+static int gen9_init_workarounds(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ int ret;
+
+ /* WaConextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
+ I915_WRITE(GEN9_CSFE_CHICKEN1_RCS, _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
+
+ /* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
+ I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
+ GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
+
+ /* WaDisableKillLogic:bxt,skl,kbl */
+ if (!IS_COFFEELAKE(dev_priv))
+ I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+ ECOCHK_DIS_TLB);
+
+ if (HAS_LLC(dev_priv)) {
+ /* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
+ *
+ * Must match Display Engine. See
+ * WaCompressedResourceDisplayNewHashMode.
+ */
+ WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ GEN9_PBE_COMPRESSED_HASH_SELECTION);
+ WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
+ GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
+
+ I915_WRITE(MMCD_MISC_CTRL,
+ I915_READ(MMCD_MISC_CTRL) |
+ MMCD_PCLA |
+ MMCD_HOTSPOT_EN);
+ }
+
+ /* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
+ /* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */
+ WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+ FLOW_CONTROL_ENABLE |
+ PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
+
+ /* Syncing dependencies between camera and graphics:skl,bxt,kbl */
+ if (!IS_COFFEELAKE(dev_priv))
+ WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+ GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC);
+
+ /* WaDisableDgMirrorFixInHalfSliceChicken5:bxt */
+ if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
+ WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
+ GEN9_DG_MIRROR_FIX_ENABLE);
+
+ /* WaSetDisablePixMaskCammingAndRhwoInCommonSliceChicken:bxt */
+ if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+ WA_SET_BIT_MASKED(GEN7_COMMON_SLICE_CHICKEN1,
+ GEN9_RHWO_OPTIMIZATION_DISABLE);
+ /*
+ * WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be set
+ * but we do that in per ctx batchbuffer as there is an issue
+ * with this register not getting restored on ctx restore
+ */
+ }
+
+ /* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */
+ /* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */
+ WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
+ GEN9_ENABLE_YV12_BUGFIX |
+ GEN9_ENABLE_GPGPU_PREEMPTION);
+
+ /* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */
+ /* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */
+ WA_SET_BIT_MASKED(CACHE_MODE_1, (GEN8_4x4_STC_OPTIMIZATION_DISABLE |
+ GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE));
+
+ /* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */
+ WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
+ GEN9_CCS_TLB_PREFETCH_ENABLE);
+
+ /* WaDisableMaskBasedCammingInRCC:bxt */
+ if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
+ WA_SET_BIT_MASKED(SLICE_ECO_CHICKEN0,
+ PIXEL_MASK_CAMMING_DISABLE);
+
+ /* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */
+ WA_SET_BIT_MASKED(HDC_CHICKEN0,
+ HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
+ HDC_FORCE_CSR_NON_COHERENT_OVR_DISABLE);
+
+ /* WaForceEnableNonCoherent and WaDisableHDCInvalidation are
+ * both tied to WaForceContextSaveRestoreNonCoherent
+ * in some hsds for skl. We keep the tie for all gen9. The
+ * documentation is a bit hazy and so we want to get common behaviour,
+ * even though there is no clear evidence we would need both on kbl/bxt.
+ * This area has been source of system hangs so we play it safe
+ * and mimic the skl regardless of what bspec says.
+ *
+ * Use Force Non-Coherent whenever executing a 3D context. This
+ * is a workaround for a possible hang in the unlikely event
+ * a TLB invalidation occurs during a PSD flush.
+ */
+
+ /* WaForceEnableNonCoherent:skl,bxt,kbl,cfl */
+ WA_SET_BIT_MASKED(HDC_CHICKEN0,
+ HDC_FORCE_NON_COHERENT);
+
+ /* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
+ I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+ BDW_DISABLE_HDC_INVALIDATION);
+
+ /* WaDisableSamplerPowerBypassForSOPingPong:skl,bxt,kbl,cfl */
+ if (IS_SKYLAKE(dev_priv) ||
+ IS_KABYLAKE(dev_priv) ||
+ IS_COFFEELAKE(dev_priv) ||
+ IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0))
+ WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+ GEN8_SAMPLER_POWER_BYPASS_DIS);
+
+ /* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
+ WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
+
+ /* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
+ I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
+ GEN8_LQSC_FLUSH_COHERENT_LINES));
+
+ /*
+ * Supporting preemption with fine-granularity requires changes in the
+ * batch buffer programming. Since we can't break old userspace, we
+ * need to set our default preemption level to safe value. Userspace is
+ * still able to use more fine-grained preemption levels, since in
+ * WaEnablePreemptionGranularityControlByUMD we're whitelisting the
+ * per-ctx register. As such, WaDisable{3D,GPGPU}MidCmdPreemption are
+ * not real HW workarounds, but merely a way to start using preemption
+ * while maintaining old contract with userspace.
+ */
+
+ /* WaDisable3DMidCmdPreemption:skl,bxt,glk,cfl,[cnl] */
+ WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+
+ /* WaDisableGPGPUMidCmdPreemption:skl,bxt,blk,cfl,[cnl] */
+ WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+ GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
+
+ /* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
+ ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
+ if (ret)
+ return ret;
+
+ /* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
+ I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
+ _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+ ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
+ if (ret)
+ return ret;
+
+ /* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
+ ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static int skl_tune_iz_hashing(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ u8 vals[3] = { 0, 0, 0 };
+ unsigned int i;
+
+ for (i = 0; i < 3; i++) {
+ u8 ss;
+
+ /*
+ * Only consider slices where one, and only one, subslice has 7
+ * EUs
+ */
+ if (!is_power_of_2(INTEL_INFO(dev_priv)->sseu.subslice_7eu[i]))
+ continue;
+
+ /*
+ * subslice_7eu[i] != 0 (because of the check above) and
+ * ss_max == 4 (maximum number of subslices possible per slice)
+ *
+ * -> 0 <= ss <= 3;
+ */
+ ss = ffs(INTEL_INFO(dev_priv)->sseu.subslice_7eu[i]) - 1;
+ vals[i] = 3 - ss;
+ }
+
+ if (vals[0] == 0 && vals[1] == 0 && vals[2] == 0)
+ return 0;
+
+ /* Tune IZ hashing. See intel_device_info_runtime_init() */
+ WA_SET_FIELD_MASKED(GEN7_GT_MODE,
+ GEN9_IZ_HASHING_MASK(2) |
+ GEN9_IZ_HASHING_MASK(1) |
+ GEN9_IZ_HASHING_MASK(0),
+ GEN9_IZ_HASHING(2, vals[2]) |
+ GEN9_IZ_HASHING(1, vals[1]) |
+ GEN9_IZ_HASHING(0, vals[0]));
+
+ return 0;
+}
+
+static int skl_init_workarounds(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ int ret;
+
+ ret = gen9_init_workarounds(engine);
+ if (ret)
+ return ret;
+
+ /* WaEnableGapsTsvCreditFix:skl */
+ I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+ GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+ /* WaDisableGafsUnitClkGating:skl */
+ I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+ GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+ /* WaInPlaceDecompressionHang:skl */
+ if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
+ I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+ (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+ /* WaDisableLSQCROPERFforOCL:skl */
+ ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+ if (ret)
+ return ret;
+
+ return skl_tune_iz_hashing(engine);
+}
+
+static int bxt_init_workarounds(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ int ret;
+
+ ret = gen9_init_workarounds(engine);
+ if (ret)
+ return ret;
+
+ /* WaStoreMultiplePTEenable:bxt */
+ /* This is a requirement according to Hardware specification */
+ if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
+ I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
+
+ /* WaSetClckGatingDisableMedia:bxt */
+ if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+ I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
+ ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
+ }
+
+ /* WaDisableThreadStallDopClockGating:bxt */
+ WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+ STALL_DOP_GATING_DISABLE);
+
+ /* WaDisablePooledEuLoadBalancingFix:bxt */
+ if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
+ I915_WRITE(FF_SLICE_CS_CHICKEN2,
+ _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
+ }
+
+ /* WaDisableSbeCacheDispatchPortSharing:bxt */
+ if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0)) {
+ WA_SET_BIT_MASKED(
+ GEN7_HALF_SLICE_CHICKEN1,
+ GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+ }
+
+ /* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
+ /* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
+ /* WaDisableObjectLevelPreemtionForInstanceId:bxt */
+ /* WaDisableLSQCROPERFforOCL:bxt */
+ if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+ ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
+ if (ret)
+ return ret;
+
+ ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+ if (ret)
+ return ret;
+ }
+
+ /* WaProgramL3SqcReg1DefaultForPerf:bxt */
+ if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER))
+ I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(62) |
+ L3_HIGH_PRIO_CREDITS(2));
+
+ /* WaToEnableHwFixForPushConstHWBug:bxt */
+ if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
+ WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+ /* WaInPlaceDecompressionHang:bxt */
+ if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
+ I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+ (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+ return 0;
+}
+
+static int cnl_init_workarounds(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ int ret;
+
+ /* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
+ if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
+ I915_WRITE(GAMT_CHKN_BIT_REG,
+ (I915_READ(GAMT_CHKN_BIT_REG) |
+ GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
+
+ /* WaForceContextSaveRestoreNonCoherent:cnl */
+ WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
+ HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
+
+ /* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
+ if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
+ WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
+
+ /* WaDisableReplayBufferBankArbitrationOptimization:cnl */
+ WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+ /* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
+ if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
+ WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
+
+ /* WaInPlaceDecompressionHang:cnl */
+ I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+ (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+ /* WaPushConstantDereferenceHoldDisable:cnl */
+ WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
+
+ /* FtrEnableFastAnisoL1BankingFix: cnl */
+ WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
+
+ /* WaDisable3DMidCmdPreemption:cnl */
+ WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+
+ /* WaDisableGPGPUMidCmdPreemption:cnl */
+ WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+ GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
+
+ /* WaEnablePreemptionGranularityControlByUMD:cnl */
+ I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
+ _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+ ret= wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static int kbl_init_workarounds(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ int ret;
+
+ ret = gen9_init_workarounds(engine);
+ if (ret)
+ return ret;
+
+ /* WaEnableGapsTsvCreditFix:kbl */
+ I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+ GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+ /* WaDisableDynamicCreditSharing:kbl */
+ if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
+ I915_WRITE(GAMT_CHKN_BIT_REG,
+ (I915_READ(GAMT_CHKN_BIT_REG) |
+ GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
+
+ /* WaDisableFenceDestinationToSLM:kbl (pre-prod) */
+ if (IS_KBL_REVID(dev_priv, KBL_REVID_A0, KBL_REVID_A0))
+ WA_SET_BIT_MASKED(HDC_CHICKEN0,
+ HDC_FENCE_DEST_SLM_DISABLE);
+
+ /* WaToEnableHwFixForPushConstHWBug:kbl */
+ if (IS_KBL_REVID(dev_priv, KBL_REVID_C0, REVID_FOREVER))
+ WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+ /* WaDisableGafsUnitClkGating:kbl */
+ I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+ GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+ /* WaDisableSbeCacheDispatchPortSharing:kbl */
+ WA_SET_BIT_MASKED(
+ GEN7_HALF_SLICE_CHICKEN1,
+ GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+
+ /* WaInPlaceDecompressionHang:kbl */
+ I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+ (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+ /* WaDisableLSQCROPERFforOCL:kbl */
+ ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static int glk_init_workarounds(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ int ret;
+
+ ret = gen9_init_workarounds(engine);
+ if (ret)
+ return ret;
+
+ /* WaToEnableHwFixForPushConstHWBug:glk */
+ WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+ return 0;
+}
+
+static int cfl_init_workarounds(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ int ret;
+
+ ret = gen9_init_workarounds(engine);
+ if (ret)
+ return ret;
+
+ /* WaEnableGapsTsvCreditFix:cfl */
+ I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+ GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+ /* WaToEnableHwFixForPushConstHWBug:cfl */
+ WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+ /* WaDisableGafsUnitClkGating:cfl */
+ I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+ GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+ /* WaDisableSbeCacheDispatchPortSharing:cfl */
+ WA_SET_BIT_MASKED(
+ GEN7_HALF_SLICE_CHICKEN1,
+ GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
+
+ /* WaInPlaceDecompressionHang:cfl */
+ I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+ (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+ return 0;
+}
+
+int init_workarounds_ring(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ int err;
+
+ WARN_ON(engine->id != RCS);
+
+ dev_priv->workarounds.count = 0;
+ dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
+
+ if (IS_BROADWELL(dev_priv))
+ err = bdw_init_workarounds(engine);
+ else if (IS_CHERRYVIEW(dev_priv))
+ err = chv_init_workarounds(engine);
+ else if (IS_SKYLAKE(dev_priv))
+ err = skl_init_workarounds(engine);
+ else if (IS_BROXTON(dev_priv))
+ err = bxt_init_workarounds(engine);
+ else if (IS_KABYLAKE(dev_priv))
+ err = kbl_init_workarounds(engine);
+ else if (IS_GEMINILAKE(dev_priv))
+ err = glk_init_workarounds(engine);
+ else if (IS_COFFEELAKE(dev_priv))
+ err = cfl_init_workarounds(engine);
+ else if (IS_CANNONLAKE(dev_priv))
+ err = cnl_init_workarounds(engine);
+ else
+ err = 0;
+ if (err)
+ return err;
+
+ DRM_DEBUG_DRIVER("%s: Number of context specific w/a: %d\n",
+ engine->name, dev_priv->workarounds.count);
+ return 0;
+}
+
+int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
+{
+ struct i915_workarounds *w = &req->i915->workarounds;
+ u32 *cs;
+ int ret, i;
+
+ if (w->count == 0)
+ return 0;
+
+ ret = req->engine->emit_flush(req, EMIT_BARRIER);
+ if (ret)
+ return ret;
+
+ cs = intel_ring_begin(req, (w->count * 2 + 2));
+ if (IS_ERR(cs))
+ return PTR_ERR(cs);
+
+ *cs++ = MI_LOAD_REGISTER_IMM(w->count);
+ for (i = 0; i < w->count; i++) {
+ *cs++ = i915_mmio_reg_offset(w->reg[i].addr);
+ *cs++ = w->reg[i].value;
+ }
+ *cs++ = MI_NOOP;
+
+ intel_ring_advance(req, cs);
+
+ ret = req->engine->emit_flush(req, EMIT_BARRIER);
+ if (ret)
+ return ret;
+
+ return 0;
+}
diff --git a/drivers/gpu/drm/i915/i915_workarounds.h b/drivers/gpu/drm/i915/i915_workarounds.h
new file mode 100644
index 0000000..27099fc
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_workarounds.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#ifndef _I915_WORKAROUNDS_H_
+#define _I915_WORKAROUNDS_H_
+
+int init_workarounds_ring(struct intel_engine_cs *engine);
+int intel_ring_workarounds_emit(struct drm_i915_gem_request *req);
+
+#endif
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 807a7aa..59666f5 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -815,685 +815,6 @@ void intel_engine_get_instdone(struct intel_engine_cs *engine,
}
}
-static int wa_add(struct drm_i915_private *dev_priv,
- i915_reg_t addr,
- const u32 mask, const u32 val)
-{
- const u32 idx = dev_priv->workarounds.count;
-
- if (WARN_ON(idx >= I915_MAX_WA_REGS))
- return -ENOSPC;
-
- dev_priv->workarounds.reg[idx].addr = addr;
- dev_priv->workarounds.reg[idx].value = val;
- dev_priv->workarounds.reg[idx].mask = mask;
-
- dev_priv->workarounds.count++;
-
- return 0;
-}
-
-#define WA_REG(addr, mask, val) do { \
- const int r = wa_add(dev_priv, (addr), (mask), (val)); \
- if (r) \
- return r; \
- } while (0)
-
-#define WA_SET_BIT_MASKED(addr, mask) \
- WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
-
-#define WA_CLR_BIT_MASKED(addr, mask) \
- WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
-
-#define WA_SET_FIELD_MASKED(addr, mask, value) \
- WA_REG(addr, mask, _MASKED_FIELD(mask, value))
-
-static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
- i915_reg_t reg)
-{
- struct drm_i915_private *dev_priv = engine->i915;
- struct i915_workarounds *wa = &dev_priv->workarounds;
- const uint32_t index = wa->hw_whitelist_count[engine->id];
-
- if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
- return -EINVAL;
-
- I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
- i915_mmio_reg_offset(reg));
- wa->hw_whitelist_count[engine->id]++;
-
- return 0;
-}
-
-static int gen8_init_workarounds(struct intel_engine_cs *engine)
-{
- struct drm_i915_private *dev_priv = engine->i915;
-
- WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
-
- /* WaDisableAsyncFlipPerfMode:bdw,chv */
- WA_SET_BIT_MASKED(MI_MODE, ASYNC_FLIP_PERF_DISABLE);
-
- /* WaDisablePartialInstShootdown:bdw,chv */
- WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
- PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
-
- /* Use Force Non-Coherent whenever executing a 3D context. This is a
- * workaround for for a possible hang in the unlikely event a TLB
- * invalidation occurs during a PSD flush.
- */
- /* WaForceEnableNonCoherent:bdw,chv */
- /* WaHdcDisableFetchWhenMasked:bdw,chv */
- WA_SET_BIT_MASKED(HDC_CHICKEN0,
- HDC_DONOT_FETCH_MEM_WHEN_MASKED |
- HDC_FORCE_NON_COHERENT);
-
- /* From the Haswell PRM, Command Reference: Registers, CACHE_MODE_0:
- * "The Hierarchical Z RAW Stall Optimization allows non-overlapping
- * polygons in the same 8x4 pixel/sample area to be processed without
- * stalling waiting for the earlier ones to write to Hierarchical Z
- * buffer."
- *
- * This optimization is off by default for BDW and CHV; turn it on.
- */
- WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
-
- /* Wa4x4STCOptimizationDisable:bdw,chv */
- WA_SET_BIT_MASKED(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);
-
- /*
- * BSpec recommends 8x4 when MSAA is used,
- * however in practice 16x4 seems fastest.
- *
- * Note that PS/WM thread counts depend on the WIZ hashing
- * disable bit, which we don't touch here, but it's good
- * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
- */
- WA_SET_FIELD_MASKED(GEN7_GT_MODE,
- GEN6_WIZ_HASHING_MASK,
- GEN6_WIZ_HASHING_16x4);
-
- return 0;
-}
-
-static int bdw_init_workarounds(struct intel_engine_cs *engine)
-{
- struct drm_i915_private *dev_priv = engine->i915;
- int ret;
-
- ret = gen8_init_workarounds(engine);
- if (ret)
- return ret;
-
- /* WaDisableThreadStallDopClockGating:bdw (pre-production) */
- WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
-
- /* WaDisableDopClockGating:bdw
- *
- * Also see the related UCGTCL1 write in broadwell_init_clock_gating()
- * to disable EUTC clock gating.
- */
- WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
- DOP_CLOCK_GATING_DISABLE);
-
- WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
- GEN8_SAMPLER_POWER_BYPASS_DIS);
-
- WA_SET_BIT_MASKED(HDC_CHICKEN0,
- /* WaForceContextSaveRestoreNonCoherent:bdw */
- HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
- /* WaDisableFenceDestinationToSLM:bdw (pre-prod) */
- (IS_BDW_GT3(dev_priv) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
-
- return 0;
-}
-
-static int chv_init_workarounds(struct intel_engine_cs *engine)
-{
- struct drm_i915_private *dev_priv = engine->i915;
- int ret;
-
- ret = gen8_init_workarounds(engine);
- if (ret)
- return ret;
-
- /* WaDisableThreadStallDopClockGating:chv */
- WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
-
- /* Improve HiZ throughput on CHV. */
- WA_SET_BIT_MASKED(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
-
- return 0;
-}
-
-static int gen9_init_workarounds(struct intel_engine_cs *engine)
-{
- struct drm_i915_private *dev_priv = engine->i915;
- int ret;
-
- /* WaConextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
- I915_WRITE(GEN9_CSFE_CHICKEN1_RCS, _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
-
- /* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
- I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
- GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
-
- /* WaDisableKillLogic:bxt,skl,kbl */
- if (!IS_COFFEELAKE(dev_priv))
- I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
- ECOCHK_DIS_TLB);
-
- if (HAS_LLC(dev_priv)) {
- /* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
- *
- * Must match Display Engine. See
- * WaCompressedResourceDisplayNewHashMode.
- */
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
- GEN9_PBE_COMPRESSED_HASH_SELECTION);
- WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
- GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
-
- I915_WRITE(MMCD_MISC_CTRL,
- I915_READ(MMCD_MISC_CTRL) |
- MMCD_PCLA |
- MMCD_HOTSPOT_EN);
- }
-
- /* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
- /* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */
- WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
- FLOW_CONTROL_ENABLE |
- PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
-
- /* Syncing dependencies between camera and graphics:skl,bxt,kbl */
- if (!IS_COFFEELAKE(dev_priv))
- WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
- GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC);
-
- /* WaDisableDgMirrorFixInHalfSliceChicken5:bxt */
- if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
- WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
- GEN9_DG_MIRROR_FIX_ENABLE);
-
- /* WaSetDisablePixMaskCammingAndRhwoInCommonSliceChicken:bxt */
- if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
- WA_SET_BIT_MASKED(GEN7_COMMON_SLICE_CHICKEN1,
- GEN9_RHWO_OPTIMIZATION_DISABLE);
- /*
- * WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be set
- * but we do that in per ctx batchbuffer as there is an issue
- * with this register not getting restored on ctx restore
- */
- }
-
- /* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */
- /* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */
- WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
- GEN9_ENABLE_YV12_BUGFIX |
- GEN9_ENABLE_GPGPU_PREEMPTION);
-
- /* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */
- /* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */
- WA_SET_BIT_MASKED(CACHE_MODE_1, (GEN8_4x4_STC_OPTIMIZATION_DISABLE |
- GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE));
-
- /* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */
- WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
- GEN9_CCS_TLB_PREFETCH_ENABLE);
-
- /* WaDisableMaskBasedCammingInRCC:bxt */
- if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
- WA_SET_BIT_MASKED(SLICE_ECO_CHICKEN0,
- PIXEL_MASK_CAMMING_DISABLE);
-
- /* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */
- WA_SET_BIT_MASKED(HDC_CHICKEN0,
- HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
- HDC_FORCE_CSR_NON_COHERENT_OVR_DISABLE);
-
- /* WaForceEnableNonCoherent and WaDisableHDCInvalidation are
- * both tied to WaForceContextSaveRestoreNonCoherent
- * in some hsds for skl. We keep the tie for all gen9. The
- * documentation is a bit hazy and so we want to get common behaviour,
- * even though there is no clear evidence we would need both on kbl/bxt.
- * This area has been source of system hangs so we play it safe
- * and mimic the skl regardless of what bspec says.
- *
- * Use Force Non-Coherent whenever executing a 3D context. This
- * is a workaround for a possible hang in the unlikely event
- * a TLB invalidation occurs during a PSD flush.
- */
-
- /* WaForceEnableNonCoherent:skl,bxt,kbl,cfl */
- WA_SET_BIT_MASKED(HDC_CHICKEN0,
- HDC_FORCE_NON_COHERENT);
-
- /* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
- I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
- BDW_DISABLE_HDC_INVALIDATION);
-
- /* WaDisableSamplerPowerBypassForSOPingPong:skl,bxt,kbl,cfl */
- if (IS_SKYLAKE(dev_priv) ||
- IS_KABYLAKE(dev_priv) ||
- IS_COFFEELAKE(dev_priv) ||
- IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0))
- WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
- GEN8_SAMPLER_POWER_BYPASS_DIS);
-
- /* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
- WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
-
- /* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
- I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
- GEN8_LQSC_FLUSH_COHERENT_LINES));
-
- /*
- * Supporting preemption with fine-granularity requires changes in the
- * batch buffer programming. Since we can't break old userspace, we
- * need to set our default preemption level to safe value. Userspace is
- * still able to use more fine-grained preemption levels, since in
- * WaEnablePreemptionGranularityControlByUMD we're whitelisting the
- * per-ctx register. As such, WaDisable{3D,GPGPU}MidCmdPreemption are
- * not real HW workarounds, but merely a way to start using preemption
- * while maintaining old contract with userspace.
- */
-
- /* WaDisable3DMidCmdPreemption:skl,bxt,glk,cfl,[cnl] */
- WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
-
- /* WaDisableGPGPUMidCmdPreemption:skl,bxt,blk,cfl,[cnl] */
- WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
- GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
-
- /* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
- ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
- if (ret)
- return ret;
-
- /* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
- I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
- _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
- ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
- if (ret)
- return ret;
-
- /* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
- ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
- if (ret)
- return ret;
-
- return 0;
-}
-
-static int skl_tune_iz_hashing(struct intel_engine_cs *engine)
-{
- struct drm_i915_private *dev_priv = engine->i915;
- u8 vals[3] = { 0, 0, 0 };
- unsigned int i;
-
- for (i = 0; i < 3; i++) {
- u8 ss;
-
- /*
- * Only consider slices where one, and only one, subslice has 7
- * EUs
- */
- if (!is_power_of_2(INTEL_INFO(dev_priv)->sseu.subslice_7eu[i]))
- continue;
-
- /*
- * subslice_7eu[i] != 0 (because of the check above) and
- * ss_max == 4 (maximum number of subslices possible per slice)
- *
- * -> 0 <= ss <= 3;
- */
- ss = ffs(INTEL_INFO(dev_priv)->sseu.subslice_7eu[i]) - 1;
- vals[i] = 3 - ss;
- }
-
- if (vals[0] == 0 && vals[1] == 0 && vals[2] == 0)
- return 0;
-
- /* Tune IZ hashing. See intel_device_info_runtime_init() */
- WA_SET_FIELD_MASKED(GEN7_GT_MODE,
- GEN9_IZ_HASHING_MASK(2) |
- GEN9_IZ_HASHING_MASK(1) |
- GEN9_IZ_HASHING_MASK(0),
- GEN9_IZ_HASHING(2, vals[2]) |
- GEN9_IZ_HASHING(1, vals[1]) |
- GEN9_IZ_HASHING(0, vals[0]));
-
- return 0;
-}
-
-static int skl_init_workarounds(struct intel_engine_cs *engine)
-{
- struct drm_i915_private *dev_priv = engine->i915;
- int ret;
-
- ret = gen9_init_workarounds(engine);
- if (ret)
- return ret;
-
- /* WaEnableGapsTsvCreditFix:skl */
- I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
- GEN9_GAPS_TSV_CREDIT_DISABLE));
-
- /* WaDisableGafsUnitClkGating:skl */
- I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
- GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
- /* WaInPlaceDecompressionHang:skl */
- if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
- /* WaDisableLSQCROPERFforOCL:skl */
- ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
- if (ret)
- return ret;
-
- return skl_tune_iz_hashing(engine);
-}
-
-static int bxt_init_workarounds(struct intel_engine_cs *engine)
-{
- struct drm_i915_private *dev_priv = engine->i915;
- int ret;
-
- ret = gen9_init_workarounds(engine);
- if (ret)
- return ret;
-
- /* WaStoreMultiplePTEenable:bxt */
- /* This is a requirement according to Hardware specification */
- if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
- I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
-
- /* WaSetClckGatingDisableMedia:bxt */
- if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
- I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
- ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
- }
-
- /* WaDisableThreadStallDopClockGating:bxt */
- WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
- STALL_DOP_GATING_DISABLE);
-
- /* WaDisablePooledEuLoadBalancingFix:bxt */
- if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
- I915_WRITE(FF_SLICE_CS_CHICKEN2,
- _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
- }
-
- /* WaDisableSbeCacheDispatchPortSharing:bxt */
- if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0)) {
- WA_SET_BIT_MASKED(
- GEN7_HALF_SLICE_CHICKEN1,
- GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
- }
-
- /* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
- /* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
- /* WaDisableObjectLevelPreemtionForInstanceId:bxt */
- /* WaDisableLSQCROPERFforOCL:bxt */
- if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
- ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
- if (ret)
- return ret;
-
- ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
- if (ret)
- return ret;
- }
-
- /* WaProgramL3SqcReg1DefaultForPerf:bxt */
- if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER))
- I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(62) |
- L3_HIGH_PRIO_CREDITS(2));
-
- /* WaToEnableHwFixForPushConstHWBug:bxt */
- if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
- GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
- /* WaInPlaceDecompressionHang:bxt */
- if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
- return 0;
-}
-
-static int cnl_init_workarounds(struct intel_engine_cs *engine)
-{
- struct drm_i915_private *dev_priv = engine->i915;
- int ret;
-
- /* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
- if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
- I915_WRITE(GAMT_CHKN_BIT_REG,
- (I915_READ(GAMT_CHKN_BIT_REG) |
- GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
-
- /* WaForceContextSaveRestoreNonCoherent:cnl */
- WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
- HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
-
- /* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
- if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
- WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
-
- /* WaDisableReplayBufferBankArbitrationOptimization:cnl */
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
- GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
- /* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
- if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
- GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
-
- /* WaInPlaceDecompressionHang:cnl */
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
- /* WaPushConstantDereferenceHoldDisable:cnl */
- WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
-
- /* FtrEnableFastAnisoL1BankingFix: cnl */
- WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
-
- /* WaDisable3DMidCmdPreemption:cnl */
- WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
-
- /* WaDisableGPGPUMidCmdPreemption:cnl */
- WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
- GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
-
- /* WaEnablePreemptionGranularityControlByUMD:cnl */
- I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
- _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
- ret= wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
- if (ret)
- return ret;
-
- return 0;
-}
-
-static int kbl_init_workarounds(struct intel_engine_cs *engine)
-{
- struct drm_i915_private *dev_priv = engine->i915;
- int ret;
-
- ret = gen9_init_workarounds(engine);
- if (ret)
- return ret;
-
- /* WaEnableGapsTsvCreditFix:kbl */
- I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
- GEN9_GAPS_TSV_CREDIT_DISABLE));
-
- /* WaDisableDynamicCreditSharing:kbl */
- if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
- I915_WRITE(GAMT_CHKN_BIT_REG,
- (I915_READ(GAMT_CHKN_BIT_REG) |
- GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
-
- /* WaDisableFenceDestinationToSLM:kbl (pre-prod) */
- if (IS_KBL_REVID(dev_priv, KBL_REVID_A0, KBL_REVID_A0))
- WA_SET_BIT_MASKED(HDC_CHICKEN0,
- HDC_FENCE_DEST_SLM_DISABLE);
-
- /* WaToEnableHwFixForPushConstHWBug:kbl */
- if (IS_KBL_REVID(dev_priv, KBL_REVID_C0, REVID_FOREVER))
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
- GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
- /* WaDisableGafsUnitClkGating:kbl */
- I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
- GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
- /* WaDisableSbeCacheDispatchPortSharing:kbl */
- WA_SET_BIT_MASKED(
- GEN7_HALF_SLICE_CHICKEN1,
- GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
-
- /* WaInPlaceDecompressionHang:kbl */
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
- /* WaDisableLSQCROPERFforOCL:kbl */
- ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
- if (ret)
- return ret;
-
- return 0;
-}
-
-static int glk_init_workarounds(struct intel_engine_cs *engine)
-{
- struct drm_i915_private *dev_priv = engine->i915;
- int ret;
-
- ret = gen9_init_workarounds(engine);
- if (ret)
- return ret;
-
- /* WaToEnableHwFixForPushConstHWBug:glk */
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
- GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
- return 0;
-}
-
-static int cfl_init_workarounds(struct intel_engine_cs *engine)
-{
- struct drm_i915_private *dev_priv = engine->i915;
- int ret;
-
- ret = gen9_init_workarounds(engine);
- if (ret)
- return ret;
-
- /* WaEnableGapsTsvCreditFix:cfl */
- I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
- GEN9_GAPS_TSV_CREDIT_DISABLE));
-
- /* WaToEnableHwFixForPushConstHWBug:cfl */
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
- GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
- /* WaDisableGafsUnitClkGating:cfl */
- I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
- GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
- /* WaDisableSbeCacheDispatchPortSharing:cfl */
- WA_SET_BIT_MASKED(
- GEN7_HALF_SLICE_CHICKEN1,
- GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
-
- /* WaInPlaceDecompressionHang:cfl */
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
- return 0;
-}
-
-int init_workarounds_ring(struct intel_engine_cs *engine)
-{
- struct drm_i915_private *dev_priv = engine->i915;
- int err;
-
- WARN_ON(engine->id != RCS);
-
- dev_priv->workarounds.count = 0;
- dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
-
- if (IS_BROADWELL(dev_priv))
- err = bdw_init_workarounds(engine);
- else if (IS_CHERRYVIEW(dev_priv))
- err = chv_init_workarounds(engine);
- else if (IS_SKYLAKE(dev_priv))
- err = skl_init_workarounds(engine);
- else if (IS_BROXTON(dev_priv))
- err = bxt_init_workarounds(engine);
- else if (IS_KABYLAKE(dev_priv))
- err = kbl_init_workarounds(engine);
- else if (IS_GEMINILAKE(dev_priv))
- err = glk_init_workarounds(engine);
- else if (IS_COFFEELAKE(dev_priv))
- err = cfl_init_workarounds(engine);
- else if (IS_CANNONLAKE(dev_priv))
- err = cnl_init_workarounds(engine);
- else
- err = 0;
- if (err)
- return err;
-
- DRM_DEBUG_DRIVER("%s: Number of context specific w/a: %d\n",
- engine->name, dev_priv->workarounds.count);
- return 0;
-}
-
-int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
-{
- struct i915_workarounds *w = &req->i915->workarounds;
- u32 *cs;
- int ret, i;
-
- if (w->count == 0)
- return 0;
-
- ret = req->engine->emit_flush(req, EMIT_BARRIER);
- if (ret)
- return ret;
-
- cs = intel_ring_begin(req, (w->count * 2 + 2));
- if (IS_ERR(cs))
- return PTR_ERR(cs);
-
- *cs++ = MI_LOAD_REGISTER_IMM(w->count);
- for (i = 0; i < w->count; i++) {
- *cs++ = i915_mmio_reg_offset(w->reg[i].addr);
- *cs++ = w->reg[i].value;
- }
- *cs++ = MI_NOOP;
-
- intel_ring_advance(req, cs);
-
- ret = req->engine->emit_flush(req, EMIT_BARRIER);
- if (ret)
- return ret;
-
- return 0;
-}
-
static bool ring_is_idle(struct intel_engine_cs *engine)
{
struct drm_i915_private *dev_priv = engine->i915;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 721432d..07978fd 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -137,6 +137,7 @@
#include <drm/i915_drm.h>
#include "i915_drv.h"
#include "intel_mocs.h"
+#include "i915_workarounds.h"
#define RING_EXECLIST_QFULL (1 << 0x2)
#define RING_EXECLIST1_VALID (1 << 0x3)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 05c08b0..bb11aac 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -33,6 +33,7 @@
#include <drm/i915_drm.h>
#include "i915_trace.h"
#include "intel_drv.h"
+#include "i915_workarounds.h"
/* Rough estimate of the typical request size, performing a flush,
* set-context and then emitting the batch.
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 0fedda1..63ff7e0 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -723,9 +723,6 @@ static inline u32 intel_engine_last_submit(struct intel_engine_cs *engine)
return READ_ONCE(engine->timeline->seqno);
}
-int init_workarounds_ring(struct intel_engine_cs *engine);
-int intel_ring_workarounds_emit(struct drm_i915_gem_request *req);
-
void intel_engine_get_instdone(struct intel_engine_cs *engine,
struct intel_instdone *instdone);
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [RFC PATCH 02/11] drm/i915: Move a bunch of workaround-related code to its own file
2017-10-09 20:58 ` [RFC PATCH 02/11] drm/i915: Move a bunch of workaround-related code to its own file Oscar Mateo
@ 2017-10-09 21:06 ` Chris Wilson
2017-10-10 17:26 ` Oscar Mateo
0 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2017-10-09 21:06 UTC (permalink / raw)
To: Oscar Mateo, intel-gfx
Quoting Oscar Mateo (2017-10-09 21:58:17)
> This has grown to be a sizable amount of code, so move it to
> its own file before we try to refactor anything. For the moment,
> we are leaving behind the WA BB code and the WAs that get applied
> (incorrectly) in init_clock_gating, but we will deal with it later.
>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/Makefile | 1 +
> drivers/gpu/drm/i915/i915_workarounds.c | 705 ++++++++++++++++++++++++++++++++
> drivers/gpu/drm/i915/i915_workarounds.h | 31 ++
If we look at the filenames with rose-tinted glasses:
intel_* -> interaction with hw
i915_* -> interaction with user
(Let's not start on the mixup entailed when we use the i915 prefix to
mean to talking to gen2/gen3. Hopefully most gen2 mixups are history.)
i915_workarounds would imply w/a for the drm/i915 uABI, but we want
intel_workarounds to imply w/a for hw.
Now some are indeed workarounds to keep uABI constant (or otherwise
correct)... The line can be murky ;)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 02/11] drm/i915: Move a bunch of workaround-related code to its own file
2017-10-09 21:06 ` Chris Wilson
@ 2017-10-10 17:26 ` Oscar Mateo
0 siblings, 0 replies; 22+ messages in thread
From: Oscar Mateo @ 2017-10-10 17:26 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 10/09/2017 02:06 PM, Chris Wilson wrote:
> Quoting Oscar Mateo (2017-10-09 21:58:17)
>> This has grown to be a sizable amount of code, so move it to
>> its own file before we try to refactor anything. For the moment,
>> we are leaving behind the WA BB code and the WAs that get applied
>> (incorrectly) in init_clock_gating, but we will deal with it later.
>>
>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> ---
>> drivers/gpu/drm/i915/Makefile | 1 +
>> drivers/gpu/drm/i915/i915_workarounds.c | 705 ++++++++++++++++++++++++++++++++
>> drivers/gpu/drm/i915/i915_workarounds.h | 31 ++
> If we look at the filenames with rose-tinted glasses:
>
> intel_* -> interaction with hw
> i915_* -> interaction with user
>
> (Let's not start on the mixup entailed when we use the i915 prefix to
> mean to talking to gen2/gen3. Hopefully most gen2 mixups are history.)
>
> i915_workarounds would imply w/a for the drm/i915 uABI, but we want
> intel_workarounds to imply w/a for hw.
>
> Now some are indeed workarounds to keep uABI constant (or otherwise
> correct)... The line can be murky ;)
> -Chris
I remembered file prefixes had a meaning, but I tried to recall it from
memory and came up with the exact opposite (and I looked at
i915_guc_submission.c as proof that I was recalling it correctly :P).
I'll rename the files, no problem.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFC PATCH 03/11] drm/i915: Split out functions for different kinds of workarounds
2017-10-09 20:58 [RFC PATCH 00/11] Refactor HW workaround code Oscar Mateo
2017-10-09 20:58 ` [RFC PATCH 01/11] drm/i915: No need for RING_MAX_NONPRIV_SLOTS space Oscar Mateo
2017-10-09 20:58 ` [RFC PATCH 02/11] drm/i915: Move a bunch of workaround-related code to its own file Oscar Mateo
@ 2017-10-09 20:58 ` Oscar Mateo
2017-10-09 20:58 ` [RFC PATCH 04/11] drm/i915: Move workarounds from init_clock_gating Oscar Mateo
` (9 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Oscar Mateo @ 2017-10-09 20:58 UTC (permalink / raw)
To: intel-gfx
There are different kind of workarounds (those that modify registers that
live in the context image, those that modify global registers, those that
whitelist registers, etc...) and they have different requirements in terms
of where they are applied and how. Also, by splitting them apart, it should
be easier to decide where a new workaround should go.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 3 +
drivers/gpu/drm/i915/i915_gem_context.c | 5 +
drivers/gpu/drm/i915/i915_workarounds.c | 649 +++++++++++++++++++-------------
drivers/gpu/drm/i915/i915_workarounds.h | 8 +-
drivers/gpu/drm/i915/intel_lrc.c | 10 +-
drivers/gpu/drm/i915/intel_ringbuffer.c | 4 +-
6 files changed, 413 insertions(+), 266 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 82a1003..c26d8f2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -32,6 +32,7 @@
#include "i915_gem_clflush.h"
#include "i915_vgpu.h"
#include "i915_trace.h"
+#include "i915_workarounds.h"
#include "intel_drv.h"
#include "intel_frontbuffer.h"
#include "intel_mocs.h"
@@ -4762,6 +4763,8 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
}
}
+ i915_mmio_workarounds_apply(dev_priv);
+
i915_gem_init_swizzling(dev_priv);
/*
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5bf96a2..b544847 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -90,6 +90,7 @@
#include <drm/i915_drm.h>
#include "i915_drv.h"
#include "i915_trace.h"
+#include "i915_workarounds.h"
#define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
@@ -454,6 +455,10 @@ int i915_gem_contexts_init(struct drm_i915_private *dev_priv)
GEM_BUG_ON(dev_priv->kernel_context);
+ err = i915_ctx_workarounds_init(dev_priv);
+ if (err)
+ goto err;
+
INIT_LIST_HEAD(&dev_priv->contexts.list);
INIT_WORK(&dev_priv->contexts.free_work, contexts_free_worker);
init_llist_head(&dev_priv->contexts.free_list);
diff --git a/drivers/gpu/drm/i915/i915_workarounds.c b/drivers/gpu/drm/i915/i915_workarounds.c
index 57da318..f272b49 100644
--- a/drivers/gpu/drm/i915/i915_workarounds.c
+++ b/drivers/gpu/drm/i915/i915_workarounds.c
@@ -58,27 +58,8 @@ static int wa_add(struct drm_i915_private *dev_priv,
#define WA_SET_FIELD_MASKED(addr, mask, value) \
WA_REG(addr, mask, _MASKED_FIELD(mask, value))
-static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
- i915_reg_t reg)
-{
- struct drm_i915_private *dev_priv = engine->i915;
- struct i915_workarounds *wa = &dev_priv->workarounds;
- const uint32_t index = wa->hw_whitelist_count[engine->id];
-
- if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
- return -EINVAL;
-
- I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
- i915_mmio_reg_offset(reg));
- wa->hw_whitelist_count[engine->id]++;
-
- return 0;
-}
-
-static int gen8_init_workarounds(struct intel_engine_cs *engine)
+static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
{
- struct drm_i915_private *dev_priv = engine->i915;
-
WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
/* WaDisableAsyncFlipPerfMode:bdw,chv */
@@ -126,12 +107,11 @@ static int gen8_init_workarounds(struct intel_engine_cs *engine)
return 0;
}
-static int bdw_init_workarounds(struct intel_engine_cs *engine)
+static int bdw_ctx_workarounds_init(struct drm_i915_private *dev_priv)
{
- struct drm_i915_private *dev_priv = engine->i915;
int ret;
- ret = gen8_init_workarounds(engine);
+ ret = gen8_ctx_workarounds_init(dev_priv);
if (ret)
return ret;
@@ -158,12 +138,11 @@ static int bdw_init_workarounds(struct intel_engine_cs *engine)
return 0;
}
-static int chv_init_workarounds(struct intel_engine_cs *engine)
+static int chv_ctx_workarounds_init(struct drm_i915_private *dev_priv)
{
- struct drm_i915_private *dev_priv = engine->i915;
int ret;
- ret = gen8_init_workarounds(engine);
+ ret = gen8_ctx_workarounds_init(dev_priv);
if (ret)
return ret;
@@ -176,23 +155,8 @@ static int chv_init_workarounds(struct intel_engine_cs *engine)
return 0;
}
-static int gen9_init_workarounds(struct intel_engine_cs *engine)
+static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
{
- struct drm_i915_private *dev_priv = engine->i915;
- int ret;
-
- /* WaConextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
- I915_WRITE(GEN9_CSFE_CHICKEN1_RCS, _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
-
- /* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
- I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
- GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
-
- /* WaDisableKillLogic:bxt,skl,kbl */
- if (!IS_COFFEELAKE(dev_priv))
- I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
- ECOCHK_DIS_TLB);
-
if (HAS_LLC(dev_priv)) {
/* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
*
@@ -203,11 +167,6 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
GEN9_PBE_COMPRESSED_HASH_SELECTION);
WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
-
- I915_WRITE(MMCD_MISC_CTRL,
- I915_READ(MMCD_MISC_CTRL) |
- MMCD_PCLA |
- MMCD_HOTSPOT_EN);
}
/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
@@ -279,10 +238,6 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
WA_SET_BIT_MASKED(HDC_CHICKEN0,
HDC_FORCE_NON_COHERENT);
- /* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
- I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
- BDW_DISABLE_HDC_INVALIDATION);
-
/* WaDisableSamplerPowerBypassForSOPingPong:skl,bxt,kbl,cfl */
if (IS_SKYLAKE(dev_priv) ||
IS_KABYLAKE(dev_priv) ||
@@ -294,10 +249,6 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
- /* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
- I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
- GEN8_LQSC_FLUSH_COHERENT_LINES));
-
/*
* Supporting preemption with fine-granularity requires changes in the
* batch buffer programming. Since we can't break old userspace, we
@@ -316,29 +267,11 @@ static int gen9_init_workarounds(struct intel_engine_cs *engine)
WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
- /* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
- ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
- if (ret)
- return ret;
-
- /* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
- I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
- _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
- ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
- if (ret)
- return ret;
-
- /* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
- ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
- if (ret)
- return ret;
-
return 0;
}
-static int skl_tune_iz_hashing(struct intel_engine_cs *engine)
+static int skl_tune_iz_hashing(struct drm_i915_private *dev_priv)
{
- struct drm_i915_private *dev_priv = engine->i915;
u8 vals[3] = { 0, 0, 0 };
unsigned int i;
@@ -377,67 +310,29 @@ static int skl_tune_iz_hashing(struct intel_engine_cs *engine)
return 0;
}
-static int skl_init_workarounds(struct intel_engine_cs *engine)
+static int skl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
{
- struct drm_i915_private *dev_priv = engine->i915;
int ret;
- ret = gen9_init_workarounds(engine);
- if (ret)
- return ret;
-
- /* WaEnableGapsTsvCreditFix:skl */
- I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
- GEN9_GAPS_TSV_CREDIT_DISABLE));
-
- /* WaDisableGafsUnitClkGating:skl */
- I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
- GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
- /* WaInPlaceDecompressionHang:skl */
- if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
- /* WaDisableLSQCROPERFforOCL:skl */
- ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+ ret = gen9_ctx_workarounds_init(dev_priv);
if (ret)
return ret;
- return skl_tune_iz_hashing(engine);
+ return skl_tune_iz_hashing(dev_priv);
}
-static int bxt_init_workarounds(struct intel_engine_cs *engine)
+static int bxt_ctx_workarounds_init(struct drm_i915_private *dev_priv)
{
- struct drm_i915_private *dev_priv = engine->i915;
int ret;
- ret = gen9_init_workarounds(engine);
+ ret = gen9_ctx_workarounds_init(dev_priv);
if (ret)
return ret;
- /* WaStoreMultiplePTEenable:bxt */
- /* This is a requirement according to Hardware specification */
- if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
- I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
-
- /* WaSetClckGatingDisableMedia:bxt */
- if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
- I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
- ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
- }
-
/* WaDisableThreadStallDopClockGating:bxt */
WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
STALL_DOP_GATING_DISABLE);
- /* WaDisablePooledEuLoadBalancingFix:bxt */
- if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
- I915_WRITE(FF_SLICE_CS_CHICKEN2,
- _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
- }
-
/* WaDisableSbeCacheDispatchPortSharing:bxt */
if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0)) {
WA_SET_BIT_MASKED(
@@ -445,114 +340,22 @@ static int bxt_init_workarounds(struct intel_engine_cs *engine)
GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
}
- /* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
- /* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
- /* WaDisableObjectLevelPreemtionForInstanceId:bxt */
- /* WaDisableLSQCROPERFforOCL:bxt */
- if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
- ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
- if (ret)
- return ret;
-
- ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
- if (ret)
- return ret;
- }
-
- /* WaProgramL3SqcReg1DefaultForPerf:bxt */
- if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER))
- I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(62) |
- L3_HIGH_PRIO_CREDITS(2));
-
/* WaToEnableHwFixForPushConstHWBug:bxt */
if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
- /* WaInPlaceDecompressionHang:bxt */
- if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
return 0;
}
-static int cnl_init_workarounds(struct intel_engine_cs *engine)
+static int kbl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
{
- struct drm_i915_private *dev_priv = engine->i915;
- int ret;
-
- /* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
- if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
- I915_WRITE(GAMT_CHKN_BIT_REG,
- (I915_READ(GAMT_CHKN_BIT_REG) |
- GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
-
- /* WaForceContextSaveRestoreNonCoherent:cnl */
- WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
- HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
-
- /* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
- if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
- WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
-
- /* WaDisableReplayBufferBankArbitrationOptimization:cnl */
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
- GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
- /* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
- if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
- GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
-
- /* WaInPlaceDecompressionHang:cnl */
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
- /* WaPushConstantDereferenceHoldDisable:cnl */
- WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
-
- /* FtrEnableFastAnisoL1BankingFix: cnl */
- WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
-
- /* WaDisable3DMidCmdPreemption:cnl */
- WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
-
- /* WaDisableGPGPUMidCmdPreemption:cnl */
- WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
- GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
-
- /* WaEnablePreemptionGranularityControlByUMD:cnl */
- I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
- _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
- ret= wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
- if (ret)
- return ret;
-
- return 0;
-}
-
-static int kbl_init_workarounds(struct intel_engine_cs *engine)
-{
- struct drm_i915_private *dev_priv = engine->i915;
int ret;
- ret = gen9_init_workarounds(engine);
+ ret = gen9_ctx_workarounds_init(dev_priv);
if (ret)
return ret;
- /* WaEnableGapsTsvCreditFix:kbl */
- I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
- GEN9_GAPS_TSV_CREDIT_DISABLE));
-
- /* WaDisableDynamicCreditSharing:kbl */
- if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
- I915_WRITE(GAMT_CHKN_BIT_REG,
- (I915_READ(GAMT_CHKN_BIT_REG) |
- GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
-
/* WaDisableFenceDestinationToSLM:kbl (pre-prod) */
if (IS_KBL_REVID(dev_priv, KBL_REVID_A0, KBL_REVID_A0))
WA_SET_BIT_MASKED(HDC_CHICKEN0,
@@ -563,34 +366,19 @@ static int kbl_init_workarounds(struct intel_engine_cs *engine)
WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
- /* WaDisableGafsUnitClkGating:kbl */
- I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
- GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
/* WaDisableSbeCacheDispatchPortSharing:kbl */
WA_SET_BIT_MASKED(
GEN7_HALF_SLICE_CHICKEN1,
GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
- /* WaInPlaceDecompressionHang:kbl */
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
-
- /* WaDisableLSQCROPERFforOCL:kbl */
- ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
- if (ret)
- return ret;
-
return 0;
}
-static int glk_init_workarounds(struct intel_engine_cs *engine)
+static int glk_ctx_workarounds_init(struct drm_i915_private *dev_priv)
{
- struct drm_i915_private *dev_priv = engine->i915;
int ret;
- ret = gen9_init_workarounds(engine);
+ ret = gen9_ctx_workarounds_init(dev_priv);
if (ret)
return ret;
@@ -601,77 +389,94 @@ static int glk_init_workarounds(struct intel_engine_cs *engine)
return 0;
}
-static int cfl_init_workarounds(struct intel_engine_cs *engine)
+static int cfl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
{
- struct drm_i915_private *dev_priv = engine->i915;
int ret;
- ret = gen9_init_workarounds(engine);
+ ret = gen9_ctx_workarounds_init(dev_priv);
if (ret)
return ret;
- /* WaEnableGapsTsvCreditFix:cfl */
- I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
- GEN9_GAPS_TSV_CREDIT_DISABLE));
-
/* WaToEnableHwFixForPushConstHWBug:cfl */
WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
- /* WaDisableGafsUnitClkGating:cfl */
- I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
- GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
-
/* WaDisableSbeCacheDispatchPortSharing:cfl */
WA_SET_BIT_MASKED(
GEN7_HALF_SLICE_CHICKEN1,
GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
- /* WaInPlaceDecompressionHang:cfl */
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+ return 0;
+}
+
+static int cnl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
+{
+ /* WaForceContextSaveRestoreNonCoherent:cnl */
+ WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
+ HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
+
+ /* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
+ if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
+ WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
+
+ /* WaDisableReplayBufferBankArbitrationOptimization:cnl */
+ WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
+
+ /* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
+ if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
+ WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
+
+ /* WaPushConstantDereferenceHoldDisable:cnl */
+ WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
+
+ /* FtrEnableFastAnisoL1BankingFix:cnl */
+ WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
+
+ /* WaDisable3DMidCmdPreemption:cnl */
+ WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+
+ /* WaDisableGPGPUMidCmdPreemption:cnl */
+ WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+ GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
return 0;
}
-int init_workarounds_ring(struct intel_engine_cs *engine)
+int i915_ctx_workarounds_init(struct drm_i915_private *dev_priv)
{
- struct drm_i915_private *dev_priv = engine->i915;
int err;
- WARN_ON(engine->id != RCS);
-
dev_priv->workarounds.count = 0;
- dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
if (IS_BROADWELL(dev_priv))
- err = bdw_init_workarounds(engine);
+ err = bdw_ctx_workarounds_init(dev_priv);
else if (IS_CHERRYVIEW(dev_priv))
- err = chv_init_workarounds(engine);
+ err = chv_ctx_workarounds_init(dev_priv);
else if (IS_SKYLAKE(dev_priv))
- err = skl_init_workarounds(engine);
+ err = skl_ctx_workarounds_init(dev_priv);
else if (IS_BROXTON(dev_priv))
- err = bxt_init_workarounds(engine);
+ err = bxt_ctx_workarounds_init(dev_priv);
else if (IS_KABYLAKE(dev_priv))
- err = kbl_init_workarounds(engine);
+ err = kbl_ctx_workarounds_init(dev_priv);
else if (IS_GEMINILAKE(dev_priv))
- err = glk_init_workarounds(engine);
+ err = glk_ctx_workarounds_init(dev_priv);
else if (IS_COFFEELAKE(dev_priv))
- err = cfl_init_workarounds(engine);
+ err = cfl_ctx_workarounds_init(dev_priv);
else if (IS_CANNONLAKE(dev_priv))
- err = cnl_init_workarounds(engine);
+ err = cnl_ctx_workarounds_init(dev_priv);
else
err = 0;
if (err)
return err;
- DRM_DEBUG_DRIVER("%s: Number of context specific w/a: %d\n",
- engine->name, dev_priv->workarounds.count);
+ DRM_DEBUG_DRIVER("Number of context specific w/a: %d\n",
+ dev_priv->workarounds.count);
return 0;
}
-int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
+int i915_ctx_workarounds_emit(struct drm_i915_gem_request *req)
{
struct i915_workarounds *w = &req->i915->workarounds;
u32 *cs;
@@ -703,3 +508,329 @@ int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
return 0;
}
+
+static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+ if (HAS_LLC(dev_priv)) {
+ /* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
+ *
+ * Must match Display Engine. See
+ * WaCompressedResourceDisplayNewHashMode.
+ */
+ I915_WRITE(MMCD_MISC_CTRL,
+ I915_READ(MMCD_MISC_CTRL) |
+ MMCD_PCLA |
+ MMCD_HOTSPOT_EN);
+ }
+
+ /* WaContextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
+ I915_WRITE(GEN9_CSFE_CHICKEN1_RCS,
+ _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
+
+ /* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
+ I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
+ GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
+
+ /* WaDisableKillLogic:bxt,skl,kbl */
+ if (!IS_COFFEELAKE(dev_priv))
+ I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+ ECOCHK_DIS_TLB);
+
+ /* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
+ I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
+ BDW_DISABLE_HDC_INVALIDATION);
+
+ /* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
+ I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
+ GEN8_LQSC_FLUSH_COHERENT_LINES));
+
+ /* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
+ I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
+ _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+}
+
+static void skl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+ gen9_mmio_workarounds_apply(dev_priv);
+
+ /* WaEnableGapsTsvCreditFix:skl */
+ I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+ GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+ /* WaDisableGafsUnitClkGating:skl */
+ I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+ GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+ /* WaInPlaceDecompressionHang:skl */
+ if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
+ I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+ (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+}
+
+static void bxt_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+ gen9_mmio_workarounds_apply(dev_priv);
+
+ /* WaStoreMultiplePTEenable:bxt */
+ /* This is a requirement according to Hardware specification */
+ if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
+ I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
+
+ /* WaSetClckGatingDisableMedia:bxt */
+ if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+ I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
+ ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
+ }
+
+ /* WaDisablePooledEuLoadBalancingFix:bxt */
+ if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
+ I915_WRITE(FF_SLICE_CS_CHICKEN2,
+ _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
+ }
+
+ /* WaProgramL3SqcReg1DefaultForPerf:bxt */
+ if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER))
+ I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(62) |
+ L3_HIGH_PRIO_CREDITS(2));
+
+ /* WaInPlaceDecompressionHang:bxt */
+ if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
+ I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+ (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+}
+
+static void kbl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+ gen9_mmio_workarounds_apply(dev_priv);
+
+ /* WaEnableGapsTsvCreditFix:kbl */
+ I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+ GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+ /* WaDisableDynamicCreditSharing:kbl */
+ if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
+ I915_WRITE(GAMT_CHKN_BIT_REG,
+ (I915_READ(GAMT_CHKN_BIT_REG) |
+ GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
+
+ /* WaDisableGafsUnitClkGating:kbl */
+ I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+ GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+ /* WaInPlaceDecompressionHang:kbl */
+ I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+ (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+}
+
+static void glk_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+ gen9_mmio_workarounds_apply(dev_priv);
+}
+
+static void cfl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+ gen9_mmio_workarounds_apply(dev_priv);
+
+ /* WaEnableGapsTsvCreditFix:cfl */
+ I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
+ GEN9_GAPS_TSV_CREDIT_DISABLE));
+
+ /* WaDisableGafsUnitClkGating:cfl */
+ I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
+ GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+
+ /* WaInPlaceDecompressionHang:cfl */
+ I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+ (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+}
+
+static void cnl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+ /* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
+ if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
+ I915_WRITE(GAMT_CHKN_BIT_REG,
+ (I915_READ(GAMT_CHKN_BIT_REG) |
+ GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
+
+ /* WaInPlaceDecompressionHang:cnl */
+ I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
+ (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+
+ /* WaEnablePreemptionGranularityControlByUMD:cnl */
+ I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
+ _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+}
+
+void i915_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+ if (IS_SKYLAKE(dev_priv))
+ skl_mmio_workarounds_apply(dev_priv);
+ else if (IS_BROXTON(dev_priv))
+ bxt_mmio_workarounds_apply(dev_priv);
+ else if (IS_KABYLAKE(dev_priv))
+ kbl_mmio_workarounds_apply(dev_priv);
+ else if (IS_GEMINILAKE(dev_priv))
+ glk_mmio_workarounds_apply(dev_priv);
+ else if (IS_COFFEELAKE(dev_priv))
+ cfl_mmio_workarounds_apply(dev_priv);
+ else if (IS_CANNONLAKE(dev_priv))
+ cnl_mmio_workarounds_apply(dev_priv);
+}
+
+static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
+ i915_reg_t reg)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ struct i915_workarounds *wa = &dev_priv->workarounds;
+ const uint32_t index = wa->hw_whitelist_count[engine->id];
+
+ if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
+ return -EINVAL;
+
+ I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
+ i915_mmio_reg_offset(reg));
+ wa->hw_whitelist_count[engine->id]++;
+
+ return 0;
+}
+
+static int gen9_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+ int ret;
+
+ /* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
+ ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
+ if (ret)
+ return ret;
+
+ /* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
+ ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
+ if (ret)
+ return ret;
+
+ /* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
+ ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static int skl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+ int ret = gen9_whitelist_workarounds_apply(engine);
+ if (ret)
+ return ret;
+
+ /* WaDisableLSQCROPERFforOCL:skl */
+ ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static int bxt_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+
+ int ret = gen9_whitelist_workarounds_apply(engine);
+ if (ret)
+ return ret;
+
+ /* WaDisableObjectLevelPreemptionForTrifanOrPolygon:bxt */
+ /* WaDisableObjectLevelPreemptionForInstancedDraw:bxt */
+ /* WaDisableObjectLevelPreemtionForInstanceId:bxt */
+ /* WaDisableLSQCROPERFforOCL:bxt */
+ if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
+ ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
+ if (ret)
+ return ret;
+
+ ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static int kbl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+ int ret = gen9_whitelist_workarounds_apply(engine);
+ if (ret)
+ return ret;
+
+ /* WaDisableLSQCROPERFforOCL:kbl */
+ ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static int glk_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+ int ret = gen9_whitelist_workarounds_apply(engine);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static int cfl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+ int ret = gen9_whitelist_workarounds_apply(engine);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static int cnl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+ int ret;
+
+ /* WaEnablePreemptionGranularityControlByUMD:cnl */
+ ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+int i915_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ int err;
+
+ WARN_ON(engine->id != RCS);
+
+ dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
+
+ if (IS_SKYLAKE(dev_priv))
+ err = skl_whitelist_workarounds_apply(engine);
+ else if (IS_BROXTON(dev_priv))
+ err = bxt_whitelist_workarounds_apply(engine);
+ else if (IS_KABYLAKE(dev_priv))
+ err = kbl_whitelist_workarounds_apply(engine);
+ else if (IS_GEMINILAKE(dev_priv))
+ err = glk_whitelist_workarounds_apply(engine);
+ else if (IS_COFFEELAKE(dev_priv))
+ err = cfl_whitelist_workarounds_apply(engine);
+ else if (IS_CANNONLAKE(dev_priv))
+ err = cnl_whitelist_workarounds_apply(engine);
+ else
+ err = 0;
+ if (err)
+ return err;
+
+ DRM_DEBUG_DRIVER("%s: Number of whitelist w/a: %d\n", engine->name,
+ dev_priv->workarounds.hw_whitelist_count[engine->id]);
+ return 0;
+}
diff --git a/drivers/gpu/drm/i915/i915_workarounds.h b/drivers/gpu/drm/i915/i915_workarounds.h
index 27099fc..3944432 100644
--- a/drivers/gpu/drm/i915/i915_workarounds.h
+++ b/drivers/gpu/drm/i915/i915_workarounds.h
@@ -25,7 +25,11 @@
#ifndef _I915_WORKAROUNDS_H_
#define _I915_WORKAROUNDS_H_
-int init_workarounds_ring(struct intel_engine_cs *engine);
-int intel_ring_workarounds_emit(struct drm_i915_gem_request *req);
+int i915_ctx_workarounds_init(struct drm_i915_private *dev_priv);
+int i915_ctx_workarounds_emit(struct drm_i915_gem_request *req);
+
+void i915_mmio_workarounds_apply(struct drm_i915_private *dev_priv);
+
+int i915_whitelist_workarounds_apply(struct intel_engine_cs *engine);
#endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 07978fd..4816dd6 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1488,7 +1488,7 @@ static int gen8_init_render_ring(struct intel_engine_cs *engine)
I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
- return init_workarounds_ring(engine);
+ return 0;
}
static int gen9_init_render_ring(struct intel_engine_cs *engine)
@@ -1499,7 +1499,11 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
if (ret)
return ret;
- return init_workarounds_ring(engine);
+ ret = i915_whitelist_workarounds_apply(engine);
+ if (ret)
+ return ret;
+
+ return 0;
}
static void reset_common_ring(struct intel_engine_cs *engine,
@@ -1827,7 +1831,7 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
{
int ret;
- ret = intel_ring_workarounds_emit(req);
+ ret = i915_ctx_workarounds_emit(req);
if (ret)
return ret;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index bb11aac..7413443 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -638,7 +638,7 @@ static int intel_rcs_ctx_init(struct drm_i915_gem_request *req)
{
int ret;
- ret = intel_ring_workarounds_emit(req);
+ ret = i915_ctx_workarounds_emit(req);
if (ret != 0)
return ret;
@@ -697,7 +697,7 @@ static int init_render_ring(struct intel_engine_cs *engine)
if (INTEL_INFO(dev_priv)->gen >= 6)
I915_WRITE_IMR(engine, ~engine->irq_keep_mask);
- return init_workarounds_ring(engine);
+ return 0;
}
static void render_ring_cleanup(struct intel_engine_cs *engine)
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 04/11] drm/i915: Move workarounds from init_clock_gating
2017-10-09 20:58 [RFC PATCH 00/11] Refactor HW workaround code Oscar Mateo
` (2 preceding siblings ...)
2017-10-09 20:58 ` [RFC PATCH 03/11] drm/i915: Split out functions for different kinds of workarounds Oscar Mateo
@ 2017-10-09 20:58 ` Oscar Mateo
2017-10-09 20:58 ` [RFC PATCH 05/11] drm/i915: Rename saved workarounds to make it explicit that they are context WAs Oscar Mateo
` (8 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Oscar Mateo @ 2017-10-09 20:58 UTC (permalink / raw)
To: intel-gfx; +Cc: Rodrigo Vivi
I'm not sure why some WAs have historically been applied in init_clock_gating
and some others in the engine setup (GT vs. display? context vs. global
registers?) but it does not look like the best place to apply workarounds:
the name is confusing, it's a display function (even though some GT WAs
also go here) and it isn't necessarily called on a GPU reset. This patch
moves these WAs to their rightful place inside i915_workarounds.c.
TODO1: It's straightforward to move things from Gen9 onwards, but in
order to move Gen8 and older, I'd need some guidance on what constitutes
clock gating and what is simply a WA.
TODO2: Do we want to keep display WAs separated from GT ones? In that case,
I would propose another category inside i915_workarounds.c (but I would need
help deciding what goes where).
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_workarounds.c | 146 ++++++++++++++++++++++++-
drivers/gpu/drm/i915/intel_pm.c | 188 +-------------------------------
2 files changed, 151 insertions(+), 183 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_workarounds.c b/drivers/gpu/drm/i915/i915_workarounds.c
index f272b49..9ae89e4 100644
--- a/drivers/gpu/drm/i915/i915_workarounds.c
+++ b/drivers/gpu/drm/i915/i915_workarounds.c
@@ -509,6 +509,14 @@ int i915_ctx_workarounds_emit(struct drm_i915_gem_request *req)
return 0;
}
+static void bdw_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+}
+
+static void chv_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+}
+
static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
{
if (HAS_LLC(dev_priv)) {
@@ -521,8 +529,40 @@ static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
I915_READ(MMCD_MISC_CTRL) |
MMCD_PCLA |
MMCD_HOTSPOT_EN);
+
+ /*
+ * WaCompressedResourceDisplayNewHashMode:skl,kbl
+ * Display WA#0390: skl,kbl
+ *
+ * Must match Sampler, Pixel Back End, and Media. See
+ * WaCompressedResourceSamplerPbeMediaNewHashMode.
+ */
+ I915_WRITE(CHICKEN_PAR1_1,
+ I915_READ(CHICKEN_PAR1_1) |
+ SKL_DE_COMPRESSED_HASH_MODE);
}
+ /* See Bspec note for PSR2_CTL bit 31, Wa#828:skl,bxt,kbl,cfl */
+ I915_WRITE(CHICKEN_PAR1_1,
+ I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
+
+ I915_WRITE(GEN8_CONFIG0,
+ I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
+
+ /* WaEnableChickenDCPR:skl,bxt,kbl,glk,cfl */
+ I915_WRITE(GEN8_CHICKEN_DCPR_1,
+ I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
+
+ /* WaFbcTurnOffFbcWatermark:skl,bxt,kbl,cfl */
+ /* WaFbcWakeMemOn:skl,bxt,kbl,glk,cfl */
+ I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
+ DISP_FBC_WM_DIS |
+ DISP_FBC_MEMORY_WAKE);
+
+ /* WaFbcHighMemBwCorruptionAvoidance:skl,bxt,kbl,cfl */
+ I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
+ ILK_DPFC_DISABLE_DUMMY0);
+
/* WaContextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
I915_WRITE(GEN9_CSFE_CHICKEN1_RCS,
_MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
@@ -553,6 +593,18 @@ static void skl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
{
gen9_mmio_workarounds_apply(dev_priv);
+ /* WaDisableDopClockGating */
+ I915_WRITE(GEN7_MISCCPCTL, I915_READ(GEN7_MISCCPCTL)
+ & ~GEN7_DOP_CLOCK_GATE_ENABLE);
+
+ /* WAC6entrylatency:skl */
+ I915_WRITE(FBC_LLC_READ_CTRL, I915_READ(FBC_LLC_READ_CTRL) |
+ FBC_LLC_FULLY_OPEN);
+
+ /* WaFbcNukeOnHostModify:skl */
+ I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
+ ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
+
/* WaEnableGapsTsvCreditFix:skl */
I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
GEN9_GAPS_TSV_CREDIT_DISABLE));
@@ -572,6 +624,24 @@ static void bxt_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
{
gen9_mmio_workarounds_apply(dev_priv);
+ /* WaDisableSDEUnitClockGating:bxt */
+ I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
+ GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
+
+ /*
+ * FIXME:
+ * GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ applies on 3x6 GT SKUs only.
+ */
+ I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
+ GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ);
+
+ /*
+ * Wa: Backlight PWM may stop in the asserted state, causing backlight
+ * to stay fully on.
+ */
+ I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
+ PWM1_GATING_DIS | PWM2_GATING_DIS);
+
/* WaStoreMultiplePTEenable:bxt */
/* This is a requirement according to Hardware specification */
if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
@@ -605,6 +675,20 @@ static void kbl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
{
gen9_mmio_workarounds_apply(dev_priv);
+ /* WaDisableSDEUnitClockGating:kbl */
+ if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
+ I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
+ GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
+
+ /* WaDisableGamClockGating:kbl */
+ if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
+ I915_WRITE(GEN6_UCGCTL1, I915_READ(GEN6_UCGCTL1) |
+ GEN6_GAMUNIT_CLOCK_GATE_DISABLE);
+
+ /* WaFbcNukeOnHostModify:kbl */
+ I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
+ ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
+
/* WaEnableGapsTsvCreditFix:kbl */
I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
GEN9_GAPS_TSV_CREDIT_DISABLE));
@@ -627,13 +711,43 @@ static void kbl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
static void glk_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
{
+ u32 val;
+
gen9_mmio_workarounds_apply(dev_priv);
+
+ /*
+ * WaDisablePWMClockGating:glk
+ * Backlight PWM may stop in the asserted state, causing backlight
+ * to stay fully on.
+ */
+ I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
+ PWM1_GATING_DIS | PWM2_GATING_DIS);
+
+ /* WaDDIIOTimeout:glk */
+ if (IS_GLK_REVID(dev_priv, 0, GLK_REVID_A1)) {
+ u32 val = I915_READ(CHICKEN_MISC_2);
+ val &= ~(GLK_CL0_PWR_DOWN |
+ GLK_CL1_PWR_DOWN |
+ GLK_CL2_PWR_DOWN);
+ I915_WRITE(CHICKEN_MISC_2, val);
+ }
+
+ /* Display WA #1133: WaFbcSkipSegments:glk */
+ val = I915_READ(ILK_DPFC_CHICKEN);
+ val &= ~GLK_SKIP_SEG_COUNT_MASK;
+ val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
+ I915_WRITE(ILK_DPFC_CHICKEN, val);
}
static void cfl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
{
gen9_mmio_workarounds_apply(dev_priv);
+ /* WaFbcNukeOnHostModify:cfl */
+ I915_WRITE(ILK_DPFC_CHICKEN,
+ I915_READ(ILK_DPFC_CHICKEN) |
+ ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
+
/* WaEnableGapsTsvCreditFix:cfl */
I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
GEN9_GAPS_TSV_CREDIT_DISABLE));
@@ -650,6 +764,32 @@ static void cfl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
static void cnl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
{
+ u32 val;
+
+ /* This is not an Wa. Enable for better image quality */
+ I915_WRITE(_3D_CHICKEN3,
+ _MASKED_BIT_ENABLE(_3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE));
+
+ /* WaEnableChickenDCPR:cnl */
+ I915_WRITE(GEN8_CHICKEN_DCPR_1,
+ I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
+
+ /* WaFbcWakeMemOn:cnl */
+ I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
+ DISP_FBC_MEMORY_WAKE);
+
+ /* WaSarbUnitClockGatingDisable:cnl (pre-prod) */
+ if (IS_CNL_REVID(dev_priv, CNL_REVID_A0, CNL_REVID_B0))
+ I915_WRITE(SLICE_UNIT_LEVEL_CLKGATE,
+ I915_READ(SLICE_UNIT_LEVEL_CLKGATE) |
+ SARBUNIT_CLKGATE_DIS);
+
+ /* Display WA #1133: WaFbcSkipSegments:cnl */
+ val = I915_READ(ILK_DPFC_CHICKEN);
+ val &= ~GLK_SKIP_SEG_COUNT_MASK;
+ val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
+ I915_WRITE(ILK_DPFC_CHICKEN, val);
+
/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
I915_WRITE(GAMT_CHKN_BIT_REG,
@@ -668,7 +808,11 @@ static void cnl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
void i915_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
{
- if (IS_SKYLAKE(dev_priv))
+ if (IS_BROADWELL(dev_priv))
+ bdw_mmio_workarounds_apply(dev_priv);
+ else if (IS_CHERRYVIEW(dev_priv))
+ chv_mmio_workarounds_apply(dev_priv);
+ else if (IS_SKYLAKE(dev_priv))
skl_mmio_workarounds_apply(dev_priv);
else if (IS_BROXTON(dev_priv))
bxt_mmio_workarounds_apply(dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 9d0ca26..243e4b8 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -56,101 +56,6 @@
#define INTEL_RC6p_ENABLE (1<<1)
#define INTEL_RC6pp_ENABLE (1<<2)
-static void gen9_init_clock_gating(struct drm_i915_private *dev_priv)
-{
- if (HAS_LLC(dev_priv)) {
- /*
- * WaCompressedResourceDisplayNewHashMode:skl,kbl
- * Display WA#0390: skl,kbl
- *
- * Must match Sampler, Pixel Back End, and Media. See
- * WaCompressedResourceSamplerPbeMediaNewHashMode.
- */
- I915_WRITE(CHICKEN_PAR1_1,
- I915_READ(CHICKEN_PAR1_1) |
- SKL_DE_COMPRESSED_HASH_MODE);
- }
-
- /* See Bspec note for PSR2_CTL bit 31, Wa#828:skl,bxt,kbl,cfl */
- I915_WRITE(CHICKEN_PAR1_1,
- I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
-
- I915_WRITE(GEN8_CONFIG0,
- I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
-
- /* WaEnableChickenDCPR:skl,bxt,kbl,glk,cfl */
- I915_WRITE(GEN8_CHICKEN_DCPR_1,
- I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
-
- /* WaFbcTurnOffFbcWatermark:skl,bxt,kbl,cfl */
- /* WaFbcWakeMemOn:skl,bxt,kbl,glk,cfl */
- I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
- DISP_FBC_WM_DIS |
- DISP_FBC_MEMORY_WAKE);
-
- /* WaFbcHighMemBwCorruptionAvoidance:skl,bxt,kbl,cfl */
- I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
- ILK_DPFC_DISABLE_DUMMY0);
-
- if (IS_SKYLAKE(dev_priv)) {
- /* WaDisableDopClockGating */
- I915_WRITE(GEN7_MISCCPCTL, I915_READ(GEN7_MISCCPCTL)
- & ~GEN7_DOP_CLOCK_GATE_ENABLE);
- }
-}
-
-static void bxt_init_clock_gating(struct drm_i915_private *dev_priv)
-{
- gen9_init_clock_gating(dev_priv);
-
- /* WaDisableSDEUnitClockGating:bxt */
- I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
- GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
-
- /*
- * FIXME:
- * GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ applies on 3x6 GT SKUs only.
- */
- I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
- GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ);
-
- /*
- * Wa: Backlight PWM may stop in the asserted state, causing backlight
- * to stay fully on.
- */
- I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
- PWM1_GATING_DIS | PWM2_GATING_DIS);
-}
-
-static void glk_init_clock_gating(struct drm_i915_private *dev_priv)
-{
- u32 val;
- gen9_init_clock_gating(dev_priv);
-
- /*
- * WaDisablePWMClockGating:glk
- * Backlight PWM may stop in the asserted state, causing backlight
- * to stay fully on.
- */
- I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
- PWM1_GATING_DIS | PWM2_GATING_DIS);
-
- /* WaDDIIOTimeout:glk */
- if (IS_GLK_REVID(dev_priv, 0, GLK_REVID_A1)) {
- u32 val = I915_READ(CHICKEN_MISC_2);
- val &= ~(GLK_CL0_PWR_DOWN |
- GLK_CL1_PWR_DOWN |
- GLK_CL2_PWR_DOWN);
- I915_WRITE(CHICKEN_MISC_2, val);
- }
-
- /* Display WA #1133: WaFbcSkipSegments:glk */
- val = I915_READ(ILK_DPFC_CHICKEN);
- val &= ~GLK_SKIP_SEG_COUNT_MASK;
- val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
- I915_WRITE(ILK_DPFC_CHICKEN, val);
-}
-
static void i915_pineview_get_mem_freq(struct drm_i915_private *dev_priv)
{
u32 tmp;
@@ -8379,78 +8284,6 @@ static void cnp_init_clock_gating(struct drm_i915_private *dev_priv)
CNP_PWM_CGE_GATING_DISABLE);
}
-static void cnl_init_clock_gating(struct drm_i915_private *dev_priv)
-{
- u32 val;
- cnp_init_clock_gating(dev_priv);
-
- /* This is not an Wa. Enable for better image quality */
- I915_WRITE(_3D_CHICKEN3,
- _MASKED_BIT_ENABLE(_3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE));
-
- /* WaEnableChickenDCPR:cnl */
- I915_WRITE(GEN8_CHICKEN_DCPR_1,
- I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
-
- /* WaFbcWakeMemOn:cnl */
- I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
- DISP_FBC_MEMORY_WAKE);
-
- /* WaSarbUnitClockGatingDisable:cnl (pre-prod) */
- if (IS_CNL_REVID(dev_priv, CNL_REVID_A0, CNL_REVID_B0))
- I915_WRITE(SLICE_UNIT_LEVEL_CLKGATE,
- I915_READ(SLICE_UNIT_LEVEL_CLKGATE) |
- SARBUNIT_CLKGATE_DIS);
-
- /* Display WA #1133: WaFbcSkipSegments:cnl */
- val = I915_READ(ILK_DPFC_CHICKEN);
- val &= ~GLK_SKIP_SEG_COUNT_MASK;
- val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
- I915_WRITE(ILK_DPFC_CHICKEN, val);
-}
-
-static void cfl_init_clock_gating(struct drm_i915_private *dev_priv)
-{
- cnp_init_clock_gating(dev_priv);
- gen9_init_clock_gating(dev_priv);
-
- /* WaFbcNukeOnHostModify:cfl */
- I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
- ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
-}
-
-static void kbl_init_clock_gating(struct drm_i915_private *dev_priv)
-{
- gen9_init_clock_gating(dev_priv);
-
- /* WaDisableSDEUnitClockGating:kbl */
- if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
- I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
- GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
-
- /* WaDisableGamClockGating:kbl */
- if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
- I915_WRITE(GEN6_UCGCTL1, I915_READ(GEN6_UCGCTL1) |
- GEN6_GAMUNIT_CLOCK_GATE_DISABLE);
-
- /* WaFbcNukeOnHostModify:kbl */
- I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
- ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
-}
-
-static void skl_init_clock_gating(struct drm_i915_private *dev_priv)
-{
- gen9_init_clock_gating(dev_priv);
-
- /* WAC6entrylatency:skl */
- I915_WRITE(FBC_LLC_READ_CTRL, I915_READ(FBC_LLC_READ_CTRL) |
- FBC_LLC_FULLY_OPEN);
-
- /* WaFbcNukeOnHostModify:skl */
- I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
- ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
-}
-
static void bdw_init_clock_gating(struct drm_i915_private *dev_priv)
{
/* The GTT cache must be disabled if the system is using 2M pages. */
@@ -8892,24 +8725,15 @@ static void nop_init_clock_gating(struct drm_i915_private *dev_priv)
* @dev_priv: device private
*
* Setup the hooks that configure which clocks of a given platform can be
- * gated and also apply various GT and display specific workarounds for these
- * platforms. Note that some GT specific workarounds are applied separately
- * when GPU contexts or batchbuffers start their execution.
+ * gated.
*/
void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
{
- if (IS_CANNONLAKE(dev_priv))
- dev_priv->display.init_clock_gating = cnl_init_clock_gating;
- else if (IS_COFFEELAKE(dev_priv))
- dev_priv->display.init_clock_gating = cfl_init_clock_gating;
- else if (IS_SKYLAKE(dev_priv))
- dev_priv->display.init_clock_gating = skl_init_clock_gating;
- else if (IS_KABYLAKE(dev_priv))
- dev_priv->display.init_clock_gating = kbl_init_clock_gating;
- else if (IS_BROXTON(dev_priv))
- dev_priv->display.init_clock_gating = bxt_init_clock_gating;
- else if (IS_GEMINILAKE(dev_priv))
- dev_priv->display.init_clock_gating = glk_init_clock_gating;
+ if (IS_CANNONLAKE(dev_priv) || IS_COFFEELAKE(dev_priv))
+ dev_priv->display.init_clock_gating = cnp_init_clock_gating;
+ else if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv) ||
+ IS_BROXTON(dev_priv) || IS_GEMINILAKE(dev_priv))
+ dev_priv->display.init_clock_gating = nop_init_clock_gating;
else if (IS_BROADWELL(dev_priv))
dev_priv->display.init_clock_gating = bdw_init_clock_gating;
else if (IS_CHERRYVIEW(dev_priv))
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 05/11] drm/i915: Rename saved workarounds to make it explicit that they are context WAs
2017-10-09 20:58 [RFC PATCH 00/11] Refactor HW workaround code Oscar Mateo
` (3 preceding siblings ...)
2017-10-09 20:58 ` [RFC PATCH 04/11] drm/i915: Move workarounds from init_clock_gating Oscar Mateo
@ 2017-10-09 20:58 ` Oscar Mateo
2017-10-09 20:58 ` [RFC PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time Oscar Mateo
` (7 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Oscar Mateo @ 2017-10-09 20:58 UTC (permalink / raw)
To: intel-gfx
Some WAs touch registers that get saved/restored together with the logical context.
Make this very explicit by renaming a few things in the code.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 10 +--
drivers/gpu/drm/i915/i915_drv.h | 5 +-
drivers/gpu/drm/i915/i915_workarounds.c | 142 ++++++++++++++++----------------
3 files changed, 79 insertions(+), 78 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index f7817c6..a204896 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3557,18 +3557,18 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
intel_runtime_pm_get(dev_priv);
- seq_printf(m, "Workarounds applied: %d\n", workarounds->count);
+ seq_printf(m, "Context workarounds applied: %d\n", workarounds->ctx_wa_count);
for_each_engine(engine, dev_priv, id)
seq_printf(m, "HW whitelist count for %s: %d\n",
engine->name, workarounds->hw_whitelist_count[id]);
- for (i = 0; i < workarounds->count; ++i) {
+ for (i = 0; i < workarounds->ctx_wa_count; ++i) {
i915_reg_t addr;
u32 mask, value, read;
bool ok;
- addr = workarounds->reg[i].addr;
- mask = workarounds->reg[i].mask;
- value = workarounds->reg[i].value;
+ addr = workarounds->ctx_wa_reg[i].addr;
+ mask = workarounds->ctx_wa_reg[i].mask;
+ value = workarounds->ctx_wa_reg[i].value;
read = I915_READ(addr);
ok = (value & mask) == (read & mask);
seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s\n",
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 47a357c..0bbe128 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1957,8 +1957,9 @@ struct i915_wa_reg {
#define I915_MAX_WA_REGS 16
struct i915_workarounds {
- struct i915_wa_reg reg[I915_MAX_WA_REGS];
- u32 count;
+ struct i915_wa_reg ctx_wa_reg[I915_MAX_WA_REGS];
+ u32 ctx_wa_count;
+
u32 hw_whitelist_count[I915_NUM_ENGINES];
};
diff --git a/drivers/gpu/drm/i915/i915_workarounds.c b/drivers/gpu/drm/i915/i915_workarounds.c
index 9ae89e4..21996e1 100644
--- a/drivers/gpu/drm/i915/i915_workarounds.c
+++ b/drivers/gpu/drm/i915/i915_workarounds.c
@@ -25,48 +25,48 @@
#include "i915_drv.h"
#include "i915_workarounds.h"
-static int wa_add(struct drm_i915_private *dev_priv,
- i915_reg_t addr,
- const u32 mask, const u32 val)
+static int ctx_wa_add(struct drm_i915_private *dev_priv,
+ i915_reg_t addr,
+ const u32 mask, const u32 val)
{
- const u32 idx = dev_priv->workarounds.count;
+ const u32 idx = dev_priv->workarounds.ctx_wa_count;
if (WARN_ON(idx >= I915_MAX_WA_REGS))
return -ENOSPC;
- dev_priv->workarounds.reg[idx].addr = addr;
- dev_priv->workarounds.reg[idx].value = val;
- dev_priv->workarounds.reg[idx].mask = mask;
+ dev_priv->workarounds.ctx_wa_reg[idx].addr = addr;
+ dev_priv->workarounds.ctx_wa_reg[idx].value = val;
+ dev_priv->workarounds.ctx_wa_reg[idx].mask = mask;
- dev_priv->workarounds.count++;
+ dev_priv->workarounds.ctx_wa_count++;
return 0;
}
-#define WA_REG(addr, mask, val) do { \
- const int r = wa_add(dev_priv, (addr), (mask), (val)); \
+#define CTXWA_REG(addr, mask, val) do { \
+ const int r = ctx_wa_add(dev_priv, (addr), (mask), (val)); \
if (r) \
return r; \
} while (0)
-#define WA_SET_BIT_MASKED(addr, mask) \
- WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
+#define CTXWA_SET_BIT_MSK(addr, mask) \
+ CTXWA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
-#define WA_CLR_BIT_MASKED(addr, mask) \
- WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
+#define CTXWA_CLR_BIT_MSK(addr, mask) \
+ CTXWA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
-#define WA_SET_FIELD_MASKED(addr, mask, value) \
- WA_REG(addr, mask, _MASKED_FIELD(mask, value))
+#define CTXWA_SET_FIELD_MSK(addr, mask, value) \
+ CTXWA_REG(addr, mask, _MASKED_FIELD(mask, value))
static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
{
- WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
+ CTXWA_SET_BIT_MSK(INSTPM, INSTPM_FORCE_ORDERING);
/* WaDisableAsyncFlipPerfMode:bdw,chv */
- WA_SET_BIT_MASKED(MI_MODE, ASYNC_FLIP_PERF_DISABLE);
+ CTXWA_SET_BIT_MSK(MI_MODE, ASYNC_FLIP_PERF_DISABLE);
/* WaDisablePartialInstShootdown:bdw,chv */
- WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+ CTXWA_SET_BIT_MSK(GEN8_ROW_CHICKEN,
PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
/* Use Force Non-Coherent whenever executing a 3D context. This is a
@@ -75,7 +75,7 @@ static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
*/
/* WaForceEnableNonCoherent:bdw,chv */
/* WaHdcDisableFetchWhenMasked:bdw,chv */
- WA_SET_BIT_MASKED(HDC_CHICKEN0,
+ CTXWA_SET_BIT_MSK(HDC_CHICKEN0,
HDC_DONOT_FETCH_MEM_WHEN_MASKED |
HDC_FORCE_NON_COHERENT);
@@ -87,10 +87,10 @@ static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
*
* This optimization is off by default for BDW and CHV; turn it on.
*/
- WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
+ CTXWA_CLR_BIT_MSK(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
/* Wa4x4STCOptimizationDisable:bdw,chv */
- WA_SET_BIT_MASKED(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);
+ CTXWA_SET_BIT_MSK(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);
/*
* BSpec recommends 8x4 when MSAA is used,
@@ -100,7 +100,7 @@ static int gen8_ctx_workarounds_init(struct drm_i915_private *dev_priv)
* disable bit, which we don't touch here, but it's good
* to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
*/
- WA_SET_FIELD_MASKED(GEN7_GT_MODE,
+ CTXWA_SET_FIELD_MSK(GEN7_GT_MODE,
GEN6_WIZ_HASHING_MASK,
GEN6_WIZ_HASHING_16x4);
@@ -116,20 +116,20 @@ static int bdw_ctx_workarounds_init(struct drm_i915_private *dev_priv)
return ret;
/* WaDisableThreadStallDopClockGating:bdw (pre-production) */
- WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+ CTXWA_SET_BIT_MSK(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
/* WaDisableDopClockGating:bdw
*
* Also see the related UCGTCL1 write in broadwell_init_clock_gating()
* to disable EUTC clock gating.
*/
- WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
+ CTXWA_SET_BIT_MSK(GEN7_ROW_CHICKEN2,
DOP_CLOCK_GATING_DISABLE);
- WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+ CTXWA_SET_BIT_MSK(HALF_SLICE_CHICKEN3,
GEN8_SAMPLER_POWER_BYPASS_DIS);
- WA_SET_BIT_MASKED(HDC_CHICKEN0,
+ CTXWA_SET_BIT_MSK(HDC_CHICKEN0,
/* WaForceContextSaveRestoreNonCoherent:bdw */
HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
/* WaDisableFenceDestinationToSLM:bdw (pre-prod) */
@@ -147,10 +147,10 @@ static int chv_ctx_workarounds_init(struct drm_i915_private *dev_priv)
return ret;
/* WaDisableThreadStallDopClockGating:chv */
- WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+ CTXWA_SET_BIT_MSK(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
/* Improve HiZ throughput on CHV. */
- WA_SET_BIT_MASKED(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
+ CTXWA_SET_BIT_MSK(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
return 0;
}
@@ -163,31 +163,31 @@ static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
* Must match Display Engine. See
* WaCompressedResourceDisplayNewHashMode.
*/
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ CTXWA_SET_BIT_MSK(COMMON_SLICE_CHICKEN2,
GEN9_PBE_COMPRESSED_HASH_SELECTION);
- WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
+ CTXWA_SET_BIT_MSK(GEN9_HALF_SLICE_CHICKEN7,
GEN9_SAMPLER_HASH_COMPRESSED_READ_ADDR);
}
/* WaClearFlowControlGpgpuContextSave:skl,bxt,kbl,glk,cfl */
/* WaDisablePartialInstShootdown:skl,bxt,kbl,glk,cfl */
- WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+ CTXWA_SET_BIT_MSK(GEN8_ROW_CHICKEN,
FLOW_CONTROL_ENABLE |
PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
/* Syncing dependencies between camera and graphics:skl,bxt,kbl */
if (!IS_COFFEELAKE(dev_priv))
- WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+ CTXWA_SET_BIT_MSK(HALF_SLICE_CHICKEN3,
GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC);
/* WaDisableDgMirrorFixInHalfSliceChicken5:bxt */
if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
- WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
+ CTXWA_CLR_BIT_MSK(GEN9_HALF_SLICE_CHICKEN5,
GEN9_DG_MIRROR_FIX_ENABLE);
/* WaSetDisablePixMaskCammingAndRhwoInCommonSliceChicken:bxt */
if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
- WA_SET_BIT_MASKED(GEN7_COMMON_SLICE_CHICKEN1,
+ CTXWA_SET_BIT_MSK(GEN7_COMMON_SLICE_CHICKEN1,
GEN9_RHWO_OPTIMIZATION_DISABLE);
/*
* WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be set
@@ -198,26 +198,26 @@ static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
/* WaEnableYV12BugFixInHalfSliceChicken7:skl,bxt,kbl,glk,cfl */
/* WaEnableSamplerGPGPUPreemptionSupport:skl,bxt,kbl,cfl */
- WA_SET_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN7,
+ CTXWA_SET_BIT_MSK(GEN9_HALF_SLICE_CHICKEN7,
GEN9_ENABLE_YV12_BUGFIX |
GEN9_ENABLE_GPGPU_PREEMPTION);
/* Wa4x4STCOptimizationDisable:skl,bxt,kbl,glk,cfl */
/* WaDisablePartialResolveInVc:skl,bxt,kbl,cfl */
- WA_SET_BIT_MASKED(CACHE_MODE_1, (GEN8_4x4_STC_OPTIMIZATION_DISABLE |
+ CTXWA_SET_BIT_MSK(CACHE_MODE_1, (GEN8_4x4_STC_OPTIMIZATION_DISABLE |
GEN9_PARTIAL_RESOLVE_IN_VC_DISABLE));
/* WaCcsTlbPrefetchDisable:skl,bxt,kbl,glk,cfl */
- WA_CLR_BIT_MASKED(GEN9_HALF_SLICE_CHICKEN5,
+ CTXWA_CLR_BIT_MSK(GEN9_HALF_SLICE_CHICKEN5,
GEN9_CCS_TLB_PREFETCH_ENABLE);
/* WaDisableMaskBasedCammingInRCC:bxt */
if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
- WA_SET_BIT_MASKED(SLICE_ECO_CHICKEN0,
+ CTXWA_SET_BIT_MSK(SLICE_ECO_CHICKEN0,
PIXEL_MASK_CAMMING_DISABLE);
/* WaForceContextSaveRestoreNonCoherent:skl,bxt,kbl,cfl */
- WA_SET_BIT_MASKED(HDC_CHICKEN0,
+ CTXWA_SET_BIT_MSK(HDC_CHICKEN0,
HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT |
HDC_FORCE_CSR_NON_COHERENT_OVR_DISABLE);
@@ -235,7 +235,7 @@ static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
*/
/* WaForceEnableNonCoherent:skl,bxt,kbl,cfl */
- WA_SET_BIT_MASKED(HDC_CHICKEN0,
+ CTXWA_SET_BIT_MSK(HDC_CHICKEN0,
HDC_FORCE_NON_COHERENT);
/* WaDisableSamplerPowerBypassForSOPingPong:skl,bxt,kbl,cfl */
@@ -243,11 +243,11 @@ static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
IS_KABYLAKE(dev_priv) ||
IS_COFFEELAKE(dev_priv) ||
IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0))
- WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3,
+ CTXWA_SET_BIT_MSK(HALF_SLICE_CHICKEN3,
GEN8_SAMPLER_POWER_BYPASS_DIS);
/* WaDisableSTUnitPowerOptimization:skl,bxt,kbl,glk,cfl */
- WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
+ CTXWA_SET_BIT_MSK(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
/*
* Supporting preemption with fine-granularity requires changes in the
@@ -261,10 +261,10 @@ static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
*/
/* WaDisable3DMidCmdPreemption:skl,bxt,glk,cfl,[cnl] */
- WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+ CTXWA_CLR_BIT_MSK(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
/* WaDisableGPGPUMidCmdPreemption:skl,bxt,blk,cfl,[cnl] */
- WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+ CTXWA_SET_FIELD_MSK(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
return 0;
@@ -299,7 +299,7 @@ static int skl_tune_iz_hashing(struct drm_i915_private *dev_priv)
return 0;
/* Tune IZ hashing. See intel_device_info_runtime_init() */
- WA_SET_FIELD_MASKED(GEN7_GT_MODE,
+ CTXWA_SET_FIELD_MSK(GEN7_GT_MODE,
GEN9_IZ_HASHING_MASK(2) |
GEN9_IZ_HASHING_MASK(1) |
GEN9_IZ_HASHING_MASK(0),
@@ -330,19 +330,19 @@ static int bxt_ctx_workarounds_init(struct drm_i915_private *dev_priv)
return ret;
/* WaDisableThreadStallDopClockGating:bxt */
- WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+ CTXWA_SET_BIT_MSK(GEN8_ROW_CHICKEN,
STALL_DOP_GATING_DISABLE);
/* WaDisableSbeCacheDispatchPortSharing:bxt */
if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_B0)) {
- WA_SET_BIT_MASKED(
+ CTXWA_SET_BIT_MSK(
GEN7_HALF_SLICE_CHICKEN1,
GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
}
/* WaToEnableHwFixForPushConstHWBug:bxt */
if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ CTXWA_SET_BIT_MSK(COMMON_SLICE_CHICKEN2,
GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
return 0;
@@ -358,16 +358,16 @@ static int kbl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
/* WaDisableFenceDestinationToSLM:kbl (pre-prod) */
if (IS_KBL_REVID(dev_priv, KBL_REVID_A0, KBL_REVID_A0))
- WA_SET_BIT_MASKED(HDC_CHICKEN0,
+ CTXWA_SET_BIT_MSK(HDC_CHICKEN0,
HDC_FENCE_DEST_SLM_DISABLE);
/* WaToEnableHwFixForPushConstHWBug:kbl */
if (IS_KBL_REVID(dev_priv, KBL_REVID_C0, REVID_FOREVER))
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ CTXWA_SET_BIT_MSK(COMMON_SLICE_CHICKEN2,
GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
/* WaDisableSbeCacheDispatchPortSharing:kbl */
- WA_SET_BIT_MASKED(
+ CTXWA_SET_BIT_MSK(
GEN7_HALF_SLICE_CHICKEN1,
GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
@@ -383,7 +383,7 @@ static int glk_ctx_workarounds_init(struct drm_i915_private *dev_priv)
return ret;
/* WaToEnableHwFixForPushConstHWBug:glk */
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ CTXWA_SET_BIT_MSK(COMMON_SLICE_CHICKEN2,
GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
return 0;
@@ -398,11 +398,11 @@ static int cfl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
return ret;
/* WaToEnableHwFixForPushConstHWBug:cfl */
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ CTXWA_SET_BIT_MSK(COMMON_SLICE_CHICKEN2,
GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
/* WaDisableSbeCacheDispatchPortSharing:cfl */
- WA_SET_BIT_MASKED(
+ CTXWA_SET_BIT_MSK(
GEN7_HALF_SLICE_CHICKEN1,
GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
@@ -412,33 +412,33 @@ static int cfl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
static int cnl_ctx_workarounds_init(struct drm_i915_private *dev_priv)
{
/* WaForceContextSaveRestoreNonCoherent:cnl */
- WA_SET_BIT_MASKED(CNL_HDC_CHICKEN0,
+ CTXWA_SET_BIT_MSK(CNL_HDC_CHICKEN0,
HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
/* WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod) */
if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
- WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, THROTTLE_12_5);
+ CTXWA_SET_BIT_MSK(GEN8_ROW_CHICKEN, THROTTLE_12_5);
/* WaDisableReplayBufferBankArbitrationOptimization:cnl */
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ CTXWA_SET_BIT_MSK(COMMON_SLICE_CHICKEN2,
GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
/* WaDisableEnhancedSBEVertexCaching:cnl (pre-prod) */
if (IS_CNL_REVID(dev_priv, 0, CNL_REVID_B0))
- WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
+ CTXWA_SET_BIT_MSK(COMMON_SLICE_CHICKEN2,
GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE);
/* WaPushConstantDereferenceHoldDisable:cnl */
- WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
+ CTXWA_SET_BIT_MSK(GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
/* FtrEnableFastAnisoL1BankingFix:cnl */
- WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
+ CTXWA_SET_BIT_MSK(HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
/* WaDisable3DMidCmdPreemption:cnl */
- WA_CLR_BIT_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
+ CTXWA_CLR_BIT_MSK(GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
/* WaDisableGPGPUMidCmdPreemption:cnl */
- WA_SET_FIELD_MASKED(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
+ CTXWA_SET_FIELD_MSK(GEN8_CS_CHICKEN1, GEN9_PREEMPT_GPGPU_LEVEL_MASK,
GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
return 0;
@@ -448,7 +448,7 @@ int i915_ctx_workarounds_init(struct drm_i915_private *dev_priv)
{
int err;
- dev_priv->workarounds.count = 0;
+ dev_priv->workarounds.ctx_wa_count = 0;
if (IS_BROADWELL(dev_priv))
err = bdw_ctx_workarounds_init(dev_priv);
@@ -472,7 +472,7 @@ int i915_ctx_workarounds_init(struct drm_i915_private *dev_priv)
return err;
DRM_DEBUG_DRIVER("Number of context specific w/a: %d\n",
- dev_priv->workarounds.count);
+ dev_priv->workarounds.ctx_wa_count);
return 0;
}
@@ -482,21 +482,21 @@ int i915_ctx_workarounds_emit(struct drm_i915_gem_request *req)
u32 *cs;
int ret, i;
- if (w->count == 0)
+ if (w->ctx_wa_count == 0)
return 0;
ret = req->engine->emit_flush(req, EMIT_BARRIER);
if (ret)
return ret;
- cs = intel_ring_begin(req, (w->count * 2 + 2));
+ cs = intel_ring_begin(req, (w->ctx_wa_count * 2 + 2));
if (IS_ERR(cs))
return PTR_ERR(cs);
- *cs++ = MI_LOAD_REGISTER_IMM(w->count);
- for (i = 0; i < w->count; i++) {
- *cs++ = i915_mmio_reg_offset(w->reg[i].addr);
- *cs++ = w->reg[i].value;
+ *cs++ = MI_LOAD_REGISTER_IMM(w->ctx_wa_count);
+ for (i = 0; i < w->ctx_wa_count; i++) {
+ *cs++ = i915_mmio_reg_offset(w->ctx_wa_reg[i].addr);
+ *cs++ = w->ctx_wa_reg[i].value;
}
*cs++ = MI_NOOP;
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time
2017-10-09 20:58 [RFC PATCH 00/11] Refactor HW workaround code Oscar Mateo
` (4 preceding siblings ...)
2017-10-09 20:58 ` [RFC PATCH 05/11] drm/i915: Rename saved workarounds to make it explicit that they are context WAs Oscar Mateo
@ 2017-10-09 20:58 ` Oscar Mateo
2017-10-09 20:58 ` [RFC PATCH 07/11] drm/i915: Save all Whitelist " Oscar Mateo
` (6 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Oscar Mateo @ 2017-10-09 20:58 UTC (permalink / raw)
To: intel-gfx
By doing this, we can dump these workarounds in debugfs for validation (which,
at the moment, we are only able to do for the contexts WAs).
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_drv.c | 5 +
drivers/gpu/drm/i915/i915_drv.h | 8 +-
drivers/gpu/drm/i915/i915_reg.h | 1 +
drivers/gpu/drm/i915/i915_workarounds.c | 335 ++++++++++++++++++--------------
drivers/gpu/drm/i915/i915_workarounds.h | 1 +
5 files changed, 202 insertions(+), 148 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 66fc156..7830073 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -49,6 +49,7 @@
#include "i915_drv.h"
#include "i915_trace.h"
#include "i915_vgpu.h"
+#include "i915_workarounds.h"
#include "intel_drv.h"
#include "intel_uc.h"
@@ -886,6 +887,10 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
BUG_ON(device_info->gen > sizeof(device_info->gen_mask) * BITS_PER_BYTE);
device_info->gen_mask = BIT(device_info->gen - 1);
+ ret = i915_mmio_workarounds_init(dev_priv);
+ if (ret < 0)
+ return ret;
+
spin_lock_init(&dev_priv->irq_lock);
spin_lock_init(&dev_priv->gpu_error.lock);
mutex_init(&dev_priv->backlight_lock);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0bbe128..b042078 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1954,12 +1954,16 @@ struct i915_wa_reg {
u32 mask;
};
-#define I915_MAX_WA_REGS 16
+#define I915_MAX_CTX_WA_REGS 16
+#define I915_MAX_MMIO_WA_REGS 32
struct i915_workarounds {
- struct i915_wa_reg ctx_wa_reg[I915_MAX_WA_REGS];
+ struct i915_wa_reg ctx_wa_reg[I915_MAX_CTX_WA_REGS];
u32 ctx_wa_count;
+ struct i915_wa_reg mmio_wa_reg[I915_MAX_MMIO_WA_REGS];
+ u32 mmio_wa_count;
+
u32 hw_whitelist_count[I915_NUM_ENGINES];
};
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 50e65c9..6b46393 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7042,6 +7042,7 @@ enum {
*/
#define L3_GENERAL_PRIO_CREDITS(x) (((x) >> 1) << 19)
#define L3_HIGH_PRIO_CREDITS(x) (((x) >> 1) << 14)
+#define L3_PRIO_CREDITS_MASK (0x1f << 19) | (0x1f << 14)
#define GEN7_L3CNTLREG1 _MMIO(0xB01C)
#define GEN7_WA_FOR_GEN7_L3_CONTROL 0x3C47FF8C
diff --git a/drivers/gpu/drm/i915/i915_workarounds.c b/drivers/gpu/drm/i915/i915_workarounds.c
index 21996e1..7f9e66a 100644
--- a/drivers/gpu/drm/i915/i915_workarounds.c
+++ b/drivers/gpu/drm/i915/i915_workarounds.c
@@ -31,7 +31,7 @@ static int ctx_wa_add(struct drm_i915_private *dev_priv,
{
const u32 idx = dev_priv->workarounds.ctx_wa_count;
- if (WARN_ON(idx >= I915_MAX_WA_REGS))
+ if (WARN_ON(idx >= I915_MAX_CTX_WA_REGS))
return -ENOSPC;
dev_priv->workarounds.ctx_wa_reg[idx].addr = addr;
@@ -509,15 +509,53 @@ int i915_ctx_workarounds_emit(struct drm_i915_gem_request *req)
return 0;
}
-static void bdw_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int mmio_wa_add(struct drm_i915_private *dev_priv,
+ i915_reg_t addr,
+ const u32 mask, const u32 val)
{
+ const u32 idx = dev_priv->workarounds.mmio_wa_count;
+
+ if (WARN_ON(idx >= I915_MAX_MMIO_WA_REGS))
+ return -ENOSPC;
+
+ dev_priv->workarounds.mmio_wa_reg[idx].addr = addr;
+ dev_priv->workarounds.mmio_wa_reg[idx].value = val;
+ dev_priv->workarounds.mmio_wa_reg[idx].mask = mask;
+
+ dev_priv->workarounds.mmio_wa_count++;
+
+ return 0;
+}
+
+#define MMIOWA_REG(addr, mask, val) do { \
+ const int r = mmio_wa_add(dev_priv, (addr), (mask), (val)); \
+ if (r) \
+ return r; \
+ } while (0)
+
+#define MMIOWA_SET_BIT(addr, mask) \
+ MMIOWA_REG(addr, (mask), (mask))
+
+#define MMIOWA_SET_BIT_MSK(addr, mask) \
+ CTXWA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
+
+#define MMIOWA_CLR_BIT(addr, mask) \
+ MMIOWA_REG(addr, (mask), 0)
+
+#define MMIOWA_SET_FIELD(addr, mask, value) \
+ MMIOWA_REG(addr, (mask), (value))
+
+static int bdw_mmio_workarounds_init(struct drm_i915_private *dev_priv)
+{
+ return 0;
}
-static void chv_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int chv_mmio_workarounds_init(struct drm_i915_private *dev_priv)
{
+ return 0;
}
-static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int gen9_mmio_workarounds_init(struct drm_i915_private *dev_priv)
{
if (HAS_LLC(dev_priv)) {
/* WaCompressedResourceSamplerPbeMediaNewHashMode:skl,kbl
@@ -525,10 +563,7 @@ static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
* Must match Display Engine. See
* WaCompressedResourceDisplayNewHashMode.
*/
- I915_WRITE(MMCD_MISC_CTRL,
- I915_READ(MMCD_MISC_CTRL) |
- MMCD_PCLA |
- MMCD_HOTSPOT_EN);
+ MMIOWA_SET_BIT(MMCD_MISC_CTRL, MMCD_PCLA | MMCD_HOTSPOT_EN);
/*
* WaCompressedResourceDisplayNewHashMode:skl,kbl
@@ -537,293 +572,301 @@ static void gen9_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
* Must match Sampler, Pixel Back End, and Media. See
* WaCompressedResourceSamplerPbeMediaNewHashMode.
*/
- I915_WRITE(CHICKEN_PAR1_1,
- I915_READ(CHICKEN_PAR1_1) |
- SKL_DE_COMPRESSED_HASH_MODE);
+ MMIOWA_SET_BIT(CHICKEN_PAR1_1, SKL_DE_COMPRESSED_HASH_MODE);
}
/* See Bspec note for PSR2_CTL bit 31, Wa#828:skl,bxt,kbl,cfl */
- I915_WRITE(CHICKEN_PAR1_1,
- I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
+ MMIOWA_SET_BIT(CHICKEN_PAR1_1, SKL_EDP_PSR_FIX_RDWRAP);
- I915_WRITE(GEN8_CONFIG0,
- I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
+ MMIOWA_SET_BIT(GEN8_CONFIG0, GEN9_DEFAULT_FIXES);
/* WaEnableChickenDCPR:skl,bxt,kbl,glk,cfl */
- I915_WRITE(GEN8_CHICKEN_DCPR_1,
- I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
+ MMIOWA_SET_BIT(GEN8_CHICKEN_DCPR_1, MASK_WAKEMEM);
/* WaFbcTurnOffFbcWatermark:skl,bxt,kbl,cfl */
/* WaFbcWakeMemOn:skl,bxt,kbl,glk,cfl */
- I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
- DISP_FBC_WM_DIS |
- DISP_FBC_MEMORY_WAKE);
+ MMIOWA_SET_BIT(DISP_ARB_CTL, DISP_FBC_WM_DIS | DISP_FBC_MEMORY_WAKE);
/* WaFbcHighMemBwCorruptionAvoidance:skl,bxt,kbl,cfl */
- I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
- ILK_DPFC_DISABLE_DUMMY0);
+ MMIOWA_SET_BIT(ILK_DPFC_CHICKEN, ILK_DPFC_DISABLE_DUMMY0);
/* WaContextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
- I915_WRITE(GEN9_CSFE_CHICKEN1_RCS,
- _MASKED_BIT_ENABLE(GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE));
+ MMIOWA_SET_BIT_MSK(GEN9_CSFE_CHICKEN1_RCS,
+ GEN9_PREEMPT_GPGPU_SYNC_SWITCH_DISABLE);
/* WaEnableLbsSlaRetryTimerDecrement:skl,bxt,kbl,glk,cfl */
- I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
- GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
+ MMIOWA_SET_BIT(BDW_SCRATCH1, GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
/* WaDisableKillLogic:bxt,skl,kbl */
if (!IS_COFFEELAKE(dev_priv))
- I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
- ECOCHK_DIS_TLB);
+ MMIOWA_SET_BIT(GAM_ECOCHK, ECOCHK_DIS_TLB);
/* WaDisableHDCInvalidation:skl,bxt,kbl,cfl */
- I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
- BDW_DISABLE_HDC_INVALIDATION);
+ MMIOWA_SET_BIT(GAM_ECOCHK, BDW_DISABLE_HDC_INVALIDATION);
/* WaOCLCoherentLineFlush:skl,bxt,kbl,cfl */
- I915_WRITE(GEN8_L3SQCREG4, (I915_READ(GEN8_L3SQCREG4) |
- GEN8_LQSC_FLUSH_COHERENT_LINES));
+ MMIOWA_SET_BIT(GEN8_L3SQCREG4, GEN8_LQSC_FLUSH_COHERENT_LINES);
/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
- I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
- _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+ MMIOWA_SET_BIT_MSK(GEN7_FF_SLICE_CS_CHICKEN1,
+ GEN9_FFSC_PERCTX_PREEMPT_CTRL);
+
+ return 0;
}
-static void skl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int skl_mmio_workarounds_init(struct drm_i915_private *dev_priv)
{
- gen9_mmio_workarounds_apply(dev_priv);
+ int ret;
+
+ ret = gen9_mmio_workarounds_init(dev_priv);
+ if (ret)
+ return ret;
/* WaDisableDopClockGating */
- I915_WRITE(GEN7_MISCCPCTL, I915_READ(GEN7_MISCCPCTL)
- & ~GEN7_DOP_CLOCK_GATE_ENABLE);
+ MMIOWA_CLR_BIT(GEN7_MISCCPCTL, GEN7_DOP_CLOCK_GATE_ENABLE);
/* WAC6entrylatency:skl */
- I915_WRITE(FBC_LLC_READ_CTRL, I915_READ(FBC_LLC_READ_CTRL) |
- FBC_LLC_FULLY_OPEN);
+ MMIOWA_SET_BIT(FBC_LLC_READ_CTRL, FBC_LLC_FULLY_OPEN);
/* WaFbcNukeOnHostModify:skl */
- I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
- ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
+ MMIOWA_SET_BIT(ILK_DPFC_CHICKEN, ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
/* WaEnableGapsTsvCreditFix:skl */
- I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
- GEN9_GAPS_TSV_CREDIT_DISABLE));
+ MMIOWA_SET_BIT(GEN8_GARBCNTL, GEN9_GAPS_TSV_CREDIT_DISABLE);
/* WaDisableGafsUnitClkGating:skl */
- I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
- GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+ MMIOWA_SET_BIT(GEN7_UCGCTL4, GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);
/* WaInPlaceDecompressionHang:skl */
if (IS_SKL_REVID(dev_priv, SKL_REVID_H0, REVID_FOREVER))
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+ MMIOWA_SET_BIT(GEN9_GAMT_ECO_REG_RW_IA,
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
+
+ return 0;
}
-static void bxt_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int bxt_mmio_workarounds_init(struct drm_i915_private *dev_priv)
{
- gen9_mmio_workarounds_apply(dev_priv);
+ int ret;
+
+ ret = gen9_mmio_workarounds_init(dev_priv);
+ if (ret)
+ return ret;
/* WaDisableSDEUnitClockGating:bxt */
- I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
- GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
+ MMIOWA_SET_BIT(GEN8_UCGCTL6, GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
/*
* FIXME:
* GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ applies on 3x6 GT SKUs only.
*/
- I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
- GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ);
+ MMIOWA_SET_BIT(GEN8_UCGCTL6, GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ);
/*
* Wa: Backlight PWM may stop in the asserted state, causing backlight
* to stay fully on.
*/
- I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
- PWM1_GATING_DIS | PWM2_GATING_DIS);
+ MMIOWA_SET_BIT(GEN9_CLKGATE_DIS_0, PWM1_GATING_DIS | PWM2_GATING_DIS);
/* WaStoreMultiplePTEenable:bxt */
/* This is a requirement according to Hardware specification */
if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1))
- I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
+ MMIOWA_SET_BIT(TILECTL, TILECTL_TLBPF);
/* WaSetClckGatingDisableMedia:bxt */
if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
- I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
- ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
+ MMIOWA_CLR_BIT(GEN7_MISCCPCTL, GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE);
}
/* WaDisablePooledEuLoadBalancingFix:bxt */
if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER)) {
- I915_WRITE(FF_SLICE_CS_CHICKEN2,
- _MASKED_BIT_ENABLE(GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE));
+ MMIOWA_SET_BIT_MSK(FF_SLICE_CS_CHICKEN2,
+ GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE);
}
/* WaProgramL3SqcReg1DefaultForPerf:bxt */
if (IS_BXT_REVID(dev_priv, BXT_REVID_B0, REVID_FOREVER))
- I915_WRITE(GEN8_L3SQCREG1, L3_GENERAL_PRIO_CREDITS(62) |
- L3_HIGH_PRIO_CREDITS(2));
+ MMIOWA_SET_FIELD(GEN8_L3SQCREG1, L3_PRIO_CREDITS_MASK,
+ L3_GENERAL_PRIO_CREDITS(62) |
+ L3_HIGH_PRIO_CREDITS(2));
/* WaInPlaceDecompressionHang:bxt */
if (IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+ MMIOWA_SET_BIT(GEN9_GAMT_ECO_REG_RW_IA,
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
+
+ return 0;
}
-static void kbl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int kbl_mmio_workarounds_init(struct drm_i915_private *dev_priv)
{
- gen9_mmio_workarounds_apply(dev_priv);
+ int ret;
+
+ ret = gen9_mmio_workarounds_init(dev_priv);
+ if (ret)
+ return ret;
/* WaDisableSDEUnitClockGating:kbl */
if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
- I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
- GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
+ MMIOWA_SET_BIT(GEN8_UCGCTL6, GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
/* WaDisableGamClockGating:kbl */
if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
- I915_WRITE(GEN6_UCGCTL1, I915_READ(GEN6_UCGCTL1) |
- GEN6_GAMUNIT_CLOCK_GATE_DISABLE);
+ MMIOWA_SET_BIT(GEN6_UCGCTL1, GEN6_GAMUNIT_CLOCK_GATE_DISABLE);
/* WaFbcNukeOnHostModify:kbl */
- I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
- ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
+ MMIOWA_SET_BIT(ILK_DPFC_CHICKEN, ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
/* WaEnableGapsTsvCreditFix:kbl */
- I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
- GEN9_GAPS_TSV_CREDIT_DISABLE));
+ MMIOWA_SET_BIT(GEN8_GARBCNTL, GEN9_GAPS_TSV_CREDIT_DISABLE);
/* WaDisableDynamicCreditSharing:kbl */
if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_B0))
- I915_WRITE(GAMT_CHKN_BIT_REG,
- (I915_READ(GAMT_CHKN_BIT_REG) |
- GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING));
+ MMIOWA_SET_BIT(GAMT_CHKN_BIT_REG,
+ GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING);
/* WaDisableGafsUnitClkGating:kbl */
- I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
- GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+ MMIOWA_SET_BIT(GEN7_UCGCTL4, GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);
/* WaInPlaceDecompressionHang:kbl */
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+ MMIOWA_SET_BIT(GEN9_GAMT_ECO_REG_RW_IA,
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
+
+ return 0;
}
-static void glk_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int glk_mmio_workarounds_init(struct drm_i915_private *dev_priv)
{
- u32 val;
+ int ret;
- gen9_mmio_workarounds_apply(dev_priv);
+ ret = gen9_mmio_workarounds_init(dev_priv);
+ if (ret)
+ return ret;
/*
* WaDisablePWMClockGating:glk
* Backlight PWM may stop in the asserted state, causing backlight
* to stay fully on.
*/
- I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
- PWM1_GATING_DIS | PWM2_GATING_DIS);
+ MMIOWA_SET_BIT(GEN9_CLKGATE_DIS_0, PWM1_GATING_DIS | PWM2_GATING_DIS);
/* WaDDIIOTimeout:glk */
- if (IS_GLK_REVID(dev_priv, 0, GLK_REVID_A1)) {
- u32 val = I915_READ(CHICKEN_MISC_2);
- val &= ~(GLK_CL0_PWR_DOWN |
- GLK_CL1_PWR_DOWN |
- GLK_CL2_PWR_DOWN);
- I915_WRITE(CHICKEN_MISC_2, val);
- }
+ if (IS_GLK_REVID(dev_priv, 0, GLK_REVID_A1))
+ MMIOWA_CLR_BIT(CHICKEN_MISC_2,
+ GLK_CL0_PWR_DOWN |
+ GLK_CL1_PWR_DOWN |
+ GLK_CL2_PWR_DOWN);
/* Display WA #1133: WaFbcSkipSegments:glk */
- val = I915_READ(ILK_DPFC_CHICKEN);
- val &= ~GLK_SKIP_SEG_COUNT_MASK;
- val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
- I915_WRITE(ILK_DPFC_CHICKEN, val);
+ MMIOWA_SET_FIELD(ILK_DPFC_CHICKEN, GLK_SKIP_SEG_COUNT_MASK,
+ GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1));
+
+ return 0;
}
-static void cfl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int cfl_mmio_workarounds_init(struct drm_i915_private *dev_priv)
{
- gen9_mmio_workarounds_apply(dev_priv);
+ int ret;
+
+ ret = gen9_mmio_workarounds_init(dev_priv);
+ if (ret)
+ return ret;
/* WaFbcNukeOnHostModify:cfl */
- I915_WRITE(ILK_DPFC_CHICKEN,
- I915_READ(ILK_DPFC_CHICKEN) |
- ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
+ MMIOWA_SET_BIT(ILK_DPFC_CHICKEN, ILK_DPFC_NUKE_ON_ANY_MODIFICATION);
/* WaEnableGapsTsvCreditFix:cfl */
- I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
- GEN9_GAPS_TSV_CREDIT_DISABLE));
+ MMIOWA_SET_BIT(GEN8_GARBCNTL, GEN9_GAPS_TSV_CREDIT_DISABLE);
/* WaDisableGafsUnitClkGating:cfl */
- I915_WRITE(GEN7_UCGCTL4, (I915_READ(GEN7_UCGCTL4) |
- GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE));
+ MMIOWA_SET_BIT(GEN7_UCGCTL4, GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);
/* WaInPlaceDecompressionHang:cfl */
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+ MMIOWA_SET_BIT(GEN9_GAMT_ECO_REG_RW_IA,
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
+
+ return 0;
}
-static void cnl_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+static int cnl_mmio_workarounds_init(struct drm_i915_private *dev_priv)
{
- u32 val;
-
/* This is not an Wa. Enable for better image quality */
- I915_WRITE(_3D_CHICKEN3,
- _MASKED_BIT_ENABLE(_3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE));
+ MMIOWA_SET_BIT_MSK(_3D_CHICKEN3, _3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE);
/* WaEnableChickenDCPR:cnl */
- I915_WRITE(GEN8_CHICKEN_DCPR_1,
- I915_READ(GEN8_CHICKEN_DCPR_1) | MASK_WAKEMEM);
+ MMIOWA_SET_BIT(GEN8_CHICKEN_DCPR_1, MASK_WAKEMEM);
/* WaFbcWakeMemOn:cnl */
- I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
- DISP_FBC_MEMORY_WAKE);
+ MMIOWA_SET_BIT(DISP_ARB_CTL, DISP_FBC_MEMORY_WAKE);
/* WaSarbUnitClockGatingDisable:cnl (pre-prod) */
if (IS_CNL_REVID(dev_priv, CNL_REVID_A0, CNL_REVID_B0))
- I915_WRITE(SLICE_UNIT_LEVEL_CLKGATE,
- I915_READ(SLICE_UNIT_LEVEL_CLKGATE) |
- SARBUNIT_CLKGATE_DIS);
+ MMIOWA_SET_BIT(SLICE_UNIT_LEVEL_CLKGATE, SARBUNIT_CLKGATE_DIS);
/* Display WA #1133: WaFbcSkipSegments:cnl */
- val = I915_READ(ILK_DPFC_CHICKEN);
- val &= ~GLK_SKIP_SEG_COUNT_MASK;
- val |= GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1);
- I915_WRITE(ILK_DPFC_CHICKEN, val);
+ MMIOWA_SET_FIELD(ILK_DPFC_CHICKEN, GLK_SKIP_SEG_COUNT_MASK,
+ GLK_SKIP_SEG_EN | GLK_SKIP_SEG_COUNT(1));
/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0))
- I915_WRITE(GAMT_CHKN_BIT_REG,
- (I915_READ(GAMT_CHKN_BIT_REG) |
- GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT));
+ MMIOWA_SET_BIT(GAMT_CHKN_BIT_REG,
+ GAMT_CHKN_DISABLE_I2M_CYCLE_ON_WR_PORT);
/* WaInPlaceDecompressionHang:cnl */
- I915_WRITE(GEN9_GAMT_ECO_REG_RW_IA,
- (I915_READ(GEN9_GAMT_ECO_REG_RW_IA) |
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS));
+ MMIOWA_SET_BIT(GEN9_GAMT_ECO_REG_RW_IA,
+ GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
/* WaEnablePreemptionGranularityControlByUMD:cnl */
- I915_WRITE(GEN7_FF_SLICE_CS_CHICKEN1,
- _MASKED_BIT_ENABLE(GEN9_FFSC_PERCTX_PREEMPT_CTRL));
+ MMIOWA_SET_BIT_MSK(GEN7_FF_SLICE_CS_CHICKEN1,
+ GEN9_FFSC_PERCTX_PREEMPT_CTRL);
+
+ return 0;
}
-void i915_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+int i915_mmio_workarounds_init(struct drm_i915_private *dev_priv)
{
+ int err;
+
+ dev_priv->workarounds.mmio_wa_count = 0;
+
if (IS_BROADWELL(dev_priv))
- bdw_mmio_workarounds_apply(dev_priv);
+ err = bdw_mmio_workarounds_init(dev_priv);
else if (IS_CHERRYVIEW(dev_priv))
- chv_mmio_workarounds_apply(dev_priv);
+ err = chv_mmio_workarounds_init(dev_priv);
else if (IS_SKYLAKE(dev_priv))
- skl_mmio_workarounds_apply(dev_priv);
+ err = skl_mmio_workarounds_init(dev_priv);
else if (IS_BROXTON(dev_priv))
- bxt_mmio_workarounds_apply(dev_priv);
+ err = bxt_mmio_workarounds_init(dev_priv);
else if (IS_KABYLAKE(dev_priv))
- kbl_mmio_workarounds_apply(dev_priv);
+ err = kbl_mmio_workarounds_init(dev_priv);
else if (IS_GEMINILAKE(dev_priv))
- glk_mmio_workarounds_apply(dev_priv);
+ err = glk_mmio_workarounds_init(dev_priv);
else if (IS_COFFEELAKE(dev_priv))
- cfl_mmio_workarounds_apply(dev_priv);
+ err = cfl_mmio_workarounds_init(dev_priv);
else if (IS_CANNONLAKE(dev_priv))
- cnl_mmio_workarounds_apply(dev_priv);
+ err = cnl_mmio_workarounds_init(dev_priv);
+ else
+ err = 0;
+ if (err)
+ return err;
+
+ DRM_DEBUG_DRIVER("Number of MMIO w/a: %d\n",
+ dev_priv->workarounds.mmio_wa_count);
+ return 0;
+}
+
+void i915_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
+{
+ struct i915_workarounds *w = &dev_priv->workarounds;
+ int i;
+
+ for (i = 0; i < w->mmio_wa_count; i++) {
+ i915_reg_t addr = w->mmio_wa_reg[i].addr;
+ u32 value = w->mmio_wa_reg[i].value;
+ u32 mask = w->mmio_wa_reg[i].mask;
+
+ I915_WRITE(addr, (I915_READ(addr) & ~mask) | value);
+ }
}
static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
diff --git a/drivers/gpu/drm/i915/i915_workarounds.h b/drivers/gpu/drm/i915/i915_workarounds.h
index 3944432..1ffe13a 100644
--- a/drivers/gpu/drm/i915/i915_workarounds.h
+++ b/drivers/gpu/drm/i915/i915_workarounds.h
@@ -28,6 +28,7 @@
int i915_ctx_workarounds_init(struct drm_i915_private *dev_priv);
int i915_ctx_workarounds_emit(struct drm_i915_gem_request *req);
+int i915_mmio_workarounds_init(struct drm_i915_private *dev_priv);
void i915_mmio_workarounds_apply(struct drm_i915_private *dev_priv);
int i915_whitelist_workarounds_apply(struct intel_engine_cs *engine);
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 07/11] drm/i915: Save all Whitelist WAs and apply them at a later time
2017-10-09 20:58 [RFC PATCH 00/11] Refactor HW workaround code Oscar Mateo
` (5 preceding siblings ...)
2017-10-09 20:58 ` [RFC PATCH 06/11] drm/i915: Save all MMIO WAs and apply them at a later time Oscar Mateo
@ 2017-10-09 20:58 ` Oscar Mateo
2017-10-09 20:58 ` [RFC PATCH 08/11] drm/i915: Print all workaround types correctly in debugfs Oscar Mateo
` (5 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Oscar Mateo @ 2017-10-09 20:58 UTC (permalink / raw)
To: intel-gfx
Same as we have been doing for other types, this allow us to dump
the whole list of workarounds to debugs, for validation purposes.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 2 +-
drivers/gpu/drm/i915/i915_drv.h | 3 +-
drivers/gpu/drm/i915/i915_workarounds.c | 113 ++++++++++++++++----------------
drivers/gpu/drm/i915/i915_workarounds.h | 3 +-
drivers/gpu/drm/i915/intel_lrc.c | 8 ++-
5 files changed, 67 insertions(+), 62 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a204896..bfb0648 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3560,7 +3560,7 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
seq_printf(m, "Context workarounds applied: %d\n", workarounds->ctx_wa_count);
for_each_engine(engine, dev_priv, id)
seq_printf(m, "HW whitelist count for %s: %d\n",
- engine->name, workarounds->hw_whitelist_count[id]);
+ engine->name, workarounds->whitelist_wa_count[id]);
for (i = 0; i < workarounds->ctx_wa_count; ++i) {
i915_reg_t addr;
u32 mask, value, read;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b042078..9359231 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1964,7 +1964,8 @@ struct i915_workarounds {
struct i915_wa_reg mmio_wa_reg[I915_MAX_MMIO_WA_REGS];
u32 mmio_wa_count;
- u32 hw_whitelist_count[I915_NUM_ENGINES];
+ struct i915_wa_reg whitelist_wa_reg[I915_NUM_ENGINES][RING_MAX_NONPRIV_SLOTS];
+ u32 whitelist_wa_count[I915_NUM_ENGINES];
};
struct i915_virtual_gpu {
diff --git a/drivers/gpu/drm/i915/i915_workarounds.c b/drivers/gpu/drm/i915/i915_workarounds.c
index 7f9e66a..5f7eb7a 100644
--- a/drivers/gpu/drm/i915/i915_workarounds.c
+++ b/drivers/gpu/drm/i915/i915_workarounds.c
@@ -869,64 +869,64 @@ void i915_mmio_workarounds_apply(struct drm_i915_private *dev_priv)
}
}
-static int wa_ring_whitelist_reg(struct intel_engine_cs *engine,
- i915_reg_t reg)
+static int whitelist_wa_add(struct intel_engine_cs *engine,
+ i915_reg_t reg)
{
struct drm_i915_private *dev_priv = engine->i915;
struct i915_workarounds *wa = &dev_priv->workarounds;
- const uint32_t index = wa->hw_whitelist_count[engine->id];
+ const uint32_t index = wa->whitelist_wa_count[engine->id];
if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
return -EINVAL;
- I915_WRITE(RING_FORCE_TO_NONPRIV(engine->mmio_base, index),
- i915_mmio_reg_offset(reg));
- wa->hw_whitelist_count[engine->id]++;
+ wa->whitelist_wa_reg[engine->id][index].addr =
+ RING_FORCE_TO_NONPRIV(engine->mmio_base, index);
+ wa->whitelist_wa_reg[engine->id][index].value = i915_mmio_reg_offset(reg);
+ wa->whitelist_wa_reg[engine->id][index].mask = 0xffffffff;
+
+ wa->whitelist_wa_count[engine->id]++;
return 0;
}
-static int gen9_whitelist_workarounds_apply(struct intel_engine_cs *engine)
-{
- int ret;
+#define WHITELISTWA_REG(engine, reg) do { \
+ const int r = whitelist_wa_add(engine, reg); \
+ if (r) \
+ return r; \
+} while (0)
+static int gen9_whitelist_workarounds_init(struct intel_engine_cs *engine)
+{
/* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
- ret = wa_ring_whitelist_reg(engine, GEN9_CTX_PREEMPT_REG);
- if (ret)
- return ret;
+ WHITELISTWA_REG(engine, GEN9_CTX_PREEMPT_REG);
/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,[cnl] */
- ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
- if (ret)
- return ret;
+ WHITELISTWA_REG(engine, GEN8_CS_CHICKEN1);
/* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
- ret = wa_ring_whitelist_reg(engine, GEN8_HDC_CHICKEN1);
- if (ret)
- return ret;
+ WHITELISTWA_REG(engine, GEN8_HDC_CHICKEN1);
return 0;
}
-static int skl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+static int skl_whitelist_workarounds_init(struct intel_engine_cs *engine)
{
- int ret = gen9_whitelist_workarounds_apply(engine);
+ int ret = gen9_whitelist_workarounds_init(engine);
if (ret)
return ret;
/* WaDisableLSQCROPERFforOCL:skl */
- ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
- if (ret)
- return ret;
+ WHITELISTWA_REG(engine, GEN8_L3SQCREG4);
return 0;
}
-static int bxt_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+static int bxt_whitelist_workarounds_init(struct intel_engine_cs *engine)
{
struct drm_i915_private *dev_priv = engine->i915;
+ int ret;
- int ret = gen9_whitelist_workarounds_apply(engine);
+ ret = gen9_whitelist_workarounds_init(engine);
if (ret)
return ret;
@@ -935,89 +935,90 @@ static int bxt_whitelist_workarounds_apply(struct intel_engine_cs *engine)
/* WaDisableObjectLevelPreemtionForInstanceId:bxt */
/* WaDisableLSQCROPERFforOCL:bxt */
if (IS_BXT_REVID(dev_priv, 0, BXT_REVID_A1)) {
- ret = wa_ring_whitelist_reg(engine, GEN9_CS_DEBUG_MODE1);
- if (ret)
- return ret;
-
- ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
- if (ret)
- return ret;
+ WHITELISTWA_REG(engine, GEN9_CS_DEBUG_MODE1);
+ WHITELISTWA_REG(engine, GEN8_L3SQCREG4);
}
return 0;
}
-static int kbl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+static int kbl_whitelist_workarounds_init(struct intel_engine_cs *engine)
{
- int ret = gen9_whitelist_workarounds_apply(engine);
+ int ret = gen9_whitelist_workarounds_init(engine);
if (ret)
return ret;
/* WaDisableLSQCROPERFforOCL:kbl */
- ret = wa_ring_whitelist_reg(engine, GEN8_L3SQCREG4);
- if (ret)
- return ret;
+ WHITELISTWA_REG(engine, GEN8_L3SQCREG4);
return 0;
}
-static int glk_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+static int glk_whitelist_workarounds_init(struct intel_engine_cs *engine)
{
- int ret = gen9_whitelist_workarounds_apply(engine);
+ int ret = gen9_whitelist_workarounds_init(engine);
if (ret)
return ret;
return 0;
}
-static int cfl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+static int cfl_whitelist_workarounds_init(struct intel_engine_cs *engine)
{
- int ret = gen9_whitelist_workarounds_apply(engine);
+ int ret = gen9_whitelist_workarounds_init(engine);
if (ret)
return ret;
return 0;
}
-static int cnl_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+static int cnl_whitelist_workarounds_init(struct intel_engine_cs *engine)
{
- int ret;
-
/* WaEnablePreemptionGranularityControlByUMD:cnl */
- ret = wa_ring_whitelist_reg(engine, GEN8_CS_CHICKEN1);
- if (ret)
- return ret;
+ WHITELISTWA_REG(engine, GEN8_CS_CHICKEN1);
return 0;
}
-int i915_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+int i915_whitelist_workarounds_init(struct intel_engine_cs *engine)
{
struct drm_i915_private *dev_priv = engine->i915;
int err;
WARN_ON(engine->id != RCS);
- dev_priv->workarounds.hw_whitelist_count[engine->id] = 0;
+ dev_priv->workarounds.whitelist_wa_count[engine->id] = 0;
if (IS_SKYLAKE(dev_priv))
- err = skl_whitelist_workarounds_apply(engine);
+ err = skl_whitelist_workarounds_init(engine);
else if (IS_BROXTON(dev_priv))
- err = bxt_whitelist_workarounds_apply(engine);
+ err = bxt_whitelist_workarounds_init(engine);
else if (IS_KABYLAKE(dev_priv))
- err = kbl_whitelist_workarounds_apply(engine);
+ err = kbl_whitelist_workarounds_init(engine);
else if (IS_GEMINILAKE(dev_priv))
- err = glk_whitelist_workarounds_apply(engine);
+ err = glk_whitelist_workarounds_init(engine);
else if (IS_COFFEELAKE(dev_priv))
- err = cfl_whitelist_workarounds_apply(engine);
+ err = cfl_whitelist_workarounds_init(engine);
else if (IS_CANNONLAKE(dev_priv))
- err = cnl_whitelist_workarounds_apply(engine);
+ err = cnl_whitelist_workarounds_init(engine);
else
err = 0;
if (err)
return err;
DRM_DEBUG_DRIVER("%s: Number of whitelist w/a: %d\n", engine->name,
- dev_priv->workarounds.hw_whitelist_count[engine->id]);
+ dev_priv->workarounds.whitelist_wa_count[engine->id]);
return 0;
}
+
+void i915_whitelist_workarounds_apply(struct intel_engine_cs *engine)
+{
+ struct drm_i915_private *dev_priv = engine->i915;
+ struct i915_workarounds *w = &dev_priv->workarounds;
+ int i;
+
+ for (i = 0; i < w->whitelist_wa_count[engine->id]; i++) {
+ I915_WRITE(w->whitelist_wa_reg[engine->id][i].addr,
+ w->whitelist_wa_reg[engine->id][i].value);
+ }
+}
diff --git a/drivers/gpu/drm/i915/i915_workarounds.h b/drivers/gpu/drm/i915/i915_workarounds.h
index 1ffe13a..805d246 100644
--- a/drivers/gpu/drm/i915/i915_workarounds.h
+++ b/drivers/gpu/drm/i915/i915_workarounds.h
@@ -31,6 +31,7 @@
int i915_mmio_workarounds_init(struct drm_i915_private *dev_priv);
void i915_mmio_workarounds_apply(struct drm_i915_private *dev_priv);
-int i915_whitelist_workarounds_apply(struct intel_engine_cs *engine);
+int i915_whitelist_workarounds_init(struct intel_engine_cs *engine);
+void i915_whitelist_workarounds_apply(struct intel_engine_cs *engine);
#endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4816dd6..0fe0f3e 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1499,9 +1499,7 @@ static int gen9_init_render_ring(struct intel_engine_cs *engine)
if (ret)
return ret;
- ret = i915_whitelist_workarounds_apply(engine);
- if (ret)
- return ret;
+ i915_whitelist_workarounds_apply(engine);
return 0;
}
@@ -1988,6 +1986,10 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
if (ret)
return ret;
+ ret = i915_whitelist_workarounds_init(engine);
+ if (ret)
+ return ret;
+
ret = intel_init_workaround_bb(engine);
if (ret) {
/*
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 08/11] drm/i915: Print all workaround types correctly in debugfs
2017-10-09 20:58 [RFC PATCH 00/11] Refactor HW workaround code Oscar Mateo
` (6 preceding siblings ...)
2017-10-09 20:58 ` [RFC PATCH 07/11] drm/i915: Save all Whitelist " Oscar Mateo
@ 2017-10-09 20:58 ` Oscar Mateo
2017-10-09 20:58 ` [RFC PATCH 09/11] drm/i915: Move WA BB stuff to the workarounds file as well Oscar Mateo
` (4 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Oscar Mateo @ 2017-10-09 20:58 UTC (permalink / raw)
To: intel-gfx
Let's try to make sure that all WAs are applied correctly and survive
resumes, resets, etc... (with some help from a companion i-g-t patch).
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 48 ++++++++++++++++++++++++++-----------
1 file changed, 34 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index bfb0648..453ca57 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3541,6 +3541,20 @@ static int i915_shared_dplls_info(struct seq_file *m, void *unused)
return 0;
}
+static void check_wa_register(struct seq_file *m, struct i915_wa_reg *wa_reg)
+{
+ struct drm_i915_private *dev_priv = node_to_i915(m->private);
+ u32 read;
+ bool ok;
+
+ read = I915_READ(wa_reg->addr);
+ ok = (wa_reg->value & wa_reg->mask) == (read & wa_reg->mask);
+ seq_printf(m, "0x%X: 0x%08x, mask: 0x%08x, read: 0x%08x, status: %s\n",
+ i915_mmio_reg_offset(wa_reg->addr),
+ wa_reg->value, wa_reg->mask, read,
+ ok ? "OK" : "FAIL");
+}
+
static int i915_wa_registers(struct seq_file *m, void *unused)
{
int i;
@@ -3550,6 +3564,7 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
struct drm_device *dev = &dev_priv->drm;
struct i915_workarounds *workarounds = &dev_priv->workarounds;
enum intel_engine_id id;
+ u32 whitelist_wa_count = 0;
ret = mutex_lock_interruptible(&dev->struct_mutex);
if (ret)
@@ -3558,22 +3573,27 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
intel_runtime_pm_get(dev_priv);
seq_printf(m, "Context workarounds applied: %d\n", workarounds->ctx_wa_count);
- for_each_engine(engine, dev_priv, id)
- seq_printf(m, "HW whitelist count for %s: %d\n",
- engine->name, workarounds->whitelist_wa_count[id]);
for (i = 0; i < workarounds->ctx_wa_count; ++i) {
- i915_reg_t addr;
- u32 mask, value, read;
- bool ok;
-
- addr = workarounds->ctx_wa_reg[i].addr;
- mask = workarounds->ctx_wa_reg[i].mask;
- value = workarounds->ctx_wa_reg[i].value;
- read = I915_READ(addr);
- ok = (value & mask) == (read & mask);
- seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s\n",
- i915_mmio_reg_offset(addr), value, mask, read, ok ? "OK" : "FAIL");
+ struct i915_wa_reg *wa_reg = &workarounds->ctx_wa_reg[i];
+
+ seq_printf(m, "0x%X: 0x%08x, mask: 0x%08x\n",
+ i915_mmio_reg_offset(wa_reg->addr),
+ wa_reg->value, wa_reg->mask);
}
+ seq_putc(m, '\n');
+
+ seq_printf(m, "MMIO workarounds applied: %d\n", workarounds->mmio_wa_count);
+ for (i = 0; i < workarounds->mmio_wa_count; ++i)
+ check_wa_register(m, &workarounds->mmio_wa_reg[i]);
+ seq_putc(m, '\n');
+
+ for_each_engine(engine, dev_priv, id)
+ whitelist_wa_count += workarounds->whitelist_wa_count[id];
+ seq_printf(m, "Whitelist workarounds applied: %d\n", whitelist_wa_count);
+ for_each_engine(engine, dev_priv, id)
+ for (i = 0; i < workarounds->whitelist_wa_count[id]; ++i)
+ check_wa_register(m, &workarounds->whitelist_wa_reg[id][i]);
+ seq_putc(m, '\n');
intel_runtime_pm_put(dev_priv);
mutex_unlock(&dev->struct_mutex);
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 09/11] drm/i915: Move WA BB stuff to the workarounds file as well
2017-10-09 20:58 [RFC PATCH 00/11] Refactor HW workaround code Oscar Mateo
` (7 preceding siblings ...)
2017-10-09 20:58 ` [RFC PATCH 08/11] drm/i915: Print all workaround types correctly in debugfs Oscar Mateo
@ 2017-10-09 20:58 ` Oscar Mateo
2017-10-09 20:58 ` [RFC PATCH 10/11] drm/i915: Document the i915_workarounds file Oscar Mateo
` (3 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Oscar Mateo @ 2017-10-09 20:58 UTC (permalink / raw)
To: intel-gfx
Since we are trying to put all WA stuff together, do not forget about the BB WAs.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_workarounds.c | 254 ++++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_workarounds.h | 3 +
drivers/gpu/drm/i915/intel_lrc.c | 253 +------------------------------
3 files changed, 259 insertions(+), 251 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_workarounds.c b/drivers/gpu/drm/i915/i915_workarounds.c
index 5f7eb7a..ff30525 100644
--- a/drivers/gpu/drm/i915/i915_workarounds.c
+++ b/drivers/gpu/drm/i915/i915_workarounds.c
@@ -1022,3 +1022,257 @@ void i915_whitelist_workarounds_apply(struct intel_engine_cs *engine)
w->whitelist_wa_reg[engine->id][i].value);
}
}
+
+/*
+ * In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after
+ * PIPE_CONTROL instruction. This is required for the flush to happen correctly
+ * but there is a slight complication as this is applied in WA batch where the
+ * values are only initialized once so we cannot take register value at the
+ * beginning and reuse it further; hence we save its value to memory, upload a
+ * constant value with bit21 set and then we restore it back with the saved value.
+ * To simplify the WA, a constant value is formed by using the default value
+ * of this register. This shouldn't be a problem because we are only modifying
+ * it for a short period and this batch in non-premptible. We can ofcourse
+ * use additional instructions that read the actual value of the register
+ * at that time and set our bit of interest but it makes the WA complicated.
+ *
+ * This WA is also required for Gen9 so extracting as a function avoids
+ * code duplication.
+ */
+static u32 *
+gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
+{
+ *batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
+ *batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+ *batch++ = i915_ggtt_offset(engine->scratch) + 256;
+ *batch++ = 0;
+
+ *batch++ = MI_LOAD_REGISTER_IMM(1);
+ *batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+ *batch++ = 0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES;
+
+ batch = gen8_emit_pipe_control(batch,
+ PIPE_CONTROL_CS_STALL |
+ PIPE_CONTROL_DC_FLUSH_ENABLE,
+ 0);
+
+ *batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
+ *batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+ *batch++ = i915_ggtt_offset(engine->scratch) + 256;
+ *batch++ = 0;
+
+ return batch;
+}
+
+/*
+ * Typically we only have one indirect_ctx and per_ctx batch buffer which are
+ * initialized at the beginning and shared across all contexts but this field
+ * helps us to have multiple batches at different offsets and select them based
+ * on a criteria. At the moment this batch always start at the beginning of the page
+ * and at this point we don't have multiple wa_ctx batch buffers.
+ *
+ * The number of WA applied are not known at the beginning; we use this field
+ * to return the no of DWORDS written.
+ *
+ * It is to be noted that this batch does not contain MI_BATCH_BUFFER_END
+ * so it adds NOOPs as padding to make it cacheline aligned.
+ * MI_BATCH_BUFFER_END will be added to perctx batch and both of them together
+ * makes a complete batch buffer.
+ */
+static u32 *gen8_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
+{
+ /* WaDisableCtxRestoreArbitration:bdw,chv */
+ *batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
+
+ /* WaFlushCoherentL3CacheLinesAtContextSwitch:bdw */
+ if (IS_BROADWELL(engine->i915))
+ batch = gen8_emit_flush_coherentl3_wa(engine, batch);
+
+ /* WaClearSlmSpaceAtContextSwitch:bdw,chv */
+ /* Actual scratch location is at 128 bytes offset */
+ batch = gen8_emit_pipe_control(batch,
+ PIPE_CONTROL_FLUSH_L3 |
+ PIPE_CONTROL_GLOBAL_GTT_IVB |
+ PIPE_CONTROL_CS_STALL |
+ PIPE_CONTROL_QW_WRITE,
+ i915_ggtt_offset(engine->scratch) +
+ 2 * CACHELINE_BYTES);
+
+ *batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+
+ /* Pad to end of cacheline */
+ while ((unsigned long)batch % CACHELINE_BYTES)
+ *batch++ = MI_NOOP;
+
+ /*
+ * MI_BATCH_BUFFER_END is not required in Indirect ctx BB because
+ * execution depends on the length specified in terms of cache lines
+ * in the register CTX_RCS_INDIRECT_CTX
+ */
+
+ return batch;
+}
+
+static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
+{
+ *batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
+
+ /* WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt,glk */
+ batch = gen8_emit_flush_coherentl3_wa(engine, batch);
+
+ /* WaDisableGatherAtSetShaderCommonSlice:skl,bxt,kbl,glk */
+ *batch++ = MI_LOAD_REGISTER_IMM(1);
+ *batch++ = i915_mmio_reg_offset(COMMON_SLICE_CHICKEN2);
+ *batch++ = _MASKED_BIT_DISABLE(
+ GEN9_DISABLE_GATHER_AT_SET_SHADER_COMMON_SLICE);
+ *batch++ = MI_NOOP;
+
+ /* WaClearSlmSpaceAtContextSwitch:kbl */
+ /* Actual scratch location is at 128 bytes offset */
+ if (IS_KBL_REVID(engine->i915, 0, KBL_REVID_A0)) {
+ batch = gen8_emit_pipe_control(batch,
+ PIPE_CONTROL_FLUSH_L3 |
+ PIPE_CONTROL_GLOBAL_GTT_IVB |
+ PIPE_CONTROL_CS_STALL |
+ PIPE_CONTROL_QW_WRITE,
+ i915_ggtt_offset(engine->scratch)
+ + 2 * CACHELINE_BYTES);
+ }
+
+ /* WaMediaPoolStateCmdInWABB:bxt,glk */
+ if (HAS_POOLED_EU(engine->i915)) {
+ /*
+ * EU pool configuration is setup along with golden context
+ * during context initialization. This value depends on
+ * device type (2x6 or 3x6) and needs to be updated based
+ * on which subslice is disabled especially for 2x6
+ * devices, however it is safe to load default
+ * configuration of 3x6 device instead of masking off
+ * corresponding bits because HW ignores bits of a disabled
+ * subslice and drops down to appropriate config. Please
+ * see render_state_setup() in i915_gem_render_state.c for
+ * possible configurations, to avoid duplication they are
+ * not shown here again.
+ */
+ *batch++ = GEN9_MEDIA_POOL_STATE;
+ *batch++ = GEN9_MEDIA_POOL_ENABLE;
+ *batch++ = 0x00777000;
+ *batch++ = 0;
+ *batch++ = 0;
+ *batch++ = 0;
+ }
+
+ *batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+
+ /* Pad to end of cacheline */
+ while ((unsigned long)batch % CACHELINE_BYTES)
+ *batch++ = MI_NOOP;
+
+ return batch;
+}
+
+#define CTX_WA_BB_OBJ_SIZE (PAGE_SIZE)
+
+static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
+{
+ struct drm_i915_gem_object *obj;
+ struct i915_vma *vma;
+ int err;
+
+ obj = i915_gem_object_create(engine->i915, CTX_WA_BB_OBJ_SIZE);
+ if (IS_ERR(obj))
+ return PTR_ERR(obj);
+
+ vma = i915_vma_instance(obj, &engine->i915->ggtt.base, NULL);
+ if (IS_ERR(vma)) {
+ err = PTR_ERR(vma);
+ goto err;
+ }
+
+ err = i915_vma_pin(vma, 0, PAGE_SIZE, PIN_GLOBAL | PIN_HIGH);
+ if (err)
+ goto err;
+
+ engine->wa_ctx.vma = vma;
+ return 0;
+
+err:
+ i915_gem_object_put(obj);
+ return err;
+}
+
+static void lrc_destroy_wa_ctx(struct intel_engine_cs *engine)
+{
+ i915_vma_unpin_and_release(&engine->wa_ctx.vma);
+}
+
+typedef u32 *(*wa_bb_func_t)(struct intel_engine_cs *engine, u32 *batch);
+
+int i915_bb_workarounds_init(struct intel_engine_cs *engine)
+{
+ struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
+ struct i915_wa_ctx_bb *wa_bb[2] = { &wa_ctx->indirect_ctx,
+ &wa_ctx->per_ctx };
+ wa_bb_func_t wa_bb_fn[2];
+ struct page *page;
+ void *batch, *batch_ptr;
+ unsigned int i;
+ int ret;
+
+ if (WARN_ON(engine->id != RCS || !engine->scratch))
+ return -EINVAL;
+
+ switch (INTEL_GEN(engine->i915)) {
+ case 10:
+ return 0;
+ case 9:
+ wa_bb_fn[0] = gen9_init_indirectctx_bb;
+ wa_bb_fn[1] = NULL;
+ break;
+ case 8:
+ wa_bb_fn[0] = gen8_init_indirectctx_bb;
+ wa_bb_fn[1] = NULL;
+ break;
+ default:
+ MISSING_CASE(INTEL_GEN(engine->i915));
+ return 0;
+ }
+
+ ret = lrc_setup_wa_ctx(engine);
+ if (ret) {
+ DRM_DEBUG_DRIVER("Failed to setup context WA page: %d\n", ret);
+ return ret;
+ }
+
+ page = i915_gem_object_get_dirty_page(wa_ctx->vma->obj, 0);
+ batch = batch_ptr = kmap_atomic(page);
+
+ /*
+ * Emit the two workaround batch buffers, recording the offset from the
+ * start of the workaround batch buffer object for each and their
+ * respective sizes.
+ */
+ for (i = 0; i < ARRAY_SIZE(wa_bb_fn); i++) {
+ wa_bb[i]->offset = batch_ptr - batch;
+ if (WARN_ON(!IS_ALIGNED(wa_bb[i]->offset, CACHELINE_BYTES))) {
+ ret = -EINVAL;
+ break;
+ }
+ if (wa_bb_fn[i])
+ batch_ptr = wa_bb_fn[i](engine, batch_ptr);
+ wa_bb[i]->size = batch_ptr - (batch + wa_bb[i]->offset);
+ }
+
+ BUG_ON(batch_ptr - batch > CTX_WA_BB_OBJ_SIZE);
+
+ kunmap_atomic(batch);
+ if (ret)
+ lrc_destroy_wa_ctx(engine);
+
+ return ret;
+}
+
+void i915_bb_workarounds_fini(struct intel_engine_cs *engine)
+{
+ lrc_destroy_wa_ctx(engine);
+}
diff --git a/drivers/gpu/drm/i915/i915_workarounds.h b/drivers/gpu/drm/i915/i915_workarounds.h
index 805d246..72cd4b7 100644
--- a/drivers/gpu/drm/i915/i915_workarounds.h
+++ b/drivers/gpu/drm/i915/i915_workarounds.h
@@ -34,4 +34,7 @@
int i915_whitelist_workarounds_init(struct intel_engine_cs *engine);
void i915_whitelist_workarounds_apply(struct intel_engine_cs *engine);
+int i915_bb_workarounds_init(struct intel_engine_cs *engine);
+void i915_bb_workarounds_fini(struct intel_engine_cs *engine);
+
#endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0fe0f3e..8bc7ddf 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1166,255 +1166,6 @@ static int execlists_request_alloc(struct drm_i915_gem_request *request)
return 0;
}
-/*
- * In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after
- * PIPE_CONTROL instruction. This is required for the flush to happen correctly
- * but there is a slight complication as this is applied in WA batch where the
- * values are only initialized once so we cannot take register value at the
- * beginning and reuse it further; hence we save its value to memory, upload a
- * constant value with bit21 set and then we restore it back with the saved value.
- * To simplify the WA, a constant value is formed by using the default value
- * of this register. This shouldn't be a problem because we are only modifying
- * it for a short period and this batch in non-premptible. We can ofcourse
- * use additional instructions that read the actual value of the register
- * at that time and set our bit of interest but it makes the WA complicated.
- *
- * This WA is also required for Gen9 so extracting as a function avoids
- * code duplication.
- */
-static u32 *
-gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
-{
- *batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
- *batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
- *batch++ = i915_ggtt_offset(engine->scratch) + 256;
- *batch++ = 0;
-
- *batch++ = MI_LOAD_REGISTER_IMM(1);
- *batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
- *batch++ = 0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES;
-
- batch = gen8_emit_pipe_control(batch,
- PIPE_CONTROL_CS_STALL |
- PIPE_CONTROL_DC_FLUSH_ENABLE,
- 0);
-
- *batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
- *batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
- *batch++ = i915_ggtt_offset(engine->scratch) + 256;
- *batch++ = 0;
-
- return batch;
-}
-
-/*
- * Typically we only have one indirect_ctx and per_ctx batch buffer which are
- * initialized at the beginning and shared across all contexts but this field
- * helps us to have multiple batches at different offsets and select them based
- * on a criteria. At the moment this batch always start at the beginning of the page
- * and at this point we don't have multiple wa_ctx batch buffers.
- *
- * The number of WA applied are not known at the beginning; we use this field
- * to return the no of DWORDS written.
- *
- * It is to be noted that this batch does not contain MI_BATCH_BUFFER_END
- * so it adds NOOPs as padding to make it cacheline aligned.
- * MI_BATCH_BUFFER_END will be added to perctx batch and both of them together
- * makes a complete batch buffer.
- */
-static u32 *gen8_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
-{
- /* WaDisableCtxRestoreArbitration:bdw,chv */
- *batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
-
- /* WaFlushCoherentL3CacheLinesAtContextSwitch:bdw */
- if (IS_BROADWELL(engine->i915))
- batch = gen8_emit_flush_coherentl3_wa(engine, batch);
-
- /* WaClearSlmSpaceAtContextSwitch:bdw,chv */
- /* Actual scratch location is at 128 bytes offset */
- batch = gen8_emit_pipe_control(batch,
- PIPE_CONTROL_FLUSH_L3 |
- PIPE_CONTROL_GLOBAL_GTT_IVB |
- PIPE_CONTROL_CS_STALL |
- PIPE_CONTROL_QW_WRITE,
- i915_ggtt_offset(engine->scratch) +
- 2 * CACHELINE_BYTES);
-
- *batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
-
- /* Pad to end of cacheline */
- while ((unsigned long)batch % CACHELINE_BYTES)
- *batch++ = MI_NOOP;
-
- /*
- * MI_BATCH_BUFFER_END is not required in Indirect ctx BB because
- * execution depends on the length specified in terms of cache lines
- * in the register CTX_RCS_INDIRECT_CTX
- */
-
- return batch;
-}
-
-static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
-{
- *batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
-
- /* WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt,glk */
- batch = gen8_emit_flush_coherentl3_wa(engine, batch);
-
- /* WaDisableGatherAtSetShaderCommonSlice:skl,bxt,kbl,glk */
- *batch++ = MI_LOAD_REGISTER_IMM(1);
- *batch++ = i915_mmio_reg_offset(COMMON_SLICE_CHICKEN2);
- *batch++ = _MASKED_BIT_DISABLE(
- GEN9_DISABLE_GATHER_AT_SET_SHADER_COMMON_SLICE);
- *batch++ = MI_NOOP;
-
- /* WaClearSlmSpaceAtContextSwitch:kbl */
- /* Actual scratch location is at 128 bytes offset */
- if (IS_KBL_REVID(engine->i915, 0, KBL_REVID_A0)) {
- batch = gen8_emit_pipe_control(batch,
- PIPE_CONTROL_FLUSH_L3 |
- PIPE_CONTROL_GLOBAL_GTT_IVB |
- PIPE_CONTROL_CS_STALL |
- PIPE_CONTROL_QW_WRITE,
- i915_ggtt_offset(engine->scratch)
- + 2 * CACHELINE_BYTES);
- }
-
- /* WaMediaPoolStateCmdInWABB:bxt,glk */
- if (HAS_POOLED_EU(engine->i915)) {
- /*
- * EU pool configuration is setup along with golden context
- * during context initialization. This value depends on
- * device type (2x6 or 3x6) and needs to be updated based
- * on which subslice is disabled especially for 2x6
- * devices, however it is safe to load default
- * configuration of 3x6 device instead of masking off
- * corresponding bits because HW ignores bits of a disabled
- * subslice and drops down to appropriate config. Please
- * see render_state_setup() in i915_gem_render_state.c for
- * possible configurations, to avoid duplication they are
- * not shown here again.
- */
- *batch++ = GEN9_MEDIA_POOL_STATE;
- *batch++ = GEN9_MEDIA_POOL_ENABLE;
- *batch++ = 0x00777000;
- *batch++ = 0;
- *batch++ = 0;
- *batch++ = 0;
- }
-
- *batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
-
- /* Pad to end of cacheline */
- while ((unsigned long)batch % CACHELINE_BYTES)
- *batch++ = MI_NOOP;
-
- return batch;
-}
-
-#define CTX_WA_BB_OBJ_SIZE (PAGE_SIZE)
-
-static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
-{
- struct drm_i915_gem_object *obj;
- struct i915_vma *vma;
- int err;
-
- obj = i915_gem_object_create(engine->i915, CTX_WA_BB_OBJ_SIZE);
- if (IS_ERR(obj))
- return PTR_ERR(obj);
-
- vma = i915_vma_instance(obj, &engine->i915->ggtt.base, NULL);
- if (IS_ERR(vma)) {
- err = PTR_ERR(vma);
- goto err;
- }
-
- err = i915_vma_pin(vma, 0, PAGE_SIZE, PIN_GLOBAL | PIN_HIGH);
- if (err)
- goto err;
-
- engine->wa_ctx.vma = vma;
- return 0;
-
-err:
- i915_gem_object_put(obj);
- return err;
-}
-
-static void lrc_destroy_wa_ctx(struct intel_engine_cs *engine)
-{
- i915_vma_unpin_and_release(&engine->wa_ctx.vma);
-}
-
-typedef u32 *(*wa_bb_func_t)(struct intel_engine_cs *engine, u32 *batch);
-
-static int intel_init_workaround_bb(struct intel_engine_cs *engine)
-{
- struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
- struct i915_wa_ctx_bb *wa_bb[2] = { &wa_ctx->indirect_ctx,
- &wa_ctx->per_ctx };
- wa_bb_func_t wa_bb_fn[2];
- struct page *page;
- void *batch, *batch_ptr;
- unsigned int i;
- int ret;
-
- if (WARN_ON(engine->id != RCS || !engine->scratch))
- return -EINVAL;
-
- switch (INTEL_GEN(engine->i915)) {
- case 10:
- return 0;
- case 9:
- wa_bb_fn[0] = gen9_init_indirectctx_bb;
- wa_bb_fn[1] = NULL;
- break;
- case 8:
- wa_bb_fn[0] = gen8_init_indirectctx_bb;
- wa_bb_fn[1] = NULL;
- break;
- default:
- MISSING_CASE(INTEL_GEN(engine->i915));
- return 0;
- }
-
- ret = lrc_setup_wa_ctx(engine);
- if (ret) {
- DRM_DEBUG_DRIVER("Failed to setup context WA page: %d\n", ret);
- return ret;
- }
-
- page = i915_gem_object_get_dirty_page(wa_ctx->vma->obj, 0);
- batch = batch_ptr = kmap_atomic(page);
-
- /*
- * Emit the two workaround batch buffers, recording the offset from the
- * start of the workaround batch buffer object for each and their
- * respective sizes.
- */
- for (i = 0; i < ARRAY_SIZE(wa_bb_fn); i++) {
- wa_bb[i]->offset = batch_ptr - batch;
- if (WARN_ON(!IS_ALIGNED(wa_bb[i]->offset, CACHELINE_BYTES))) {
- ret = -EINVAL;
- break;
- }
- if (wa_bb_fn[i])
- batch_ptr = wa_bb_fn[i](engine, batch_ptr);
- wa_bb[i]->size = batch_ptr - (batch + wa_bb[i]->offset);
- }
-
- BUG_ON(batch_ptr - batch > CTX_WA_BB_OBJ_SIZE);
-
- kunmap_atomic(batch);
- if (ret)
- lrc_destroy_wa_ctx(engine);
-
- return ret;
-}
-
static u8 gtiir[] = {
[RCS] = 0,
[BCS] = 0,
@@ -1870,7 +1621,7 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *engine)
intel_engine_cleanup_common(engine);
- lrc_destroy_wa_ctx(engine);
+ i915_bb_workarounds_fini(engine);
engine->i915 = NULL;
dev_priv->engine[engine->id] = NULL;
kfree(engine);
@@ -1990,7 +1741,7 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
if (ret)
return ret;
- ret = intel_init_workaround_bb(engine);
+ ret = i915_bb_workarounds_init(engine);
if (ret) {
/*
* We continue even if we fail to initialize WA batch
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 10/11] drm/i915: Document the i915_workarounds file
2017-10-09 20:58 [RFC PATCH 00/11] Refactor HW workaround code Oscar Mateo
` (8 preceding siblings ...)
2017-10-09 20:58 ` [RFC PATCH 09/11] drm/i915: Move WA BB stuff to the workarounds file as well Oscar Mateo
@ 2017-10-09 20:58 ` Oscar Mateo
2017-10-09 20:58 ` [RFC PATCH 11/11] drm/i915: Remove Gen9 WAs with no effect Oscar Mateo
` (2 subsequent siblings)
12 siblings, 0 replies; 22+ messages in thread
From: Oscar Mateo @ 2017-10-09 20:58 UTC (permalink / raw)
To: intel-gfx
Does what it says on the tin (plus a few fixes in some old comments).
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_workarounds.c | 45 ++++++++++++++++++++++++++++-----
1 file changed, 38 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_workarounds.c b/drivers/gpu/drm/i915/i915_workarounds.c
index ff30525..9a32601 100644
--- a/drivers/gpu/drm/i915/i915_workarounds.c
+++ b/drivers/gpu/drm/i915/i915_workarounds.c
@@ -25,6 +25,36 @@
#include "i915_drv.h"
#include "i915_workarounds.h"
+/**
+ * DOC: Hardware workarounds
+ *
+ * This file is a central place to implement most* of the required workarounds
+ * required for HW to work as originally intended. They fall in four categories
+ * depending on how/when they are applied:
+ *
+ * - Workarounds that touch registers that are saved/restored to/from the HW
+ * context image. The list is generated once and then emitted (via Load
+ * Register Immediate commands) everytime a new context is created.
+ * - Workarounds that touch global MMIO registers. The list of these WAs is
+ * generated once and then applied whenever these registers revert to default
+ * values (on GPU reset, suspend/resume**, etc..).
+ * - Workarounds that whitelist a privileged register, so that UMDs can manage
+ * them directly. This is just a special case of a MMMIO workaround (as we
+ * write the list of these to/be-whitelisted registers to some special HW
+ * registers).
+ * - Workaround batchbuffers, that get executed automatically by the hardware
+ * on every HW context restore.
+ *
+ * * Please notice that there are other WAs that, due to their nature, cannot be
+ * applied from a central place. Those are peppered around the rest of the
+ * code, as needed).
+ *
+ * ** Technically, some registers are powercontext saved & restored, so they
+ * survive a suspend/resume. In practice, writing them again is not too
+ * costly and simplifies things. We can revisit this in the future.
+ *
+ */
+
static int ctx_wa_add(struct drm_i915_private *dev_priv,
i915_reg_t addr,
const u32 mask, const u32 val)
@@ -190,9 +220,9 @@ static int gen9_ctx_workarounds_init(struct drm_i915_private *dev_priv)
CTXWA_SET_BIT_MSK(GEN7_COMMON_SLICE_CHICKEN1,
GEN9_RHWO_OPTIMIZATION_DISABLE);
/*
- * WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be set
- * but we do that in per ctx batchbuffer as there is an issue
- * with this register not getting restored on ctx restore
+ * WA also requires GEN9_SLICE_COMMON_ECO_CHICKEN0[14:14] to be
+ * set but we do that in per ctx batchbuffer as there is an
+ * issue with this register not getting restored on ctx restore.
*/
}
@@ -1029,10 +1059,11 @@ void i915_whitelist_workarounds_apply(struct intel_engine_cs *engine)
* but there is a slight complication as this is applied in WA batch where the
* values are only initialized once so we cannot take register value at the
* beginning and reuse it further; hence we save its value to memory, upload a
- * constant value with bit21 set and then we restore it back with the saved value.
+ * constant value with bit21 set and then we restore it back with the saved
+ * value.
* To simplify the WA, a constant value is formed by using the default value
* of this register. This shouldn't be a problem because we are only modifying
- * it for a short period and this batch in non-premptible. We can ofcourse
+ * it for a short period and this batch in non-premptible. We can of course
* use additional instructions that read the actual value of the register
* at that time and set our bit of interest but it makes the WA complicated.
*
@@ -1068,8 +1099,8 @@ void i915_whitelist_workarounds_apply(struct intel_engine_cs *engine)
* Typically we only have one indirect_ctx and per_ctx batch buffer which are
* initialized at the beginning and shared across all contexts but this field
* helps us to have multiple batches at different offsets and select them based
- * on a criteria. At the moment this batch always start at the beginning of the page
- * and at this point we don't have multiple wa_ctx batch buffers.
+ * on a criteria. At the moment this batch always start at the beginning of the
+ * page and at this point we don't have multiple wa_ctx batch buffers.
*
* The number of WA applied are not known at the beginning; we use this field
* to return the no of DWORDS written.
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread* [RFC PATCH 11/11] drm/i915: Remove Gen9 WAs with no effect
2017-10-09 20:58 [RFC PATCH 00/11] Refactor HW workaround code Oscar Mateo
` (9 preceding siblings ...)
2017-10-09 20:58 ` [RFC PATCH 10/11] drm/i915: Document the i915_workarounds file Oscar Mateo
@ 2017-10-09 20:58 ` Oscar Mateo
2017-10-09 21:08 ` [RFC PATCH 00/11] Refactor HW workaround code Chris Wilson
2017-10-09 21:36 ` ✗ Fi.CI.BAT: warning for " Patchwork
12 siblings, 0 replies; 22+ messages in thread
From: Oscar Mateo @ 2017-10-09 20:58 UTC (permalink / raw)
To: intel-gfx
GEN8_CONFIG0 (0xD00) is a protected by a lock (bit 31) which is set by
the BIOS, so there is no way we can enable the three chicken bits
mandated by the WA (the BIOS should be doing it instead).
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 3 ---
drivers/gpu/drm/i915/i915_workarounds.c | 2 --
2 files changed, 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 6b46393..a78dea6 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -355,9 +355,6 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
#define ECOCHK_PPGTT_WT_HSW (0x2<<3)
#define ECOCHK_PPGTT_WB_HSW (0x3<<3)
-#define GEN8_CONFIG0 _MMIO(0xD00)
-#define GEN9_DEFAULT_FIXES (1 << 3 | 1 << 2 | 1 << 1)
-
#define GAC_ECO_BITS _MMIO(0x14090)
#define ECOBITS_SNB_BIT (1<<13)
#define ECOBITS_PPGTT_CACHE64B (3<<8)
diff --git a/drivers/gpu/drm/i915/i915_workarounds.c b/drivers/gpu/drm/i915/i915_workarounds.c
index 9a32601..6f10e04 100644
--- a/drivers/gpu/drm/i915/i915_workarounds.c
+++ b/drivers/gpu/drm/i915/i915_workarounds.c
@@ -608,8 +608,6 @@ static int gen9_mmio_workarounds_init(struct drm_i915_private *dev_priv)
/* See Bspec note for PSR2_CTL bit 31, Wa#828:skl,bxt,kbl,cfl */
MMIOWA_SET_BIT(CHICKEN_PAR1_1, SKL_EDP_PSR_FIX_RDWRAP);
- MMIOWA_SET_BIT(GEN8_CONFIG0, GEN9_DEFAULT_FIXES);
-
/* WaEnableChickenDCPR:skl,bxt,kbl,glk,cfl */
MMIOWA_SET_BIT(GEN8_CHICKEN_DCPR_1, MASK_WAKEMEM);
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [RFC PATCH 00/11] Refactor HW workaround code
2017-10-09 20:58 [RFC PATCH 00/11] Refactor HW workaround code Oscar Mateo
` (10 preceding siblings ...)
2017-10-09 20:58 ` [RFC PATCH 11/11] drm/i915: Remove Gen9 WAs with no effect Oscar Mateo
@ 2017-10-09 21:08 ` Chris Wilson
2017-10-10 17:37 ` Oscar Mateo
2017-10-09 21:36 ` ✗ Fi.CI.BAT: warning for " Patchwork
12 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2017-10-09 21:08 UTC (permalink / raw)
To: Oscar Mateo, intel-gfx
Quoting Oscar Mateo (2017-10-09 21:58:15)
> Currently, deciding how/where to apply new workarounds is challenging. Often,
> workarounds end up applied incorrectly and get lost under certain circumstances
> (e.g. a context switch or a GPU reset). This is a proposal to attempt to
> eliminate some of this pain, by clarifying the current classification of
> workarounds (context saved/restored, global registers, whitelisting, BB),
> putting them together on the same file, and improving the existing validation
> infrastructure (debugfs/i-g-t).
One thing I've been dreaming of is if we can have an external file for
importing the w/a (reg offset + corrected value) that we could source
directly from spec. (Hoping for some xml translation to C or DT.)
We need something like this so that we can set all the nonpriv registers
to the default value in the proto-context. Or at least lots of patience
and careful proofreading.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [RFC PATCH 00/11] Refactor HW workaround code
2017-10-09 21:08 ` [RFC PATCH 00/11] Refactor HW workaround code Chris Wilson
@ 2017-10-10 17:37 ` Oscar Mateo
2017-10-10 18:24 ` Oscar Mateo
0 siblings, 1 reply; 22+ messages in thread
From: Oscar Mateo @ 2017-10-10 17:37 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 10/09/2017 02:08 PM, Chris Wilson wrote:
> Quoting Oscar Mateo (2017-10-09 21:58:15)
>> Currently, deciding how/where to apply new workarounds is challenging. Often,
>> workarounds end up applied incorrectly and get lost under certain circumstances
>> (e.g. a context switch or a GPU reset). This is a proposal to attempt to
>> eliminate some of this pain, by clarifying the current classification of
>> workarounds (context saved/restored, global registers, whitelisting, BB),
>> putting them together on the same file, and improving the existing validation
>> infrastructure (debugfs/i-g-t).
> One thing I've been dreaming of is if we can have an external file for
> importing the w/a (reg offset + corrected value) that we could source
> directly from spec. (Hoping for some xml translation to C or DT.)
Hmmm... I'm afraid this is impossible at the moment, since many WAs in
the BSpec are simply a link to the ticket where the workaround was
devised (and there you have to parse the conversation to figure out what
the WA should do).
> We need something like this so that we can set all the nonpriv registers
> to the default value in the proto-context. Or at least lots of patience
> and careful proofreading.
> -Chris
You mean applying the workarounds directly to the context image instead
of the LRI commands we use now? That can be done (as you said, with *a
lot* of patience and careful proofreading) but we would need to force
the proto-context to be restored first (with
CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT bit set), because the workarounds do
not always give us the full default value for a register, only the
fields that are wrong in the hardware.
-- Oscar
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 00/11] Refactor HW workaround code
2017-10-10 17:37 ` Oscar Mateo
@ 2017-10-10 18:24 ` Oscar Mateo
2017-10-10 20:10 ` Chris Wilson
0 siblings, 1 reply; 22+ messages in thread
From: Oscar Mateo @ 2017-10-10 18:24 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 10/10/2017 10:37 AM, Oscar Mateo wrote:
>
>
> On 10/09/2017 02:08 PM, Chris Wilson wrote:
>> Quoting Oscar Mateo (2017-10-09 21:58:15)
>>> Currently, deciding how/where to apply new workarounds is
>>> challenging. Often,
>>> workarounds end up applied incorrectly and get lost under certain
>>> circumstances
>>> (e.g. a context switch or a GPU reset). This is a proposal to
>>> attempt to
>>> eliminate some of this pain, by clarifying the current
>>> classification of
>>> workarounds (context saved/restored, global registers, whitelisting,
>>> BB),
>>> putting them together on the same file, and improving the existing
>>> validation
>>> infrastructure (debugfs/i-g-t).
>> One thing I've been dreaming of is if we can have an external file for
>> importing the w/a (reg offset + corrected value) that we could source
>> directly from spec. (Hoping for some xml translation to C or DT.)
>
> Hmmm... I'm afraid this is impossible at the moment, since many WAs in
> the BSpec are simply a link to the ticket where the workaround was
> devised (and there you have to parse the conversation to figure out
> what the WA should do).
>
>> We need something like this so that we can set all the nonpriv registers
>> to the default value in the proto-context. Or at least lots of patience
>> and careful proofreading.
>> -Chris
>
> You mean applying the workarounds directly to the context image
> instead of the LRI commands we use now? That can be done (as you said,
> with *a lot* of patience and careful proofreading) but we would need
> to force the proto-context to be restored first (with
> CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT bit set), because the workarounds
> do not always give us the full default value for a register, only the
> fields that are wrong in the hardware.
Hmmmm... I'm giving out conflicting information. Registers in the engine
(those that live in the engine part of the context image) are not
guaranteed to be initialized, so inhibit-restoring the context and then
saving it is not enough. Yes, we would have to get the default values
for them from the BSpec, either automatically or by hand, then apply the
workarounds (either by modifying the default values or with the emission
of LRIs we have now).
This is an insane amount of work (and sometimes the BSpec does not
include all the information you need), that's why the recommendation in
the BSpec is to force the engine to populate these registers for us (the
infamous null state batchbuffer):
"[...] Engine context starts immediately following the logical ring
context in memory. This state is very specific to an engine and differs
from engine to engine. This part of the context consists of the state
from all the units in the engine that needs to be save/restored across
context switches. Engine restores the engine context following the
logical ring context restore. It is tedious for software to populate the
engine context as per the requirements, it is recommended to implicitly
use engine to populate this portion of the context. Below method can be
followed to achieve the same:
* When a context is submitted for the first time for execution, SW can
inhibit engine from restoring engine context by setting the “Engine
Context Restore Inhibit” bit in CTXT_SR_CTL register of the logical ring
context. This will avoid software from populating the Engine Context.
Software must program all the state required to initialize the engine in
the ring buffer which would initialize the hardware state. On a
subsequent context save engine will populate the engine context with
appropriate values.
* Above method can be used to create a complete logical context with
engine context populated by the hardware. This Logical context can be
used as an Golden Context Image or template for subsequently created
contexts.
Engine saves the engine context following the logical ring context on
switching out a context."
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 00/11] Refactor HW workaround code
2017-10-10 18:24 ` Oscar Mateo
@ 2017-10-10 20:10 ` Chris Wilson
2017-10-10 20:35 ` Oscar Mateo
0 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2017-10-10 20:10 UTC (permalink / raw)
To: Oscar Mateo, intel-gfx
Quoting Oscar Mateo (2017-10-10 19:24:18)
> * Above method can be used to create a complete logical context with
> engine context populated by the hardware. This Logical context can be
> used as an Golden Context Image or template for subsequently created
> contexts.
It's the template part we want (sending down a pair of contexts to be
executed in sequence is problematic).
The current fire are the nonpriv registers that are leaking across the
proto-context, i.e. mesa does LRI that get inherited into a libva
context. Adding those to the context image requires approx a hundred
LRI, so a not impossible task to proofread (esp. as many are just
setting register groups, e.g. the 32 CS_GPR registers, to 0).
But it is still a task that we need to maintain that list of nonpriv
registers. (State outside of the nonpriv that doesn't have a defined
default, cannot be either modified by userspace or assume to have any
particular value.)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC PATCH 00/11] Refactor HW workaround code
2017-10-10 20:10 ` Chris Wilson
@ 2017-10-10 20:35 ` Oscar Mateo
0 siblings, 0 replies; 22+ messages in thread
From: Oscar Mateo @ 2017-10-10 20:35 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 10/10/2017 01:10 PM, Chris Wilson wrote:
> Quoting Oscar Mateo (2017-10-10 19:24:18)
>> * Above method can be used to create a complete logical context with
>> engine context populated by the hardware. This Logical context can be
>> used as an Golden Context Image or template for subsequently created
>> contexts.
> It's the template part we want (sending down a pair of contexts to be
> executed in sequence is problematic).
Ahh, I see.
Notice that what the BSpec suggests is to send the null state BB only
once (for the kernel default context image) and then copy (memcpy or
DMA) that context image over all new contexts that get created. That
takes care of the problem with nonpriv registers, because absolutely
everything inside the context image get blasted with a clean copy everytime.
> The current fire are the nonpriv registers that are leaking across the
> proto-context, i.e. mesa does LRI that get inherited into a libva
> context. Adding those to the context image requires approx a hundred
> LRI, so a not impossible task to proofread (esp. as many are just
> setting register groups, e.g. the 32 CS_GPR registers, to 0).
>
> But it is still a task that we need to maintain that list of nonpriv
> registers.
or, alternatively, follow the BSpec's "template" approach, with or
without null state BB
> (State outside of the nonpriv that doesn't have a defined
> default, cannot be either modified by userspace or assume to have any
> particular value.)
> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread
* ✗ Fi.CI.BAT: warning for Refactor HW workaround code
2017-10-09 20:58 [RFC PATCH 00/11] Refactor HW workaround code Oscar Mateo
` (11 preceding siblings ...)
2017-10-09 21:08 ` [RFC PATCH 00/11] Refactor HW workaround code Chris Wilson
@ 2017-10-09 21:36 ` Patchwork
12 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2017-10-09 21:36 UTC (permalink / raw)
To: Oscar Mateo; +Cc: intel-gfx
== Series Details ==
Series: Refactor HW workaround code
URL : https://patchwork.freedesktop.org/series/31611/
State : warning
== Summary ==
Series 31611v1 Refactor HW workaround code
https://patchwork.freedesktop.org/api/1.0/series/31611/revisions/1/mbox/
Test chamelium:
Subgroup dp-crc-fast:
fail -> PASS (fi-kbl-7500u) fdo#102514
Test gem_exec_suspend:
Subgroup basic-s3:
pass -> DMESG-WARN (fi-cfl-s) fdo#103026
Test gem_workarounds:
Subgroup basic-read:
pass -> SKIP (fi-bdw-5557u)
pass -> SKIP (fi-bdw-gvtdvm)
pass -> SKIP (fi-bsw-n3050)
pass -> SKIP (fi-skl-6260u)
pass -> SKIP (fi-skl-6700hq)
pass -> SKIP (fi-skl-6700k)
pass -> SKIP (fi-skl-6770hq)
pass -> SKIP (fi-skl-gvtdvm)
pass -> SKIP (fi-bxt-dsi)
pass -> SKIP (fi-bxt-j4205)
pass -> SKIP (fi-kbl-7500u)
pass -> SKIP (fi-kbl-7560u)
pass -> SKIP (fi-kbl-7567u)
pass -> SKIP (fi-kbl-r)
pass -> SKIP (fi-glk-1)
pass -> SKIP (fi-cfl-s)
pass -> SKIP (fi-cnl-y)
Test kms_pipe_crc_basic:
Subgroup nonblocking-crc-pipe-c:
pass -> DMESG-WARN (fi-skl-6700k)
Subgroup read-crc-pipe-b:
pass -> DMESG-WARN (fi-skl-6700k)
fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514
fdo#103026 https://bugs.freedesktop.org/show_bug.cgi?id=103026
fi-bdw-5557u total:289 pass:267 dwarn:0 dfail:0 fail:0 skip:22 time:457s
fi-bdw-gvtdvm total:289 pass:264 dwarn:0 dfail:0 fail:0 skip:25 time:466s
fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:392s
fi-bsw-n3050 total:289 pass:242 dwarn:0 dfail:0 fail:0 skip:47 time:577s
fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:286s
fi-bxt-dsi total:289 pass:258 dwarn:0 dfail:0 fail:0 skip:31 time:520s
fi-bxt-j4205 total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:526s
fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:533s
fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:519s
fi-cfl-s total:289 pass:255 dwarn:1 dfail:0 fail:0 skip:33 time:562s
fi-cnl-y total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:623s
fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:431s
fi-glk-1 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:598s
fi-hsw-4770 total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:436s
fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:415s
fi-ilk-650 total:289 pass:228 dwarn:0 dfail:0 fail:0 skip:61 time:460s
fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:498s
fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:478s
fi-kbl-7500u total:289 pass:263 dwarn:1 dfail:0 fail:0 skip:25 time:504s
fi-kbl-7560u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:581s
fi-kbl-7567u total:289 pass:264 dwarn:4 dfail:0 fail:0 skip:21 time:489s
fi-kbl-r total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:588s
fi-pnv-d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 time:654s
fi-skl-6260u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:467s
fi-skl-6700hq total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:656s
fi-skl-6700k total:289 pass:262 dwarn:2 dfail:0 fail:0 skip:25 time:525s
fi-skl-6770hq total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:511s
fi-skl-gvtdvm total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:475s
fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:576s
fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:424s
85a1f1bd95a55d745bb4fff6ce5f1923502bd5c9 drm-tip: 2017y-10m-09d-20h-33m-26s UTC integration manifest
2e475c799415 drm/i915: Remove Gen9 WAs with no effect
2cca5d39b848 drm/i915: Document the i915_workarounds file
9c235c7f0ebf drm/i915: Move WA BB stuff to the workarounds file as well
40e1bbf47606 drm/i915: Print all workaround types correctly in debugfs
41afd19fc153 drm/i915: Save all Whitelist WAs and apply them at a later time
8aa20bc42c6f drm/i915: Save all MMIO WAs and apply them at a later time
249a71966e30 drm/i915: Rename saved workarounds to make it explicit that they are context WAs
ce3a45c3bceb drm/i915: Move workarounds from init_clock_gating
fa6247968024 drm/i915: Split out functions for different kinds of workarounds
9aeb5d10480f drm/i915: Move a bunch of workaround-related code to its own file
adce2ce40c80 drm/i915: No need for RING_MAX_NONPRIV_SLOTS space
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5961/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 22+ messages in thread