* [PATCH 2/6] drm/i915: Don't warn if the workaround list is empty.
2015-10-07 11:44 [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Francisco Jerez
@ 2015-10-07 11:44 ` Francisco Jerez
2015-10-07 14:37 ` Daniel Vetter
2015-10-07 11:44 ` [PATCH 3/6] drm/i915: Hook up ring workaround writes at context creation time on Gen6-7 Francisco Jerez
` (4 subsequent siblings)
5 siblings, 1 reply; 11+ messages in thread
From: Francisco Jerez @ 2015-10-07 11:44 UTC (permalink / raw)
To: intel-gfx
It's not an error for the workaround list to be empty if no
workarounds are needed. This will avoid spamming the logs
unnecessarily on Gen6 after the workaround list is hooked up on
pre-Gen8 hardware by the following commits.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
---
drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 54ca344..3373516 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -717,7 +717,7 @@ static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
struct drm_i915_private *dev_priv = dev->dev_private;
struct i915_workarounds *w = &dev_priv->workarounds;
- if (WARN_ON_ONCE(w->count == 0))
+ if (w->count == 0)
return 0;
ring->gpu_caches_dirty = true;
--
2.5.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH 2/6] drm/i915: Don't warn if the workaround list is empty.
2015-10-07 11:44 ` [PATCH 2/6] drm/i915: Don't warn if the workaround list is empty Francisco Jerez
@ 2015-10-07 14:37 ` Daniel Vetter
0 siblings, 0 replies; 11+ messages in thread
From: Daniel Vetter @ 2015-10-07 14:37 UTC (permalink / raw)
To: Francisco Jerez; +Cc: intel-gfx
On Wed, Oct 07, 2015 at 02:44:01PM +0300, Francisco Jerez wrote:
> It's not an error for the workaround list to be empty if no
> workarounds are needed. This will avoid spamming the logs
> unnecessarily on Gen6 after the workaround list is hooked up on
> pre-Gen8 hardware by the following commits.
>
> Signed-off-by: Francisco Jerez <currojerez@riseup.net>
I think the idea of this was to not forget this when bringing up new
platforms. I think we wrestle workarounds often enough to have a low risk
for that ;-)
Queued for -next, thanks for the patch.
-Daniel
> ---
> drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 54ca344..3373516 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -717,7 +717,7 @@ static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
> struct drm_i915_private *dev_priv = dev->dev_private;
> struct i915_workarounds *w = &dev_priv->workarounds;
>
> - if (WARN_ON_ONCE(w->count == 0))
> + if (w->count == 0)
> return 0;
>
> ring->gpu_caches_dirty = true;
> --
> 2.5.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/6] drm/i915: Hook up ring workaround writes at context creation time on Gen6-7.
2015-10-07 11:44 [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Francisco Jerez
2015-10-07 11:44 ` [PATCH 2/6] drm/i915: Don't warn if the workaround list is empty Francisco Jerez
@ 2015-10-07 11:44 ` Francisco Jerez
2015-10-07 14:38 ` Daniel Vetter
2015-10-07 11:44 ` [PATCH 4/6] drm/i915: Program the L3 configuration to hardware defaults on context init Francisco Jerez
` (3 subsequent siblings)
5 siblings, 1 reply; 11+ messages in thread
From: Francisco Jerez @ 2015-10-07 11:44 UTC (permalink / raw)
To: intel-gfx
intel_rcs_ctx_init() emits all workaround register writes on the list
to the ring, in addition to calling i915_gem_render_state_init(). The
workaround list is currently empty on Gen6-7 so this shouldn't cause
any functional changes.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
---
This is based on top of Daniel's "drm/i915: Resurrect golden context
on gen6/7". [1]
[1] https://patchwork.freedesktop.org/patch/60870/
drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3373516..e2e40d0 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2703,7 +2703,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
GEN8_RING_SEMAPHORE_INIT;
}
} else if (INTEL_INFO(dev)->gen >= 6) {
- ring->init_context = i915_gem_render_state_init;
+ ring->init_context = intel_rcs_ctx_init;
ring->add_request = gen6_add_request;
ring->flush = gen7_render_ring_flush;
if (INTEL_INFO(dev)->gen == 6)
--
2.5.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH 3/6] drm/i915: Hook up ring workaround writes at context creation time on Gen6-7.
2015-10-07 11:44 ` [PATCH 3/6] drm/i915: Hook up ring workaround writes at context creation time on Gen6-7 Francisco Jerez
@ 2015-10-07 14:38 ` Daniel Vetter
0 siblings, 0 replies; 11+ messages in thread
From: Daniel Vetter @ 2015-10-07 14:38 UTC (permalink / raw)
To: Francisco Jerez; +Cc: intel-gfx
On Wed, Oct 07, 2015 at 02:44:02PM +0300, Francisco Jerez wrote:
> intel_rcs_ctx_init() emits all workaround register writes on the list
> to the ring, in addition to calling i915_gem_render_state_init(). The
> workaround list is currently empty on Gen6-7 so this shouldn't cause
> any functional changes.
>
> Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Queued for -next, thanks for the patch.
-Daniel
> ---
> This is based on top of Daniel's "drm/i915: Resurrect golden context
> on gen6/7". [1]
>
> [1] https://patchwork.freedesktop.org/patch/60870/
>
> drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 3373516..e2e40d0 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2703,7 +2703,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
> GEN8_RING_SEMAPHORE_INIT;
> }
> } else if (INTEL_INFO(dev)->gen >= 6) {
> - ring->init_context = i915_gem_render_state_init;
> + ring->init_context = intel_rcs_ctx_init;
> ring->add_request = gen6_add_request;
> ring->flush = gen7_render_ring_flush;
> if (INTEL_INFO(dev)->gen == 6)
> --
> 2.5.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 4/6] drm/i915: Program the L3 configuration to hardware defaults on context init.
2015-10-07 11:44 [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Francisco Jerez
2015-10-07 11:44 ` [PATCH 2/6] drm/i915: Don't warn if the workaround list is empty Francisco Jerez
2015-10-07 11:44 ` [PATCH 3/6] drm/i915: Hook up ring workaround writes at context creation time on Gen6-7 Francisco Jerez
@ 2015-10-07 11:44 ` Francisco Jerez
2015-10-07 11:44 ` [PATCH 5/6] drm/i915/hsw: Move L3 atomics workaround to the workaround list Francisco Jerez
` (2 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Francisco Jerez @ 2015-10-07 11:44 UTC (permalink / raw)
To: intel-gfx
Use init_l3_partitioning_workarounds() to set up the L3 partitioning
on context creation according to the hardware boot-up defaults for
each device.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
---
drivers/gpu/drm/i915/intel_ringbuffer.c | 53 +++++++++++++++++++++++++++++++--
1 file changed, 51 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e2e40d0..c4c39c4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -880,6 +880,36 @@ static int init_l3_partitioning_workarounds(struct intel_engine_cs *ring,
return 0;
}
+static int ivb_init_workarounds(struct intel_engine_cs *ring)
+{
+ /*
+ * Set up the default L3 partitioning of the hardware of 32
+ * ways (i.e. 256 KB on GT2 parts) for the URB and another 32
+ * ways for the RO partition.
+ */
+ return init_l3_partitioning_workarounds(ring, 32, 32, 0, 0);
+}
+
+static int vlv_init_workarounds(struct intel_engine_cs *ring)
+{
+ /*
+ * Set up the default L3 partitioning of the hardware of 64
+ * ways (i.e. 128 KB) for the URB, 16 ways (i.e. 32 KB) for
+ * the RO partition and another 16 ways for the DC.
+ */
+ return init_l3_partitioning_workarounds(ring, 64, 16, 16, 0);
+}
+
+static int hsw_init_workarounds(struct intel_engine_cs *ring)
+{
+ /*
+ * Set up the default L3 partitioning of the hardware of 32
+ * ways (i.e. 256 KB on GT2 parts) for the URB and another 32
+ * ways for the RO partition.
+ */
+ return init_l3_partitioning_workarounds(ring, 32, 32, 0, 0);
+}
+
static int gen8_init_workarounds(struct intel_engine_cs *ring)
{
struct drm_device *dev = ring->dev;
@@ -929,7 +959,12 @@ static int gen8_init_workarounds(struct intel_engine_cs *ring)
GEN6_WIZ_HASHING_MASK,
GEN6_WIZ_HASHING_16x4);
- return 0;
+ /*
+ * Set up the default L3 partitioning of the hardware of 48
+ * ways (i.e. 384 KB on GT2 parts, 192 KB on GT1 parts and
+ * CHV) for the URB and another 48 ways for the RO partition.
+ */
+ return init_l3_partitioning_workarounds(ring, 48, 0, 0, 48);
}
static int bdw_init_workarounds(struct intel_engine_cs *ring)
@@ -1053,7 +1088,12 @@ static int gen9_init_workarounds(struct intel_engine_cs *ring)
/* WaDisableSTUnitPowerOptimization:skl,bxt */
WA_SET_BIT_MASKED(HALF_SLICE_CHICKEN2, GEN8_ST_PO_DISABLE);
- return 0;
+ /*
+ * Set up the default L3 partitioning of the hardware of 48
+ * ways (i.e. 384 KB on GT2 parts, 192 KB on GT1 parts and
+ * BXT) for the URB and another 48 ways for the RO partition.
+ */
+ return init_l3_partitioning_workarounds(ring, 48, 0, 0, 48);
}
static int skl_tune_iz_hashing(struct intel_engine_cs *ring)
@@ -1175,6 +1215,15 @@ int init_workarounds_ring(struct intel_engine_cs *ring)
dev_priv->workarounds.count = 0;
+ if (IS_IVYBRIDGE(dev))
+ return ivb_init_workarounds(ring);
+
+ if (IS_VALLEYVIEW(dev))
+ return vlv_init_workarounds(ring);
+
+ if (IS_HASWELL(dev))
+ return hsw_init_workarounds(ring);
+
if (IS_BROADWELL(dev))
return bdw_init_workarounds(ring);
--
2.5.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH 5/6] drm/i915/hsw: Move L3 atomics workaround to the workaround list.
2015-10-07 11:44 [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Francisco Jerez
` (2 preceding siblings ...)
2015-10-07 11:44 ` [PATCH 4/6] drm/i915: Program the L3 configuration to hardware defaults on context init Francisco Jerez
@ 2015-10-07 11:44 ` Francisco Jerez
2015-10-07 11:44 ` [PATCH 6/6] drm/i915/vlv: Remove WaIncreaseL3CreditsForVLVB0 from init_clock_gating Francisco Jerez
2015-10-07 12:42 ` [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Chris Wilson
5 siblings, 0 replies; 11+ messages in thread
From: Francisco Jerez @ 2015-10-07 11:44 UTC (permalink / raw)
To: intel-gfx
This makes sure that the workaround is not accidentally undone by some
process (which is possible because the HSW_SCRATCH1 and
HSW_ROW_CHICKEN3 registers are partially whitelisted), what could
cause the next context to be created to hang if it wasn't expecting L3
atomics to be enabled.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
---
drivers/gpu/drm/i915/intel_pm.c | 5 -----
drivers/gpu/drm/i915/intel_ringbuffer.c | 12 ++++++++++++
2 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 60d120c..d495043 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6585,11 +6585,6 @@ static void haswell_init_clock_gating(struct drm_device *dev)
ilk_init_lp_watermarks(dev);
- /* L3 caching of data atomics doesn't work -- disable it. */
- I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
- I915_WRITE(HSW_ROW_CHICKEN3,
- _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
-
/* This is required by WaCatErrorRejectionIssue:hsw */
I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
I915_READ(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG) |
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index c4c39c4..4cb9f5c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -902,6 +902,18 @@ static int vlv_init_workarounds(struct intel_engine_cs *ring)
static int hsw_init_workarounds(struct intel_engine_cs *ring)
{
+ struct drm_i915_private *dev_priv = ring->dev->dev_private;
+
+ /*
+ * L3 caching of data atomics can easily cause a full-system
+ * hang if not used with extreme care -- Disable it on all
+ * contexts by default, userspace is free to re-enable them if
+ * it knows what it's doing.
+ */
+ WA_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
+ WA_WRITE(HSW_ROW_CHICKEN3,
+ _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
+
/*
* Set up the default L3 partitioning of the hardware of 32
* ways (i.e. 256 KB on GT2 parts) for the URB and another 32
--
2.5.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH 6/6] drm/i915/vlv: Remove WaIncreaseL3CreditsForVLVB0 from init_clock_gating.
2015-10-07 11:44 [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Francisco Jerez
` (3 preceding siblings ...)
2015-10-07 11:44 ` [PATCH 5/6] drm/i915/hsw: Move L3 atomics workaround to the workaround list Francisco Jerez
@ 2015-10-07 11:44 ` Francisco Jerez
2015-10-07 12:42 ` [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Chris Wilson
5 siblings, 0 replies; 11+ messages in thread
From: Francisco Jerez @ 2015-10-07 11:44 UTC (permalink / raw)
To: intel-gfx
The same work-arond is applied per-context in
init_l3_partitioning_workarounds(), so the direct MMIO write of
GEN7_L3SQCREG1 should be redundant now. Applying the work-around at
context creation time also makes sure that the MMIO writes are not
accidentally undone by userspace.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
---
drivers/gpu/drm/i915/i915_reg.h | 1 -
drivers/gpu/drm/i915/intel_pm.c | 6 ------
2 files changed, 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 663bc8f..b445c93 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -5930,7 +5930,6 @@ enum skl_disp_power_wells {
#define DISABLE_PIXEL_MASK_CAMMING (1<<14)
#define GEN7_L3SQCREG1 0xB010
-#define VLV_B0_WA_L3SQCREG1_VALUE 0x00D30000
#define IVB_L3SQCREG1_SQGHPCI_DEFAULT 0x00730000
#define VLV_L3SQCREG1_SQGHPCI_DEFAULT 0x00D30000
#define HSW_L3SQCREG1_SQGHPCI_DEFAULT 0x00610000
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d495043..ccb91d3 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6810,12 +6810,6 @@ static void valleyview_init_clock_gating(struct drm_device *dev)
_MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4));
/*
- * WaIncreaseL3CreditsForVLVB0:vlv
- * This is the hardware default actually.
- */
- I915_WRITE(GEN7_L3SQCREG1, VLV_B0_WA_L3SQCREG1_VALUE);
-
- /*
* WaDisableVLVClockGating_VBIIssue:vlv
* Disable clock gating on th GCFG unit to prevent a delay
* in the reporting of vblank events.
--
2.5.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list.
2015-10-07 11:44 [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Francisco Jerez
` (4 preceding siblings ...)
2015-10-07 11:44 ` [PATCH 6/6] drm/i915/vlv: Remove WaIncreaseL3CreditsForVLVB0 from init_clock_gating Francisco Jerez
@ 2015-10-07 12:42 ` Chris Wilson
2015-10-07 13:30 ` Francisco Jerez
5 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2015-10-07 12:42 UTC (permalink / raw)
To: Francisco Jerez; +Cc: intel-gfx
On Wed, Oct 07, 2015 at 02:44:00PM +0300, Francisco Jerez wrote:
> This programs the L3 configuration based on the sizes given for each
> partition as arguments. The relevant register writes are added to the
> workaround list so that they are re-applied to each context while it's
> initialized, preventing state leaks from other userspace processes
> which may have modified the L3 partitioning from its boot-up state,
> since all relevant registers are part of the software and hardware
> command checker whitelists.
>
> Some userspace clients (DDX and current versions of Mesa not patched
> with my L3 partitioning series [1]) assume that the L3 configuration,
> in particular the URB size, comes up in certain state when a context
> is created, but nothing in the kernel guarantees this assumption, the
> registers that control the partitioning of the L3 cache were being
> left untouched.
>
> Note that the VLV_L3SQCREG1_SQGHPCI_DEFAULT macro defined here has the
> same value as the previously defined VLV_B0_WA_L3SQCREG1_VALUE, but
> the latter will be removed in a future commit.
>
> [1] http://lists.freedesktop.org/archives/mesa-dev/2015-September/093550.html
Merge this with 1 + 5 (or was it 4?)- not only does it introduce a
temporary unused function, but we have to jump between patches to see
whether the function is safe (especially given those BUGs), and you add
all w/a in the same patch so bisection is not improved.
Looking at the function itself, I am not convinced that it actually adds
anything over calling setting up the WA from the vfuncs - at least the
bdw/gen7 split is redundant (we split at the vfunc then call one
function where we replit again, but with nasty constraints on the
interface of the function for different gen, it's not a function for the
faint of heart).
> +static int init_l3_partitioning_workarounds(struct intel_engine_cs *ring,
> + unsigned int n_urb,
> + unsigned int n_ro,
> + unsigned int n_dc,
> + unsigned int n_all)
> +{
> + struct drm_device *dev = ring->dev;
> + struct drm_i915_private *dev_priv = dev->dev_private;
> +
> + if (INTEL_INFO(dev)->gen >= 8) {
Why use dev here? You already have dev_priv, so why chase the pointer
again?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list.
2015-10-07 12:42 ` [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list Chris Wilson
@ 2015-10-07 13:30 ` Francisco Jerez
2015-10-08 9:17 ` Chris Wilson
0 siblings, 1 reply; 11+ messages in thread
From: Francisco Jerez @ 2015-10-07 13:30 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
[-- Attachment #1.1.1: Type: text/plain, Size: 3919 bytes --]
Chris Wilson <chris@chris-wilson.co.uk> writes:
> On Wed, Oct 07, 2015 at 02:44:00PM +0300, Francisco Jerez wrote:
>> This programs the L3 configuration based on the sizes given for each
>> partition as arguments. The relevant register writes are added to the
>> workaround list so that they are re-applied to each context while it's
>> initialized, preventing state leaks from other userspace processes
>> which may have modified the L3 partitioning from its boot-up state,
>> since all relevant registers are part of the software and hardware
>> command checker whitelists.
>>
>> Some userspace clients (DDX and current versions of Mesa not patched
>> with my L3 partitioning series [1]) assume that the L3 configuration,
>> in particular the URB size, comes up in certain state when a context
>> is created, but nothing in the kernel guarantees this assumption, the
>> registers that control the partitioning of the L3 cache were being
>> left untouched.
>>
>> Note that the VLV_L3SQCREG1_SQGHPCI_DEFAULT macro defined here has the
>> same value as the previously defined VLV_B0_WA_L3SQCREG1_VALUE, but
>> the latter will be removed in a future commit.
>>
>> [1] http://lists.freedesktop.org/archives/mesa-dev/2015-September/093550.html
>
> Merge this with 1 + 5 (or was it 4?)- not only does it introduce a
> temporary unused function, but we have to jump between patches to see
> whether the function is safe (especially given those BUGs), and you add
> all w/a in the same patch so bisection is not improved.
>
What do you want me to merge it with? This *is* PATCH 1, and PATCH 5 is
largely unrelated. I wouldn't have any objection to squashing PATCH 4
into this commit though.
> Looking at the function itself, I am not convinced that it actually adds
> anything over calling setting up the WA from the vfuncs - at least the
> bdw/gen7 split is redundant (we split at the vfunc then call one
> function where we replit again, but with nasty constraints on the
> interface of the function for different gen, it's not a function for the
> faint of heart).
>
The constraints are just the hardware's constraints on the L3
partitioning. The function is meant to implement the details of
programming the L3 configuration, which is different for different gens
even though the overall structure of the L3 partitioning is the same.
Of course the constraints set by specific hardware on the partition
sizes cannot be abstracted out.
I guess that splitting this out into two functions (one for gen7 and
another for gen8) wouldn't hurt much, but open-coding the function in
all its uses (5) would involve duplicating quite some code.
Assuming I split the function into gen7 and gen8 variants, would you
like them to implement a common consistent interface (i.e. the same
prototype that my init_l3_partitioning_workarounds() implements), or
would you like the variant for each gen to implement an ad-hoc interface
according to the partition configs actually needed on each gen? I
suspect the former would be cleaner.
>> +static int init_l3_partitioning_workarounds(struct intel_engine_cs *ring,
>> + unsigned int n_urb,
>> + unsigned int n_ro,
>> + unsigned int n_dc,
>> + unsigned int n_all)
>> +{
>> + struct drm_device *dev = ring->dev;
>> + struct drm_i915_private *dev_priv = dev->dev_private;
>> +
>> + if (INTEL_INFO(dev)->gen >= 8) {
>
> Why use dev here? You already have dev_priv, so why chase the pointer
> again?
Sorry? Does INTEL_INFO() take a drm_device or a drm_i915_private
pointer as argument? The two types don't seem to be related by an
inheritance relationship or anything similar AFAICT, and most other uses
in this file seem to pass a drm_device as argument even where a
drm_i915_private is available.
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 212 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH 1/6] drm/i915: Implement L3 partitioning set-up from the workaround list.
2015-10-07 13:30 ` Francisco Jerez
@ 2015-10-08 9:17 ` Chris Wilson
0 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2015-10-08 9:17 UTC (permalink / raw)
To: Francisco Jerez; +Cc: intel-gfx
On Wed, Oct 07, 2015 at 04:30:35PM +0300, Francisco Jerez wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > On Wed, Oct 07, 2015 at 02:44:00PM +0300, Francisco Jerez wrote:
> >> This programs the L3 configuration based on the sizes given for each
> >> partition as arguments. The relevant register writes are added to the
> >> workaround list so that they are re-applied to each context while it's
> >> initialized, preventing state leaks from other userspace processes
> >> which may have modified the L3 partitioning from its boot-up state,
> >> since all relevant registers are part of the software and hardware
> >> command checker whitelists.
> >>
> >> Some userspace clients (DDX and current versions of Mesa not patched
> >> with my L3 partitioning series [1]) assume that the L3 configuration,
> >> in particular the URB size, comes up in certain state when a context
> >> is created, but nothing in the kernel guarantees this assumption, the
> >> registers that control the partitioning of the L3 cache were being
> >> left untouched.
> >>
> >> Note that the VLV_L3SQCREG1_SQGHPCI_DEFAULT macro defined here has the
> >> same value as the previously defined VLV_B0_WA_L3SQCREG1_VALUE, but
> >> the latter will be removed in a future commit.
> >>
> >> [1] http://lists.freedesktop.org/archives/mesa-dev/2015-September/093550.html
> >
> > Merge this with 1 + 5 (or was it 4?)- not only does it introduce a
> > temporary unused function, but we have to jump between patches to see
> > whether the function is safe (especially given those BUGs), and you add
> > all w/a in the same patch so bisection is not improved.
> >
> What do you want me to merge it with? This *is* PATCH 1, and PATCH 5 is
> largely unrelated. I wouldn't have any objection to squashing PATCH 4
> into this commit though.
1+4. That just highlights the issue of having two very tightly coupled
patches arbitrary split - the split here doesn't even improve bisection.
> > Looking at the function itself, I am not convinced that it actually adds
> > anything over calling setting up the WA from the vfuncs - at least the
> > bdw/gen7 split is redundant (we split at the vfunc then call one
> > function where we replit again, but with nasty constraints on the
> > interface of the function for different gen, it's not a function for the
> > faint of heart).
> >
> The constraints are just the hardware's constraints on the L3
> partitioning. The function is meant to implement the details of
> programming the L3 configuration, which is different for different gens
> even though the overall structure of the L3 partitioning is the same.
> Of course the constraints set by specific hardware on the partition
> sizes cannot be abstracted out.
>
> I guess that splitting this out into two functions (one for gen7 and
> another for gen8) wouldn't hurt much, but open-coding the function in
> all its uses (5) would involve duplicating quite some code.
Having slept on it, and I think a gen7/gen8 split is best.
My issue with the current combined function is that the interface is
different based on generation, which says to me that it is multiple
functions. It scores very low on the hard-to-misuse ranking
http://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html
And instead of the multiline multiple ternary operators, just have
an if-else-chain for hsw/vlv/ivb (for setting the cmd value).
> Assuming I split the function into gen7 and gen8 variants, would you
> like them to implement a common consistent interface (i.e. the same
> prototype that my init_l3_partitioning_workarounds() implements), or
> would you like the variant for each gen to implement an ad-hoc interface
> according to the partition configs actually needed on each gen? I
> suspect the former would be cleaner.
No, the BUGs tell me that the registers do not have a consistent
interface.
> >> +static int init_l3_partitioning_workarounds(struct intel_engine_cs *ring,
> >> + unsigned int n_urb,
> >> + unsigned int n_ro,
> >> + unsigned int n_dc,
> >> + unsigned int n_all)
> >> +{
> >> + struct drm_device *dev = ring->dev;
> >> + struct drm_i915_private *dev_priv = dev->dev_private;
> >> +
> >> + if (INTEL_INFO(dev)->gen >= 8) {
> >
> > Why use dev here? You already have dev_priv, so why chase the pointer
> > again?
>
> Sorry? Does INTEL_INFO() take a drm_device or a drm_i915_private
> pointer as argument? The two types don't seem to be related by an
> inheritance relationship or anything similar AFAICT, and most other uses
> in this file seem to pass a drm_device as argument even where a
> drm_i915_private is available.
drm_i915_private is our primary internal pointer. We can shave over 8KiB
of object code by removing the pointer dance we have from still passing
around drm_device and jumping to drm_i915_private. New code is supposed
to be using drm_i915_private consistently for internal interfaces (we
can then clearly see the external entry points).
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread