All of lore.kernel.org
 help / color / mirror / Atom feed
* Logical (HW) contexts for gen4/5
@ 2019-01-10 10:38 Chris Wilson
  2019-01-10 10:38 ` [PATCH 1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context Chris Wilson
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Chris Wilson @ 2019-01-10 10:38 UTC (permalink / raw)
  To: intel-gfx

I've completed a run through with piglit on Crestline, Cantiga and
Ironlake (missing desktops Broadwater, Eaglelake) and fixed up the fails
found. Currently piglit seems happy with both contexts disabled (so just
using the per-fd default context isolation) and contexts enabled in mesa.
-Chris


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context
  2019-01-10 10:38 Logical (HW) contexts for gen4/5 Chris Wilson
@ 2019-01-10 10:38 ` Chris Wilson
  2019-01-10 10:38 ` [PATCH 2/3] drm/i915: Enable render context support for Ironlake (gen5) Chris Wilson
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2019-01-10 10:38 UTC (permalink / raw)
  To: intel-gfx

Despite what I think the prm recommends, commit f2253bd9859b
("drm/i915/ringbuffer: EMIT_INVALIDATE after switch context") turned out
to be a huge mistake when enabling Ironlake contexts as the GPU would
hang on either a MI_FLUSH or PIPE_CONTROL immediately following the
MI_SET_CONTEXT of an active mesa context (more vanilla contexts, e.g.
simple rendercopies with igt, do not suffer).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 26b7274a2d43..035b88b32ca7 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1866,12 +1866,12 @@ static int ring_request_alloc(struct i915_request *request)
 	 */
 	request->reserved_space += LEGACY_REQUEST_SIZE;
 
-	ret = switch_context(request);
+	/* Unconditionally invalidate GPU caches and TLBs. */
+	ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
 	if (ret)
 		return ret;
 
-	/* Unconditionally invalidate GPU caches and TLBs. */
-	ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
+	ret = switch_context(request);
 	if (ret)
 		return ret;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] drm/i915: Enable render context support for Ironlake (gen5)
  2019-01-10 10:38 Logical (HW) contexts for gen4/5 Chris Wilson
  2019-01-10 10:38 ` [PATCH 1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context Chris Wilson
@ 2019-01-10 10:38 ` Chris Wilson
  2019-01-10 10:38 ` [PATCH 3/3] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga) Chris Wilson
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2019-01-10 10:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kenneth Graunke

Ironlake does support being able to saving and reloading context specific
registers between contexts, providing isolation of the basic GPU state
(as programmable by userspace). This allows userspace to assume that the
GPU retains their state from one batch to the next, minimising the
amount of state it needs to reload, or manually save and restore.

v2: Fix off-by-one in reading CXT_SIZE, and add a comment that the
CXT_SIZE and context-layout do not match in bspec, but the difference is
irrelevant as we overallocate the full page anyway (Ville).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_engine_cs.c  | 16 ++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 13 +++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 236cd040f271..f89b8f199e3f 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -219,6 +219,22 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
 			return round_up(GEN6_CXT_TOTAL_SIZE(cxt_size) * 64,
 					PAGE_SIZE);
 		case 5:
+			/*
+			 * There is a discrepancy here between the size reported
+			 * by the register and the size of the context layout
+			 * in the docs. Both are described as authorative!
+			 *
+			 * The discrepancy is on the order of a few cachelines,
+			 * but the total is under one page (4k), which is our
+			 * minimum allocation anyway so it should all come
+			 * out in the wash.
+			 */
+			cxt_size = I915_READ(CXT_SIZE) + 1;
+			DRM_DEBUG_DRIVER("gen%d CXT_SIZE = %d bytes [0x%08x]\n",
+					 INTEL_GEN(dev_priv),
+					 cxt_size * 64,
+					 cxt_size - 1);
+			return round_up(cxt_size * 64, PAGE_SIZE);
 		case 4:
 		case 3:
 		case 2:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 035b88b32ca7..889f3de79dd0 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1624,11 +1624,14 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 		/* These flags are for resource streamer on HSW+ */
 		flags |= HSW_MI_RS_SAVE_STATE_EN | HSW_MI_RS_RESTORE_STATE_EN;
 	else
+		/* We need to save the extended state for powersaving modes */
 		flags |= MI_SAVE_EXT_STATE_EN | MI_RESTORE_EXT_STATE_EN;
 
 	len = 4;
 	if (IS_GEN(i915, 7))
 		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
+	else if (IS_GEN(i915, 5))
+		len += 2;
 	if (flags & MI_FORCE_RESTORE) {
 		GEM_BUG_ON(flags & MI_RESTORE_INHIBIT);
 		flags &= ~MI_FORCE_RESTORE;
@@ -1657,6 +1660,14 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 						GEN6_PSMI_SLEEP_MSG_DISABLE);
 			}
 		}
+	} else if (IS_GEN(i915, 5)) {
+		/*
+		 * This w/a is only listed for pre-production ilk a/b steppings,
+		 * but is also mentioned for programming the powerctx. To be
+		 * safe, just apply the workaround; we do not use SyncFlush so
+		 * this should never take effect and so be a no-op!
+		 */
+		*cs++ = MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN;
 	}
 
 	if (force_restore) {
@@ -1711,6 +1722,8 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 			*cs++ = MI_NOOP;
 		}
 		*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+	} else if (IS_GEN(i915, 5)) {
+		*cs++ = MI_SUSPEND_FLUSH;
 	}
 
 	intel_ring_advance(rq, cs);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga)
  2019-01-10 10:38 Logical (HW) contexts for gen4/5 Chris Wilson
  2019-01-10 10:38 ` [PATCH 1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context Chris Wilson
  2019-01-10 10:38 ` [PATCH 2/3] drm/i915: Enable render context support for Ironlake (gen5) Chris Wilson
@ 2019-01-10 10:38 ` Chris Wilson
  2019-01-10 16:03   ` Ville Syrjälä
  2019-01-10 12:34 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context Patchwork
  2019-01-10 15:56 ` ✓ Fi.CI.IGT: " Patchwork
  4 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2019-01-10 10:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kenneth Graunke

Broadwater and the rest of gen4  do support being able to saving and
reloading context specific registers between contexts, providing isolation
of the basic GPU state (as programmable by userspace). This allows
userspace to assume that the GPU retains their state from one batch to the
next, minimising the amount of state it needs to reload and manually save
across batches.

v2: CONSTANT_BUFFER woes

Running through piglit turned up an interesting issue, a GPU hang inside
the context load. The context image includes the CONSTANT_BUFFER command
that loads an address into a on-gpu buffer, and the context load was
executing that immediately. However, since it was reading from the GTT
there is no guarantee that the GTT retains the same configuration as
when the context was saved, resulting in stray reads and a GPU hang.

Having tried issuing a CONSTANT_BUFFER (to disable the command) from the
ring before saving the context to no avail, we resort to patching out
the instruction inside the context image before loading.

This does impose that gen4 always reissues CONSTANT_BUFFER commands on
each batch, but due to the use of a shared GTT that was and will remain
a requirement.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v1
---
 drivers/gpu/drm/i915/intel_engine_cs.c    |  2 +-
 drivers/gpu/drm/i915/intel_gpu_commands.h |  3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c   | 17 +++++++++++++++++
 3 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index f89b8f199e3f..88109e0de051 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -219,6 +219,7 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
 			return round_up(GEN6_CXT_TOTAL_SIZE(cxt_size) * 64,
 					PAGE_SIZE);
 		case 5:
+		case 4:
 			/*
 			 * There is a discrepancy here between the size reported
 			 * by the register and the size of the context layout
@@ -235,7 +236,6 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
 					 cxt_size * 64,
 					 cxt_size - 1);
 			return round_up(cxt_size * 64, PAGE_SIZE);
-		case 4:
 		case 3:
 		case 2:
 		/* For the special day when i810 gets merged. */
diff --git a/drivers/gpu/drm/i915/intel_gpu_commands.h b/drivers/gpu/drm/i915/intel_gpu_commands.h
index 105e2a9e874a..00c0175c37ed 100644
--- a/drivers/gpu/drm/i915/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/intel_gpu_commands.h
@@ -266,6 +266,9 @@
 #define GFX_OP_3DSTATE_BINDING_TABLE_EDIT_PS \
 	((0x3<<29)|(0x3<<27)|(0x0<<24)|(0x47<<16))
 
+#define GFX_OP_CONSTANT_BUFFER \
+	(0x3 << 29 | 0x0 << 27 | 0x0 << 24 | 0x2 << 16)
+
 #define MFX_WAIT  ((0x3<<29)|(0x1<<27)|(0x0<<16))
 
 #define COLOR_BLT     ((0x2<<29)|(0x40<<22))
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 889f3de79dd0..21bd71cf2e94 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1632,6 +1632,8 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
 	else if (IS_GEN(i915, 5))
 		len += 2;
+	else if (IS_GEN(i915, 4))
+		len += 4;
 	if (flags & MI_FORCE_RESTORE) {
 		GEM_BUG_ON(flags & MI_RESTORE_INHIBIT);
 		flags &= ~MI_FORCE_RESTORE;
@@ -1668,6 +1670,21 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 		 * this should never take effect and so be a no-op!
 		 */
 		*cs++ = MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN;
+	} else if (IS_GEN(i915, 4)) {
+		/*
+		 * Disable CONSTANT_BUFFER before it is loaded from the context
+		 * image. For as it is loaded, it is executed and the stored
+		 * address may no longer be valid, leading to a GPU hang.
+		 *
+		 * This imposes the requirement that userspace reload their
+		 * CONSTANT_BUFFER on every batch, fortunately a requirement
+		 * they are already accustomed to from before contexts were
+		 * enabled.
+		 */
+		*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+		*cs++ = 0;
+		*cs++ = i915_ggtt_offset(rq->hw_context->state) + 0x1d4;
+		*cs++ = GFX_OP_CONSTANT_BUFFER; /* inactive */
 	}
 
 	if (force_restore) {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context
  2019-01-10 10:38 Logical (HW) contexts for gen4/5 Chris Wilson
                   ` (2 preceding siblings ...)
  2019-01-10 10:38 ` [PATCH 3/3] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga) Chris Wilson
@ 2019-01-10 12:34 ` Patchwork
  2019-01-10 15:56 ` ✓ Fi.CI.IGT: " Patchwork
  4 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2019-01-10 12:34 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context
URL   : https://patchwork.freedesktop.org/series/54991/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_5391 -> Patchwork_11273
====================================================

Summary
-------

  **WARNING**

  Minor unknown changes coming with Patchwork_11273 need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_11273, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/54991/revisions/1/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_11273:

### IGT changes ###

#### Warnings ####

  * igt@gem_ctx_create@basic:
    - fi-elk-e7500:       {SKIP} -> PASS +6

  * igt@gem_ctx_exec@basic:
    - fi-bwr-2160:        {SKIP} -> PASS +6

  * igt@gem_ctx_param@basic-default:
    - fi-ilk-650:         {SKIP} -> PASS +6

  * igt@pm_rpm@basic-pci-d3-state:
    - fi-byt-n2820:       {SKIP} -> PASS

  
Known issues
------------

  Here are the changes found in Patchwork_11273 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@amdgpu/amd_basic@cs-compute:
    - fi-kbl-8809g:       NOTRUN -> FAIL [fdo#108094]

  * igt@amdgpu/amd_prime@amd-to-i915:
    - fi-kbl-8809g:       NOTRUN -> FAIL [fdo#107341]

  * igt@gem_mmap_gtt@basic:
    - fi-glk-dsi:         PASS -> INCOMPLETE [fdo#103359] / [k.org#198133]

  * igt@kms_frontbuffer_tracking@basic:
    - fi-hsw-peppy:       PASS -> DMESG-WARN [fdo#102614]

  * igt@kms_pipe_crc_basic@hang-read-crc-pipe-a:
    - fi-byt-clapper:     PASS -> FAIL [fdo#103191] / [fdo#107362]

  
#### Possible fixes ####

  * igt@amdgpu/amd_basic@userptr:
    - fi-kbl-8809g:       DMESG-WARN [fdo#108965] -> PASS

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b:
    - fi-blb-e6850:       INCOMPLETE [fdo#107718] -> PASS

  * igt@pm_rpm@basic-rte:
    - fi-byt-n2820:       FAIL [fdo#108800] -> PASS

  
#### Warnings ####

  * igt@kms_chamelium@common-hpd-after-suspend:
    - fi-kbl-7567u:       DMESG-WARN [fdo#103558] / [fdo#105079] / [fdo#105602] -> DMESG-FAIL [fdo#105079]

  
  [fdo#102614]: https://bugs.freedesktop.org/show_bug.cgi?id=102614
  [fdo#103191]: https://bugs.freedesktop.org/show_bug.cgi?id=103191
  [fdo#103359]: https://bugs.freedesktop.org/show_bug.cgi?id=103359
  [fdo#103558]: https://bugs.freedesktop.org/show_bug.cgi?id=103558
  [fdo#105079]: https://bugs.freedesktop.org/show_bug.cgi?id=105079
  [fdo#105602]: https://bugs.freedesktop.org/show_bug.cgi?id=105602
  [fdo#107341]: https://bugs.freedesktop.org/show_bug.cgi?id=107341
  [fdo#107362]: https://bugs.freedesktop.org/show_bug.cgi?id=107362
  [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
  [fdo#108094]: https://bugs.freedesktop.org/show_bug.cgi?id=108094
  [fdo#108800]: https://bugs.freedesktop.org/show_bug.cgi?id=108800
  [fdo#108965]: https://bugs.freedesktop.org/show_bug.cgi?id=108965
  [k.org#198133]: https://bugzilla.kernel.org/show_bug.cgi?id=198133


Participating hosts (50 -> 43)
------------------------------

  Missing    (7): fi-kbl-soraka fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-pnv-d510 fi-icl-y fi-bdw-samus 


Build changes
-------------

    * Linux: CI_DRM_5391 -> Patchwork_11273

  CI_DRM_5391: 301489f99b4cbab3fe70de185a74747e7fca46c3 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4758: f01796214bbde31e37b0593e547ad9436fdd02ba @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_11273: 0d45b385f0d7ddc64afc5006d9de71c4f6fef779 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

0d45b385f0d7 drm/i915: Enable render context support for gen4 (Broadwater to Cantiga)
e2befd63e9fc drm/i915: Enable render context support for Ironlake (gen5)
b5788e6024ef drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_11273/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context
  2019-01-10 10:38 Logical (HW) contexts for gen4/5 Chris Wilson
                   ` (3 preceding siblings ...)
  2019-01-10 12:34 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context Patchwork
@ 2019-01-10 15:56 ` Patchwork
  4 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2019-01-10 15:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context
URL   : https://patchwork.freedesktop.org/series/54991/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_5391_full -> Patchwork_11273_full
====================================================

Summary
-------

  **WARNING**

  Minor unknown changes coming with Patchwork_11273_full need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_11273_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_11273_full:

### IGT changes ###

#### Warnings ####

  * igt@pm_rc6_residency@rc6-accuracy:
    - shard-snb:          {SKIP} -> PASS

  * igt@tools_test@tools_test:
    - shard-glk:          PASS -> {SKIP}

  
Known issues
------------

  Here are the changes found in Patchwork_11273_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_schedule@pi-ringfull-bsd:
    - shard-skl:          NOTRUN -> FAIL [fdo#103158]

  * igt@gem_exec_whisper@normal:
    - shard-skl:          PASS -> TIMEOUT [fdo#108592]

  * igt@gem_softpin@noreloc-s3:
    - shard-snb:          PASS -> DMESG-WARN [fdo#102365]

  * igt@kms_busy@extended-modeset-hang-newfb-render-b:
    - shard-skl:          NOTRUN -> DMESG-WARN [fdo#107956]

  * igt@kms_cursor_crc@cursor-64x21-random:
    - shard-glk:          PASS -> FAIL [fdo#103232]

  * igt@kms_cursor_crc@cursor-64x64-dpms:
    - shard-skl:          NOTRUN -> FAIL [fdo#103232] +1

  * igt@kms_draw_crc@fill-fb:
    - shard-skl:          NOTRUN -> FAIL [fdo#103184] +1

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-blt:
    - shard-apl:          PASS -> FAIL [fdo#103167]

  * igt@kms_frontbuffer_tracking@fbc-rgb565-draw-mmap-wc:
    - shard-skl:          NOTRUN -> FAIL [fdo#105682]

  * igt@kms_frontbuffer_tracking@fbc-stridechange:
    - shard-skl:          NOTRUN -> FAIL [fdo#105683]

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-cur-indfb-draw-pwrite:
    - shard-skl:          NOTRUN -> FAIL [fdo#103167] +2

  * igt@kms_pipe_crc_basic@read-crc-pipe-a-frame-sequence:
    - shard-skl:          NOTRUN -> FAIL [fdo#103191] / [fdo#107362]

  * igt@kms_plane@pixel-format-pipe-a-planes-source-clamping:
    - shard-skl:          NOTRUN -> DMESG-WARN [fdo#106885]

  * igt@kms_plane@plane-position-covered-pipe-c-planes:
    - shard-apl:          PASS -> FAIL [fdo#103166] +1
    - shard-glk:          PASS -> FAIL [fdo#103166]

  * igt@kms_plane_alpha_blend@pipe-a-constant-alpha-max:
    - shard-skl:          NOTRUN -> FAIL [fdo#108145] +1

  * igt@kms_plane_alpha_blend@pipe-a-coverage-7efc:
    - shard-skl:          NOTRUN -> FAIL [fdo#107815] / [fdo#108145] +1

  * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
    - shard-skl:          PASS -> FAIL [fdo#107815]

  * igt@kms_rotation_crc@multiplane-rotation-cropping-top:
    - shard-glk:          PASS -> DMESG-WARN [fdo#105763] / [fdo#106538]

  * igt@kms_setmode@basic:
    - shard-apl:          PASS -> FAIL [fdo#99912]
    - shard-hsw:          PASS -> FAIL [fdo#99912]

  * igt@pm_rpm@gem-execbuf:
    - shard-skl:          PASS -> INCOMPLETE [fdo#107803] / [fdo#107807]

  * igt@pm_rpm@system-suspend-devices:
    - shard-skl:          PASS -> INCOMPLETE [fdo#107807] +1

  
#### Possible fixes ####

  * igt@kms_ccs@pipe-a-crc-sprite-planes-basic:
    - shard-apl:          FAIL [fdo#106510] / [fdo#108145] -> PASS

  * igt@kms_chv_cursor_fail@pipe-b-128x128-top-edge:
    - shard-skl:          FAIL [fdo#104671] -> PASS

  * igt@kms_cursor_crc@cursor-64x21-onscreen:
    - shard-glk:          FAIL [fdo#103232] -> PASS

  * igt@kms_cursor_crc@cursor-64x21-random:
    - shard-apl:          FAIL [fdo#103232] -> PASS

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-mmap-wc:
    - shard-apl:          FAIL [fdo#103167] -> PASS

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-spr-indfb-onoff:
    - shard-glk:          FAIL [fdo#103167] -> PASS +4

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-pri-indfb-draw-render:
    - shard-apl:          INCOMPLETE [fdo#103927] -> {SKIP}

  * igt@kms_plane_alpha_blend@pipe-c-constant-alpha-max:
    - shard-glk:          FAIL [fdo#108145] -> PASS

  * igt@kms_plane_multiple@atomic-pipe-a-tiling-x:
    - shard-glk:          FAIL [fdo#103166] -> PASS

  * igt@kms_setmode@basic:
    - shard-kbl:          FAIL [fdo#99912] -> PASS

  
  [fdo#102365]: https://bugs.freedesktop.org/show_bug.cgi?id=102365
  [fdo#103158]: https://bugs.freedesktop.org/show_bug.cgi?id=103158
  [fdo#103166]: https://bugs.freedesktop.org/show_bug.cgi?id=103166
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103184]: https://bugs.freedesktop.org/show_bug.cgi?id=103184
  [fdo#103191]: https://bugs.freedesktop.org/show_bug.cgi?id=103191
  [fdo#103232]: https://bugs.freedesktop.org/show_bug.cgi?id=103232
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#104671]: https://bugs.freedesktop.org/show_bug.cgi?id=104671
  [fdo#105682]: https://bugs.freedesktop.org/show_bug.cgi?id=105682
  [fdo#105683]: https://bugs.freedesktop.org/show_bug.cgi?id=105683
  [fdo#105763]: https://bugs.freedesktop.org/show_bug.cgi?id=105763
  [fdo#106510]: https://bugs.freedesktop.org/show_bug.cgi?id=106510
  [fdo#106538]: https://bugs.freedesktop.org/show_bug.cgi?id=106538
  [fdo#106885]: https://bugs.freedesktop.org/show_bug.cgi?id=106885
  [fdo#107362]: https://bugs.freedesktop.org/show_bug.cgi?id=107362
  [fdo#107803]: https://bugs.freedesktop.org/show_bug.cgi?id=107803
  [fdo#107807]: https://bugs.freedesktop.org/show_bug.cgi?id=107807
  [fdo#107815]: https://bugs.freedesktop.org/show_bug.cgi?id=107815
  [fdo#107956]: https://bugs.freedesktop.org/show_bug.cgi?id=107956
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#108592]: https://bugs.freedesktop.org/show_bug.cgi?id=108592
  [fdo#99912]: https://bugs.freedesktop.org/show_bug.cgi?id=99912


Participating hosts (6 -> 6)
------------------------------

  No changes in participating hosts


Build changes
-------------

    * Linux: CI_DRM_5391 -> Patchwork_11273

  CI_DRM_5391: 301489f99b4cbab3fe70de185a74747e7fca46c3 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4758: f01796214bbde31e37b0593e547ad9436fdd02ba @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_11273: 0d45b385f0d7ddc64afc5006d9de71c4f6fef779 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_11273/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga)
  2019-01-10 10:38 ` [PATCH 3/3] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga) Chris Wilson
@ 2019-01-10 16:03   ` Ville Syrjälä
  2019-01-10 16:32     ` Chris Wilson
  2019-01-12 21:28     ` Chris Wilson
  0 siblings, 2 replies; 11+ messages in thread
From: Ville Syrjälä @ 2019-01-10 16:03 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Kenneth Graunke

On Thu, Jan 10, 2019 at 10:38:07AM +0000, Chris Wilson wrote:
> Broadwater and the rest of gen4  do support being able to saving and
> reloading context specific registers between contexts, providing isolation
> of the basic GPU state (as programmable by userspace). This allows
> userspace to assume that the GPU retains their state from one batch to the
> next, minimising the amount of state it needs to reload and manually save
> across batches.
> 
> v2: CONSTANT_BUFFER woes
> 
> Running through piglit turned up an interesting issue, a GPU hang inside
> the context load. The context image includes the CONSTANT_BUFFER command
> that loads an address into a on-gpu buffer, and the context load was
> executing that immediately. However, since it was reading from the GTT
> there is no guarantee that the GTT retains the same configuration as
> when the context was saved, resulting in stray reads and a GPU hang.
> 
> Having tried issuing a CONSTANT_BUFFER (to disable the command) from the
> ring before saving the context to no avail, we resort to patching out
> the instruction inside the context image before loading.
> 
> This does impose that gen4 always reissues CONSTANT_BUFFER commands on
> each batch, but due to the use of a shared GTT that was and will remain
> a requirement.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v1
> ---
>  drivers/gpu/drm/i915/intel_engine_cs.c    |  2 +-
>  drivers/gpu/drm/i915/intel_gpu_commands.h |  3 +++
>  drivers/gpu/drm/i915/intel_ringbuffer.c   | 17 +++++++++++++++++
>  3 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index f89b8f199e3f..88109e0de051 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -219,6 +219,7 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
>  			return round_up(GEN6_CXT_TOTAL_SIZE(cxt_size) * 64,
>  					PAGE_SIZE);
>  		case 5:
> +		case 4:
>  			/*
>  			 * There is a discrepancy here between the size reported
>  			 * by the register and the size of the context layout
> @@ -235,7 +236,6 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
>  					 cxt_size * 64,
>  					 cxt_size - 1);
>  			return round_up(cxt_size * 64, PAGE_SIZE);
> -		case 4:
>  		case 3:
>  		case 2:
>  		/* For the special day when i810 gets merged. */
> diff --git a/drivers/gpu/drm/i915/intel_gpu_commands.h b/drivers/gpu/drm/i915/intel_gpu_commands.h
> index 105e2a9e874a..00c0175c37ed 100644
> --- a/drivers/gpu/drm/i915/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/intel_gpu_commands.h
> @@ -266,6 +266,9 @@
>  #define GFX_OP_3DSTATE_BINDING_TABLE_EDIT_PS \
>  	((0x3<<29)|(0x3<<27)|(0x0<<24)|(0x47<<16))
>  
> +#define GFX_OP_CONSTANT_BUFFER \
> +	(0x3 << 29 | 0x0 << 27 | 0x0 << 24 | 0x2 << 16)
> +
>  #define MFX_WAIT  ((0x3<<29)|(0x1<<27)|(0x0<<16))
>  
>  #define COLOR_BLT     ((0x2<<29)|(0x40<<22))
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 889f3de79dd0..21bd71cf2e94 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1632,6 +1632,8 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
>  		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
>  	else if (IS_GEN(i915, 5))
>  		len += 2;
> +	else if (IS_GEN(i915, 4))
> +		len += 4;
>  	if (flags & MI_FORCE_RESTORE) {
>  		GEM_BUG_ON(flags & MI_RESTORE_INHIBIT);
>  		flags &= ~MI_FORCE_RESTORE;
> @@ -1668,6 +1670,21 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
>  		 * this should never take effect and so be a no-op!
>  		 */
>  		*cs++ = MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN;
> +	} else if (IS_GEN(i915, 4)) {
> +		/*
> +		 * Disable CONSTANT_BUFFER before it is loaded from the context
> +		 * image. For as it is loaded, it is executed and the stored
> +		 * address may no longer be valid, leading to a GPU hang.
> +		 *
> +		 * This imposes the requirement that userspace reload their
> +		 * CONSTANT_BUFFER on every batch, fortunately a requirement
> +		 * they are already accustomed to from before contexts were
> +		 * enabled.
> +		 */
> +		*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
> +		*cs++ = 0;
> +		*cs++ = i915_ggtt_offset(rq->hw_context->state) + 0x1d4;

Is that offset correct for ctg/elk? The docs suggest that it is not.
Though it looks like it'd land inside 3DSTATE_SAMPLER_PALETTE_LOAD_1
so probably not something anyone would notice.

> +		*cs++ = GFX_OP_CONSTANT_BUFFER; /* inactive */

I'm thinking there are a few ways in which we could even end up
with an inconsistent curbe setup in the context.

Eg. something like this:
CS_URB_STATE(num>0)
URB_FENCE(cs>0)
CONSTANT_BUFFER(len>0)
CS_URB_STATE(num=0)
URB_FENCE(cs=0)
ctx save
ctx restore

The restore would end up trying to issue CONSTANT_BUFFER with
len > 0 even though nothing is allocated for the CS unit in
the URB.

That didn't seem to be the case in your earlier hang though
so not quite sure why it was hanging. And I'm not sure this
would even cause a hang. The spec just says "undefined".

Bspec has this to say about CONSTANT_BUFFER:
"Programming Notes
 Constant Buffers can only be allocated in linear (not tiled) graphics memory
 Constant Buffers can only be mapped to Main Memory (UC)"

Maybe we were violating one of those? Though I don't think we could
even violate the tiled vs. linear restriction unless the memory access
goes through the gtt fence for some reason. I didn't think gen4+ do
that anymore. So that maybe leaves the possibility that a snooped
bo got mapped to the same address. But again I'm not sure if that
would cause a hang or just potentiall stale data being loaded into
the CURBE.

Another mystery is why g4x/ilk wouldn't need this. There's nothing in
the docs to suggest that it should behave any differently. Hmm. Maybe
MI_INVALIDATE_ISP could have something to do with this? Maybe that
forces CONSTANT_BUFFER to be saved with valid==0 always? I guess it
might be semi-interesting to peek at the resulting context image
for.

After a bit of digging I found a few more potentially related
tidbits:

ECOSPKD[4] Constant Buffer Save/Restore Disable [DevBW-C1+]

ILK_Chicken_1[7] ILK-C0: Curbe 8HW drop ECO
 0: Fix is enabled to complete the 8HW CURBE restore FIFO clear
 1: Fix is disable. Behavior is same as ILK B0

ILK_Chicken_1[0] [DevILK-B0]
 “1” CS will force URB modify_enables set after context restore
 “0” CS will not force URB modify_enables set after context restore

All of those default to 0 AFAICS, so only ILK_Chicken_1[7] seems like it
could have any real effect. So wouldn't explain g4x working.

>  	}
>  
>  	if (force_restore) {
> -- 
> 2.20.1

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga)
  2019-01-10 16:03   ` Ville Syrjälä
@ 2019-01-10 16:32     ` Chris Wilson
  2019-01-12 21:28     ` Chris Wilson
  1 sibling, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2019-01-10 16:32 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Kenneth Graunke

Quoting Ville Syrjälä (2019-01-10 16:03:21)
> On Thu, Jan 10, 2019 at 10:38:07AM +0000, Chris Wilson wrote:
> > Broadwater and the rest of gen4  do support being able to saving and
> > reloading context specific registers between contexts, providing isolation
> > of the basic GPU state (as programmable by userspace). This allows
> > userspace to assume that the GPU retains their state from one batch to the
> > next, minimising the amount of state it needs to reload and manually save
> > across batches.
> > 
> > v2: CONSTANT_BUFFER woes
> > 
> > Running through piglit turned up an interesting issue, a GPU hang inside
> > the context load. The context image includes the CONSTANT_BUFFER command
> > that loads an address into a on-gpu buffer, and the context load was
> > executing that immediately. However, since it was reading from the GTT
> > there is no guarantee that the GTT retains the same configuration as
> > when the context was saved, resulting in stray reads and a GPU hang.
> > 
> > Having tried issuing a CONSTANT_BUFFER (to disable the command) from the
> > ring before saving the context to no avail, we resort to patching out
> > the instruction inside the context image before loading.
> > 
> > This does impose that gen4 always reissues CONSTANT_BUFFER commands on
> > each batch, but due to the use of a shared GTT that was and will remain
> > a requirement.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Cc: Kenneth Graunke <kenneth@whitecape.org>
> > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v1
> > ---
> >  drivers/gpu/drm/i915/intel_engine_cs.c    |  2 +-
> >  drivers/gpu/drm/i915/intel_gpu_commands.h |  3 +++
> >  drivers/gpu/drm/i915/intel_ringbuffer.c   | 17 +++++++++++++++++
> >  3 files changed, 21 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> > index f89b8f199e3f..88109e0de051 100644
> > --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> > @@ -219,6 +219,7 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
> >                       return round_up(GEN6_CXT_TOTAL_SIZE(cxt_size) * 64,
> >                                       PAGE_SIZE);
> >               case 5:
> > +             case 4:
> >                       /*
> >                        * There is a discrepancy here between the size reported
> >                        * by the register and the size of the context layout
> > @@ -235,7 +236,6 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
> >                                        cxt_size * 64,
> >                                        cxt_size - 1);
> >                       return round_up(cxt_size * 64, PAGE_SIZE);
> > -             case 4:
> >               case 3:
> >               case 2:
> >               /* For the special day when i810 gets merged. */
> > diff --git a/drivers/gpu/drm/i915/intel_gpu_commands.h b/drivers/gpu/drm/i915/intel_gpu_commands.h
> > index 105e2a9e874a..00c0175c37ed 100644
> > --- a/drivers/gpu/drm/i915/intel_gpu_commands.h
> > +++ b/drivers/gpu/drm/i915/intel_gpu_commands.h
> > @@ -266,6 +266,9 @@
> >  #define GFX_OP_3DSTATE_BINDING_TABLE_EDIT_PS \
> >       ((0x3<<29)|(0x3<<27)|(0x0<<24)|(0x47<<16))
> >  
> > +#define GFX_OP_CONSTANT_BUFFER \
> > +     (0x3 << 29 | 0x0 << 27 | 0x0 << 24 | 0x2 << 16)
> > +
> >  #define MFX_WAIT  ((0x3<<29)|(0x1<<27)|(0x0<<16))
> >  
> >  #define COLOR_BLT     ((0x2<<29)|(0x40<<22))
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 889f3de79dd0..21bd71cf2e94 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -1632,6 +1632,8 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
> >               len += 2 + (num_rings ? 4*num_rings + 6 : 0);
> >       else if (IS_GEN(i915, 5))
> >               len += 2;
> > +     else if (IS_GEN(i915, 4))
> > +             len += 4;
> >       if (flags & MI_FORCE_RESTORE) {
> >               GEM_BUG_ON(flags & MI_RESTORE_INHIBIT);
> >               flags &= ~MI_FORCE_RESTORE;
> > @@ -1668,6 +1670,21 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
> >                * this should never take effect and so be a no-op!
> >                */
> >               *cs++ = MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN;
> > +     } else if (IS_GEN(i915, 4)) {
> > +             /*
> > +              * Disable CONSTANT_BUFFER before it is loaded from the context
> > +              * image. For as it is loaded, it is executed and the stored
> > +              * address may no longer be valid, leading to a GPU hang.
> > +              *
> > +              * This imposes the requirement that userspace reload their
> > +              * CONSTANT_BUFFER on every batch, fortunately a requirement
> > +              * they are already accustomed to from before contexts were
> > +              * enabled.
> > +              */
> > +             *cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
> > +             *cs++ = 0;
> > +             *cs++ = i915_ggtt_offset(rq->hw_context->state) + 0x1d4;
> 
> Is that offset correct for ctg/elk? The docs suggest that it is not.
> Though it looks like it'd land inside 3DSTATE_SAMPLER_PALETTE_LOAD_1
> so probably not something anyone would notice.

I just checked piglit didn't break. Admittedly a bit blaise :)

> > +             *cs++ = GFX_OP_CONSTANT_BUFFER; /* inactive */
> 
> I'm thinking there are a few ways in which we could even end up
> with an inconsistent curbe setup in the context.
> 
> Eg. something like this:
> CS_URB_STATE(num>0)
> URB_FENCE(cs>0)
> CONSTANT_BUFFER(len>0)
> CS_URB_STATE(num=0)
> URB_FENCE(cs=0)
> ctx save
> ctx restore
> 
> The restore would end up trying to issue CONSTANT_BUFFER with
> len > 0 even though nothing is allocated for the CS unit in
> the URB.
> 
> That didn't seem to be the case in your earlier hang though
> so not quite sure why it was hanging. And I'm not sure this
> would even cause a hang. The spec just says "undefined".
> 
> Bspec has this to say about CONSTANT_BUFFER:
> "Programming Notes
>  Constant Buffers can only be allocated in linear (not tiled) graphics memory
>  Constant Buffers can only be mapped to Main Memory (UC)"
> 
> Maybe we were violating one of those? Though I don't think we could
> even violate the tiled vs. linear restriction unless the memory access
> goes through the gtt fence for some reason. I didn't think gen4+ do
> that anymore. So that maybe leaves the possibility that a snooped
> bo got mapped to the same address. But again I'm not sure if that
> would cause a hang or just potentiall stale data being loaded into
> the CURBE.

Agreed that tiling shouldn't be an issue with the gen4 architecture (and
that's probably just cut'n'paste from older). snooped is also unlikely
since we/mesa are definitely not allocating snooped buffers on the fly,
so the only snoop buffers should be the static HWSP -- and that's not
even in the GGTT for Crestline.

So perhaps it is an overlapping fence?

> Another mystery is why g4x/ilk wouldn't need this. There's nothing in
> the docs to suggest that it should behave any differently. Hmm. Maybe
> MI_INVALIDATE_ISP could have something to do with this? Maybe that
> forces CONSTANT_BUFFER to be saved with valid==0 always? I guess it
> might be semi-interesting to peek at the resulting context image
> for.

The ctg/ilk context image is ugly and looks much more like line noise
(like there's a magically 255 dword 3DSTATE command full of junk.)

It could just be with the larger GGTT for g4x/ilk, piglit run gpu wasn't
able to trigger the same issue.

> After a bit of digging I found a few more potentially related
> tidbits:
> 
> ECOSPKD[4] Constant Buffer Save/Restore Disable [DevBW-C1+]
> 
> ILK_Chicken_1[7] ILK-C0: Curbe 8HW drop ECO
>  0: Fix is enabled to complete the 8HW CURBE restore FIFO clear
>  1: Fix is disable. Behavior is same as ILK B0
> 
> ILK_Chicken_1[0] [DevILK-B0]
>  “1” CS will force URB modify_enables set after context restore
>  “0” CS will not force URB modify_enables set after context restore
> 
> All of those default to 0 AFAICS, so only ILK_Chicken_1[7] seems like it
> could have any real effect. So wouldn't explain g4x working.

That does look interesting. Thanks,
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga)
  2019-01-10 16:03   ` Ville Syrjälä
  2019-01-10 16:32     ` Chris Wilson
@ 2019-01-12 21:28     ` Chris Wilson
  1 sibling, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2019-01-12 21:28 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Kenneth Graunke

Quoting Ville Syrjälä (2019-01-10 16:03:21)
> After a bit of digging I found a few more potentially related
> tidbits:
> 
> ECOSPKD[4] Constant Buffer Save/Restore Disable [DevBW-C1+]

Does the trick for DevCL. I see it is still listed for g4x, so seems safe
enough to keep for IS_GEN(4) and that it would seem mere good fortune
that it did fail for DevCL highlighting the CONSTANT_BUFFER issue.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/3] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga)
  2019-01-28 21:52 [PATCH 1/3] " Chris Wilson
@ 2019-01-28 21:52 ` Chris Wilson
  0 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2019-01-28 21:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kenneth Graunke

Broadwater and the rest of gen4  do support being able to saving and
reloading context specific registers between contexts, providing isolation
of the basic GPU state (as programmable by userspace). This allows
userspace to assume that the GPU retains their state from one batch to the
next, minimising the amount of state it needs to reload and manually save
across batches.

v2: CONSTANT_BUFFER woes

Running through piglit turned up an interesting issue, a GPU hang inside
the context load. The context image includes the CONSTANT_BUFFER command
that loads an address into a on-gpu buffer, and the context load was
executing that immediately. However, since it was reading from the GTT
there is no guarantee that the GTT retains the same configuration as
when the context was saved, resulting in stray reads and a GPU hang.

Having tried issuing a CONSTANT_BUFFER (to disable the command) from the
ring before saving the context to no avail, we resort to patching out
the instruction inside the context image before loading.

This does impose that gen4 always reissues CONSTANT_BUFFER commands on
each batch, but due to the use of a shared GTT that was and will remain
a requirement.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v1
---
 drivers/gpu/drm/i915/intel_engine_cs.c    |  2 +-
 drivers/gpu/drm/i915/intel_gpu_commands.h |  3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c   | 17 +++++++++++++++++
 3 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 148c3e06a2eb..32bd850eec30 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -220,6 +220,7 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
 			return round_up(GEN6_CXT_TOTAL_SIZE(cxt_size) * 64,
 					PAGE_SIZE);
 		case 5:
+		case 4:
 			/*
 			 * There is a discrepancy here between the size reported
 			 * by the register and the size of the context layout
@@ -236,7 +237,6 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
 					 cxt_size * 64,
 					 cxt_size - 1);
 			return round_up(cxt_size * 64, PAGE_SIZE);
-		case 4:
 		case 3:
 		case 2:
 		/* For the special day when i810 gets merged. */
diff --git a/drivers/gpu/drm/i915/intel_gpu_commands.h b/drivers/gpu/drm/i915/intel_gpu_commands.h
index b96a31bc1080..a95bfd922c41 100644
--- a/drivers/gpu/drm/i915/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/intel_gpu_commands.h
@@ -265,6 +265,9 @@
 #define GFX_OP_3DSTATE_BINDING_TABLE_EDIT_PS \
 	((0x3<<29)|(0x3<<27)|(0x0<<24)|(0x47<<16))
 
+#define GFX_OP_CONSTANT_BUFFER \
+	(0x3 << 29 | 0x0 << 27 | 0x0 << 24 | 0x2 << 16)
+
 #define MFX_WAIT  ((0x3<<29)|(0x1<<27)|(0x0<<16))
 
 #define COLOR_BLT     ((0x2<<29)|(0x40<<22))
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 19def67bf1c5..c03d156d59d9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1734,6 +1734,8 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
 	else if (IS_GEN(i915, 5))
 		len += 2;
+	else if (IS_GEN(i915, 4))
+		len += 4;
 	if (flags & MI_FORCE_RESTORE) {
 		GEM_BUG_ON(flags & MI_RESTORE_INHIBIT);
 		flags &= ~MI_FORCE_RESTORE;
@@ -1770,6 +1772,21 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 		 * this should never take effect and so be a no-op!
 		 */
 		*cs++ = MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN;
+	} else if (IS_GEN(i915, 4)) {
+		/*
+		 * Disable CONSTANT_BUFFER before it is loaded from the context
+		 * image. For as it is loaded, it is executed and the stored
+		 * address may no longer be valid, leading to a GPU hang.
+		 *
+		 * This imposes the requirement that userspace reload their
+		 * CONSTANT_BUFFER on every batch, fortunately a requirement
+		 * they are already accustomed to from before contexts were
+		 * enabled.
+		 */
+		*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+		*cs++ = 0;
+		*cs++ = i915_ggtt_offset(rq->hw_context->state) + 0x1d4;
+		*cs++ = GFX_OP_CONSTANT_BUFFER; /* inactive */
 	}
 
 	if (force_restore) {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga)
  2019-04-19 11:17 [PATCH 1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context Chris Wilson
@ 2019-04-19 11:17 ` Chris Wilson
  0 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2019-04-19 11:17 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kenneth Graunke

Broadwater and the rest of gen4  do support being able to saving and
reloading context specific registers between contexts, providing isolation
of the basic GPU state (as programmable by userspace). This allows
userspace to assume that the GPU retains their state from one batch to the
next, minimising the amount of state it needs to reload and manually save
across batches.

v2: CONSTANT_BUFFER woes

Running through piglit turned up an interesting issue, a GPU hang inside
the context load. The context image includes the CONSTANT_BUFFER command
that loads an address into a on-gpu buffer, and the context load was
executing that immediately. However, since it was reading from the GTT
there is no guarantee that the GTT retains the same configuration as
when the context was saved, resulting in stray reads and a GPU hang.

Having tried issuing a CONSTANT_BUFFER (to disable the command) from the
ring before saving the context to no avail, we resort to patching out
the instruction inside the context image before loading.

This does impose that gen4 always reissues CONSTANT_BUFFER commands on
each batch, but due to the use of a shared GTT that was and will remain
a requirement.

v3: ECOSPKD to the rescue

Ville found the magic bit in the ECOSPKD to disable saving and restoring
the CONSTANT_BUFFER from the context image, thereby completely avoiding
the GPU hangs from chasing invalid pointers. This appears to be the
default behaviour for gen5, and so we just need to tweak gen4 to match.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
---
 drivers/gpu/drm/i915/i915_reg.h         |  3 +++
 drivers/gpu/drm/i915/intel_engine_cs.c  |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 14 ++++++++++++++
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index b74824f0b5b1..5815703ac35f 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2665,6 +2665,9 @@ enum i915_power_well_id {
 # define MODE_IDLE					(1 << 9)
 # define STOP_RING					(1 << 8)
 
+#define ECOSPKD		_MMIO(0x21d0)
+# define CONSTANT_BUFFER_SR_DISABLE BIT(4)
+
 #define GEN6_GT_MODE	_MMIO(0x20d0)
 #define GEN7_GT_MODE	_MMIO(0x7008)
 #define   GEN6_WIZ_HASHING(hi, lo)			(((hi) << 9) | ((lo) << 7))
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index fc8be2fcb4e6..f9db2e0bca12 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -211,6 +211,7 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
 			return round_up(GEN6_CXT_TOTAL_SIZE(cxt_size) * 64,
 					PAGE_SIZE);
 		case 5:
+		case 4:
 			/*
 			 * There is a discrepancy here between the size reported
 			 * by the register and the size of the context layout
@@ -227,7 +228,6 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
 					 cxt_size * 64,
 					 cxt_size - 1);
 			return round_up(cxt_size * 64, PAGE_SIZE);
-		case 4:
 		case 3:
 		case 2:
 		/* For the special day when i810 gets merged. */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 2d2e33cd3fae..26b276ed00b3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -832,6 +832,20 @@ static int init_render_ring(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
 
+	/*
+	 * Disable CONSTANT_BUFFER before it is loaded from the context
+	 * image. For as it is loaded, it is executed and the stored
+	 * address may no longer be valid, leading to a GPU hang.
+	 *
+	 * This imposes the requirement that userspace reload their
+	 * CONSTANT_BUFFER on every batch, fortunately a requirement
+	 * they are already accustomed to from before contexts were
+	 * enabled.
+	 */
+	if (IS_GEN(dev_priv, 4))
+		I915_WRITE(ECOSPKD,
+			   _MASKED_BIT_ENABLE(CONSTANT_BUFFER_SR_DISABLE));
+
 	/* WaTimedSingleVertexDispatch:cl,bw,ctg,elk,ilk,snb */
 	if (IS_GEN_RANGE(dev_priv, 4, 6))
 		I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-04-19 11:17 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-01-10 10:38 Logical (HW) contexts for gen4/5 Chris Wilson
2019-01-10 10:38 ` [PATCH 1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context Chris Wilson
2019-01-10 10:38 ` [PATCH 2/3] drm/i915: Enable render context support for Ironlake (gen5) Chris Wilson
2019-01-10 10:38 ` [PATCH 3/3] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga) Chris Wilson
2019-01-10 16:03   ` Ville Syrjälä
2019-01-10 16:32     ` Chris Wilson
2019-01-12 21:28     ` Chris Wilson
2019-01-10 12:34 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context Patchwork
2019-01-10 15:56 ` ✓ Fi.CI.IGT: " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2019-01-28 21:52 [PATCH 1/3] " Chris Wilson
2019-01-28 21:52 ` [PATCH 3/3] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga) Chris Wilson
2019-04-19 11:17 [PATCH 1/3] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context Chris Wilson
2019-04-19 11:17 ` [PATCH 3/3] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga) Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.