[CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching

stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching
       [not found] <1460491389-8602-1-git-send-email-chris@chris-wilson.co.uk>
@ 2016-04-12 20:03 ` Chris Wilson
  2016-04-13  9:33   ` Daniel Vetter
  2016-04-12 20:03 ` [CI-ping 12/15] drm/i915: Force ringbuffers to not be at offset 0 Chris Wilson
  2016-04-12 20:03 ` [CI-ping 15/15] drm/i915: Late request cancellations are harmful Chris Wilson
  2 siblings, 1 reply; 10+ messages in thread
From: Chris Wilson @ 2016-04-12 20:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson, Daniel Vetter, Ville Syrjälä, stable

Two concurrent writes into the same register cacheline has the chance of
killing the machine on Ivybridge and other gen7. This includes LRI
emitted from the command parser.  The MI_SET_CONTEXT itself serves as
serialising barrier and prevents the pair of register writes in the first
packet from triggering the fault.  However, if a second switch-context
immediately occurs then we may have two adjacent blocks of LRI to the
same registers which may then trigger the hang. To counteract this we
need to insert a delay after the second register write using SRM.

This is easiest to reproduce with something like
igt/gem_ctx_switch/interruptible that triggers back-to-back context
switches (with no operations in between them in the command stream,
which requires the execbuf operation to be interrupted after the
MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
interruptible igt. No reports from the wild though, so it must be of low
enough frequency that no one has correlated the random machine freezes
with i915.ko

The issue was introduced with
commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Dec 16 10:02:27 2014 +0000

    drm/i915: Disable PSMI sleep messages on all rings around context switches

Testcase: igt/gem_ctx_switch/render-interruptible #ivb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index fe580cb9501a..e5ad7b21e356 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 
 	len = 4;
 	if (INTEL_INFO(engine->dev)->gen >= 7)
-		len += 2 + (num_rings ? 4*num_rings + 2 : 0);
+		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
 
 	ret = intel_ring_begin(req, len);
 	if (ret)
@@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 	if (INTEL_INFO(engine->dev)->gen >= 7) {
 		if (num_rings) {
 			struct intel_engine_cs *signaller;
+			i915_reg_t last_reg = {}; /* keep gcc quiet */
 
 			intel_ring_emit(engine,
 					MI_LOAD_REGISTER_IMM(num_rings));
@@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 				if (signaller == engine)
 					continue;
 
-				intel_ring_emit_reg(engine,
-						    RING_PSMI_CTL(signaller->mmio_base));
+				last_reg = RING_PSMI_CTL(signaller->mmio_base);
+				intel_ring_emit_reg(engine, last_reg);
 				intel_ring_emit(engine,
 						_MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
 			}
+
+			/* Insert a delay before the next switch! */
+			intel_ring_emit(engine,
+					MI_STORE_REGISTER_MEM |
+					MI_SRM_LRM_GLOBAL_GTT);
+			intel_ring_emit_reg(engine, last_reg);
+			intel_ring_emit(engine, engine->scratch.gtt_offset);
+			intel_ring_emit(engine, MI_NOOP);
 		}
 		intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE);
 	}
-- 
2.8.0.rc3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [CI-ping 12/15] drm/i915: Force ringbuffers to not be at offset 0
       [not found] <1460491389-8602-1-git-send-email-chris@chris-wilson.co.uk>
  2016-04-12 20:03 ` [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching Chris Wilson
@ 2016-04-12 20:03 ` Chris Wilson
  2016-04-13  9:34   ` [Intel-gfx] " Daniel Vetter
  2016-04-12 20:03 ` [CI-ping 15/15] drm/i915: Late request cancellations are harmful Chris Wilson
  2 siblings, 1 reply; 10+ messages in thread
From: Chris Wilson @ 2016-04-12 20:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson, stable

For reasons unknown Sandybridge GT1 (at least) will eventually hang when
it encounters a ring wraparound at offset 0. The test case that
reproduces the bug reliably forces a large number of interrupted context
switches, thereby causing very frequent ring wraparounds, but there are
similar bug reports in the wild with the same symptoms, seqno writes
stop just before the wrap and the ringbuffer at address 0. It is also
timing crucial, but adding various delays hasn't helped pinpoint where
the window lies.

Whether the fault is restricted to the ringbuffer itself or the GTT
addressing is unclear, but moving the ringbuffer fixes all the hangs I
have been able to reproduce.

References: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=93262
Testcase: igt/gem_exec_whisper/render-contexts-interruptible #snb-gt1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 8391382431b2..904a8a276f6a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2096,10 +2096,12 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct i915_ggtt *ggtt = &dev_priv->ggtt;
 	struct drm_i915_gem_object *obj = ringbuf->obj;
+	/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
+	unsigned flags = PIN_OFFSET_BIAS | 4096;
 	int ret;
 
 	if (HAS_LLC(dev_priv) && !obj->stolen) {
-		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, 0);
+		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, flags);
 		if (ret)
 			return ret;
 
@@ -2113,7 +2115,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 			goto err_unpin;
 		}
 	} else {
-		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, PIN_MAPPABLE);
+		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE,
+					    flags | PIN_MAPPABLE);
 		if (ret)
 			return ret;
 
-- 
2.8.0.rc3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [CI-ping 15/15] drm/i915: Late request cancellations are harmful
       [not found] <1460491389-8602-1-git-send-email-chris@chris-wilson.co.uk>
  2016-04-12 20:03 ` [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching Chris Wilson
  2016-04-12 20:03 ` [CI-ping 12/15] drm/i915: Force ringbuffers to not be at offset 0 Chris Wilson
@ 2016-04-12 20:03 ` Chris Wilson
  2016-04-13  9:57   ` Daniel Vetter
  2 siblings, 1 reply; 10+ messages in thread
From: Chris Wilson @ 2016-04-12 20:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson, Daniel Vetter, Tvrtko Ursulin, stable

Conceptually, each request is a record of a hardware transaction - we
build up a list of pending commands and then either commit them to
hardware, or cancel them. However, whilst building up the list of
pending commands, we may modify state outside of the request and make
references to the pending request. If we do so and then cancel that
request, external objects then point to the deleted request leading to
both graphical and memory corruption.

The easiest example is to consider object/VMA tracking. When we mark an
object as active in a request, we store a pointer to this, the most
recent request, in the object. Then we want to free that object, we wait
for the most recent request to be idle before proceeding (otherwise the
hardware will write to pages now owned by the system, or we will attempt
to read from those pages before the hardware is finished writing). If
the request was cancelled instead, that wait completes immediately. As a
result, all requests must be committed and not cancelled if the external
state is unknown.

All that remains of i915_gem_request_cancel() users are just a couple of
extremely unlikely allocation failures, so remove the API entirely.

A consequence of committing all incomplete requests is that we generate
excess breadcrumbs and fill the ring much more often with dummy work. We
have completely undone the outstanding_last_seqno optimisation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/i915_drv.h            |  2 --
 drivers/gpu/drm/i915/i915_gem.c            | 50 ++++++++++++------------------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++------
 drivers/gpu/drm/i915/intel_display.c       |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c           |  4 +--
 drivers/gpu/drm/i915/intel_overlay.c       |  8 ++---
 6 files changed, 30 insertions(+), 51 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 061ecc43d935..de84dd7be971 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2331,7 +2331,6 @@ struct drm_i915_gem_request {
 struct drm_i915_gem_request * __must_check
 i915_gem_request_alloc(struct intel_engine_cs *engine,
 		       struct intel_context *ctx);
-void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 void i915_gem_request_free(struct kref *req_ref);
 int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
 				   struct drm_file *file);
@@ -2883,7 +2882,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
 			     struct drm_file *file_priv);
 void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 					struct drm_i915_gem_request *req);
-void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params);
 int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 				   struct drm_i915_gem_execbuffer2 *args,
 				   struct list_head *vmas);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b6879d43dd74..c6f09e7839ea 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2785,7 +2785,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
 		 * fully prepared. Thus it can be cleaned up using the proper
 		 * free code.
 		 */
-		i915_gem_request_cancel(req);
+		intel_ring_reserved_space_cancel(req->ringbuf);
+		i915_gem_request_unreference(req);
 		return ret;
 	}
 
@@ -2822,13 +2823,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
 	return err ? ERR_PTR(err) : req;
 }
 
-void i915_gem_request_cancel(struct drm_i915_gem_request *req)
-{
-	intel_ring_reserved_space_cancel(req->ringbuf);
-
-	i915_gem_request_unreference(req);
-}
-
 struct drm_i915_gem_request *
 i915_gem_find_active_request(struct intel_engine_cs *engine)
 {
@@ -3438,12 +3432,9 @@ int i915_gpu_idle(struct drm_device *dev)
 				return PTR_ERR(req);
 
 			ret = i915_switch_context(req);
-			if (ret) {
-				i915_gem_request_cancel(req);
-				return ret;
-			}
-
 			i915_add_request_no_flush(req);
+			if (ret)
+				return ret;
 		}
 
 		ret = intel_engine_idle(engine);
@@ -4943,34 +4934,33 @@ i915_gem_init_hw(struct drm_device *dev)
 		req = i915_gem_request_alloc(engine, NULL);
 		if (IS_ERR(req)) {
 			ret = PTR_ERR(req);
-			i915_gem_cleanup_engines(dev);
-			goto out;
+			break;
 		}
 
 		if (engine->id == RCS) {
-			for (j = 0; j < NUM_L3_SLICES(dev); j++)
-				i915_gem_l3_remap(req, j);
+			for (j = 0; j < NUM_L3_SLICES(dev); j++) {
+				ret = i915_gem_l3_remap(req, j);
+				if (ret)
+					goto err_request;
+			}
 		}
 
 		ret = i915_ppgtt_init_ring(req);
-		if (ret && ret != -EIO) {
-			DRM_ERROR("PPGTT enable %s failed %d\n",
-				  engine->name, ret);
-			i915_gem_request_cancel(req);
-			i915_gem_cleanup_engines(dev);
-			goto out;
-		}
+		if (ret)
+			goto err_request;
 
 		ret = i915_gem_context_enable(req);
-		if (ret && ret != -EIO) {
-			DRM_ERROR("Context enable %s failed %d\n",
+		if (ret)
+			goto err_request;
+
+err_request:
+		i915_add_request_no_flush(req);
+		if (ret) {
+			DRM_ERROR("Failed to enable %s, error=%d\n",
 				  engine->name, ret);
-			i915_gem_request_cancel(req);
 			i915_gem_cleanup_engines(dev);
-			goto out;
+			break;
 		}
-
-		i915_add_request_no_flush(req);
 	}
 
 out:
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 6ee4f00f620c..6f4f2a6cdf93 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1137,7 +1137,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 	}
 }
 
-void
+static void
 i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
 {
 	/* Unconditionally force add_request to emit a full flush. */
@@ -1322,7 +1322,6 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
 
 	i915_gem_execbuffer_move_to_active(vmas, params->request);
-	i915_gem_execbuffer_retire_commands(params);
 
 	return 0;
 }
@@ -1624,7 +1623,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	ret = i915_gem_request_add_to_client(req, file);
 	if (ret)
-		goto err_batch_unpin;
+		goto err_request;
 
 	/*
 	 * Save assorted stuff away to pass through to *_submission().
@@ -1641,6 +1640,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	params->request                 = req;
 
 	ret = dev_priv->gt.execbuf_submit(params, args, &eb->vmas);
+err_request:
+	i915_gem_execbuffer_retire_commands(params);
 
 err_batch_unpin:
 	/*
@@ -1657,14 +1658,6 @@ err:
 	i915_gem_context_unreference(ctx);
 	eb_destroy(eb);
 
-	/*
-	 * If the request was created but not successfully submitted then it
-	 * must be freed again. If it was submitted then it is being tracked
-	 * on the active request list and no clean up is required here.
-	 */
-	if (ret && !IS_ERR_OR_NULL(req))
-		i915_gem_request_cancel(req);
-
 	mutex_unlock(&dev->struct_mutex);
 
 pre_mutex_err:
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index b1b457864e17..3cae596d10a3 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11664,7 +11664,7 @@ cleanup_unpin:
 	intel_unpin_fb_obj(fb, crtc->primary->state->rotation);
 cleanup_pending:
 	if (!IS_ERR_OR_NULL(request))
-		i915_gem_request_cancel(request);
+		i915_add_request_no_flush(request);
 	atomic_dec(&intel_crtc->unpin_work_count);
 	mutex_unlock(&dev->struct_mutex);
 cleanup:
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b8f6b96472a6..6fc24deaa16a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1010,7 +1010,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
 
 	i915_gem_execbuffer_move_to_active(vmas, params->request);
-	i915_gem_execbuffer_retire_commands(params);
 
 	return 0;
 }
@@ -2679,13 +2678,12 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
 		}
 
 		ret = engine->init_context(req);
+		i915_add_request_no_flush(req);
 		if (ret) {
 			DRM_ERROR("ring init context: %d\n",
 				ret);
-			i915_gem_request_cancel(req);
 			goto error_ringbuf;
 		}
-		i915_add_request_no_flush(req);
 	}
 	return 0;
 
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 6694e9230cd5..bcc3b6a016d8 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -247,7 +247,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
 
 	ret = intel_ring_begin(req, 4);
 	if (ret) {
-		i915_gem_request_cancel(req);
+		i915_add_request_no_flush(req);
 		return ret;
 	}
 
@@ -290,7 +290,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
 
 	ret = intel_ring_begin(req, 2);
 	if (ret) {
-		i915_gem_request_cancel(req);
+		i915_add_request_no_flush(req);
 		return ret;
 	}
 
@@ -356,7 +356,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
 
 	ret = intel_ring_begin(req, 6);
 	if (ret) {
-		i915_gem_request_cancel(req);
+		i915_add_request_no_flush(req);
 		return ret;
 	}
 
@@ -431,7 +431,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
 
 		ret = intel_ring_begin(req, 2);
 		if (ret) {
-			i915_gem_request_cancel(req);
+			i915_add_request_no_flush(req);
 			return ret;
 		}
 
-- 
2.8.0.rc3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching
  2016-04-12 20:03 ` [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching Chris Wilson
@ 2016-04-13  9:33   ` Daniel Vetter
  2016-04-18  9:50     ` [Intel-gfx] " Jani Nikula
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Vetter @ 2016-04-13  9:33 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Daniel Vetter, Ville Syrjälä, stable

On Tue, Apr 12, 2016 at 09:03:05PM +0100, Chris Wilson wrote:
> Two concurrent writes into the same register cacheline has the chance of
> killing the machine on Ivybridge and other gen7. This includes LRI
> emitted from the command parser.  The MI_SET_CONTEXT itself serves as
> serialising barrier and prevents the pair of register writes in the first
> packet from triggering the fault.  However, if a second switch-context
> immediately occurs then we may have two adjacent blocks of LRI to the
> same registers which may then trigger the hang. To counteract this we
> need to insert a delay after the second register write using SRM.
> 
> This is easiest to reproduce with something like
> igt/gem_ctx_switch/interruptible that triggers back-to-back context
> switches (with no operations in between them in the command stream,
> which requires the execbuf operation to be interrupted after the
> MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
> interruptible igt. No reports from the wild though, so it must be of low
> enough frequency that no one has correlated the random machine freezes
> with i915.ko
> 
> The issue was introduced with
> commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Tue Dec 16 10:02:27 2014 +0000
> 
>     drm/i915: Disable PSMI sleep messages on all rings around context switches
> 
> Testcase: igt/gem_ctx_switch/render-interruptible #ivb
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Ville Syrjï¿½lï¿½ <ville.syrjala@linux.intel.com>
> Cc: stable@vger.kernel.org

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index fe580cb9501a..e5ad7b21e356 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  
>  	len = 4;
>  	if (INTEL_INFO(engine->dev)->gen >= 7)
> -		len += 2 + (num_rings ? 4*num_rings + 2 : 0);
> +		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
>  
>  	ret = intel_ring_begin(req, len);
>  	if (ret)
> @@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  	if (INTEL_INFO(engine->dev)->gen >= 7) {
>  		if (num_rings) {
>  			struct intel_engine_cs *signaller;
> +			i915_reg_t last_reg = {}; /* keep gcc quiet */
>  
>  			intel_ring_emit(engine,
>  					MI_LOAD_REGISTER_IMM(num_rings));
> @@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  				if (signaller == engine)
>  					continue;
>  
> -				intel_ring_emit_reg(engine,
> -						    RING_PSMI_CTL(signaller->mmio_base));
> +				last_reg = RING_PSMI_CTL(signaller->mmio_base);
> +				intel_ring_emit_reg(engine, last_reg);
>  				intel_ring_emit(engine,
>  						_MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
>  			}
> +
> +			/* Insert a delay before the next switch! */
> +			intel_ring_emit(engine,
> +					MI_STORE_REGISTER_MEM |
> +					MI_SRM_LRM_GLOBAL_GTT);
> +			intel_ring_emit_reg(engine, last_reg);
> +			intel_ring_emit(engine, engine->scratch.gtt_offset);
> +			intel_ring_emit(engine, MI_NOOP);
>  		}
>  		intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE);
>  	}
> -- 
> 2.8.0.rc3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [CI-ping 12/15] drm/i915: Force ringbuffers to not be at offset 0
  2016-04-12 20:03 ` [CI-ping 12/15] drm/i915: Force ringbuffers to not be at offset 0 Chris Wilson
@ 2016-04-13  9:34   ` Daniel Vetter
  0 siblings, 0 replies; 10+ messages in thread
From: Daniel Vetter @ 2016-04-13  9:34 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable

On Tue, Apr 12, 2016 at 09:03:06PM +0100, Chris Wilson wrote:
> For reasons unknown Sandybridge GT1 (at least) will eventually hang when
> it encounters a ring wraparound at offset 0. The test case that
> reproduces the bug reliably forces a large number of interrupted context
> switches, thereby causing very frequent ring wraparounds, but there are
> similar bug reports in the wild with the same symptoms, seqno writes
> stop just before the wrap and the ringbuffer at address 0. It is also
> timing crucial, but adding various delays hasn't helped pinpoint where
> the window lies.
> 
> Whether the fault is restricted to the ringbuffer itself or the GTT
> addressing is unclear, but moving the ringbuffer fixes all the hangs I
> have been able to reproduce.
> 
> References: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=93262
> Testcase: igt/gem_exec_whisper/render-contexts-interruptible #snb-gt1
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: stable@vger.kernel.org

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 8391382431b2..904a8a276f6a 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2096,10 +2096,12 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>  	struct drm_i915_private *dev_priv = to_i915(dev);
>  	struct i915_ggtt *ggtt = &dev_priv->ggtt;
>  	struct drm_i915_gem_object *obj = ringbuf->obj;
> +	/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
> +	unsigned flags = PIN_OFFSET_BIAS | 4096;
>  	int ret;
>  
>  	if (HAS_LLC(dev_priv) && !obj->stolen) {
> -		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, 0);
> +		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, flags);
>  		if (ret)
>  			return ret;
>  
> @@ -2113,7 +2115,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>  			goto err_unpin;
>  		}
>  	} else {
> -		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, PIN_MAPPABLE);
> +		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE,
> +					    flags | PIN_MAPPABLE);
>  		if (ret)
>  			return ret;
>  
> -- 
> 2.8.0.rc3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [CI-ping 15/15] drm/i915: Late request cancellations are harmful
  2016-04-12 20:03 ` [CI-ping 15/15] drm/i915: Late request cancellations are harmful Chris Wilson
@ 2016-04-13  9:57   ` Daniel Vetter
  2016-04-13 14:21     ` John Harrison
  2016-04-18  9:46     ` [Intel-gfx] " Jani Nikula
  0 siblings, 2 replies; 10+ messages in thread
From: Daniel Vetter @ 2016-04-13  9:57 UTC (permalink / raw)
  To: Chris Wilson
  Cc: intel-gfx, Daniel Vetter, Tvrtko Ursulin, stable, John Harrison

On Tue, Apr 12, 2016 at 09:03:09PM +0100, Chris Wilson wrote:
> Conceptually, each request is a record of a hardware transaction - we
> build up a list of pending commands and then either commit them to
> hardware, or cancel them. However, whilst building up the list of
> pending commands, we may modify state outside of the request and make
> references to the pending request. If we do so and then cancel that
> request, external objects then point to the deleted request leading to
> both graphical and memory corruption.
> 
> The easiest example is to consider object/VMA tracking. When we mark an
> object as active in a request, we store a pointer to this, the most
> recent request, in the object. Then we want to free that object, we wait
> for the most recent request to be idle before proceeding (otherwise the
> hardware will write to pages now owned by the system, or we will attempt
> to read from those pages before the hardware is finished writing). If
> the request was cancelled instead, that wait completes immediately. As a
> result, all requests must be committed and not cancelled if the external
> state is unknown.
> 
> All that remains of i915_gem_request_cancel() users are just a couple of
> extremely unlikely allocation failures, so remove the API entirely.
> 
> A consequence of committing all incomplete requests is that we generate
> excess breadcrumbs and fill the ring much more often with dummy work. We
> have completely undone the outstanding_last_seqno optimisation.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: stable@vger.kernel.org

Cc: John Harrison <John.C.Harrison@Intel.com>

I'd like John's ack on this on too, but patch itself looks sound. Fast r-b
since we've discussed this a while ago already ...

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_drv.h            |  2 --
>  drivers/gpu/drm/i915/i915_gem.c            | 50 ++++++++++++------------------
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++------
>  drivers/gpu/drm/i915/intel_display.c       |  2 +-
>  drivers/gpu/drm/i915/intel_lrc.c           |  4 +--
>  drivers/gpu/drm/i915/intel_overlay.c       |  8 ++---
>  6 files changed, 30 insertions(+), 51 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 061ecc43d935..de84dd7be971 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2331,7 +2331,6 @@ struct drm_i915_gem_request {
>  struct drm_i915_gem_request * __must_check
>  i915_gem_request_alloc(struct intel_engine_cs *engine,
>  		       struct intel_context *ctx);
> -void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>  void i915_gem_request_free(struct kref *req_ref);
>  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
>  				   struct drm_file *file);
> @@ -2883,7 +2882,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
>  			     struct drm_file *file_priv);
>  void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>  					struct drm_i915_gem_request *req);
> -void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params);
>  int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>  				   struct drm_i915_gem_execbuffer2 *args,
>  				   struct list_head *vmas);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index b6879d43dd74..c6f09e7839ea 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2785,7 +2785,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
>  		 * fully prepared. Thus it can be cleaned up using the proper
>  		 * free code.
>  		 */
> -		i915_gem_request_cancel(req);
> +		intel_ring_reserved_space_cancel(req->ringbuf);
> +		i915_gem_request_unreference(req);
>  		return ret;
>  	}
>  
> @@ -2822,13 +2823,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
>  	return err ? ERR_PTR(err) : req;
>  }
>  
> -void i915_gem_request_cancel(struct drm_i915_gem_request *req)
> -{
> -	intel_ring_reserved_space_cancel(req->ringbuf);
> -
> -	i915_gem_request_unreference(req);
> -}
> -
>  struct drm_i915_gem_request *
>  i915_gem_find_active_request(struct intel_engine_cs *engine)
>  {
> @@ -3438,12 +3432,9 @@ int i915_gpu_idle(struct drm_device *dev)
>  				return PTR_ERR(req);
>  
>  			ret = i915_switch_context(req);
> -			if (ret) {
> -				i915_gem_request_cancel(req);
> -				return ret;
> -			}
> -
>  			i915_add_request_no_flush(req);
> +			if (ret)
> +				return ret;
>  		}
>  
>  		ret = intel_engine_idle(engine);
> @@ -4943,34 +4934,33 @@ i915_gem_init_hw(struct drm_device *dev)
>  		req = i915_gem_request_alloc(engine, NULL);
>  		if (IS_ERR(req)) {
>  			ret = PTR_ERR(req);
> -			i915_gem_cleanup_engines(dev);
> -			goto out;
> +			break;
>  		}
>  
>  		if (engine->id == RCS) {
> -			for (j = 0; j < NUM_L3_SLICES(dev); j++)
> -				i915_gem_l3_remap(req, j);
> +			for (j = 0; j < NUM_L3_SLICES(dev); j++) {
> +				ret = i915_gem_l3_remap(req, j);
> +				if (ret)
> +					goto err_request;
> +			}
>  		}
>  
>  		ret = i915_ppgtt_init_ring(req);
> -		if (ret && ret != -EIO) {
> -			DRM_ERROR("PPGTT enable %s failed %d\n",
> -				  engine->name, ret);
> -			i915_gem_request_cancel(req);
> -			i915_gem_cleanup_engines(dev);
> -			goto out;
> -		}
> +		if (ret)
> +			goto err_request;
>  
>  		ret = i915_gem_context_enable(req);
> -		if (ret && ret != -EIO) {
> -			DRM_ERROR("Context enable %s failed %d\n",
> +		if (ret)
> +			goto err_request;
> +
> +err_request:
> +		i915_add_request_no_flush(req);
> +		if (ret) {
> +			DRM_ERROR("Failed to enable %s, error=%d\n",
>  				  engine->name, ret);
> -			i915_gem_request_cancel(req);
>  			i915_gem_cleanup_engines(dev);
> -			goto out;
> +			break;
>  		}
> -
> -		i915_add_request_no_flush(req);
>  	}
>  
>  out:
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 6ee4f00f620c..6f4f2a6cdf93 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1137,7 +1137,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>  	}
>  }
>  
> -void
> +static void
>  i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
>  {
>  	/* Unconditionally force add_request to emit a full flush. */
> @@ -1322,7 +1322,6 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>  
>  	i915_gem_execbuffer_move_to_active(vmas, params->request);
> -	i915_gem_execbuffer_retire_commands(params);
>  
>  	return 0;
>  }
> @@ -1624,7 +1623,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  
>  	ret = i915_gem_request_add_to_client(req, file);
>  	if (ret)
> -		goto err_batch_unpin;
> +		goto err_request;
>  
>  	/*
>  	 * Save assorted stuff away to pass through to *_submission().
> @@ -1641,6 +1640,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	params->request                 = req;
>  
>  	ret = dev_priv->gt.execbuf_submit(params, args, &eb->vmas);
> +err_request:
> +	i915_gem_execbuffer_retire_commands(params);
>  
>  err_batch_unpin:
>  	/*
> @@ -1657,14 +1658,6 @@ err:
>  	i915_gem_context_unreference(ctx);
>  	eb_destroy(eb);
>  
> -	/*
> -	 * If the request was created but not successfully submitted then it
> -	 * must be freed again. If it was submitted then it is being tracked
> -	 * on the active request list and no clean up is required here.
> -	 */
> -	if (ret && !IS_ERR_OR_NULL(req))
> -		i915_gem_request_cancel(req);
> -
>  	mutex_unlock(&dev->struct_mutex);
>  
>  pre_mutex_err:
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index b1b457864e17..3cae596d10a3 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -11664,7 +11664,7 @@ cleanup_unpin:
>  	intel_unpin_fb_obj(fb, crtc->primary->state->rotation);
>  cleanup_pending:
>  	if (!IS_ERR_OR_NULL(request))
> -		i915_gem_request_cancel(request);
> +		i915_add_request_no_flush(request);
>  	atomic_dec(&intel_crtc->unpin_work_count);
>  	mutex_unlock(&dev->struct_mutex);
>  cleanup:
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index b8f6b96472a6..6fc24deaa16a 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1010,7 +1010,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>  
>  	i915_gem_execbuffer_move_to_active(vmas, params->request);
> -	i915_gem_execbuffer_retire_commands(params);
>  
>  	return 0;
>  }
> @@ -2679,13 +2678,12 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
>  		}
>  
>  		ret = engine->init_context(req);
> +		i915_add_request_no_flush(req);
>  		if (ret) {
>  			DRM_ERROR("ring init context: %d\n",
>  				ret);
> -			i915_gem_request_cancel(req);
>  			goto error_ringbuf;
>  		}
> -		i915_add_request_no_flush(req);
>  	}
>  	return 0;
>  
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index 6694e9230cd5..bcc3b6a016d8 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -247,7 +247,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
>  
>  	ret = intel_ring_begin(req, 4);
>  	if (ret) {
> -		i915_gem_request_cancel(req);
> +		i915_add_request_no_flush(req);
>  		return ret;
>  	}
>  
> @@ -290,7 +290,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
>  
>  	ret = intel_ring_begin(req, 2);
>  	if (ret) {
> -		i915_gem_request_cancel(req);
> +		i915_add_request_no_flush(req);
>  		return ret;
>  	}
>  
> @@ -356,7 +356,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
>  
>  	ret = intel_ring_begin(req, 6);
>  	if (ret) {
> -		i915_gem_request_cancel(req);
> +		i915_add_request_no_flush(req);
>  		return ret;
>  	}
>  
> @@ -431,7 +431,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
>  
>  		ret = intel_ring_begin(req, 2);
>  		if (ret) {
> -			i915_gem_request_cancel(req);
> +			i915_add_request_no_flush(req);
>  			return ret;
>  		}
>  
> -- 
> 2.8.0.rc3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [CI-ping 15/15] drm/i915: Late request cancellations are harmful
  2016-04-13  9:57   ` Daniel Vetter
@ 2016-04-13 14:21     ` John Harrison
  2016-04-18  9:46     ` [Intel-gfx] " Jani Nikula
  1 sibling, 0 replies; 10+ messages in thread
From: John Harrison @ 2016-04-13 14:21 UTC (permalink / raw)
  To: Daniel Vetter, Chris Wilson
  Cc: intel-gfx, Daniel Vetter, Tvrtko Ursulin, stable

On 13/04/2016 10:57, Daniel Vetter wrote:
> On Tue, Apr 12, 2016 at 09:03:09PM +0100, Chris Wilson wrote:
>> Conceptually, each request is a record of a hardware transaction - we
>> build up a list of pending commands and then either commit them to
>> hardware, or cancel them. However, whilst building up the list of
>> pending commands, we may modify state outside of the request and make
>> references to the pending request. If we do so and then cancel that
>> request, external objects then point to the deleted request leading to
>> both graphical and memory corruption.
>>
>> The easiest example is to consider object/VMA tracking. When we mark an
>> object as active in a request, we store a pointer to this, the most
>> recent request, in the object. Then we want to free that object, we wait
>> for the most recent request to be idle before proceeding (otherwise the
>> hardware will write to pages now owned by the system, or we will attempt
>> to read from those pages before the hardware is finished writing). If
>> the request was cancelled instead, that wait completes immediately. As a
>> result, all requests must be committed and not cancelled if the external
>> state is unknown.
>>
>> All that remains of i915_gem_request_cancel() users are just a couple of
>> extremely unlikely allocation failures, so remove the API entirely.
>>
>> A consequence of committing all incomplete requests is that we generate
>> excess breadcrumbs and fill the ring much more often with dummy work. We
>> have completely undone the outstanding_last_seqno optimisation.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> Cc: stable@vger.kernel.org
> Cc: John Harrison <John.C.Harrison@Intel.com>
>
> I'd like John's ack on this on too, but patch itself looks sound. Fast r-b
> since we've discussed this a while ago already ...

I think this is going to cause a problem with the scheduler. You are 
effectively saying that the execbuf call cannot fail beyond the point of 
allocating a request. If it gets that far then it must go all the way 
and submit the request to the hardware. With a scheduler, that means 
adding it to the scheduler's queues and tracking it all the way through 
the system to completion. If nothing else, that sounds like a lot of 
extra overhead for no actual work. Or worse if the failure is at a point 
where the request cannot be sent further through the system because it 
was something critical that failed then you are really stuffed.

I'm not sure what the other option would be though, short of being able 
to undo the last read/write object tracking updates.


>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h            |  2 --
>>   drivers/gpu/drm/i915/i915_gem.c            | 50 ++++++++++++------------------
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++------
>>   drivers/gpu/drm/i915/intel_display.c       |  2 +-
>>   drivers/gpu/drm/i915/intel_lrc.c           |  4 +--
>>   drivers/gpu/drm/i915/intel_overlay.c       |  8 ++---
>>   6 files changed, 30 insertions(+), 51 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 061ecc43d935..de84dd7be971 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2331,7 +2331,6 @@ struct drm_i915_gem_request {
>>   struct drm_i915_gem_request * __must_check
>>   i915_gem_request_alloc(struct intel_engine_cs *engine,
>>   		       struct intel_context *ctx);
>> -void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>>   void i915_gem_request_free(struct kref *req_ref);
>>   int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
>>   				   struct drm_file *file);
>> @@ -2883,7 +2882,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
>>   			     struct drm_file *file_priv);
>>   void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>>   					struct drm_i915_gem_request *req);
>> -void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params);
>>   int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>>   				   struct drm_i915_gem_execbuffer2 *args,
>>   				   struct list_head *vmas);
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index b6879d43dd74..c6f09e7839ea 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2785,7 +2785,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
>>   		 * fully prepared. Thus it can be cleaned up using the proper
>>   		 * free code.
>>   		 */
>> -		i915_gem_request_cancel(req);
>> +		intel_ring_reserved_space_cancel(req->ringbuf);
>> +		i915_gem_request_unreference(req);
>>   		return ret;
>>   	}
>>   
>> @@ -2822,13 +2823,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
>>   	return err ? ERR_PTR(err) : req;
>>   }
>>   
>> -void i915_gem_request_cancel(struct drm_i915_gem_request *req)
>> -{
>> -	intel_ring_reserved_space_cancel(req->ringbuf);
>> -
>> -	i915_gem_request_unreference(req);
>> -}
>> -
>>   struct drm_i915_gem_request *
>>   i915_gem_find_active_request(struct intel_engine_cs *engine)
>>   {
>> @@ -3438,12 +3432,9 @@ int i915_gpu_idle(struct drm_device *dev)
>>   				return PTR_ERR(req);
>>   
>>   			ret = i915_switch_context(req);
>> -			if (ret) {
>> -				i915_gem_request_cancel(req);
>> -				return ret;
>> -			}
>> -
>>   			i915_add_request_no_flush(req);
>> +			if (ret)
>> +				return ret;
>>   		}
>>   
>>   		ret = intel_engine_idle(engine);
>> @@ -4943,34 +4934,33 @@ i915_gem_init_hw(struct drm_device *dev)
>>   		req = i915_gem_request_alloc(engine, NULL);
>>   		if (IS_ERR(req)) {
>>   			ret = PTR_ERR(req);
>> -			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> +			break;
>>   		}
>>   
>>   		if (engine->id == RCS) {
>> -			for (j = 0; j < NUM_L3_SLICES(dev); j++)
>> -				i915_gem_l3_remap(req, j);
>> +			for (j = 0; j < NUM_L3_SLICES(dev); j++) {
>> +				ret = i915_gem_l3_remap(req, j);
>> +				if (ret)
>> +					goto err_request;
>> +			}
>>   		}
>>   
>>   		ret = i915_ppgtt_init_ring(req);
>> -		if (ret && ret != -EIO) {
>> -			DRM_ERROR("PPGTT enable %s failed %d\n",
>> -				  engine->name, ret);
>> -			i915_gem_request_cancel(req);
>> -			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> -		}
>> +		if (ret)
>> +			goto err_request;
>>   
>>   		ret = i915_gem_context_enable(req);
>> -		if (ret && ret != -EIO) {
>> -			DRM_ERROR("Context enable %s failed %d\n",
>> +		if (ret)
>> +			goto err_request;
>> +
>> +err_request:
>> +		i915_add_request_no_flush(req);
>> +		if (ret) {
>> +			DRM_ERROR("Failed to enable %s, error=%d\n",
>>   				  engine->name, ret);
>> -			i915_gem_request_cancel(req);
>>   			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> +			break;
>>   		}
>> -
>> -		i915_add_request_no_flush(req);
>>   	}
>>   
>>   out:
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index 6ee4f00f620c..6f4f2a6cdf93 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -1137,7 +1137,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>>   	}
>>   }
>>   
>> -void
>> +static void
>>   i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
>>   {
>>   	/* Unconditionally force add_request to emit a full flush. */
>> @@ -1322,7 +1322,6 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>>   	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>>   
>>   	i915_gem_execbuffer_move_to_active(vmas, params->request);
>> -	i915_gem_execbuffer_retire_commands(params);
>>   
>>   	return 0;
>>   }
>> @@ -1624,7 +1623,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>   
>>   	ret = i915_gem_request_add_to_client(req, file);
>>   	if (ret)
>> -		goto err_batch_unpin;
>> +		goto err_request;
>>   
>>   	/*
>>   	 * Save assorted stuff away to pass through to *_submission().
>> @@ -1641,6 +1640,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>   	params->request                 = req;
>>   
>>   	ret = dev_priv->gt.execbuf_submit(params, args, &eb->vmas);
>> +err_request:
>> +	i915_gem_execbuffer_retire_commands(params);
>>   
>>   err_batch_unpin:
>>   	/*
>> @@ -1657,14 +1658,6 @@ err:
>>   	i915_gem_context_unreference(ctx);
>>   	eb_destroy(eb);
>>   
>> -	/*
>> -	 * If the request was created but not successfully submitted then it
>> -	 * must be freed again. If it was submitted then it is being tracked
>> -	 * on the active request list and no clean up is required here.
>> -	 */
>> -	if (ret && !IS_ERR_OR_NULL(req))
>> -		i915_gem_request_cancel(req);
>> -
>>   	mutex_unlock(&dev->struct_mutex);
>>   
>>   pre_mutex_err:
>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>> index b1b457864e17..3cae596d10a3 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -11664,7 +11664,7 @@ cleanup_unpin:
>>   	intel_unpin_fb_obj(fb, crtc->primary->state->rotation);
>>   cleanup_pending:
>>   	if (!IS_ERR_OR_NULL(request))
>> -		i915_gem_request_cancel(request);
>> +		i915_add_request_no_flush(request);
>>   	atomic_dec(&intel_crtc->unpin_work_count);
>>   	mutex_unlock(&dev->struct_mutex);
>>   cleanup:
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index b8f6b96472a6..6fc24deaa16a 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -1010,7 +1010,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>>   	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>>   
>>   	i915_gem_execbuffer_move_to_active(vmas, params->request);
>> -	i915_gem_execbuffer_retire_commands(params);
>>   
>>   	return 0;
>>   }
>> @@ -2679,13 +2678,12 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
>>   		}
>>   
>>   		ret = engine->init_context(req);
>> +		i915_add_request_no_flush(req);
>>   		if (ret) {
>>   			DRM_ERROR("ring init context: %d\n",
>>   				ret);
>> -			i915_gem_request_cancel(req);
>>   			goto error_ringbuf;
>>   		}
>> -		i915_add_request_no_flush(req);
>>   	}
>>   	return 0;
>>   
>> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
>> index 6694e9230cd5..bcc3b6a016d8 100644
>> --- a/drivers/gpu/drm/i915/intel_overlay.c
>> +++ b/drivers/gpu/drm/i915/intel_overlay.c
>> @@ -247,7 +247,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
>>   
>>   	ret = intel_ring_begin(req, 4);
>>   	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>   		return ret;
>>   	}
>>   
>> @@ -290,7 +290,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
>>   
>>   	ret = intel_ring_begin(req, 2);
>>   	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>   		return ret;
>>   	}
>>   
>> @@ -356,7 +356,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
>>   
>>   	ret = intel_ring_begin(req, 6);
>>   	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>   		return ret;
>>   	}
>>   
>> @@ -431,7 +431,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
>>   
>>   		ret = intel_ring_begin(req, 2);
>>   		if (ret) {
>> -			i915_gem_request_cancel(req);
>> +			i915_add_request_no_flush(req);
>>   			return ret;
>>   		}
>>   
>> -- 
>> 2.8.0.rc3
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [CI-ping 15/15] drm/i915: Late request cancellations are harmful
  2016-04-13  9:57   ` Daniel Vetter
  2016-04-13 14:21     ` John Harrison
@ 2016-04-18  9:46     ` Jani Nikula
  1 sibling, 0 replies; 10+ messages in thread
From: Jani Nikula @ 2016-04-18  9:46 UTC (permalink / raw)
  To: Daniel Vetter, Chris Wilson; +Cc: Daniel Vetter, intel-gfx, stable

On Wed, 13 Apr 2016, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Apr 12, 2016 at 09:03:09PM +0100, Chris Wilson wrote:
>> Conceptually, each request is a record of a hardware transaction - we
>> build up a list of pending commands and then either commit them to
>> hardware, or cancel them. However, whilst building up the list of
>> pending commands, we may modify state outside of the request and make
>> references to the pending request. If we do so and then cancel that
>> request, external objects then point to the deleted request leading to
>> both graphical and memory corruption.
>> 
>> The easiest example is to consider object/VMA tracking. When we mark an
>> object as active in a request, we store a pointer to this, the most
>> recent request, in the object. Then we want to free that object, we wait
>> for the most recent request to be idle before proceeding (otherwise the
>> hardware will write to pages now owned by the system, or we will attempt
>> to read from those pages before the hardware is finished writing). If
>> the request was cancelled instead, that wait completes immediately. As a
>> result, all requests must be committed and not cancelled if the external
>> state is unknown.
>> 
>> All that remains of i915_gem_request_cancel() users are just a couple of
>> extremely unlikely allocation failures, so remove the API entirely.
>> 
>> A consequence of committing all incomplete requests is that we generate
>> excess breadcrumbs and fill the ring much more often with dummy work. We
>> have completely undone the outstanding_last_seqno optimisation.
>> 
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> Cc: stable@vger.kernel.org
>
> Cc: John Harrison <John.C.Harrison@Intel.com>
>
> I'd like John's ack on this on too, but patch itself looks sound. Fast r-b
> since we've discussed this a while ago already ...
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

FYI, this (*) does not cherry-pick cleanly to drm-intel-fixes.

BR,
Jani.


(*) Well, not exactly *this* but rather
https://patchwork.freedesktop.org/patch/80961/ which was not posted on
the list so I can't reply to it.


>> ---
>>  drivers/gpu/drm/i915/i915_drv.h            |  2 --
>>  drivers/gpu/drm/i915/i915_gem.c            | 50 ++++++++++++------------------
>>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++------
>>  drivers/gpu/drm/i915/intel_display.c       |  2 +-
>>  drivers/gpu/drm/i915/intel_lrc.c           |  4 +--
>>  drivers/gpu/drm/i915/intel_overlay.c       |  8 ++---
>>  6 files changed, 30 insertions(+), 51 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 061ecc43d935..de84dd7be971 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2331,7 +2331,6 @@ struct drm_i915_gem_request {
>>  struct drm_i915_gem_request * __must_check
>>  i915_gem_request_alloc(struct intel_engine_cs *engine,
>>  		       struct intel_context *ctx);
>> -void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>>  void i915_gem_request_free(struct kref *req_ref);
>>  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
>>  				   struct drm_file *file);
>> @@ -2883,7 +2882,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
>>  			     struct drm_file *file_priv);
>>  void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>>  					struct drm_i915_gem_request *req);
>> -void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params);
>>  int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>>  				   struct drm_i915_gem_execbuffer2 *args,
>>  				   struct list_head *vmas);
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index b6879d43dd74..c6f09e7839ea 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2785,7 +2785,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
>>  		 * fully prepared. Thus it can be cleaned up using the proper
>>  		 * free code.
>>  		 */
>> -		i915_gem_request_cancel(req);
>> +		intel_ring_reserved_space_cancel(req->ringbuf);
>> +		i915_gem_request_unreference(req);
>>  		return ret;
>>  	}
>>  
>> @@ -2822,13 +2823,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
>>  	return err ? ERR_PTR(err) : req;
>>  }
>>  
>> -void i915_gem_request_cancel(struct drm_i915_gem_request *req)
>> -{
>> -	intel_ring_reserved_space_cancel(req->ringbuf);
>> -
>> -	i915_gem_request_unreference(req);
>> -}
>> -
>>  struct drm_i915_gem_request *
>>  i915_gem_find_active_request(struct intel_engine_cs *engine)
>>  {
>> @@ -3438,12 +3432,9 @@ int i915_gpu_idle(struct drm_device *dev)
>>  				return PTR_ERR(req);
>>  
>>  			ret = i915_switch_context(req);
>> -			if (ret) {
>> -				i915_gem_request_cancel(req);
>> -				return ret;
>> -			}
>> -
>>  			i915_add_request_no_flush(req);
>> +			if (ret)
>> +				return ret;
>>  		}
>>  
>>  		ret = intel_engine_idle(engine);
>> @@ -4943,34 +4934,33 @@ i915_gem_init_hw(struct drm_device *dev)
>>  		req = i915_gem_request_alloc(engine, NULL);
>>  		if (IS_ERR(req)) {
>>  			ret = PTR_ERR(req);
>> -			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> +			break;
>>  		}
>>  
>>  		if (engine->id == RCS) {
>> -			for (j = 0; j < NUM_L3_SLICES(dev); j++)
>> -				i915_gem_l3_remap(req, j);
>> +			for (j = 0; j < NUM_L3_SLICES(dev); j++) {
>> +				ret = i915_gem_l3_remap(req, j);
>> +				if (ret)
>> +					goto err_request;
>> +			}
>>  		}
>>  
>>  		ret = i915_ppgtt_init_ring(req);
>> -		if (ret && ret != -EIO) {
>> -			DRM_ERROR("PPGTT enable %s failed %d\n",
>> -				  engine->name, ret);
>> -			i915_gem_request_cancel(req);
>> -			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> -		}
>> +		if (ret)
>> +			goto err_request;
>>  
>>  		ret = i915_gem_context_enable(req);
>> -		if (ret && ret != -EIO) {
>> -			DRM_ERROR("Context enable %s failed %d\n",
>> +		if (ret)
>> +			goto err_request;
>> +
>> +err_request:
>> +		i915_add_request_no_flush(req);
>> +		if (ret) {
>> +			DRM_ERROR("Failed to enable %s, error=%d\n",
>>  				  engine->name, ret);
>> -			i915_gem_request_cancel(req);
>>  			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> +			break;
>>  		}
>> -
>> -		i915_add_request_no_flush(req);
>>  	}
>>  
>>  out:
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index 6ee4f00f620c..6f4f2a6cdf93 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -1137,7 +1137,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>>  	}
>>  }
>>  
>> -void
>> +static void
>>  i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
>>  {
>>  	/* Unconditionally force add_request to emit a full flush. */
>> @@ -1322,7 +1322,6 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>>  
>>  	i915_gem_execbuffer_move_to_active(vmas, params->request);
>> -	i915_gem_execbuffer_retire_commands(params);
>>  
>>  	return 0;
>>  }
>> @@ -1624,7 +1623,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>  
>>  	ret = i915_gem_request_add_to_client(req, file);
>>  	if (ret)
>> -		goto err_batch_unpin;
>> +		goto err_request;
>>  
>>  	/*
>>  	 * Save assorted stuff away to pass through to *_submission().
>> @@ -1641,6 +1640,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>  	params->request                 = req;
>>  
>>  	ret = dev_priv->gt.execbuf_submit(params, args, &eb->vmas);
>> +err_request:
>> +	i915_gem_execbuffer_retire_commands(params);
>>  
>>  err_batch_unpin:
>>  	/*
>> @@ -1657,14 +1658,6 @@ err:
>>  	i915_gem_context_unreference(ctx);
>>  	eb_destroy(eb);
>>  
>> -	/*
>> -	 * If the request was created but not successfully submitted then it
>> -	 * must be freed again. If it was submitted then it is being tracked
>> -	 * on the active request list and no clean up is required here.
>> -	 */
>> -	if (ret && !IS_ERR_OR_NULL(req))
>> -		i915_gem_request_cancel(req);
>> -
>>  	mutex_unlock(&dev->struct_mutex);
>>  
>>  pre_mutex_err:
>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>> index b1b457864e17..3cae596d10a3 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -11664,7 +11664,7 @@ cleanup_unpin:
>>  	intel_unpin_fb_obj(fb, crtc->primary->state->rotation);
>>  cleanup_pending:
>>  	if (!IS_ERR_OR_NULL(request))
>> -		i915_gem_request_cancel(request);
>> +		i915_add_request_no_flush(request);
>>  	atomic_dec(&intel_crtc->unpin_work_count);
>>  	mutex_unlock(&dev->struct_mutex);
>>  cleanup:
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index b8f6b96472a6..6fc24deaa16a 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -1010,7 +1010,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>>  
>>  	i915_gem_execbuffer_move_to_active(vmas, params->request);
>> -	i915_gem_execbuffer_retire_commands(params);
>>  
>>  	return 0;
>>  }
>> @@ -2679,13 +2678,12 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
>>  		}
>>  
>>  		ret = engine->init_context(req);
>> +		i915_add_request_no_flush(req);
>>  		if (ret) {
>>  			DRM_ERROR("ring init context: %d\n",
>>  				ret);
>> -			i915_gem_request_cancel(req);
>>  			goto error_ringbuf;
>>  		}
>> -		i915_add_request_no_flush(req);
>>  	}
>>  	return 0;
>>  
>> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
>> index 6694e9230cd5..bcc3b6a016d8 100644
>> --- a/drivers/gpu/drm/i915/intel_overlay.c
>> +++ b/drivers/gpu/drm/i915/intel_overlay.c
>> @@ -247,7 +247,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
>>  
>>  	ret = intel_ring_begin(req, 4);
>>  	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>  		return ret;
>>  	}
>>  
>> @@ -290,7 +290,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
>>  
>>  	ret = intel_ring_begin(req, 2);
>>  	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>  		return ret;
>>  	}
>>  
>> @@ -356,7 +356,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
>>  
>>  	ret = intel_ring_begin(req, 6);
>>  	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>  		return ret;
>>  	}
>>  
>> @@ -431,7 +431,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
>>  
>>  		ret = intel_ring_begin(req, 2);
>>  		if (ret) {
>> -			i915_gem_request_cancel(req);
>> +			i915_add_request_no_flush(req);
>>  			return ret;
>>  		}
>>  
>> -- 
>> 2.8.0.rc3
>> 

-- 
Jani Nikula, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching
  2016-04-13  9:33   ` Daniel Vetter
@ 2016-04-18  9:50     ` Jani Nikula
  2016-04-20 13:26       ` Jani Nikula
  0 siblings, 1 reply; 10+ messages in thread
From: Jani Nikula @ 2016-04-18  9:50 UTC (permalink / raw)
  To: Daniel Vetter, Chris Wilson; +Cc: stable, intel-gfx

On Wed, 13 Apr 2016, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Apr 12, 2016 at 09:03:05PM +0100, Chris Wilson wrote:
>> Two concurrent writes into the same register cacheline has the chance of
>> killing the machine on Ivybridge and other gen7. This includes LRI
>> emitted from the command parser.  The MI_SET_CONTEXT itself serves as
>> serialising barrier and prevents the pair of register writes in the first
>> packet from triggering the fault.  However, if a second switch-context
>> immediately occurs then we may have two adjacent blocks of LRI to the
>> same registers which may then trigger the hang. To counteract this we
>> need to insert a delay after the second register write using SRM.
>> 
>> This is easiest to reproduce with something like
>> igt/gem_ctx_switch/interruptible that triggers back-to-back context
>> switches (with no operations in between them in the command stream,
>> which requires the execbuf operation to be interrupted after the
>> MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
>> interruptible igt. No reports from the wild though, so it must be of low
>> enough frequency that no one has correlated the random machine freezes
>> with i915.ko
>> 
>> The issue was introduced with
>> commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>> Date:   Tue Dec 16 10:02:27 2014 +0000
>> 
>>     drm/i915: Disable PSMI sleep messages on all rings around context switches
>> 
>> Testcase: igt/gem_ctx_switch/render-interruptible #ivb
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> Cc: stable@vger.kernel.org
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

FYI, this (*) does not cherry-pick cleanly to drm-intel-fixes.

BR,
Jani.

(*) Well, not exactly *this* but rather
https://patchwork.freedesktop.org/patch/80952/ which was not posted on
the list so I can't reply to it.


>
>> ---
>>  drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++---
>>  1 file changed, 12 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>> index fe580cb9501a..e5ad7b21e356 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> @@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>  
>>  	len = 4;
>>  	if (INTEL_INFO(engine->dev)->gen >= 7)
>> -		len += 2 + (num_rings ? 4*num_rings + 2 : 0);
>> +		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
>>  
>>  	ret = intel_ring_begin(req, len);
>>  	if (ret)
>> @@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>  	if (INTEL_INFO(engine->dev)->gen >= 7) {
>>  		if (num_rings) {
>>  			struct intel_engine_cs *signaller;
>> +			i915_reg_t last_reg = {}; /* keep gcc quiet */
>>  
>>  			intel_ring_emit(engine,
>>  					MI_LOAD_REGISTER_IMM(num_rings));
>> @@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>  				if (signaller == engine)
>>  					continue;
>>  
>> -				intel_ring_emit_reg(engine,
>> -						    RING_PSMI_CTL(signaller->mmio_base));
>> +				last_reg = RING_PSMI_CTL(signaller->mmio_base);
>> +				intel_ring_emit_reg(engine, last_reg);
>>  				intel_ring_emit(engine,
>>  						_MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
>>  			}
>> +
>> +			/* Insert a delay before the next switch! */
>> +			intel_ring_emit(engine,
>> +					MI_STORE_REGISTER_MEM |
>> +					MI_SRM_LRM_GLOBAL_GTT);
>> +			intel_ring_emit_reg(engine, last_reg);
>> +			intel_ring_emit(engine, engine->scratch.gtt_offset);
>> +			intel_ring_emit(engine, MI_NOOP);
>>  		}
>>  		intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE);
>>  	}
>> -- 
>> 2.8.0.rc3
>> 

-- 
Jani Nikula, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching
  2016-04-18  9:50     ` [Intel-gfx] " Jani Nikula
@ 2016-04-20 13:26       ` Jani Nikula
  0 siblings, 0 replies; 10+ messages in thread
From: Jani Nikula @ 2016-04-20 13:26 UTC (permalink / raw)
  To: Daniel Vetter, Chris Wilson; +Cc: stable, intel-gfx

On Mon, 18 Apr 2016, Jani Nikula <jani.nikula@linux.intel.com> wrote:
> On Wed, 13 Apr 2016, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Tue, Apr 12, 2016 at 09:03:05PM +0100, Chris Wilson wrote:
>>> Two concurrent writes into the same register cacheline has the chance of
>>> killing the machine on Ivybridge and other gen7. This includes LRI
>>> emitted from the command parser.  The MI_SET_CONTEXT itself serves as
>>> serialising barrier and prevents the pair of register writes in the first
>>> packet from triggering the fault.  However, if a second switch-context
>>> immediately occurs then we may have two adjacent blocks of LRI to the
>>> same registers which may then trigger the hang. To counteract this we
>>> need to insert a delay after the second register write using SRM.
>>> 
>>> This is easiest to reproduce with something like
>>> igt/gem_ctx_switch/interruptible that triggers back-to-back context
>>> switches (with no operations in between them in the command stream,
>>> which requires the execbuf operation to be interrupted after the
>>> MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
>>> interruptible igt. No reports from the wild though, so it must be of low
>>> enough frequency that no one has correlated the random machine freezes
>>> with i915.ko
>>> 
>>> The issue was introduced with
>>> commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
>>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>>> Date:   Tue Dec 16 10:02:27 2014 +0000
>>> 
>>>     drm/i915: Disable PSMI sleep messages on all rings around context switches
>>> 
>>> Testcase: igt/gem_ctx_switch/render-interruptible #ivb
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>> Cc: stable@vger.kernel.org
>>
>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> FYI, this (*) does not cherry-pick cleanly to drm-intel-fixes.

By which I mean, regardless of cc: stable I'm not doing anything to
backport the commit to fixes. Same with the other patch in this
thread. If someone wants the fixes to 4.6 or stable, they need to do the
backporting.

BR,
Jani.


>
> BR,
> Jani.
>
> (*) Well, not exactly *this* but rather
> https://patchwork.freedesktop.org/patch/80952/ which was not posted on
> the list so I can't reply to it.
>
>
>>
>>> ---
>>>  drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++---
>>>  1 file changed, 12 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>>> index fe580cb9501a..e5ad7b21e356 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>>> @@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>>  
>>>  	len = 4;
>>>  	if (INTEL_INFO(engine->dev)->gen >= 7)
>>> -		len += 2 + (num_rings ? 4*num_rings + 2 : 0);
>>> +		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
>>>  
>>>  	ret = intel_ring_begin(req, len);
>>>  	if (ret)
>>> @@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>>  	if (INTEL_INFO(engine->dev)->gen >= 7) {
>>>  		if (num_rings) {
>>>  			struct intel_engine_cs *signaller;
>>> +			i915_reg_t last_reg = {}; /* keep gcc quiet */
>>>  
>>>  			intel_ring_emit(engine,
>>>  					MI_LOAD_REGISTER_IMM(num_rings));
>>> @@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>>  				if (signaller == engine)
>>>  					continue;
>>>  
>>> -				intel_ring_emit_reg(engine,
>>> -						    RING_PSMI_CTL(signaller->mmio_base));
>>> +				last_reg = RING_PSMI_CTL(signaller->mmio_base);
>>> +				intel_ring_emit_reg(engine, last_reg);
>>>  				intel_ring_emit(engine,
>>>  						_MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
>>>  			}
>>> +
>>> +			/* Insert a delay before the next switch! */
>>> +			intel_ring_emit(engine,
>>> +					MI_STORE_REGISTER_MEM |
>>> +					MI_SRM_LRM_GLOBAL_GTT);
>>> +			intel_ring_emit_reg(engine, last_reg);
>>> +			intel_ring_emit(engine, engine->scratch.gtt_offset);
>>> +			intel_ring_emit(engine, MI_NOOP);
>>>  		}
>>>  		intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE);
>>>  	}
>>> -- 
>>> 2.8.0.rc3
>>> 

-- 
Jani Nikula, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-04-20 13:27 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1460491389-8602-1-git-send-email-chris@chris-wilson.co.uk>
2016-04-12 20:03 ` [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching Chris Wilson
2016-04-13  9:33   ` Daniel Vetter
2016-04-18  9:50     ` [Intel-gfx] " Jani Nikula
2016-04-20 13:26       ` Jani Nikula
2016-04-12 20:03 ` [CI-ping 12/15] drm/i915: Force ringbuffers to not be at offset 0 Chris Wilson
2016-04-13  9:34   ` [Intel-gfx] " Daniel Vetter
2016-04-12 20:03 ` [CI-ping 15/15] drm/i915: Late request cancellations are harmful Chris Wilson
2016-04-13  9:57   ` Daniel Vetter
2016-04-13 14:21     ` John Harrison
2016-04-18  9:46     ` [Intel-gfx] " Jani Nikula

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).