All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
To: John.C.Harrison@Intel.com, Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [PATCH v7 6/8] drm/i915: Interrupt driven fences
Date: Thu, 21 Apr 2016 09:38:06 +0200	[thread overview]
Message-ID: <5718835E.7000802@linux.intel.com> (raw)
In-Reply-To: <1461172195-3959-7-git-send-email-John.C.Harrison@Intel.com>

Op 20-04-16 om 19:09 schreef John.C.Harrison@Intel.com:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> The intended usage model for struct fence is that the signalled status
> should be set on demand rather than polled. That is, there should not
> be a need for a 'signaled' function to be called everytime the status
> is queried. Instead, 'something' should be done to enable a signal
> callback from the hardware which will update the state directly. In
> the case of requests, this is the seqno update interrupt. The idea is
> that this callback will only be enabled on demand when something
> actually tries to wait on the fence.
>
> This change removes the polling test and replaces it with the callback
> scheme. Each fence is added to a 'please poke me' list at the start of
> i915_add_request(). The interrupt handler then scans through the 'poke
> me' list when a new seqno pops out and signals any matching
> fence/request. The fence is then removed from the list so the entire
> request stack does not need to be scanned every time. Note that the
> fence is added to the list before the commands to generate the seqno
> interrupt are added to the ring. Thus the sequence is guaranteed to be
> race free if the interrupt is already enabled.
>
> Note that the interrupt is only enabled on demand (i.e. when
> __wait_request() is called). Thus there is still a potential race when
> enabling the interrupt as the request may already have completed.
> However, this is simply solved by calling the interrupt processing
> code immediately after enabling the interrupt and thereby checking for
> already completed requests.
>
> Lastly, the ring clean up code has the possibility to cancel
> outstanding requests (e.g. because TDR has reset the ring). These
> requests will never get signalled and so must be removed from the
> signal list manually. This is done by setting a 'cancelled' flag and
> then calling the regular notify/retire code path rather than
> attempting to duplicate the list manipulatation and clean up code in
> multiple places. This also avoid any race condition where the
> cancellation request might occur after/during the completion interrupt
> actually arriving.
>
> v2: Updated to take advantage of the request unreference no longer
> requiring the mutex lock.
>
> v3: Move the signal list processing around to prevent unsubmitted
> requests being added to the list. This was occurring on Android
> because the native sync implementation calls the
> fence->enable_signalling API immediately on fence creation.
>
> Updated after review comments by Tvrtko Ursulin. Renamed list nodes to
> 'link' instead of 'list'. Added support for returning an error code on
> a cancelled fence. Update list processing to be more efficient/safer
> with respect to spinlocks.
>
> v5: Made i915_gem_request_submit a static as it is only ever called
> from one place.
>
> Fixed up the low latency wait optimisation. The time delay between the
> seqno value being to memory and the drive's ISR running can be
> significant, at least for the wait request micro-benchmark. This can
> be greatly improved by explicitly checking for seqno updates in the
> pre-wait busy poll loop. Also added some documentation comments to the
> busy poll code.
>
> Fixed up support for the faking of lost interrupts
> (test_irq_rings/missed_irq_rings). That is, there is an IGT test that
> tells the driver to loose interrupts deliberately and then check that
> everything still works as expected (albeit much slower).
>
> Updates from review comments: use non IRQ-save spinlocking, early exit
> on WARN and improved comments (Tvrtko Ursulin).
>
> v6: Updated to newer nigthly and resolved conflicts around the
> wait_request busy spin optimisation. Also fixed a race condition
> between this early exit path and the regular completion path.
>
> v7: Updated to newer nightly - lots of ring -> engine renaming plus an
> interface change on get_seqno(). Also added a list_empty() check
> before acquring spinlocks and doing list processing.
>
> For: VIZ-5190
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |   8 +
>  drivers/gpu/drm/i915/i915_gem.c         | 249 +++++++++++++++++++++++++++++---
>  drivers/gpu/drm/i915/i915_irq.c         |   2 +
>  drivers/gpu/drm/i915/intel_lrc.c        |   2 +
>  drivers/gpu/drm/i915/intel_ringbuffer.c |   2 +
>  drivers/gpu/drm/i915/intel_ringbuffer.h |   2 +
>  6 files changed, 241 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 81324ba..9519b11 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2255,7 +2255,12 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
>  struct drm_i915_gem_request {
>  	/** Underlying object for implementing the signal/wait stuff. */
>  	struct fence fence;
> +	struct list_head signal_link;
> +	struct list_head unsignal_link;
>  	struct list_head delayed_free_link;
> +	bool cancelled;
> +	bool irq_enabled;
> +	bool signal_requested;
>  
>  	/** On Which ring this request was generated */
>  	struct drm_i915_private *i915;
> @@ -2341,6 +2346,9 @@ struct drm_i915_gem_request * __must_check
>  i915_gem_request_alloc(struct intel_engine_cs *engine,
>  		       struct intel_context *ctx);
>  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
> +void i915_gem_request_enable_interrupt(struct drm_i915_gem_request *req,
> +				       bool fence_locked);
> +void i915_gem_request_notify(struct intel_engine_cs *ring, bool fence_locked);
>  
>  int i915_create_fence_timeline(struct drm_device *dev,
>  			       struct intel_context *ctx,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 9071058..9b1de5f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -40,6 +40,8 @@
>  
>  #define RQ_BUG_ON(expr)
>  
> +static void i915_gem_request_submit(struct drm_i915_gem_request *req);
> +
>  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
>  static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
>  static void
> @@ -1253,9 +1255,8 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  	struct intel_engine_cs *engine = i915_gem_request_get_engine(req);
>  	struct drm_device *dev = engine->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	const bool irq_test_in_progress =
> -		ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_engine_flag(engine);
>  	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
> +	uint32_t seqno;
>  	DEFINE_WAIT(wait);
>  	unsigned long timeout_expire;
>  	s64 before = 0; /* Only to silence a compiler warning. */
> @@ -1263,9 +1264,6 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  
>  	WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
>  
> -	if (list_empty(&req->list))
> -		return 0;
> -
>  	if (i915_gem_request_completed(req))
>  		return 0;
>  
> @@ -1291,15 +1289,17 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  	trace_i915_gem_request_wait_begin(req);
>  
>  	/* Optimistic spin for the next jiffie before touching IRQs */
> -	ret = __i915_spin_request(req, state);
> -	if (ret == 0)
> -		goto out;
> -
> -	if (!irq_test_in_progress && WARN_ON(!engine->irq_get(engine))) {
> -		ret = -ENODEV;
> -		goto out;
> +	if (req->seqno) {
> +		ret = __i915_spin_request(req, state);
> +		if (ret == 0)
> +			goto out;
>  	}
>  
> +	/*
> +	 * Enable interrupt completion of the request.
> +	 */
> +	fence_enable_sw_signaling(&req->fence);
> +
>  	for (;;) {
>  		struct timer_list timer;
>  
> @@ -1321,6 +1321,21 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  			break;
>  		}
>  
> +		if (req->seqno) {
> +			/*
> +			 * There is quite a lot of latency in the user interrupt
> +			 * path. So do an explicit seqno check and potentially
> +			 * remove all that delay.
> +			 */
> +			if (req->engine->irq_seqno_barrier)
> +				req->engine->irq_seqno_barrier(req->engine);
> +			seqno = engine->get_seqno(engine);
You're using this barrier + get_seqno pattern multiple times, maybe write a helper function for this?
> +			if (i915_seqno_passed(seqno, req->seqno)) {
> +				ret = 0;
> +				break;
> +			}
> +		}
> +
>  		if (signal_pending_state(state, current)) {
>  			ret = -ERESTARTSYS;
>  			break;
> @@ -1347,14 +1362,32 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  			destroy_timer_on_stack(&timer);
>  		}
>  	}
> -	if (!irq_test_in_progress)
> -		engine->irq_put(engine);
>  
>  	finish_wait(&engine->irq_queue, &wait);
>  
>  out:
>  	trace_i915_gem_request_wait_end(req);
>  
> +	if ((ret == 0) && (req->seqno)) {
> +		if (req->engine->irq_seqno_barrier)
> +			req->engine->irq_seqno_barrier(req->engine);
> +		seqno = engine->get_seqno(engine);
> +		if (i915_seqno_passed(seqno, req->seqno) &&
> +		    !i915_gem_request_completed(req)) {
> +			/*
> +			 * Make sure the request is marked as completed before
> +			 * returning. NB: Need to acquire the spinlock around
> +			 * the whole call to avoid a race condition with the
> +			 * interrupt handler is running concurrently and could
> +			 * cause this invocation to early exit even though the
> +			 * request has not actually been fully processed yet.
> +			 */
> +			spin_lock_irq(&req->engine->fence_lock);
> +			i915_gem_request_notify(req->engine, true);
> +			spin_unlock_irq(&req->engine->fence_lock);
> +		}
> +	}
> +
>  	if (timeout) {
>  		s64 tres = *timeout - (ktime_get_raw_ns() - before);
>  
> @@ -1432,6 +1465,22 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
>  	list_del_init(&request->list);
>  	i915_gem_request_remove_from_client(request);
>  
> +	/*
> +	 * In case the request is still in the signal pending list,
> +	 * e.g. due to being cancelled by TDR, preemption, etc.
> +	 */
> +	if (!list_empty(&request->signal_link)) {
> +		/*
> +		 * The request must be marked as cancelled and the underlying
> +		 * fence as failed. NB: There is no explicit fence fail API,
> +		 * there is only a manual poke and signal.
> +		 */
> +		request->cancelled = true;
> +		/* How to propagate to any associated sync_fence??? */
> +		request->fence.status = -EIO;
> +		fence_signal_locked(&request->fence);
> +	}
>
Can this still happen with

commit aa9b78104fe3210758fa9e6c644e9a108d371e8b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Apr 13 17:35:15 2016 +0100

    drm/i915: Late request cancellations are harmful

?
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-04-21  7:38 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-20 17:09 [PATCH v7 0/8] Convert requests to use struct fence John.C.Harrison
2016-04-20 17:09 ` [PATCH v7 1/8] drm/i915: " John.C.Harrison
2016-04-21  7:06   ` Maarten Lankhorst
2016-04-21 10:26     ` John Harrison
2016-04-21 11:12       ` Maarten Lankhorst
2016-04-21  7:09   ` Maarten Lankhorst
2016-04-20 17:09 ` [PATCH v7 2/8] drm/i915: Removed now redudant parameter to i915_gem_request_completed() John.C.Harrison
2016-04-21  7:19   ` Maarten Lankhorst
2016-04-21 10:18     ` John Harrison
2016-04-20 17:09 ` [PATCH v7 3/8] drm/i915: Add per context timelines to fence object John.C.Harrison
2016-04-20 17:44   ` Chris Wilson
2016-04-21 11:37     ` John Harrison
2016-04-20 17:09 ` [PATCH v7 4/8] drm/i915: Fix clean up of file client list on execbuff failure John.C.Harrison
2016-04-21  7:20   ` Maarten Lankhorst
2016-04-21 10:15     ` John Harrison
2016-04-20 17:09 ` [PATCH v7 5/8] drm/i915: Delay the freeing of requests until retire time John.C.Harrison
2016-04-21  7:41   ` Maarten Lankhorst
2016-04-20 17:09 ` [PATCH v7 6/8] drm/i915: Interrupt driven fences John.C.Harrison
2016-04-21  7:38   ` Maarten Lankhorst [this message]
2016-04-20 17:09 ` [PATCH v7 7/8] drm/i915: Updated request structure tracing John.C.Harrison
2016-04-20 17:09 ` [PATCH v7 8/8] drm/i915: Cache last IRQ seqno to reduce IRQ overhead John.C.Harrison
2016-04-22 15:37 ` [PATCH v7 0/8] Convert requests to use struct fence John Harrison

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5718835E.7000802@linux.intel.com \
    --to=maarten.lankhorst@linux.intel.com \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=John.C.Harrison@Intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.