public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: John.C.Harrison@Intel.com
Cc: Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [RFC 19/39] drm/i915: Added scheduler support to __wait_request() calls
Date: Tue, 21 Jul 2015 11:27:17 +0200	[thread overview]
Message-ID: <20150721092717.GF16722@phenom.ffwll.local> (raw)
In-Reply-To: <1437143628-6329-20-git-send-email-John.C.Harrison@Intel.com>

On Fri, Jul 17, 2015 at 03:33:28PM +0100, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> The scheduler can cause batch buffers, and hence requests, to be submitted to
> the ring out of order and asynchronously to their submission to the driver. Thus
> at the point of waiting for the completion of a given request, it is not even
> guaranteed that the request has actually been sent to the hardware yet. Even it
> is has been sent, it is possible that it could be pre-empted and thus 'unsent'.
> 
> This means that it is necessary to be able to submit requests to the hardware
> during the wait call itself. Unfortunately, while some callers of
> __wait_request() release the mutex lock first, others do not (and apparently can
> not). Hence there is the ability to deadlock as the wait stalls for submission
> but the asynchronous submission is stalled for the mutex lock.
> 
> This change hooks the scheduler in to the __wait_request() code to ensure
> correct behaviour. That is, flush the target batch buffer through to the
> hardware and do not deadlock waiting for something that cannot currently be
> submitted. Instead, the wait call must return EAGAIN at least as far back as
> necessary to release the mutex lock and allow the scheduler's asynchronous
> processing to get in and handle the pre-emption operation and eventually
> (re-)submit the work.

I expect this to break the shrinker logic and that means fixing up all the
callers that bind objects into ggtt, like you already do (well comment
about) for the page fault handler. And just doing -EAGAIN spinning isn't
pretty either.

Instead we need a proper locking scheme here so that the scheduler can run
while the struct mutex is held. execlist submission had the same
requirement and hence has grown its own locks.

Note that there's currently one type of memory that we can evict which the
shrinker doesn't take care of, and that's to-be-unpinned buffers after
pageflips. We're working to fix that.
-Daniel

> 
> Change-Id: I31fe6bc7e38f6ffdd843fcae16e7cc8b1e52a931
> For: VIZ-1587
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |  3 +-
>  drivers/gpu/drm/i915/i915_gem.c         | 37 +++++++++++---
>  drivers/gpu/drm/i915/i915_scheduler.c   | 91 +++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_scheduler.h   |  2 +
>  drivers/gpu/drm/i915/intel_display.c    |  3 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c |  2 +-
>  6 files changed, 128 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 2b3fab6..e9e0736 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2972,7 +2972,8 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  			unsigned reset_counter,
>  			bool interruptible,
>  			s64 *timeout,
> -			struct intel_rps_client *rps);
> +			struct intel_rps_client *rps,
> +			bool is_locked);
>  int __must_check i915_wait_request(struct drm_i915_gem_request *req);
>  int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
>  int __must_check
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index cb5af5d..f713cda 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1219,7 +1219,8 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  			unsigned reset_counter,
>  			bool interruptible,
>  			s64 *timeout,
> -			struct intel_rps_client *rps)
> +			struct intel_rps_client *rps,
> +			bool is_locked)
>  {
>  	struct intel_engine_cs *ring = i915_gem_request_get_ring(req);
>  	struct drm_device *dev = ring->dev;
> @@ -1229,8 +1230,10 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  	DEFINE_WAIT(wait);
>  	unsigned long timeout_expire;
>  	s64 before, now;
> -	int ret;
> +	int ret = 0;
> +	bool    busy;
>  
> +	might_sleep();
>  	WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
>  
>  	if (list_empty(&req->list))
> @@ -1281,6 +1284,22 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  			break;
>  		}
>  
> +		if (is_locked) {
> +			/* If this request is being processed by the scheduler
> +			 * then it is unsafe to sleep with the mutex lock held
> +			 * as the scheduler may require the lock in order to
> +			 * progress the request. */
> +			if (i915_scheduler_is_request_tracked(req, NULL, &busy)) {
> +				if (busy) {
> +					ret = -EAGAIN;
> +					break;
> +				}
> +			}
> +
> +			/* If the request is not tracked by the scheduler then the
> +			 * regular test can be done. */
> +		}
> +
>  		if (i915_gem_request_completed(req)) {
>  			ret = 0;
>  			break;
> @@ -1452,13 +1471,17 @@ i915_wait_request(struct drm_i915_gem_request *req)
>  
>  	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
>  
> +	ret = i915_scheduler_flush_request(req, true);
> +	if (ret < 0)
> +		return ret;
> +
>  	ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
>  	if (ret)
>  		return ret;
>  
>  	ret = __i915_wait_request(req,
>  				  atomic_read(&dev_priv->gpu_error.reset_counter),
> -				  interruptible, NULL, NULL);
> +				  interruptible, NULL, NULL, true);
>  	if (ret)
>  		return ret;
>  
> @@ -1571,7 +1594,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
>  	mutex_unlock(&dev->struct_mutex);
>  	for (i = 0; ret == 0 && i < n; i++)
>  		ret = __i915_wait_request(requests[i], reset_counter, true,
> -					  NULL, rps);
> +					  NULL, rps, false);
>  	mutex_lock(&dev->struct_mutex);
>  
>  	for (i = 0; i < n; i++) {
> @@ -3443,7 +3466,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>  		if (ret == 0)
>  			ret = __i915_wait_request(req[i], reset_counter, true,
>  						  args->timeout_ns > 0 ? &args->timeout_ns : NULL,
> -						  file->driver_priv);
> +						  file->driver_priv, false);
>  		i915_gem_request_unreference(req[i]);
>  	}
>  	return ret;
> @@ -3476,7 +3499,7 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
>  					  atomic_read(&i915->gpu_error.reset_counter),
>  					  i915->mm.interruptible,
>  					  NULL,
> -					  &i915->rps.semaphores);
> +					  &i915->rps.semaphores, true);
>  		if (ret)
>  			return ret;
>  
> @@ -4683,7 +4706,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  	if (target == NULL)
>  		return 0;
>  
> -	ret = __i915_wait_request(target, reset_counter, true, NULL, NULL);
> +	ret = __i915_wait_request(target, reset_counter, true, NULL, NULL, false);
>  	if (ret == 0)
>  		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
>  
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index df2e27f..811cbe4 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -31,6 +31,8 @@ static int         i915_scheduler_remove_dependent(struct i915_scheduler *schedu
>  						   struct i915_scheduler_queue_entry *remove);
>  static int         i915_scheduler_submit(struct intel_engine_cs *ring,
>  					 bool is_locked);
> +static int         i915_scheduler_submit_max_priority(struct intel_engine_cs *ring,
> +					       bool is_locked);
>  static uint32_t    i915_scheduler_count_flying(struct i915_scheduler *scheduler,
>  					       struct intel_engine_cs *ring);
>  static void        i915_scheduler_priority_bump_clear(struct i915_scheduler *scheduler);
> @@ -600,6 +602,57 @@ void i915_gem_scheduler_work_handler(struct work_struct *work)
>  	mutex_unlock(&dev->struct_mutex);
>  }
>  
> +int i915_scheduler_flush_request(struct drm_i915_gem_request *req,
> +				 bool is_locked)
> +{
> +	struct drm_i915_private            *dev_priv;
> +	struct i915_scheduler              *scheduler;
> +	unsigned long       flags;
> +	int                 flush_count;
> +	uint32_t            ring_id;
> +
> +	if (!req)
> +		return -EINVAL;
> +
> +	dev_priv  = req->ring->dev->dev_private;
> +	scheduler = dev_priv->scheduler;
> +
> +	if (!scheduler)
> +		return 0;
> +
> +	if (!req->scheduler_qe)
> +		return 0;
> +
> +	if (!I915_SQS_IS_QUEUED(req->scheduler_qe))
> +		return 0;
> +
> +	ring_id = req->ring->id;
> +	if (is_locked && (scheduler->flags[ring_id] & i915_sf_submitting)) {
> +		/* Scheduler is busy already submitting another batch,
> +		 * come back later rather than going recursive... */
> +		return -EAGAIN;
> +	}
> +
> +	if (list_empty(&scheduler->node_queue[ring_id]))
> +		return 0;
> +
> +	spin_lock_irqsave(&scheduler->lock, flags);
> +
> +	i915_scheduler_priority_bump_clear(scheduler);
> +
> +	flush_count = i915_scheduler_priority_bump(scheduler,
> +			    req->scheduler_qe, scheduler->priority_level_max);
> +
> +	spin_unlock_irqrestore(&scheduler->lock, flags);
> +
> +	if (flush_count) {
> +		DRM_DEBUG_DRIVER("<%s> Bumped %d entries\n", req->ring->name, flush_count);
> +		flush_count = i915_scheduler_submit_max_priority(req->ring, is_locked);
> +	}
> +
> +	return flush_count;
> +}
> +
>  static void i915_scheduler_priority_bump_clear(struct i915_scheduler *scheduler)
>  {
>  	struct i915_scheduler_queue_entry *node;
> @@ -653,6 +706,44 @@ static int i915_scheduler_priority_bump(struct i915_scheduler *scheduler,
>  	return count;
>  }
>  
> +static int i915_scheduler_submit_max_priority(struct intel_engine_cs *ring,
> +					      bool is_locked)
> +{
> +	struct i915_scheduler_queue_entry  *node;
> +	struct drm_i915_private            *dev_priv = ring->dev->dev_private;
> +	struct i915_scheduler              *scheduler = dev_priv->scheduler;
> +	unsigned long	flags;
> +	int             ret, count = 0;
> +	bool            found;
> +
> +	do {
> +		found = false;
> +		spin_lock_irqsave(&scheduler->lock, flags);
> +		list_for_each_entry(node, &scheduler->node_queue[ring->id], link) {
> +			if (!I915_SQS_IS_QUEUED(node))
> +				continue;
> +
> +			if (node->priority < scheduler->priority_level_max)
> +				continue;
> +
> +			found = true;
> +			break;
> +		}
> +		spin_unlock_irqrestore(&scheduler->lock, flags);
> +
> +		if (!found)
> +			break;
> +
> +		ret = i915_scheduler_submit(ring, is_locked);
> +		if (ret < 0)
> +			return ret;
> +
> +		count += ret;
> +	} while (found);
> +
> +	return count;
> +}
> +
>  static int i915_scheduler_pop_from_queue_locked(struct intel_engine_cs *ring,
>  				    struct i915_scheduler_queue_entry **pop_node,
>  				    unsigned long *flags)
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index 73c5e7d..fcf2640 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -92,6 +92,8 @@ void        i915_gem_scheduler_clean_node(struct i915_scheduler_queue_entry *nod
>  int         i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe);
>  int         i915_scheduler_handle_irq(struct intel_engine_cs *ring);
>  void        i915_gem_scheduler_work_handler(struct work_struct *work);
> +int         i915_scheduler_flush_request(struct drm_i915_gem_request *req,
> +					 bool is_locked);
>  bool        i915_scheduler_is_request_tracked(struct drm_i915_gem_request *req,
>  					      bool *completed, bool *busy);
>  
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 9629dab..129c829 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -11291,7 +11291,8 @@ static void intel_mmio_flip_work_func(struct work_struct *work)
>  		WARN_ON(__i915_wait_request(mmio_flip->req,
>  					    mmio_flip->crtc->reset_counter,
>  					    false, NULL,
> -					    &mmio_flip->i915->rps.mmioflips));
> +					    &mmio_flip->i915->rps.mmioflips,
> +					    false));
>  
>  	intel_do_mmio_flip(mmio_flip->crtc);
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index df0cd48..e0992b7 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2193,7 +2193,7 @@ int intel_ring_idle(struct intel_engine_cs *ring)
>  	return __i915_wait_request(req,
>  				   atomic_read(&to_i915(ring->dev)->gpu_error.reset_counter),
>  				   to_i915(ring->dev)->mm.interruptible,
> -				   NULL, NULL);
> +				   NULL, NULL, true);
>  }
>  
>  int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-07-21  9:24 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-17 14:33 [RFC 00/39] GPU scheduler for i915 driver John.C.Harrison
2015-07-17 14:33 ` [RFC 01/39] drm/i915: Add total count to context status debugfs output John.C.Harrison
2015-07-17 14:33 ` [RFC 02/39] drm/i915: Updating assorted register and status page definitions John.C.Harrison
2015-07-17 14:33 ` [RFC 03/39] drm/i915: Explicit power enable during deferred context initialisation John.C.Harrison
2015-07-21  7:54   ` Daniel Vetter
2015-07-17 14:33 ` [RFC 04/39] drm/i915: Prelude to splitting i915_gem_do_execbuffer in two John.C.Harrison
2015-07-21  8:06   ` Daniel Vetter
2015-07-17 14:33 ` [RFC 05/39] drm/i915: Split i915_dem_do_execbuffer() in half John.C.Harrison
2015-07-21  8:00   ` Daniel Vetter
2015-07-17 14:33 ` [RFC 06/39] drm/i915: Re-instate request->uniq because it is extremely useful John.C.Harrison
2015-07-17 14:33 ` [RFC 07/39] drm/i915: Start of GPU scheduler John.C.Harrison
2015-07-21  9:40   ` Daniel Vetter
2015-07-17 14:33 ` [RFC 08/39] drm/i915: Prepare retire_requests to handle out-of-order seqnos John.C.Harrison
2015-07-17 14:33 ` [RFC 09/39] drm/i915: Added scheduler hook into i915_gem_complete_requests_ring() John.C.Harrison
2015-07-17 14:33 ` [RFC 10/39] drm/i915: Disable hardware semaphores when GPU scheduler is enabled John.C.Harrison
2015-07-17 14:33 ` [RFC 11/39] drm/i915: Force MMIO flips when scheduler enabled John.C.Harrison
2015-07-17 14:33 ` [RFC 12/39] drm/i915: Added scheduler hook when closing DRM file handles John.C.Harrison
2015-07-17 14:33 ` [RFC 13/39] drm/i915: Added deferred work handler for scheduler John.C.Harrison
2015-07-17 14:33 ` [RFC 14/39] drm/i915: Redirect execbuffer_final() via scheduler John.C.Harrison
2015-07-17 14:33 ` [RFC 15/39] drm/i915: Keep the reserved space mechanism happy John.C.Harrison
2015-07-17 14:33 ` [RFC 16/39] drm/i915: Added tracking/locking of batch buffer objects John.C.Harrison
2015-07-17 14:33 ` [RFC 17/39] drm/i915: Hook scheduler node clean up into retire requests John.C.Harrison
2015-07-17 14:33 ` [RFC 18/39] drm/i915: Added scheduler interrupt handler hook John.C.Harrison
2015-07-17 14:33 ` [RFC 19/39] drm/i915: Added scheduler support to __wait_request() calls John.C.Harrison
2015-07-21  9:27   ` Daniel Vetter [this message]
2015-07-17 14:33 ` [RFC 20/39] drm/i915: Added scheduler support to page fault handler John.C.Harrison
2015-07-17 14:33 ` [RFC 21/39] drm/i915: Added scheduler flush calls to ring throttle and idle functions John.C.Harrison
2015-07-17 14:33 ` [RFC 22/39] drm/i915: Add scheduler hook to GPU reset John.C.Harrison
2015-07-17 14:33 ` [RFC 23/39] drm/i915: Added a module parameter for allowing scheduler overrides John.C.Harrison
2015-07-17 14:33 ` [RFC 24/39] drm/i915: Support for 'unflushed' ring idle John.C.Harrison
2015-07-21  8:50   ` Daniel Vetter
2015-07-17 14:33 ` [RFC 25/39] drm/i915: Defer seqno allocation until actual hardware submission time John.C.Harrison
2015-07-17 14:33 ` [RFC 26/39] drm/i915: Added immediate submission override to scheduler John.C.Harrison
2015-07-17 14:33 ` [RFC 27/39] drm/i915: Add sync wait support " John.C.Harrison
2015-07-21  9:59   ` Daniel Vetter
2015-07-17 14:33 ` [RFC 28/39] drm/i915: Connecting execbuff fences " John.C.Harrison
2015-07-17 14:33 ` [RFC 29/39] drm/i915: Added trace points " John.C.Harrison
2015-07-17 14:33 ` [RFC 30/39] drm/i915: Added scheduler queue throttling by DRM file handle John.C.Harrison
2015-07-17 14:33 ` [RFC 31/39] drm/i915: Added debugfs interface to scheduler tuning parameters John.C.Harrison
2015-07-17 14:33 ` [RFC 32/39] drm/i915: Added debug state dump facilities to scheduler John.C.Harrison
2015-07-17 14:33 ` [RFC 33/39] drm/i915: Add early exit to execbuff_final() if insufficient ring space John.C.Harrison
2015-07-17 14:33 ` [RFC 34/39] drm/i915: Added scheduler statistic reporting to debugfs John.C.Harrison
2015-07-17 14:33 ` [RFC 35/39] drm/i915: Added seqno values to scheduler status dump John.C.Harrison
2015-07-17 14:33 ` [RFC 36/39] drm/i915: Add scheduler support functions for TDR John.C.Harrison
2015-07-17 14:33 ` [RFC 37/39] drm/i915: GPU priority bumping to prevent starvation John.C.Harrison
2015-07-17 14:33 ` [RFC 38/39] drm/i915: Enable GPU scheduler by default John.C.Harrison
2015-07-17 14:33 ` [RFC 39/39] drm/i915: Allow scheduler to manage inter-ring object synchronisation John.C.Harrison
2015-07-21 13:33 ` [RFC 00/39] GPU scheduler for i915 driver Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150721092717.GF16722@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=John.C.Harrison@Intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox