intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: John Harrison <John.C.Harrison@Intel.com>
To: Jesse Barnes <jbarnes@virtuousgeek.org>, Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [PATCH v5 25/35] drm/i915: Added scheduler queue throttling by DRM file handle
Date: Tue, 01 Mar 2016 15:52:06 +0000	[thread overview]
Message-ID: <56D5BAA6.1070005@Intel.com> (raw)
In-Reply-To: <56CCC8FA.1070107@virtuousgeek.org>

On 23/02/2016 21:02, Jesse Barnes wrote:
> On 02/18/2016 06:27 AM, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> The scheduler decouples the submission of batch buffers to the driver
>> from their subsequent submission to the hardware. This means that an
>> application which is continuously submitting buffers as fast as it can
>> could potentialy flood the driver. To prevent this, the driver now
>> tracks how many buffers are in progress (queued in software or
>> executing in hardware) and limits this to a given (tunable) number. If
>> this number is exceeded then the queue to the driver will return
>> EAGAIN and thus prevent the scheduler's queue becoming arbitrarily
>> large.
>>
>> v3: Added a missing decrement of the file queue counter.
>>
>> v4: Updated a comment.
>>
>> v5: Updated due to changes to earlier patches in series - removing
>> forward declarations and white space. Also added some documentation.
>> [Joonas Lahtinen]
>>
>> For: VIZ-1587
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h            |  2 ++
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |  8 +++++
>>   drivers/gpu/drm/i915/i915_scheduler.c      | 48 ++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_scheduler.h      |  2 ++
>>   4 files changed, 60 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 071a27b..3f4c4f0 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -336,6 +336,8 @@ struct drm_i915_file_private {
>>   	} rps;
>>   
>>   	struct intel_engine_cs *bsd_ring;
>> +
>> +	u32 scheduler_queue_length;
>>   };
>>   
>>   enum intel_dpll_id {
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index d4de8c7..dff120c 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -1803,6 +1803,10 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
>>   		return -EINVAL;
>>   	}
>>   
>> +	/* Throttle batch requests per device file */
>> +	if (i915_scheduler_file_queue_is_full(file))
>> +		return -EAGAIN;
>> +
>>   	/* Copy in the exec list from userland */
>>   	exec_list = drm_malloc_ab(sizeof(*exec_list), args->buffer_count);
>>   	exec2_list = drm_malloc_ab(sizeof(*exec2_list), args->buffer_count);
>> @@ -1893,6 +1897,10 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
>>   		return -EINVAL;
>>   	}
>>   
>> +	/* Throttle batch requests per device file */
>> +	if (i915_scheduler_file_queue_is_full(file))
>> +		return -EAGAIN;
>> +
>>   	exec2_list = kmalloc(sizeof(*exec2_list)*args->buffer_count,
>>   			     GFP_TEMPORARY | __GFP_NOWARN | __GFP_NORETRY);
>>   	if (exec2_list == NULL)
>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
>> index e56ce08..f7f29d5 100644
>> --- a/drivers/gpu/drm/i915/i915_scheduler.c
>> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
>> @@ -69,6 +69,7 @@ int i915_scheduler_init(struct drm_device *dev)
>>   	scheduler->priority_level_bump    = 50;
>>   	scheduler->priority_level_preempt = 900;
>>   	scheduler->min_flying             = 2;
>> +	scheduler->file_queue_max         = 64;
>>   
>>   	dev_priv->scheduler = scheduler;
>>   
>> @@ -464,6 +465,44 @@ static int i915_scheduler_submit_unlocked(struct intel_engine_cs *ring)
>>   	return ret;
>>   }
>>   
>> +/**
>> + * i915_scheduler_file_queue_is_full - Returns true if the queue is full.
>> + * @file: File object to query.
>> + * This allows throttling of applications by limiting the total number of
>> + * outstanding requests to a specified level. Once that limit is reached,
>> + * this call will return true and no more requests should be accepted.
>> + */
>> +bool i915_scheduler_file_queue_is_full(struct drm_file *file)
>> +{
>> +	struct drm_i915_file_private *file_priv = file->driver_priv;
>> +	struct drm_i915_private *dev_priv  = file_priv->dev_priv;
>> +	struct i915_scheduler *scheduler = dev_priv->scheduler;
>> +
>> +	return file_priv->scheduler_queue_length >= scheduler->file_queue_max;
>> +}
>> +
>> +/**
>> + * i915_scheduler_file_queue_inc - Increment the file's request queue count.
>> + * @file: File object to process.
>> + */
>> +static void i915_scheduler_file_queue_inc(struct drm_file *file)
>> +{
>> +	struct drm_i915_file_private *file_priv = file->driver_priv;
>> +
>> +	file_priv->scheduler_queue_length++;
>> +}
>> +
>> +/**
>> + * i915_scheduler_file_queue_dec - Decrement the file's request queue count.
>> + * @file: File object to process.
>> + */
>> +static void i915_scheduler_file_queue_dec(struct drm_file *file)
>> +{
>> +	struct drm_i915_file_private *file_priv = file->driver_priv;
>> +
>> +	file_priv->scheduler_queue_length--;
>> +}
>> +
>>   static void i915_generate_dependencies(struct i915_scheduler *scheduler,
>>   				       struct i915_scheduler_queue_entry *node,
>>   				       uint32_t ring)
>> @@ -640,6 +679,8 @@ int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe)
>>   
>>   	list_add_tail(&node->link, &scheduler->node_queue[ring->id]);
>>   
>> +	i915_scheduler_file_queue_inc(node->params.file);
>> +
>>   	not_flying = i915_scheduler_count_flying(scheduler, ring) <
>>   						 scheduler->min_flying;
>>   
>> @@ -883,6 +924,12 @@ static bool i915_scheduler_remove(struct i915_scheduler *scheduler,
>>   		/* Strip the dependency info while the mutex is still locked */
>>   		i915_scheduler_remove_dependent(scheduler, node);
>>   
>> +		/* Likewise clean up the file pointer. */
>> +		if (node->params.file) {
>> +			i915_scheduler_file_queue_dec(node->params.file);
>> +			node->params.file = NULL;
>> +		}
>> +
>>   		continue;
>>   	}
>>   
>> @@ -1205,6 +1252,7 @@ int i915_scheduler_closefile(struct drm_device *dev, struct drm_file *file)
>>   						 node->status,
>>   						 ring->name);
>>   
>> +			i915_scheduler_file_queue_dec(node->params.file);
>>   			node->params.file = NULL;
>>   		}
>>   	}
>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
>> index 075befb..b78de12 100644
>> --- a/drivers/gpu/drm/i915/i915_scheduler.h
>> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
>> @@ -77,6 +77,7 @@ struct i915_scheduler {
>>   	int32_t             priority_level_bump;
>>   	int32_t             priority_level_preempt;
>>   	uint32_t            min_flying;
>> +	uint32_t            file_queue_max;
>>   };
>>   
>>   /* Flag bits for i915_scheduler::flags */
>> @@ -100,5 +101,6 @@ int i915_scheduler_flush_stamp(struct intel_engine_cs *ring,
>>   			       unsigned long stamp, bool is_locked);
>>   bool i915_scheduler_is_request_tracked(struct drm_i915_gem_request *req,
>>   				       bool *completed, bool *busy);
>> +bool i915_scheduler_file_queue_is_full(struct drm_file *file);
>>   
>>   #endif  /* _I915_SCHEDULER_H_ */
>>
> Just to clarify and make sure I understood the previous stuff: a queued execbuf that has not yet been dispatched does not reserve and pin pages right?  That occurs at actual dispatch time?  If so, I guess clients will hit this 64 queued item limit pretty regularly...  How much metadata overhead does that involve?  Has it been derived from some performance work with a bunch of workloads?  It's fine if not, I can imagine that different mixes of workloads would be affected by lower or higher queue depths (e.g. small batch tests).

Whether a client will hit the limit is entirely down to how that client 
is written. A GL program that renders double buffered with one batch per 
frame will only ever have two batch buffer requests outstanding. 
Whereas, the gem_exec_nop IGT test submits a thousand batches before 
waiting for anything to complete.

> If this is tunable, I guess it should be clamped like a nice or rlimit value, with values outside that range requiring CAP_SYS_ADMIN.
What is considered a valid range? Previously the limit would have been 
the size of the ring buffer. In times past, they were large and you 
could have thousands of batches in flight. With the GuC we are down to 
4K words. At, for example, 32 words per batch buffer request that makes 
for 128 outstanding batches. So set the maximum permitted to 256? 
Larger? Smaller?


>
> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
>
> Jesse

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-03-01 15:52 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-18 14:26 [PATCH v5 00/35] GPU scheduler for i915 driver John.C.Harrison
2016-02-18 14:26 ` [PATCH v5 01/35] drm/i915: Add total count to context status debugfs output John.C.Harrison
2016-02-18 14:26 ` [PATCH v5 02/35] drm/i915: Prelude to splitting i915_gem_do_execbuffer in two John.C.Harrison
2016-02-18 14:26 ` [PATCH v5 03/35] drm/i915: Split i915_dem_do_execbuffer() in half John.C.Harrison
2016-02-18 14:26 ` [PATCH v5 04/35] drm/i915: Cache request pointer in *_submission_final() John.C.Harrison
2016-02-18 14:26 ` [PATCH v5 05/35] drm/i915: Re-instate request->uniq because it is extremely useful John.C.Harrison
2016-02-18 14:26 ` [PATCH v5 06/35] drm/i915: Start of GPU scheduler John.C.Harrison
2016-02-19 13:03   ` Joonas Lahtinen
2016-02-19 17:03     ` John Harrison
2016-02-26  9:13       ` Joonas Lahtinen
2016-02-26 14:18         ` John Harrison
2016-02-18 14:26 ` [PATCH v5 07/35] drm/i915: Prepare retire_requests to handle out-of-order seqnos John.C.Harrison
2016-02-19 19:23   ` Jesse Barnes
2016-02-18 14:26 ` [PATCH v5 08/35] drm/i915: Disable hardware semaphores when GPU scheduler is enabled John.C.Harrison
2016-02-19 19:27   ` Jesse Barnes
2016-02-18 14:26 ` [PATCH v5 09/35] drm/i915: Force MMIO flips when scheduler enabled John.C.Harrison
2016-02-19 19:28   ` Jesse Barnes
2016-02-19 19:53     ` Ville Syrjälä
2016-02-19 20:01       ` Jesse Barnes
2016-02-22  9:41         ` Lankhorst, Maarten
2016-02-22 12:53           ` John Harrison
2016-02-20  9:22     ` Chris Wilson
2016-02-22 20:42       ` Jesse Barnes
2016-02-23 11:16         ` Chris Wilson
2016-02-18 14:26 ` [PATCH v5 10/35] drm/i915: Added scheduler hook when closing DRM file handles John.C.Harrison
2016-03-01  8:59   ` Joonas Lahtinen
2016-03-01 14:52     ` John Harrison
2016-02-18 14:26 ` [PATCH v5 11/35] drm/i915: Added scheduler hook into i915_gem_request_notify() John.C.Harrison
2016-03-01  9:10   ` Joonas Lahtinen
2016-02-18 14:27 ` [PATCH v5 12/35] drm/i915: Added deferred work handler for scheduler John.C.Harrison
2016-03-01  9:16   ` Joonas Lahtinen
2016-03-01 15:12     ` John Harrison
2016-02-18 14:27 ` [PATCH v5 13/35] drm/i915: Redirect execbuffer_final() via scheduler John.C.Harrison
2016-02-19 19:33   ` Jesse Barnes
2016-02-18 14:27 ` [PATCH v5 14/35] drm/i915: Keep the reserved space mechanism happy John.C.Harrison
2016-02-19 19:36   ` Jesse Barnes
2016-02-18 14:27 ` [PATCH v5 15/35] drm/i915: Added tracking/locking of batch buffer objects John.C.Harrison
2016-02-19 19:42   ` Jesse Barnes
2016-02-18 14:27 ` [PATCH v5 16/35] drm/i915: Hook scheduler node clean up into retire requests John.C.Harrison
2016-02-19 19:44   ` Jesse Barnes
2016-02-18 14:27 ` [PATCH v5 17/35] drm/i915: Added scheduler support to __wait_request() calls John.C.Harrison
2016-03-01 10:02   ` Joonas Lahtinen
2016-03-11 11:47     ` John Harrison
2016-02-18 14:27 ` [PATCH v5 18/35] drm/i915: Added scheduler support to page fault handler John.C.Harrison
2016-02-19 19:45   ` Jesse Barnes
2016-02-18 14:27 ` [PATCH v5 19/35] drm/i915: Added scheduler flush calls to ring throttle and idle functions John.C.Harrison
2016-03-07 11:31   ` Joonas Lahtinen
2016-03-11 16:22     ` John Harrison
2016-02-18 14:27 ` [PATCH v5 20/35] drm/i915: Add scheduler hook to GPU reset John.C.Harrison
2016-02-23 20:27   ` Jesse Barnes
2016-02-18 14:27 ` [PATCH v5 21/35] drm/i915: Added a module parameter to allow the scheduler to be disabled John.C.Harrison
2016-02-23 20:29   ` Jesse Barnes
2016-02-18 14:27 ` [PATCH v5 22/35] drm/i915: Support for 'unflushed' ring idle John.C.Harrison
2016-02-23 20:35   ` Jesse Barnes
2016-02-18 14:27 ` [PATCH v5 23/35] drm/i915: Defer seqno allocation until actual hardware submission time John.C.Harrison
2016-03-07 12:16   ` Joonas Lahtinen
2016-02-18 14:27 ` [PATCH v5 24/35] drm/i915: Added trace points to scheduler John.C.Harrison
2016-02-23 20:42   ` Jesse Barnes
2016-02-23 20:42   ` Jesse Barnes
2016-02-26 15:55     ` John Harrison
2016-02-26 17:12       ` Jesse Barnes
2016-02-18 14:27 ` [PATCH v5 25/35] drm/i915: Added scheduler queue throttling by DRM file handle John.C.Harrison
2016-02-23 21:02   ` Jesse Barnes
2016-03-01 15:52     ` John Harrison [this message]
2016-02-18 14:27 ` [PATCH v5 26/35] drm/i915: Added debugfs interface to scheduler tuning parameters John.C.Harrison
2016-02-23 21:06   ` Jesse Barnes
2016-03-11 16:28     ` John Harrison
2016-03-11 17:25       ` Jesse Barnes
2016-02-18 14:27 ` [PATCH v5 27/35] drm/i915: Added debug state dump facilities to scheduler John.C.Harrison
2016-03-07 12:31   ` Joonas Lahtinen
2016-03-11 16:38     ` John Harrison
2016-03-15 10:53       ` Joonas Lahtinen
2016-02-18 14:27 ` [PATCH v5 28/35] drm/i915: Add early exit to execbuff_final() if insufficient ring space John.C.Harrison
2016-02-18 14:27 ` [PATCH v5 29/35] drm/i915: Added scheduler statistic reporting to debugfs John.C.Harrison
2016-02-18 14:27 ` [PATCH v5 30/35] drm/i915: Add scheduler support functions for TDR John.C.Harrison
2016-02-18 14:27 ` [PATCH v5 31/35] drm/i915: Scheduler state dump via debugfs John.C.Harrison
2016-02-18 14:27 ` [PATCH v5 32/35] drm/i915: Enable GPU scheduler by default John.C.Harrison
2016-02-18 14:27 ` [PATCH v5 33/35] drm/i915: Add scheduling priority to per-context parameters John.C.Harrison
2016-02-18 14:27 ` [PATCH v5 34/35] drm/i915: Add support for retro-actively banning batch buffers John.C.Harrison
2016-02-18 14:27 ` [PATCH v5 35/35] drm/i915: Allow scheduler to manage inter-ring object synchronisation John.C.Harrison
2016-02-18 14:27 ` [PATCH 01/20] igt/gem_ctx_param_basic: Updated to support scheduler priority interface John.C.Harrison
2016-02-18 15:30 ` ✗ Fi.CI.BAT: failure for GPU scheduler for i915 driver Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56D5BAA6.1070005@Intel.com \
    --to=john.c.harrison@intel.com \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=jbarnes@virtuousgeek.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).