All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Harrison <John.C.Harrison@Intel.com>
To: Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [PATCH v6 00/34] GPU scheduler for i915 driver
Date: Fri, 22 Apr 2016 16:37:40 +0100	[thread overview]
Message-ID: <571A4544.5000407@Intel.com> (raw)
In-Reply-To: <1461172435-4256-1-git-send-email-John.C.Harrison@Intel.com>

This patch series can be pulled from 
'ssh://people.freedesktop.org/~johnharr/scheduler' as the 'scheduler' 
branch.


On 20/04/2016 18:13, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> Implemented a batch buffer submission scheduler for the i915 DRM driver.
>
> The general theory of operation is that when batch buffers are
> submitted to the driver, the execbuffer() code assigns a unique seqno
> value and then packages up all the information required to execute the
> batch buffer at a later time. This package is given over to the
> scheduler which adds it to an internal node list. The scheduler also
> scans the list of objects associated with the batch buffer and
> compares them against the objects already in use by other buffers in
> the node list. If matches are found then the new batch buffer node is
> marked as being dependent upon the matching node. The same is done for
> the context object. The scheduler also bumps up the priority of such
> matching nodes on the grounds that the more dependencies a given batch
> buffer has the more important it is likely to be.
>
> The scheduler aims to have a given (tuneable) number of batch buffers
> in flight on the hardware at any given time. If fewer than this are
> currently executing when a new node is queued, then the node is passed
> straight through to the submit function. Otherwise it is simply added
> to the queue and the driver returns back to user land.
>
> As each batch buffer completes, it raises an interrupt which wakes up
> the scheduler. Note that it is possible for multiple buffers to
> complete before the IRQ handler gets to run. Further, the seqno values
> of the individual buffers are not necessary incrementing as the
> scheduler may have re-ordered their submission. However, the scheduler
> keeps the list of executing buffers in order of hardware submission.
> Thus it can scan through the list until a matching seqno is found and
> then mark all in flight nodes from that point on as completed.
>
> A deferred work queue is also poked by the interrupt handler. When
> this wakes up it can do more involved processing such as actually
> removing completed nodes from the queue and freeing up the resources
> associated with them (internal memory allocations, DRM object
> references, context reference, etc.). The work handler also checks the
> in flight count and calls the submission code if a new slot has
> appeared.
>
> When the scheduler's submit code is called, it scans the queued node
> list for the highest priority node that has no unmet dependencies.
> Note that the dependency calculation is complex as it must take
> inter-ring dependencies and potential preemptions into account. Note
> also that in the future this will be extended to include external
> dependencies such as the Android Native Sync file descriptors and/or
> the linux dma-buff synchronisation scheme.
>
> If a suitable node is found then it is sent to execbuff_final() for
> submission to the hardware. The in flight count is then re-checked and
> a new node popped from the list if appropriate.
>
> The scheduler also allows high priority batch buffers (e.g. from a
> desktop compositor) to jump ahead of whatever is already running if
> the underlying hardware supports pre-emption. In this situation, any
> work that was pre-empted is returned to the queued list ready to be
> resubmitted when no more high priority work is outstanding.
>
> Various IGT tests are in progress to test the scheduler's operation
> and will follow.
>
> v2: Updated for changes in struct fence patch series and other changes
> to underlying tree (e.g. removal of cliprects). Also changed priority
> levels to be signed +/-1023 range and reduced mutex lock usage.
>
> v3: More reuse of cached pointers rather than repeated dereferencing
> (David Gordon).
>
> Moved the dependency generation code out to a seperate function for
> easier readability. Also added in support for the read-read
> optimisation.
>
> Major simplification of the DRM file close handler.
>
> Fixed up an overzealous WARN.
>
> Removed unnecessary flushing of the scheduler queue when waiting for a
> request.
>
> v4: Removed user land fence/sync integration as this is dependent upon
> the de-staging of the Android sync code. That de-staging is now being
> done by someone else. The sync support will be added back in to the
> scheduler in a separate patch series which must wait for the
> de-staging to be landed first.
>
> Added support for killing batches from contexts that were banned after
> the batches were submitted to the scheduler.
>
> Changed various comments to fix typos, update to reflect changes to
> the code, correct formatting and line wrapping, etc. Also wrapped
> various long lines and twiddled white space to keep the style checker
> happy.
>
> Changed a bunch of BUG_ONs to WARN_ONs as apparently the latter are
> preferred.
>
> Used the correct array memory allocator function (kmalloc_array
> instead of kmalloc).
>
> Fixed a variable type (review comment by Joonas).
>
> Fixed a WARN_ON firing incorrectly when removing killed nodes from the
> scheduler's queue.
>
> Dropped register definition update patch from this series. The changes
> are all for pre-emption so it makes more sense for it to be part of
> that series instead.
>
> v5: Reverted power management changes as they apparently conflict with
> mutex acquisition. Converted the override mask module parameter to a
> set of boolean enable flags (just one in this patch set, but others
> are added later for controlling pre-emption). [Chris Wilson]
>
> Removed lots of whitespace from i915_scheduler.c and re-ordered it to
> remove all forward declarations. Squashed down the i915_scheduler.c
> sections of various patches into the initial 'start of scheduler'
> patch. Thus the later patches simply hook in existing code into
> various parts of the driver rather than adding the code as well. Added
> documentation to various functions. Re-worked the submit function in
> terms of mutex locking, error handling and exit paths. Split the
> delayed work handler function in half. Made use of the kernel 'clamp'
> macro. [Joonas Lahtinen]
>
> Dropped the 'immediate submission' override option. This was a half
> way house between full scheduler and direct submission and was only
> really useful during early debug.
>
> Added a re-install of the scheduler's interrupt hook around GPU reset.
> [Zhipeng Gong]
>
> Used lighter weight spinlocks.
>
> v6: Updated to newer nightly (lots of ring -> engine renaming).
>
> Added 'for_each_scheduler_node()' and 'assert_scheduler_lock_held()'
> helper macros. Renamed 'i915_gem_execbuff_release_batch_obj' to
> 'i915_gem_execbuf_release_batch_obj'. Updated to use 'to_i915()'
> instead of dev_private. Converted all enum labels to uppercase.
> Removed various unnecessary WARNs. Renamed 'saved_objects' to just
> 'objs'. More code refactoring. Removed even more white space.  Added
> an i915_scheduler_destroy() function instead of doing explicit clean
> up of scheduler internals from i915_driver_unload(). Changed extra
> boolean i915_wait_request() parameter to a flags word and consumed the
> original boolean parameter too. Also, replaced the
> i915_scheduler_is_request_tracked() function with
> i915_scheduler_is_mutex_required() and
> i915_scheduler_is_request_batch_buffer() as the need for the former
> has gone away and it was really being used to ask the latter two
> questions in a convoluted manner. Wrapped boolean 'flush' parameter to
> intel_engine_idle() with an _flush() macro.
> [review feedback from Joonas Lahtinen]
>
> Moved scheduler modue parameter declaration to correct place in
> i915_params struct. [review feedback from Matt Roper]
>
> Added an admin only check when setting the tuning parameters via
> debugfs to prevent rogue user code trying to break the system with
> strange settings. [review feedback from Jesse Barnes]
>
> Added kerneldoc for intel_engine_idle().
>
> Added running totals of 'flying' and 'queued' nodes rather than
> re-calculating each time as a minor CPU performance optimisation.
>
> Removed support for out of order seqno completion. All the prep work
> patch series (seqno to request conversion, late seqno assignment,
> etc.) that has now been done means that the scheduler no longer
> generates out of order seqno completions. Thus all the complex code
> for coping with such is no longer required and can be removed.
>
> Fixed a bug in scheduler bypass mode introduced in the clean up code
> refactoring of v5. The clean up function was seeing the node in the
> wrong state and thus refusing to process it.
>
> Improved the throttle by file handle feature by chaning from a simple
> 'return to userland when full' scheme with a 'sleep on request'
> scheme. The former could lead to the busy polling and wasting lots of
> CPU time as user land continuously retried the execbuf IOCTL in a
> tight loop. Now the driver will sleep (without holding the mutex lock)
> on the oldest request outstanding for that file and then automatically
> retry. This is closer to the pre-scheduler behaviour of stalling on a
> full ring buffer.
>
> [Patches against drm-intel-nightly tree fetched 13/04/2016 with struct
> fence conversion patches applied]
>
> Dave Gordon (2):
>    drm/i915: Cache request pointer in *_submission_final()
>    drm/i915: Add scheduling priority to per-context parameters
>
> John Harrison (32):
>    drm/i915: Add total count to context status debugfs output
>    drm/i915: Prelude to splitting i915_gem_do_execbuffer in two
>    drm/i915: Split i915_dem_do_execbuffer() in half
>    drm/i915: Re-instate request->uniq because it is extremely useful
>    drm/i915: Start of GPU scheduler
>    drm/i915: Disable hardware semaphores when GPU scheduler is enabled
>    drm/i915: Force MMIO flips when scheduler enabled
>    drm/i915: Added scheduler hook when closing DRM file handles
>    drm/i915: Added scheduler hook into i915_gem_request_notify()
>    drm/i915: Added deferred work handler for scheduler
>    drm/i915: Redirect execbuffer_final() via scheduler
>    drm/i915: Keep the reserved space mechanism happy
>    drm/i915: Added tracking/locking of batch buffer objects
>    drm/i915: Hook scheduler node clean up into retire requests
>    drm/i915: Added scheduler support to __wait_request() calls
>    drm/i915: Added scheduler support to page fault handler
>    drm/i915: Added scheduler flush calls to ring throttle and idle functions
>    drm/i915: Add scheduler hook to GPU reset
>    drm/i915: Added a module parameter to allow the scheduler to be disabled
>    drm/i915: Support for 'unflushed' ring idle
>    drm/i915: Defer seqno allocation until actual hardware submission time
>    drm/i915: Added trace points to scheduler
>    drm/i915: Added scheduler queue throttling by DRM file handle
>    drm/i915: Added debugfs interface to scheduler tuning parameters
>    drm/i915: Add early exit to execbuff_final() if insufficient ring space
>    drm/i915: Added scheduler statistic reporting to debugfs
>    drm/i915: Add scheduler support functions for TDR
>    drm/i915: Enable GPU scheduler by default
>    drm/i915: Add support for retro-actively banning batch buffers
>    drm/i915: Allow scheduler to manage inter-ring object synchronisation
>    drm/i915: Added debug state dump facilities to scheduler
>    drm/i915: Scheduler state dump via debugfs
>
>   drivers/gpu/drm/i915/Makefile              |    1 +
>   drivers/gpu/drm/i915/i915_debugfs.c        |  336 +++++-
>   drivers/gpu/drm/i915/i915_dma.c            |    5 +
>   drivers/gpu/drm/i915/i915_drv.c            |    9 +
>   drivers/gpu/drm/i915/i915_drv.h            |   58 +-
>   drivers/gpu/drm/i915/i915_gem.c            |  156 ++-
>   drivers/gpu/drm/i915/i915_gem_context.c    |   24 +
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |  297 +++--
>   drivers/gpu/drm/i915/i915_params.c         |    4 +
>   drivers/gpu/drm/i915/i915_params.h         |    1 +
>   drivers/gpu/drm/i915/i915_scheduler.c      | 1709 ++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/i915_scheduler.h      |  180 +++
>   drivers/gpu/drm/i915/i915_trace.h          |  225 +++-
>   drivers/gpu/drm/i915/intel_display.c       |   10 +-
>   drivers/gpu/drm/i915/intel_lrc.c           |  161 ++-
>   drivers/gpu/drm/i915/intel_lrc.h           |    1 +
>   drivers/gpu/drm/i915/intel_ringbuffer.c    |   69 +-
>   drivers/gpu/drm/i915/intel_ringbuffer.h    |    5 +-
>   include/uapi/drm/i915_drm.h                |    1 +
>   19 files changed, 3118 insertions(+), 134 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/i915_scheduler.c
>   create mode 100644 drivers/gpu/drm/i915/i915_scheduler.h
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2016-04-22 15:37 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-20 17:13 [PATCH v6 00/34] GPU scheduler for i915 driver John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 01/34] drm/i915: Add total count to context status debugfs output John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 02/34] drm/i915: Prelude to splitting i915_gem_do_execbuffer in two John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 03/34] drm/i915: Split i915_dem_do_execbuffer() in half John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 04/34] drm/i915: Cache request pointer in *_submission_final() John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 05/34] drm/i915: Re-instate request->uniq because it is extremely useful John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 06/34] drm/i915: Start of GPU scheduler John.C.Harrison
2016-06-10 16:24   ` Tvrtko Ursulin
2016-04-20 17:13 ` [PATCH v6 07/34] drm/i915: Disable hardware semaphores when GPU scheduler is enabled John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 08/34] drm/i915: Force MMIO flips when scheduler enabled John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 09/34] drm/i915: Added scheduler hook when closing DRM file handles John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 10/34] drm/i915: Added scheduler hook into i915_gem_request_notify() John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 11/34] drm/i915: Added deferred work handler for scheduler John.C.Harrison
2016-06-10 16:29   ` Tvrtko Ursulin
2016-04-20 17:13 ` [PATCH v6 12/34] drm/i915: Redirect execbuffer_final() via scheduler John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 13/34] drm/i915: Keep the reserved space mechanism happy John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 14/34] drm/i915: Added tracking/locking of batch buffer objects John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 15/34] drm/i915: Hook scheduler node clean up into retire requests John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 16/34] drm/i915: Added scheduler support to __wait_request() calls John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 17/34] drm/i915: Added scheduler support to page fault handler John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 18/34] drm/i915: Added scheduler flush calls to ring throttle and idle functions John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 19/34] drm/i915: Add scheduler hook to GPU reset John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 20/34] drm/i915: Added a module parameter to allow the scheduler to be disabled John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 21/34] drm/i915: Support for 'unflushed' ring idle John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 22/34] drm/i915: Defer seqno allocation until actual hardware submission time John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 23/34] drm/i915: Added trace points to scheduler John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 24/34] drm/i915: Added scheduler queue throttling by DRM file handle John.C.Harrison
2016-05-06 13:19   ` John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 25/34] drm/i915: Added debugfs interface to scheduler tuning parameters John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 26/34] drm/i915: Add early exit to execbuff_final() if insufficient ring space John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 27/34] drm/i915: Added scheduler statistic reporting to debugfs John.C.Harrison
2016-05-06 13:21   ` John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 28/34] drm/i915: Add scheduler support functions for TDR John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 29/34] drm/i915: Enable GPU scheduler by default John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 30/34] drm/i915: Add scheduling priority to per-context parameters John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 31/34] drm/i915: Add support for retro-actively banning batch buffers John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 32/34] drm/i915: Allow scheduler to manage inter-ring object synchronisation John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 33/34] drm/i915: Added debug state dump facilities to scheduler John.C.Harrison
2016-04-20 17:13 ` [PATCH v6 34/34] drm/i915: Scheduler state dump via debugfs John.C.Harrison
2016-04-20 17:13 ` [PATCH 1/1] drm/i915: Add wrapper for context priority interface John.C.Harrison
2016-04-20 17:13 ` [PATCH 1/2] igt/gem_ctx_param_basic: Updated to support scheduler " John.C.Harrison
2016-04-20 17:13 ` [PATCH 2/2] igt/gem_scheduler: Add gem_scheduler test John.C.Harrison
2016-04-21  9:43 ` ✓ Fi.CI.BAT: success for GPU scheduler for i915 driver (rev2) Patchwork
2016-04-22 15:37 ` John Harrison [this message]
2016-04-23  9:57 ` ✗ Fi.CI.BAT: failure " Patchwork
2016-04-25  9:54 ` [PATCH v6 00/34] GPU scheduler for i915 driver Chris Wilson
2016-04-25 11:55   ` John Harrison
2016-04-26 13:20   ` Daniel Vetter
2016-05-05 11:54     ` John Harrison
2016-05-09  9:49 ` ✗ Fi.CI.BAT: warning for GPU scheduler for i915 driver (rev4) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=571A4544.5000407@Intel.com \
    --to=john.c.harrison@intel.com \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.