All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH 00/25] Parallel submission aka multi-bb execbuf
@ 2021-10-14 17:19 ` Matthew Brost
  0 siblings, 0 replies; 69+ messages in thread
From: Matthew Brost @ 2021-10-14 17:19 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: john.c.harrison

As discussed in [1] we are introducing a new parallel submission uAPI
for the i915 which allows more than 1 BB to be submitted in an execbuf
IOCTL. This is the implemenation for both GuC and execlists.

In addition to selftests in the series, an IGT is available implemented
in the first 4 patches [2].

The execbuf IOCTL changes have been done in a single large patch (#21)
as all the changes flow together and I believe a single patch will be
better if some one has to lookup this change in the future. Can split in
a series of smaller patches if desired.

This code is available in a public [3] repo for UMD teams to test there
code on.

v2: Drop complicated state machine to block in kernel if no guc_ids
available, perma-pin parallel contexts, reworker execbuf IOCTL to be a
series of loops inside the IOCTL rather than 1 large one on the outside,
address Daniel Vetter's comments
v3: Address John Harrison's comments, add a couple of patches which fix
bugs found internally
v4: Address John Harrison's latest round of comments
v5: Address John Harrison's latest round of comments, resend for CI

Signed-off-by: Matthew Brost <matthew.brost@intel.com>

[1] https://patchwork.freedesktop.org/series/92028/
[2] https://patchwork.freedesktop.org/series/93071/
[3] https://gitlab.freedesktop.org/mbrost/mbrost-drm-intel/-/tree/drm-intel-parallel

Matthew Brost (25):
  drm/i915/guc: Move GuC guc_id allocation under submission state
    sub-struct
  drm/i915/guc: Take GT PM ref when deregistering context
  drm/i915/guc: Take engine PM when a context is pinned with GuC
    submission
  drm/i915/guc: Don't call switch_to_kernel_context with GuC submission
  drm/i915: Add logical engine mapping
  drm/i915: Expose logical engine instance to user
  drm/i915/guc: Introduce context parent-child relationship
  drm/i915/guc: Add multi-lrc context registration
  drm/i915/guc: Ensure GuC schedule operations do not operate on child
    contexts
  drm/i915/guc: Assign contexts in parent-child relationship consecutive
    guc_ids
  drm/i915/guc: Implement parallel context pin / unpin functions
  drm/i915/guc: Implement multi-lrc submission
  drm/i915/guc: Insert submit fences between requests in parent-child
    relationship
  drm/i915/guc: Implement multi-lrc reset
  drm/i915/guc: Update debugfs for GuC multi-lrc
  drm/i915/guc: Connect UAPI to GuC multi-lrc interface
  drm/i915/doc: Update parallel submit doc to point to i915_drm.h
  drm/i915/guc: Add basic GuC multi-lrc selftest
  drm/i915/guc: Implement no mid batch preemption for multi-lrc
  drm/i915: Multi-BB execbuf
  drm/i915/guc: Handle errors in multi-lrc requests
  drm/i915: Make request conflict tracking understand parallel submits
  drm/i915: Update I915_GEM_BUSY IOCTL to understand composite fences
  drm/i915: Enable multi-bb execbuf
  drm/i915/execlists: Weak parallel submission support for execlists

 Documentation/gpu/rfc/i915_parallel_execbuf.h |  122 --
 Documentation/gpu/rfc/i915_scheduler.rst      |    4 +-
 drivers/gpu/drm/i915/gem/i915_gem_busy.c      |   57 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  229 ++-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   16 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  786 ++++++---
 drivers/gpu/drm/i915/gt/intel_context.c       |   50 +-
 drivers/gpu/drm/i915/gt/intel_context.h       |   56 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |   73 +-
 drivers/gpu/drm/i915/gt/intel_engine.h        |   12 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   66 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |   13 +
 drivers/gpu/drm/i915/gt/intel_engine_pm.h     |   37 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |    7 +
 .../drm/i915/gt/intel_execlists_submission.c  |   63 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.h         |   14 +
 drivers/gpu/drm/i915/gt/intel_lrc.c           |    7 +
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |   12 +-
 .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |    1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.c        |   29 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h        |   54 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |    2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     |   24 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   34 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 1444 ++++++++++++++---
 .../drm/i915/gt/uc/selftest_guc_multi_lrc.c   |  179 ++
 drivers/gpu/drm/i915/i915_query.c             |    2 +
 drivers/gpu/drm/i915/i915_request.c           |  143 +-
 drivers/gpu/drm/i915/i915_request.h           |   23 +
 drivers/gpu/drm/i915/i915_vma.c               |   21 +-
 drivers/gpu/drm/i915/i915_vma.h               |   13 +-
 drivers/gpu/drm/i915/intel_wakeref.h          |   12 +
 .../drm/i915/selftests/i915_live_selftests.h  |    1 +
 include/uapi/drm/i915_drm.h                   |  139 +-
 34 files changed, 3053 insertions(+), 692 deletions(-)
 delete mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
 create mode 100644 drivers/gpu/drm/i915/gt/uc/selftest_guc_multi_lrc.c

-- 
2.32.0


^ permalink raw reply	[flat|nested] 69+ messages in thread
* [Intel-gfx] [PATCH 00/25] Parallel submission aka multi-bb execbuf
@ 2021-10-13 20:42 Matthew Brost
  2021-10-13 20:42 ` [Intel-gfx] [PATCH 25/25] drm/i915/execlists: Weak parallel submission support for execlists Matthew Brost
  0 siblings, 1 reply; 69+ messages in thread
From: Matthew Brost @ 2021-10-13 20:42 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: john.c.harrison

As discussed in [1] we are introducing a new parallel submission uAPI
for the i915 which allows more than 1 BB to be submitted in an execbuf
IOCTL. This is the implemenation for both GuC and execlists.

In addition to selftests in the series, an IGT is available implemented
in the first 4 patches [2].

The execbuf IOCTL changes have been done in a single large patch (#21)
as all the changes flow together and I believe a single patch will be
better if some one has to lookup this change in the future. Can split in
a series of smaller patches if desired.

This code is available in a public [3] repo for UMD teams to test there
code on.

v2: Drop complicated state machine to block in kernel if no guc_ids
available, perma-pin parallel contexts, reworker execbuf IOCTL to be a
series of loops inside the IOCTL rather than 1 large one on the outside,
address Daniel Vetter's comments
v3: Address John Harrison's comments, add a couple of patches which fix
bugs found internally
v4: Address John Harrison's latest round of comments

Signed-off-by: Matthew Brost <matthew.brost@intel.com>

[1] https://patchwork.freedesktop.org/series/92028/
[2] https://patchwork.freedesktop.org/series/93071/
[3] https://gitlab.freedesktop.org/mbrost/mbrost-drm-intel/-/tree/drm-intel-parallel

Matthew Brost (25):
  drm/i915/guc: Move GuC guc_id allocation under submission state
    sub-struct
  drm/i915/guc: Take GT PM ref when deregistering context
  drm/i915/guc: Take engine PM when a context is pinned with GuC
    submission
  drm/i915/guc: Don't call switch_to_kernel_context with GuC submission
  drm/i915: Add logical engine mapping
  drm/i915: Expose logical engine instance to user
  drm/i915/guc: Introduce context parent-child relationship
  drm/i915/guc: Add multi-lrc context registration
  drm/i915/guc: Ensure GuC schedule operations do not operate on child
    contexts
  drm/i915/guc: Assign contexts in parent-child relationship consecutive
    guc_ids
  drm/i915/guc: Implement parallel context pin / unpin functions
  drm/i915/guc: Implement multi-lrc submission
  drm/i915/guc: Insert submit fences between requests in parent-child
    relationship
  drm/i915/guc: Implement multi-lrc reset
  drm/i915/guc: Update debugfs for GuC multi-lrc
  drm/i915/guc: Connect UAPI to GuC multi-lrc interface
  drm/i915/doc: Update parallel submit doc to point to i915_drm.h
  drm/i915/guc: Add basic GuC multi-lrc selftest
  drm/i915/guc: Implement no mid batch preemption for multi-lrc
  drm/i915: Multi-BB execbuf
  drm/i915/guc: Handle errors in multi-lrc requests
  drm/i915: Make request conflict tracking understand parallel submits
  drm/i915: Update I915_GEM_BUSY IOCTL to understand composite fences
  drm/i915: Enable multi-bb execbuf
  drm/i915/execlists: Weak parallel submission support for execlists

 Documentation/gpu/rfc/i915_parallel_execbuf.h |  122 --
 Documentation/gpu/rfc/i915_scheduler.rst      |    4 +-
 drivers/gpu/drm/i915/gem/i915_gem_busy.c      |   57 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  227 ++-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   16 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  786 ++++++---
 drivers/gpu/drm/i915/gt/intel_context.c       |   50 +-
 drivers/gpu/drm/i915/gt/intel_context.h       |   54 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |   73 +-
 drivers/gpu/drm/i915/gt/intel_engine.h        |   12 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   66 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |   13 +
 drivers/gpu/drm/i915/gt/intel_engine_pm.h     |   37 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |    7 +
 .../drm/i915/gt/intel_execlists_submission.c  |   63 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.h         |   14 +
 drivers/gpu/drm/i915/gt/intel_lrc.c           |    7 +
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |   12 +-
 .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |    1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.c        |   29 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h        |   54 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |    2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     |   24 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   34 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 1452 ++++++++++++++---
 .../drm/i915/gt/uc/selftest_guc_multi_lrc.c   |  179 ++
 drivers/gpu/drm/i915/i915_query.c             |    2 +
 drivers/gpu/drm/i915/i915_request.c           |  143 +-
 drivers/gpu/drm/i915/i915_request.h           |   23 +
 drivers/gpu/drm/i915/i915_vma.c               |   21 +-
 drivers/gpu/drm/i915/i915_vma.h               |   13 +-
 drivers/gpu/drm/i915/intel_wakeref.h          |   12 +
 .../drm/i915/selftests/i915_live_selftests.h  |    1 +
 include/uapi/drm/i915_drm.h                   |  139 +-
 34 files changed, 3056 insertions(+), 693 deletions(-)
 delete mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
 create mode 100644 drivers/gpu/drm/i915/gt/uc/selftest_guc_multi_lrc.c

-- 
2.32.0


^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2021-10-15  6:12 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-10-14 17:19 [Intel-gfx] [PATCH 00/25] Parallel submission aka multi-bb execbuf Matthew Brost
2021-10-14 17:19 ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 01/25] drm/i915/guc: Move GuC guc_id allocation under submission state sub-struct Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 02/25] drm/i915/guc: Take GT PM ref when deregistering context Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 03/25] drm/i915/guc: Take engine PM when a context is pinned with GuC submission Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 04/25] drm/i915/guc: Don't call switch_to_kernel_context " Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 05/25] drm/i915: Add logical engine mapping Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 06/25] drm/i915: Expose logical engine instance to user Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 07/25] drm/i915/guc: Introduce context parent-child relationship Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 08/25] drm/i915/guc: Add multi-lrc context registration Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 18:18   ` [Intel-gfx] " John Harrison
2021-10-14 18:18     ` John Harrison
2021-10-14 17:19 ` [Intel-gfx] [PATCH 09/25] drm/i915/guc: Ensure GuC schedule operations do not operate on child contexts Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 10/25] drm/i915/guc: Assign contexts in parent-child relationship consecutive guc_ids Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 11/25] drm/i915/guc: Implement parallel context pin / unpin functions Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 12/25] drm/i915/guc: Implement multi-lrc submission Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 13/25] drm/i915/guc: Insert submit fences between requests in parent-child relationship Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 14/25] drm/i915/guc: Implement multi-lrc reset Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 15/25] drm/i915/guc: Update debugfs for GuC multi-lrc Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 16/25] drm/i915/guc: Connect UAPI to GuC multi-lrc interface Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 18:24   ` [Intel-gfx] " John Harrison
2021-10-14 18:24     ` John Harrison
2021-10-14 17:19 ` [Intel-gfx] [PATCH 17/25] drm/i915/doc: Update parallel submit doc to point to i915_drm.h Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 18/25] drm/i915/guc: Add basic GuC multi-lrc selftest Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] [PATCH 19/25] drm/i915/guc: Implement no mid batch preemption for multi-lrc Matthew Brost
2021-10-14 17:19   ` Matthew Brost
2021-10-14 17:20 ` [Intel-gfx] [PATCH 20/25] drm/i915: Multi-BB execbuf Matthew Brost
2021-10-14 17:20   ` Matthew Brost
2021-10-14 18:27   ` [Intel-gfx] " John Harrison
2021-10-14 18:27     ` John Harrison
2021-10-14 17:20 ` [Intel-gfx] [PATCH 21/25] drm/i915/guc: Handle errors in multi-lrc requests Matthew Brost
2021-10-14 17:20   ` Matthew Brost
2021-10-14 17:20 ` [Intel-gfx] [PATCH 22/25] drm/i915: Make request conflict tracking understand parallel submits Matthew Brost
2021-10-14 17:20   ` Matthew Brost
2021-10-14 17:20 ` [Intel-gfx] [PATCH 23/25] drm/i915: Update I915_GEM_BUSY IOCTL to understand composite fences Matthew Brost
2021-10-14 17:20   ` Matthew Brost
2021-10-14 17:20 ` [Intel-gfx] [PATCH 24/25] drm/i915: Enable multi-bb execbuf Matthew Brost
2021-10-14 17:20   ` Matthew Brost
2021-10-14 18:29   ` [Intel-gfx] " John Harrison
2021-10-14 18:29     ` John Harrison
2021-10-14 17:20 ` [Intel-gfx] [PATCH 25/25] drm/i915/execlists: Weak parallel submission support for execlists Matthew Brost
2021-10-14 17:20   ` Matthew Brost
2021-10-14 18:42   ` [Intel-gfx] " John Harrison
2021-10-14 18:42     ` John Harrison
2021-10-14 18:55     ` [Intel-gfx] " Matthew Brost
2021-10-14 18:55       ` Matthew Brost
2021-10-14 23:50 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Parallel submission aka multi-bb execbuf (rev7) Patchwork
2021-10-14 23:51 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-10-15  0:25 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-10-15  6:12 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2021-10-13 20:42 [Intel-gfx] [PATCH 00/25] Parallel submission aka multi-bb execbuf Matthew Brost
2021-10-13 20:42 ` [Intel-gfx] [PATCH 25/25] drm/i915/execlists: Weak parallel submission support for execlists Matthew Brost

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.