From: Matthew Brost <matthew.brost@intel.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
john.c.harrison@intel.com, daniele.ceraolospurio@intel.com
Subject: Re: [Intel-gfx] [PATCH 24/26] drm/i915: Update I915_GEM_BUSY IOCTL to understand composite fences
Date: Tue, 12 Oct 2021 11:31:44 -0700 [thread overview]
Message-ID: <20211012183144.GA9834@jons-linux-dev-box> (raw)
In-Reply-To: <033fd934-26b8-2888-8605-45f80a38dffa@linux.intel.com>
On Tue, Oct 12, 2021 at 08:53:25AM +0100, Tvrtko Ursulin wrote:
>
> On 04/10/2021 23:06, Matthew Brost wrote:
> > Parallel submission create composite fences (dma_fence_array) for excl /
> > shared slots in objects. The I915_GEM_BUSY IOCTL checks these slots to
> > determine the busyness of the object. Prior to patch it only check if
> > the fence in the slot was a i915_request. Update the check to understand
> > composite fences and correctly report the busyness.
> >
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> > drivers/gpu/drm/i915/gem/i915_gem_busy.c | 60 +++++++++++++++----
> > .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 5 +-
> > drivers/gpu/drm/i915/i915_request.h | 6 ++
> > 3 files changed, 58 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > index 6234e17259c1..b89d173c62eb 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > @@ -4,6 +4,8 @@
> > * Copyright © 2014-2016 Intel Corporation
> > */
> > +#include <linux/dma-fence-array.h>
> > +
> > #include "gt/intel_engine.h"
> > #include "i915_gem_ioctls.h"
> > @@ -36,7 +38,7 @@ static __always_inline u32 __busy_write_id(u16 id)
> > }
> > static __always_inline unsigned int
> > -__busy_set_if_active(const struct dma_fence *fence, u32 (*flag)(u16 id))
> > +__busy_set_if_active(struct dma_fence *fence, u32 (*flag)(u16 id))
> > {
> > const struct i915_request *rq;
> > @@ -46,29 +48,63 @@ __busy_set_if_active(const struct dma_fence *fence, u32 (*flag)(u16 id))
> > * to eventually flush us, but to minimise latency just ask the
> > * hardware.
> > *
> > - * Note we only report on the status of native fences.
> > + * Note we only report on the status of native fences and we currently
> > + * have two native fences:
> > + *
> > + * 1. A composite fence (dma_fence_array) constructed of i915 requests
> > + * created during a parallel submission. In this case we deconstruct the
> > + * composite fence into individual i915 requests and check the status of
> > + * each request.
> > + *
> > + * 2. A single i915 request.
> > */
> > - if (!dma_fence_is_i915(fence))
> > + if (dma_fence_is_array(fence)) {
> > + struct dma_fence_array *array = to_dma_fence_array(fence);
> > + struct dma_fence **child = array->fences;
> > + unsigned int nchild = array->num_fences;
> > +
> > + do {
> > + struct dma_fence *current_fence = *child++;
> > +
> > + /* Not an i915 fence, can't be busy per above */
> > + if (!dma_fence_is_i915(current_fence) ||
> > + !test_bit(I915_FENCE_FLAG_COMPOSITE,
> > + ¤t_fence->flags)) {
> > + return 0;
> > + }
> > +
> > + rq = to_request(current_fence);
> > + if (!i915_request_completed(rq)) {
> > + BUILD_BUG_ON(!typecheck(u16,
> > + rq->engine->uabi_class));
> > + return flag(rq->engine->uabi_class);
> > + }
> > + } while (--nchild);
>
> Do you even need to introduce I915_FENCE_FLAG_COMPOSITE? If parallel submit
> is the only possible creator of array fences then possibly not. Probably
> even would result in less code which even keeps working in a hypothetical
> future. Otherwise you could add a debug bug on if array fence contains a
> fence without I915_FENCE_FLAG_COMPOSITE set.
>
Certainly other drivers can create a dma fence array and in theory could
include a i915_request in that array. Adding this flag makes it clear
that this fence was created by i915 for parallel submission and future
proofs this code.
> Secondly, I'd also run the whole loop and not return on first busy or
> incompatible for simplicity.
>
I disagree. Short circuiting when a condition is found is pretty
standard and not hard to understand.
> And finally, with all above in place, I think you could have common function
> for the below (checking one fence) and call that both for a single fence and
> from an array loop above for less duplication. (Even duplicated BUILD_BUG_ON
> which makes no sense!)
>
Yea duplicating the BUILD_BUG_ON doesn't make a ton of sense. Will
remove.
Disagree on the helper, the code paths are different enough to just open
code this.
Matt
> End result would be a simpler patch like:
>
> __busy_set_if_active_one(...)
> {
> .. existing __busy_set_if_active ..
> }
>
> __busy_set_if_active(..)
> {
> ...
> if (dma_fence_is_array(fence)) {
> ...
> for (i = 0; i < array->num_fences; i++)
> flags |= __busy_set_if_active_one(...);
> } else {
> flags = __busy_set_if_active_one(...);
> }
>
> Regards,
>
> Tvrtko
>
> > +
> > + /* All requests in array complete, not busy */
> > return 0;
> > + } else {
> > + if (!dma_fence_is_i915(fence))
> > + return 0;
> > - /* opencode to_request() in order to avoid const warnings */
> > - rq = container_of(fence, const struct i915_request, fence);
> > - if (i915_request_completed(rq))
> > - return 0;
> > + rq = to_request(fence);
> > + if (i915_request_completed(rq))
> > + return 0;
> > - /* Beware type-expansion follies! */
> > - BUILD_BUG_ON(!typecheck(u16, rq->engine->uabi_class));
> > - return flag(rq->engine->uabi_class);
> > + /* Beware type-expansion follies! */
> > + BUILD_BUG_ON(!typecheck(u16, rq->engine->uabi_class));
> > + return flag(rq->engine->uabi_class);
> > + }
> > }
> > static __always_inline unsigned int
> > -busy_check_reader(const struct dma_fence *fence)
> > +busy_check_reader(struct dma_fence *fence)
> > {
> > return __busy_set_if_active(fence, __busy_read_flag);
> > }
> > static __always_inline unsigned int
> > -busy_check_writer(const struct dma_fence *fence)
> > +busy_check_writer(struct dma_fence *fence)
> > {
> > if (!fence)
> > return 0;
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > index 5c7fb6f68bbb..16276f406fd6 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > @@ -2988,8 +2988,11 @@ eb_composite_fence_create(struct i915_execbuffer *eb, int out_fence_fd)
> > if (!fences)
> > return ERR_PTR(-ENOMEM);
> > - for_each_batch_create_order(eb, i)
> > + for_each_batch_create_order(eb, i) {
> > fences[i] = &eb->requests[i]->fence;
> > + __set_bit(I915_FENCE_FLAG_COMPOSITE,
> > + &eb->requests[i]->fence.flags);
> > + }
> > fence_array = dma_fence_array_create(eb->num_batches,
> > fences,
> > diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> > index 24db8459376b..dc359242d1ae 100644
> > --- a/drivers/gpu/drm/i915/i915_request.h
> > +++ b/drivers/gpu/drm/i915/i915_request.h
> > @@ -156,6 +156,12 @@ enum {
> > * submission / relationship encoutered an error.
> > */
> > I915_FENCE_FLAG_SKIP_PARALLEL,
> > +
> > + /*
> > + * I915_FENCE_FLAG_COMPOSITE - Indicates fence is part of a composite
> > + * fence (dma_fence_array) and i915 generated for parallel submission.
> > + */
> > + I915_FENCE_FLAG_COMPOSITE,
> > };
> > /**
> >
next prev parent reply other threads:[~2021-10-12 18:36 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-04 22:06 [Intel-gfx] [PATCH 00/26] Parallel submission aka multi-bb execbuf Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 01/26] drm/i915/guc: Move GuC guc_id allocation under submission state sub-struct Matthew Brost
2021-10-07 3:06 ` John Harrison
2021-10-07 15:05 ` Matthew Brost
2021-10-07 18:13 ` John Harrison
2021-10-04 22:06 ` [Intel-gfx] [PATCH 02/26] drm/i915/guc: Take GT PM ref when deregistering context Matthew Brost
2021-10-07 3:37 ` John Harrison
2021-10-08 1:28 ` Matthew Brost
2021-10-08 18:23 ` Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 03/26] drm/i915/guc: Take engine PM when a context is pinned with GuC submission Matthew Brost
2021-10-07 3:45 ` John Harrison
2021-10-07 15:19 ` Matthew Brost
2021-10-07 18:15 ` John Harrison
2021-10-08 1:23 ` Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 04/26] drm/i915/guc: Don't call switch_to_kernel_context " Matthew Brost
2021-10-07 3:49 ` John Harrison
2021-10-04 22:06 ` [Intel-gfx] [PATCH 05/26] drm/i915: Add logical engine mapping Matthew Brost
2021-10-07 19:03 ` John Harrison
2021-10-04 22:06 ` [Intel-gfx] [PATCH 06/26] drm/i915: Expose logical engine instance to user Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 07/26] drm/i915/guc: Introduce context parent-child relationship Matthew Brost
2021-10-07 19:35 ` John Harrison
2021-10-08 18:33 ` Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 08/26] drm/i915/guc: Add multi-lrc context registration Matthew Brost
2021-10-07 19:50 ` John Harrison
2021-10-08 1:31 ` Matthew Brost
2021-10-08 17:20 ` John Harrison
2021-10-08 17:29 ` Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 09/26] drm/i915/guc: Ensure GuC schedule operations do not operate on child contexts Matthew Brost
2021-10-07 20:23 ` John Harrison
2021-10-04 22:06 ` [Intel-gfx] [PATCH 10/26] drm/i915/guc: Assign contexts in parent-child relationship consecutive guc_ids Matthew Brost
2021-10-07 22:03 ` John Harrison
2021-10-08 1:21 ` Matthew Brost
2021-10-08 16:40 ` John Harrison
2021-10-13 18:03 ` Matthew Brost
2021-10-13 19:11 ` John Harrison
2021-10-04 22:06 ` [Intel-gfx] [PATCH 11/26] drm/i915/guc: Implement parallel context pin / unpin functions Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 12/26] drm/i915/guc: Implement multi-lrc submission Matthew Brost
2021-10-05 7:55 ` kernel test robot
2021-10-05 10:37 ` kernel test robot
2021-10-08 17:20 ` John Harrison
2021-10-13 18:24 ` Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 13/26] drm/i915/guc: Insert submit fences between requests in parent-child relationship Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 14/26] drm/i915/guc: Implement multi-lrc reset Matthew Brost
2021-10-08 17:39 ` John Harrison
2021-10-08 17:56 ` Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 15/26] drm/i915/guc: Update debugfs for GuC multi-lrc Matthew Brost
2021-10-08 17:46 ` John Harrison
2021-10-04 22:06 ` [Intel-gfx] [PATCH 16/26] drm/i915: Fix bug in user proto-context creation that leaked contexts Matthew Brost
2021-10-08 17:49 ` John Harrison
2021-10-04 22:06 ` [Intel-gfx] [PATCH 17/26] drm/i915/guc: Connect UAPI to GuC multi-lrc interface Matthew Brost
2021-10-11 22:09 ` John Harrison
2021-10-11 22:59 ` Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 18/26] drm/i915/doc: Update parallel submit doc to point to i915_drm.h Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 19/26] drm/i915/guc: Add basic GuC multi-lrc selftest Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 20/26] drm/i915/guc: Implement no mid batch preemption for multi-lrc Matthew Brost
2021-10-11 23:32 ` John Harrison
2021-10-13 1:52 ` Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 21/26] drm/i915: Multi-BB execbuf Matthew Brost
2021-10-05 8:31 ` kernel test robot
2021-10-05 17:02 ` Matthew Brost
2021-10-06 20:46 ` Matthew Brost
2021-10-12 21:22 ` John Harrison
2021-10-13 0:37 ` Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 22/26] drm/i915/guc: Handle errors in multi-lrc requests Matthew Brost
2021-10-12 21:56 ` John Harrison
2021-10-13 0:18 ` Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 23/26] drm/i915: Make request conflict tracking understand parallel submits Matthew Brost
2021-10-12 22:08 ` John Harrison
2021-10-13 0:32 ` Matthew Brost
2021-10-13 19:35 ` John Harrison
2021-10-13 17:51 ` Matthew Brost
2021-10-13 19:25 ` John Harrison
2021-10-04 22:06 ` [Intel-gfx] [PATCH 24/26] drm/i915: Update I915_GEM_BUSY IOCTL to understand composite fences Matthew Brost
2021-10-11 22:15 ` Daniele Ceraolo Spurio
2021-10-12 7:53 ` Tvrtko Ursulin
2021-10-12 18:31 ` Matthew Brost [this message]
2021-10-04 22:06 ` [Intel-gfx] [PATCH 25/26] drm/i915: Enable multi-bb execbuf Matthew Brost
2021-10-04 22:06 ` [Intel-gfx] [PATCH 26/26] drm/i915/execlists: Weak parallel submission support for execlists Matthew Brost
2021-10-04 22:21 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Parallel submission aka multi-bb execbuf (rev4) Patchwork
2021-10-12 22:15 ` John Harrison
2021-10-13 0:15 ` Matthew Brost
2021-10-13 19:24 ` John Harrison
2021-10-04 22:23 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-10-04 22:26 ` [Intel-gfx] ✗ Fi.CI.DOCS: " Patchwork
2021-10-12 22:15 ` John Harrison
2021-10-13 0:12 ` Matthew Brost
2021-10-04 22:54 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2021-10-05 1:49 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Parallel submission aka multi-bb execbuf (rev5) Patchwork
2021-10-05 1:51 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-10-05 1:54 ` [Intel-gfx] ✗ Fi.CI.DOCS: " Patchwork
2021-10-05 2:21 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2021-10-12 18:11 ` [Intel-gfx] [PATCH 02/26] drm/i915/guc: Take GT PM ref when deregistering context Matthew Brost
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211012183144.GA9834@jons-linux-dev-box \
--to=matthew.brost@intel.com \
--cc=daniele.ceraolospurio@intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=john.c.harrison@intel.com \
--cc=tvrtko.ursulin@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox