public inbox for dri-devel@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Matthew Brost <matthew.brost@intel.com>,
	intel-gfx@lists.freedesktop.org,
	 dri-devel@lists.freedesktop.org
Cc: john.c.harrison@intel.com, daniele.ceraolospurio@intel.com
Subject: Re: [Intel-gfx] [PATCH 24/26] drm/i915: Update I915_GEM_BUSY IOCTL to understand composite fences
Date: Tue, 12 Oct 2021 08:53:25 +0100	[thread overview]
Message-ID: <033fd934-26b8-2888-8605-45f80a38dffa@linux.intel.com> (raw)
In-Reply-To: <20211004220637.14746-25-matthew.brost@intel.com>


On 04/10/2021 23:06, Matthew Brost wrote:
> Parallel submission create composite fences (dma_fence_array) for excl /
> shared slots in objects. The I915_GEM_BUSY IOCTL checks these slots to
> determine the busyness of the object. Prior to patch it only check if
> the fence in the slot was a i915_request. Update the check to understand
> composite fences and correctly report the busyness.
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_busy.c      | 60 +++++++++++++++----
>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  5 +-
>   drivers/gpu/drm/i915/i915_request.h           |  6 ++
>   3 files changed, 58 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> index 6234e17259c1..b89d173c62eb 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> @@ -4,6 +4,8 @@
>    * Copyright © 2014-2016 Intel Corporation
>    */
>   
> +#include <linux/dma-fence-array.h>
> +
>   #include "gt/intel_engine.h"
>   
>   #include "i915_gem_ioctls.h"
> @@ -36,7 +38,7 @@ static __always_inline u32 __busy_write_id(u16 id)
>   }
>   
>   static __always_inline unsigned int
> -__busy_set_if_active(const struct dma_fence *fence, u32 (*flag)(u16 id))
> +__busy_set_if_active(struct dma_fence *fence, u32 (*flag)(u16 id))
>   {
>   	const struct i915_request *rq;
>   
> @@ -46,29 +48,63 @@ __busy_set_if_active(const struct dma_fence *fence, u32 (*flag)(u16 id))
>   	 * to eventually flush us, but to minimise latency just ask the
>   	 * hardware.
>   	 *
> -	 * Note we only report on the status of native fences.
> +	 * Note we only report on the status of native fences and we currently
> +	 * have two native fences:
> +	 *
> +	 * 1. A composite fence (dma_fence_array) constructed of i915 requests
> +	 * created during a parallel submission. In this case we deconstruct the
> +	 * composite fence into individual i915 requests and check the status of
> +	 * each request.
> +	 *
> +	 * 2. A single i915 request.
>   	 */
> -	if (!dma_fence_is_i915(fence))
> +	if (dma_fence_is_array(fence)) {
> +		struct dma_fence_array *array = to_dma_fence_array(fence);
> +		struct dma_fence **child = array->fences;
> +		unsigned int nchild = array->num_fences;
> +
> +		do {
> +			struct dma_fence *current_fence = *child++;
> +
> +			/* Not an i915 fence, can't be busy per above */
> +			if (!dma_fence_is_i915(current_fence) ||
> +			    !test_bit(I915_FENCE_FLAG_COMPOSITE,
> +				      &current_fence->flags)) {
> +				return 0;
> +			}
> +
> +			rq = to_request(current_fence);
> +			if (!i915_request_completed(rq)) {
> +				BUILD_BUG_ON(!typecheck(u16,
> +							rq->engine->uabi_class));
> +				return flag(rq->engine->uabi_class);
> +			}
> +		} while (--nchild);

Do you even need to introduce I915_FENCE_FLAG_COMPOSITE? If parallel 
submit is the only possible creator of array fences then possibly not. 
Probably even would result in less code which even keeps working in a 
hypothetical future. Otherwise you could add a debug bug on if array 
fence contains a fence without I915_FENCE_FLAG_COMPOSITE set.

Secondly, I'd also run the whole loop and not return on first busy or 
incompatible for simplicity.

And finally, with all above in place, I think you could have common 
function for the below (checking one fence) and call that both for a 
single fence and from an array loop above for less duplication. (Even 
duplicated BUILD_BUG_ON which makes no sense!)

End result would be a simpler patch like:

__busy_set_if_active_one(...)
{
    .. existing __busy_set_if_active ..
}

__busy_set_if_active(..)
{
   ...
   if (dma_fence_is_array(fence)) {
	...
	for (i = 0; i < array->num_fences; i++)
		flags |= __busy_set_if_active_one(...);
   } else {
	flags = __busy_set_if_active_one(...);
   }

Regards,

Tvrtko

> +
> +		/* All requests in array complete, not busy */
>   		return 0;
> +	} else {
> +		if (!dma_fence_is_i915(fence))
> +			return 0;
>   
> -	/* opencode to_request() in order to avoid const warnings */
> -	rq = container_of(fence, const struct i915_request, fence);
> -	if (i915_request_completed(rq))
> -		return 0;
> +		rq = to_request(fence);
> +		if (i915_request_completed(rq))
> +			return 0;
>   
> -	/* Beware type-expansion follies! */
> -	BUILD_BUG_ON(!typecheck(u16, rq->engine->uabi_class));
> -	return flag(rq->engine->uabi_class);
> +		/* Beware type-expansion follies! */
> +		BUILD_BUG_ON(!typecheck(u16, rq->engine->uabi_class));
> +		return flag(rq->engine->uabi_class);
> +	}
>   }
>   
>   static __always_inline unsigned int
> -busy_check_reader(const struct dma_fence *fence)
> +busy_check_reader(struct dma_fence *fence)
>   {
>   	return __busy_set_if_active(fence, __busy_read_flag);
>   }
>   
>   static __always_inline unsigned int
> -busy_check_writer(const struct dma_fence *fence)
> +busy_check_writer(struct dma_fence *fence)
>   {
>   	if (!fence)
>   		return 0;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 5c7fb6f68bbb..16276f406fd6 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -2988,8 +2988,11 @@ eb_composite_fence_create(struct i915_execbuffer *eb, int out_fence_fd)
>   	if (!fences)
>   		return ERR_PTR(-ENOMEM);
>   
> -	for_each_batch_create_order(eb, i)
> +	for_each_batch_create_order(eb, i) {
>   		fences[i] = &eb->requests[i]->fence;
> +		__set_bit(I915_FENCE_FLAG_COMPOSITE,
> +			  &eb->requests[i]->fence.flags);
> +	}
>   
>   	fence_array = dma_fence_array_create(eb->num_batches,
>   					     fences,
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index 24db8459376b..dc359242d1ae 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -156,6 +156,12 @@ enum {
>   	 * submission / relationship encoutered an error.
>   	 */
>   	I915_FENCE_FLAG_SKIP_PARALLEL,
> +
> +	/*
> +	 * I915_FENCE_FLAG_COMPOSITE - Indicates fence is part of a composite
> +	 * fence (dma_fence_array) and i915 generated for parallel submission.
> +	 */
> +	I915_FENCE_FLAG_COMPOSITE,
>   };
>   
>   /**
> 

  parent reply	other threads:[~2021-10-12  7:53 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-04 22:06 [PATCH 00/26] Parallel submission aka multi-bb execbuf Matthew Brost
2021-10-04 22:06 ` [PATCH 01/26] drm/i915/guc: Move GuC guc_id allocation under submission state sub-struct Matthew Brost
2021-10-07  3:06   ` John Harrison
2021-10-07 15:05     ` Matthew Brost
2021-10-07 18:13       ` John Harrison
2021-10-04 22:06 ` [PATCH 02/26] drm/i915/guc: Take GT PM ref when deregistering context Matthew Brost
2021-10-07  3:37   ` John Harrison
2021-10-08  1:28     ` Matthew Brost
2021-10-08 18:23     ` Matthew Brost
2021-10-04 22:06 ` [PATCH 03/26] drm/i915/guc: Take engine PM when a context is pinned with GuC submission Matthew Brost
2021-10-07  3:45   ` John Harrison
2021-10-07 15:19     ` Matthew Brost
2021-10-07 18:15       ` John Harrison
2021-10-08  1:23         ` Matthew Brost
2021-10-04 22:06 ` [PATCH 04/26] drm/i915/guc: Don't call switch_to_kernel_context " Matthew Brost
2021-10-07  3:49   ` John Harrison
2021-10-04 22:06 ` [PATCH 05/26] drm/i915: Add logical engine mapping Matthew Brost
2021-10-07 19:03   ` John Harrison
2021-10-04 22:06 ` [PATCH 06/26] drm/i915: Expose logical engine instance to user Matthew Brost
2021-10-04 22:06 ` [PATCH 07/26] drm/i915/guc: Introduce context parent-child relationship Matthew Brost
2021-10-07 19:35   ` John Harrison
2021-10-08 18:33     ` Matthew Brost
2021-10-04 22:06 ` [PATCH 08/26] drm/i915/guc: Add multi-lrc context registration Matthew Brost
2021-10-07 19:50   ` John Harrison
2021-10-08  1:31     ` Matthew Brost
2021-10-08 17:20     ` [Intel-gfx] " John Harrison
2021-10-08 17:29       ` Matthew Brost
2021-10-04 22:06 ` [PATCH 09/26] drm/i915/guc: Ensure GuC schedule operations do not operate on child contexts Matthew Brost
2021-10-07 20:23   ` John Harrison
2021-10-04 22:06 ` [PATCH 10/26] drm/i915/guc: Assign contexts in parent-child relationship consecutive guc_ids Matthew Brost
2021-10-07 22:03   ` John Harrison
2021-10-08  1:21     ` Matthew Brost
2021-10-08 16:40       ` John Harrison
2021-10-13 18:03         ` Matthew Brost
2021-10-13 19:11           ` John Harrison
2021-10-04 22:06 ` [PATCH 11/26] drm/i915/guc: Implement parallel context pin / unpin functions Matthew Brost
2021-10-04 22:06 ` [PATCH 12/26] drm/i915/guc: Implement multi-lrc submission Matthew Brost
2021-10-05  7:55   ` [Intel-gfx] " kernel test robot
2021-10-05 10:37   ` kernel test robot
2021-10-08 17:20   ` John Harrison
2021-10-13 18:24     ` Matthew Brost
2021-10-04 22:06 ` [PATCH 13/26] drm/i915/guc: Insert submit fences between requests in parent-child relationship Matthew Brost
2021-10-04 22:06 ` [PATCH 14/26] drm/i915/guc: Implement multi-lrc reset Matthew Brost
2021-10-08 17:39   ` John Harrison
2021-10-08 17:56     ` Matthew Brost
2021-10-04 22:06 ` [PATCH 15/26] drm/i915/guc: Update debugfs for GuC multi-lrc Matthew Brost
2021-10-08 17:46   ` John Harrison
2021-10-04 22:06 ` [PATCH 16/26] drm/i915: Fix bug in user proto-context creation that leaked contexts Matthew Brost
2021-10-08 17:49   ` John Harrison
2021-10-04 22:06 ` [PATCH 17/26] drm/i915/guc: Connect UAPI to GuC multi-lrc interface Matthew Brost
2021-10-11 22:09   ` John Harrison
2021-10-11 22:59     ` Matthew Brost
2021-10-04 22:06 ` [PATCH 18/26] drm/i915/doc: Update parallel submit doc to point to i915_drm.h Matthew Brost
2021-10-04 22:06 ` [PATCH 19/26] drm/i915/guc: Add basic GuC multi-lrc selftest Matthew Brost
2021-10-04 22:06 ` [PATCH 20/26] drm/i915/guc: Implement no mid batch preemption for multi-lrc Matthew Brost
2021-10-11 23:32   ` John Harrison
2021-10-13  1:52     ` Matthew Brost
2021-10-04 22:06 ` [PATCH 21/26] drm/i915: Multi-BB execbuf Matthew Brost
2021-10-05  8:31   ` [Intel-gfx] " kernel test robot
2021-10-05 17:02   ` Matthew Brost
2021-10-06 20:46   ` Matthew Brost
2021-10-12 21:22   ` John Harrison
2021-10-13  0:37     ` Matthew Brost
2021-10-04 22:06 ` [PATCH 22/26] drm/i915/guc: Handle errors in multi-lrc requests Matthew Brost
2021-10-12 21:56   ` John Harrison
2021-10-13  0:18     ` Matthew Brost
2021-10-04 22:06 ` [PATCH 23/26] drm/i915: Make request conflict tracking understand parallel submits Matthew Brost
2021-10-12 22:08   ` John Harrison
2021-10-13  0:32     ` Matthew Brost
2021-10-13 19:35       ` John Harrison
2021-10-13 17:51     ` Matthew Brost
2021-10-13 19:25       ` John Harrison
2021-10-04 22:06 ` [PATCH 24/26] drm/i915: Update I915_GEM_BUSY IOCTL to understand composite fences Matthew Brost
2021-10-11 22:15   ` Daniele Ceraolo Spurio
2021-10-12  7:53   ` Tvrtko Ursulin [this message]
2021-10-12 18:31     ` [Intel-gfx] " Matthew Brost
2021-10-04 22:06 ` [PATCH 25/26] drm/i915: Enable multi-bb execbuf Matthew Brost
2021-10-04 22:06 ` [PATCH 26/26] drm/i915/execlists: Weak parallel submission support for execlists Matthew Brost
2021-10-12 18:11 ` [PATCH 02/26] drm/i915/guc: Take GT PM ref when deregistering context Matthew Brost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=033fd934-26b8-2888-8605-45f80a38dffa@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=john.c.harrison@intel.com \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox