dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Philipp Stanner <phasta@mailbox.org>
To: "Christian König" <ckoenig.leichtzumerken@gmail.com>,
	tursulin@ursulin.net, dakr@kernel.org,
	amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency
Date: Mon, 26 May 2025 09:28:02 +0200	[thread overview]
Message-ID: <a3ef761d7ba3544798e04547ca882cc1ef4c5899.camel@mailbox.org> (raw)
In-Reply-To: <20250523125643.7540-2-christian.koenig@amd.com>

On Fri, 2025-05-23 at 14:56 +0200, Christian König wrote:
> It turned out that we can actually massively optimize here.
> 
> The previous code was horrible inefficient since it constantly
> released
> and re-acquired the lock of the xarray and started each iteration
> from the
> base of the array to avoid concurrent modification which in our case
> doesn't exist.
> 
> Additional to that the xas_find() and xas_store() functions are
> explicitly
> made in a way so that you can efficiently check entries and if you
> don't
> find a match store a new one at the end or replace existing ones.
> 
> So use xas_for_each()/xa_store() instead of xa_for_each()/xa_alloc().
> It's a bit more code, but should be much faster in the end.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 29 ++++++++++++++++++------
> --
>  1 file changed, 20 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> b/drivers/gpu/drm/scheduler/sched_main.c
> index f7118497e47a..cf200b1b643e 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -871,10 +871,8 @@ EXPORT_SYMBOL(drm_sched_job_arm);
>  int drm_sched_job_add_dependency(struct drm_sched_job *job,
>  				 struct dma_fence *fence)
>  {
> +	XA_STATE(xas, &job->dependencies, 0);
>  	struct dma_fence *entry;
> -	unsigned long index;
> -	u32 id = 0;
> -	int ret;
>  
>  	if (!fence)
>  		return 0;
> @@ -883,24 +881,37 @@ int drm_sched_job_add_dependency(struct
> drm_sched_job *job,
>  	 * This lets the size of the array of deps scale with the
> number of
>  	 * engines involved, rather than the number of BOs.
>  	 */
> -	xa_for_each(&job->dependencies, index, entry) {
> +	xa_lock(&job->dependencies);
> +	xas_for_each(&xas, entry, ULONG_MAX) {
>  		if (entry->context != fence->context)
>  			continue;
>  
>  		if (dma_fence_is_later(fence, entry)) {
>  			dma_fence_put(entry);
> -			xa_store(&job->dependencies, index, fence,
> GFP_KERNEL);
> +			xas_store(&xas, fence);
>  		} else {
>  			dma_fence_put(fence);
>  		}
> -		return 0;
> +		xa_unlock(&job->dependencies);
> +		return xas_error(&xas);
>  	}
>  
> -	ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b,
> GFP_KERNEL);
> -	if (ret != 0)
> +retry:
> +	entry = xas_store(&xas, fence);
> +	xa_unlock(&job->dependencies);
> +
> +	/* There shouldn't be any concurrent add, so no need to loop
> again */

Should we maybe add it to the function documentation that this must not
be called concurrently?

Looks to me as if the current version were already broken if someone
does that. So maybe is also OK to just leave it as is.


P.


> +	if (xas_nomem(&xas, GFP_KERNEL)) {
> +		xa_lock(&job->dependencies);
> +		goto retry;
> +	}
> +
> +	if (xas_error(&xas))
>  		dma_fence_put(fence);
> +	else
> +		WARN_ON(entry);
>  
> -	return ret;
> +	return xas_error(&xas);
>  }
>  EXPORT_SYMBOL(drm_sched_job_add_dependency);
>  


  parent reply	other threads:[~2025-05-26  7:28 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-23 12:56 Fixing AMDGPUs gang submit error handling Christian König
2025-05-23 12:56 ` [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency Christian König
2025-05-23 13:49   ` Tvrtko Ursulin
2025-05-23 14:11   ` Danilo Krummrich
2025-05-23 14:16     ` Danilo Krummrich
2025-05-26  9:25       ` Christian König
2025-05-26  9:34         ` Philipp Stanner
2025-05-26 11:16           ` Christian König
2025-05-26 11:27             ` Philipp Stanner
2025-05-28 12:30               ` Simona Vetter
2025-05-28 13:24                 ` Christian König
2025-05-28 13:29                 ` Alex Deucher
2025-05-28 14:38                   ` Danilo Krummrich
2025-05-28 14:47                     ` Danilo Krummrich
2025-06-03 11:34                       ` Simona Vetter
2025-05-24 11:17     ` Danilo Krummrich
2025-05-26 10:59       ` Christian König
2025-05-26 11:14         ` Danilo Krummrich
2025-05-26 11:36           ` Christian König
2025-05-26  7:28   ` Philipp Stanner [this message]
2025-05-23 12:56 ` [PATCH 2/4] drm/sched: add drm_sched_prealloc_dependency_slots Christian König
2025-05-23 12:56 ` [PATCH 3/4] drm/sched: Add a test for prealloced fence slots Christian König
2025-05-23 12:56 ` [PATCH 4/4] drm/amdgpu: fix gang submission error handling Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a3ef761d7ba3544798e04547ca882cc1ef4c5899.camel@mailbox.org \
    --to=phasta@mailbox.org \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=phasta@kernel.org \
    --cc=tursulin@ursulin.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).