AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>,
	Alex Deucher <alexander.deucher@amd.com>,
	David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>,
	amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 2/3] drm/amdgpu: increment sched score on entity selection
Date: Fri, 7 Nov 2025 11:26:29 +0100	[thread overview]
Message-ID: <5717c024-0200-4b23-a25b-681ef0937d6f@amd.com> (raw)
In-Reply-To: <20251107090425.23199-2-pierre-eric.pelloux-prayer@amd.com>



On 11/7/25 10:04, Pierre-Eric Pelloux-Prayer wrote:
> For hw engines that can't load balance jobs, entities are
> "statically" load balanced: on their first submit, they select
> the best scheduler based on its score.
> The score is made up of 2 parts:
> * the job queue depth (how much jobs are executing/waiting)
> * the number of entities assigned
> 
> The second part is only relevant for the static load balance:
> it's a way to consider how many entities are attached to this
> scheduler, knowing that if they ever submit jobs they will go
> to this one.
> 
> For rings that can load balance jobs freely, idle entities
> aren't a concern and shouldn't impact the scheduler's decisions.
> 
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 22 +++++++++++++++++-----
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h |  1 +
>  2 files changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index afedea02188d..4d91cbcbcf25 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -209,6 +209,7 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
>  	struct amdgpu_ctx_entity *entity;
>  	enum drm_sched_priority drm_prio;
>  	unsigned int hw_prio, num_scheds;
> +	struct amdgpu_ring *aring;
>  	int32_t ctx_prio;
>  	int r;
>  
> @@ -239,11 +240,13 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
>  			goto error_free_entity;
>  	}
>  
> -	/* disable load balance if the hw engine retains context among dependent jobs */
> -	if (hw_ip == AMDGPU_HW_IP_VCN_ENC ||
> -	    hw_ip == AMDGPU_HW_IP_VCN_DEC ||
> -	    hw_ip == AMDGPU_HW_IP_UVD_ENC ||
> -	    hw_ip == AMDGPU_HW_IP_UVD) {
> +	sched = scheds[0];
> +	aring = container_of(sched, struct amdgpu_ring, sched);
> +
> +	if (aring->funcs->engine_retains_context) {
> +		/* Disable load balancing between multiple schedulers if the hw
> +		 * engine retains context among dependent jobs.
> +		 */
>  		sched = drm_sched_pick_best(scheds, num_scheds);
>  		scheds = &sched;
>  		num_scheds = 1;
> @@ -258,6 +261,12 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
>  	if (cmpxchg(&ctx->entities[hw_ip][ring], NULL, entity))
>  		goto cleanup_entity;
>  
> +	if (aring->funcs->engine_retains_context) {
> +		aring = container_of(sched, struct amdgpu_ring, sched);
> +		entity->sched_score = aring->sched_score;
> +		atomic_inc(entity->sched_score);
> +	}
> +
>  	return 0;
>  
>  cleanup_entity:
> @@ -514,6 +523,9 @@ static void amdgpu_ctx_do_release(struct kref *ref)
>  			if (!ctx->entities[i][j])
>  				continue;
>  
> +			if (ctx->entities[i][j]->sched_score)
> +				atomic_dec(ctx->entities[i][j]->sched_score);
> +
>  			drm_sched_entity_destroy(&ctx->entities[i][j]->entity);
>  		}
>  	}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
> index 090dfe86f75b..f7b44f96f374 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
> @@ -39,6 +39,7 @@ struct amdgpu_ctx_entity {
>  	uint32_t		hw_ip;
>  	uint64_t		sequence;
>  	struct drm_sched_entity	entity;
> +	atomic_t		*sched_score;

I would rather prefer to not have that additional member here.

Additional to that we are messing with the internals of the scheduler here and should probably have two clean functions to increase/decrease the score.

Regards,
Christian.

>  	struct dma_fence	*fences[];
>  };
>  


  reply	other threads:[~2025-11-07 10:26 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-07  9:04 [PATCH v4 1/3] drm/amdgpu: add engine_retains_context to amdgpu_ring_funcs Pierre-Eric Pelloux-Prayer
2025-11-07  9:04 ` [PATCH v4 2/3] drm/amdgpu: increment sched score on entity selection Pierre-Eric Pelloux-Prayer
2025-11-07 10:26   ` Christian König [this message]
2025-11-07 10:39     ` Tvrtko Ursulin
2025-11-13 16:43       ` Pierre-Eric Pelloux-Prayer
2025-11-07 10:14 ` [PATCH v4 1/3] drm/amdgpu: add engine_retains_context to amdgpu_ring_funcs Christian König
2025-11-07 10:14 ` Tvrtko Ursulin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5717c024-0200-4b23-a25b-681ef0937d6f@amd.com \
    --to=christian.koenig@amd.com \
    --cc=airlied@gmail.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pierre-eric.pelloux-prayer@amd.com \
    --cc=simona@ffwll.ch \
    --cc=tvrtko.ursulin@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox