[PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection

dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection
@ 2025-08-22 13:43 Pierre-Eric Pelloux-Prayer
  2025-08-22 13:43 ` [PATCH v1 2/2] drm/sched: limit sched score update to jobs change Pierre-Eric Pelloux-Prayer
  2025-09-01  9:02 ` [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Tvrtko Ursulin
  0 siblings, 2 replies; 7+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-08-22 13:43 UTC (permalink / raw)
  To: Alex Deucher, Christian König, David Airlie, Simona Vetter
  Cc: Pierre-Eric Pelloux-Prayer, amd-gfx, dri-devel, linux-kernel

For hw engines that can't load balance jobs, entities are
"statically" load balanced: on their first submit, they select
the best scheduler based on its score.
The score is made up of 2 parts:
* the job queue depth (how much jobs are executing/waiting)
* the number of entities assigned

The second part is only relevant for the static load balance:
it's a way to consider how many entities are attached to this
scheduler, knowing that if they ever submit jobs they will go
to this one.

For rings that can load balance jobs freely, idle entities
aren't a concern and shouldn't impact the scheduler's decisions.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 23 ++++++++++++++++++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h |  1 +
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index f5d5c45ddc0d..4a078d2d98c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -206,9 +206,11 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
 {
 	struct drm_gpu_scheduler **scheds = NULL, *sched = NULL;
 	struct amdgpu_device *adev = ctx->mgr->adev;
+	bool static_load_balancing = false;
 	struct amdgpu_ctx_entity *entity;
 	enum drm_sched_priority drm_prio;
 	unsigned int hw_prio, num_scheds;
+	struct amdgpu_ring *aring;
 	int32_t ctx_prio;
 	int r;
 
@@ -236,17 +238,22 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
 		r = amdgpu_xcp_select_scheds(adev, hw_ip, hw_prio, fpriv,
 						&num_scheds, &scheds);
 		if (r)
-			goto cleanup_entity;
+			goto error_free_entity;
 	}
 
 	/* disable load balance if the hw engine retains context among dependent jobs */
-	if (hw_ip == AMDGPU_HW_IP_VCN_ENC ||
-	    hw_ip == AMDGPU_HW_IP_VCN_DEC ||
-	    hw_ip == AMDGPU_HW_IP_UVD_ENC ||
-	    hw_ip == AMDGPU_HW_IP_UVD) {
+	static_load_balancing = hw_ip == AMDGPU_HW_IP_VCN_ENC ||
+				hw_ip == AMDGPU_HW_IP_VCN_DEC ||
+				hw_ip == AMDGPU_HW_IP_UVD_ENC ||
+				hw_ip == AMDGPU_HW_IP_UVD;
+
+	if (static_load_balancing) {
 		sched = drm_sched_pick_best(scheds, num_scheds);
 		scheds = &sched;
 		num_scheds = 1;
+		aring = container_of(sched, struct amdgpu_ring, sched);
+		entity->sched_ring_score = aring->sched_score;
+		atomic_inc(entity->sched_ring_score);
 	}
 
 	r = drm_sched_entity_init(&entity->entity, drm_prio, scheds, num_scheds,
@@ -264,6 +271,9 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
 	drm_sched_entity_fini(&entity->entity);
 
 error_free_entity:
+	if (static_load_balancing)
+		atomic_dec(entity->sched_ring_score);
+
 	kfree(entity);
 
 	return r;
@@ -514,6 +524,9 @@ static void amdgpu_ctx_do_release(struct kref *ref)
 			if (!ctx->entities[i][j])
 				continue;
 
+			if (ctx->entities[i][j]->sched_ring_score)
+				atomic_dec(ctx->entities[i][j]->sched_ring_score);
+
 			drm_sched_entity_destroy(&ctx->entities[i][j]->entity);
 		}
 	}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
index 090dfe86f75b..076a0e165ce0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
@@ -39,6 +39,7 @@ struct amdgpu_ctx_entity {
 	uint32_t		hw_ip;
 	uint64_t		sequence;
 	struct drm_sched_entity	entity;
+	atomic_t		*sched_ring_score;
 	struct dma_fence	*fences[];
 };
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v1 2/2] drm/sched: limit sched score update to jobs change
  2025-08-22 13:43 [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Pierre-Eric Pelloux-Prayer
@ 2025-08-22 13:43 ` Pierre-Eric Pelloux-Prayer
  2025-08-25 13:13   ` Philipp Stanner
  2025-09-01  9:20   ` Tvrtko Ursulin
  2025-09-01  9:02 ` [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Tvrtko Ursulin
  1 sibling, 2 replies; 7+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-08-22 13:43 UTC (permalink / raw)
  To: Matthew Brost, Danilo Krummrich, Philipp Stanner,
	Christian König, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter
  Cc: Pierre-Eric Pelloux-Prayer, dri-devel, linux-kernel

Currently, the scheduler score is incremented when a job is pushed to an
entity and when an entity is attached to the scheduler.

This leads to some bad scheduling decision where the score value is
largely made of idle entities.

For instance, a scenario with 2 schedulers and where 10 entities submit
a single job, then do nothing, each scheduler will probably end up with
a score of 5.
Now, 5 userspace apps exit, so their entities will be dropped. In
the worst case, these apps' entities where all attached to the same
scheduler and we end up with score=5 (the 5 remaining entities) and
score=0, despite the 2 schedulers being idle.
When new entities show up, they will all select the second scheduler
based on its low score value, instead of alternating between the 2.

Some amdgpu rings depended on this feature, but the previous commit
implemented the same thing in amdgpu directly so it can be safely
removed from drm/sched.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 5a550fd76bf0..e6d232a8ec58 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -206,7 +206,6 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
 	if (!list_empty(&entity->list))
 		return;

-	atomic_inc(rq->sched->score);
 	list_add_tail(&entity->list, &rq->entities);
 }

@@ -228,7 +227,6 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,

 	spin_lock(&rq->lock);

-	atomic_dec(rq->sched->score);
 	list_del_init(&entity->list);

 	if (rq->current_entity == entity)
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 2/2] drm/sched: limit sched score update to jobs change
  2025-08-22 13:43 ` [PATCH v1 2/2] drm/sched: limit sched score update to jobs change Pierre-Eric Pelloux-Prayer
@ 2025-08-25 13:13   ` Philipp Stanner
  2025-09-01 13:14     ` Pierre-Eric Pelloux-Prayer
  2025-09-01  9:20   ` Tvrtko Ursulin
  1 sibling, 1 reply; 7+ messages in thread
From: Philipp Stanner @ 2025-08-25 13:13 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Matthew Brost, Danilo Krummrich,
	Philipp Stanner, Christian König, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter
  Cc: dri-devel, linux-kernel

On Fri, 2025-08-22 at 15:43 +0200, Pierre-Eric Pelloux-Prayer wrote:
> Currently, the scheduler score is incremented when a job is pushed to an
> entity and when an entity is attached to the scheduler.

It's indeed awkward why attaching is treated equivalently to job
submission.

Can you expand the documentation for drm_sched_init_args a bit so that
it gets clearer what the score is supposed to do?

> 
> This leads to some bad scheduling decision where the score value is
> largely made of idle entities.
> 
> For instance, a scenario with 2 schedulers and where 10 entities submit
> a single job, then do nothing, each scheduler will probably end up with
> a score of 5.
> Now, 5 userspace apps exit, so their entities will be dropped. 
> 

"entities will be dropped" == "drm_sched_entity_kill() gets called",
right?

> In
> the worst case, these apps' entities where all attached to the same

s/where/were

or better yet: "could be"

> scheduler and we end up with score=5 (the 5 remaining entities) and
> score=0, despite the 2 schedulers being idle.

Sounds indeed like a (small) problem to me.


> When new entities show up, they will all select the second scheduler
> based on its low score value, instead of alternating between the 2.
> 
> Some amdgpu rings depended on this feature, but the previous commit
> implemented the same thing in amdgpu directly so it can be safely
> removed from drm/sched.

Can we be that sure that other drivers don't depend on it, though? I
suspect it's likely that it's just amdgpu, but…



BTW, since you're cleaning up related stuff currently: I saw that it
seems that the only driver that sets &struct drm_sched_init_args.score
is amdgpu. Would be cool if you can take a look whether that's still
needed.


P.

> 
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 5a550fd76bf0..e6d232a8ec58 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -206,7 +206,6 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>  	if (!list_empty(&entity->list))
>  		return;
>  
> -	atomic_inc(rq->sched->score);
>  	list_add_tail(&entity->list, &rq->entities);
>  }
>  
> @@ -228,7 +227,6 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>  
>  	spin_lock(&rq->lock);
>  
> -	atomic_dec(rq->sched->score);
>  	list_del_init(&entity->list);
>  
>  	if (rq->current_entity == entity)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection
  2025-08-22 13:43 [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Pierre-Eric Pelloux-Prayer
  2025-08-22 13:43 ` [PATCH v1 2/2] drm/sched: limit sched score update to jobs change Pierre-Eric Pelloux-Prayer
@ 2025-09-01  9:02 ` Tvrtko Ursulin
  1 sibling, 0 replies; 7+ messages in thread
From: Tvrtko Ursulin @ 2025-09-01  9:02 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Alex Deucher, Christian König,
	David Airlie, Simona Vetter
  Cc: amd-gfx, dri-devel, linux-kernel


On 22/08/2025 14:43, Pierre-Eric Pelloux-Prayer wrote:
> For hw engines that can't load balance jobs, entities are
> "statically" load balanced: on their first submit, they select
> the best scheduler based on its score.
> The score is made up of 2 parts:
> * the job queue depth (how much jobs are executing/waiting)
> * the number of entities assigned
> 
> The second part is only relevant for the static load balance:
> it's a way to consider how many entities are attached to this
> scheduler, knowing that if they ever submit jobs they will go
> to this one.
> 
> For rings that can load balance jobs freely, idle entities
> aren't a concern and shouldn't impact the scheduler's decisions.
> 
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 23 ++++++++++++++++++-----
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h |  1 +
>   2 files changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index f5d5c45ddc0d..4a078d2d98c5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -206,9 +206,11 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
>   {
>   	struct drm_gpu_scheduler **scheds = NULL, *sched = NULL;
>   	struct amdgpu_device *adev = ctx->mgr->adev;
> +	bool static_load_balancing = false;
>   	struct amdgpu_ctx_entity *entity;
>   	enum drm_sched_priority drm_prio;
>   	unsigned int hw_prio, num_scheds;
> +	struct amdgpu_ring *aring;
>   	int32_t ctx_prio;
>   	int r;
>   
> @@ -236,17 +238,22 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
>   		r = amdgpu_xcp_select_scheds(adev, hw_ip, hw_prio, fpriv,
>   						&num_scheds, &scheds);
>   		if (r)
> -			goto cleanup_entity;
> +			goto error_free_entity;

Indeed, previously this was calling drm_sched_entity_fini() before 
drm_sched_entity_init() and it only worked because of kzalloc.

>   	}
>   
>   	/* disable load balance if the hw engine retains context among dependent jobs */
> -	if (hw_ip == AMDGPU_HW_IP_VCN_ENC ||
> -	    hw_ip == AMDGPU_HW_IP_VCN_DEC ||
> -	    hw_ip == AMDGPU_HW_IP_UVD_ENC ||
> -	    hw_ip == AMDGPU_HW_IP_UVD) {
> +	static_load_balancing = hw_ip == AMDGPU_HW_IP_VCN_ENC ||
> +				hw_ip == AMDGPU_HW_IP_VCN_DEC ||
> +				hw_ip == AMDGPU_HW_IP_UVD_ENC ||
> +				hw_ip == AMDGPU_HW_IP_UVD;
> +
> +	if (static_load_balancing) {
>   		sched = drm_sched_pick_best(scheds, num_scheds);
>   		scheds = &sched;
>   		num_scheds = 1;
> +		aring = container_of(sched, struct amdgpu_ring, sched);
> +		entity->sched_ring_score = aring->sched_score;
> +		atomic_inc(entity->sched_ring_score);

If we were to bike-shed we could find a way to avoid the new local 
variables. Keeping the if as is and assign to entity->sched_ring_score 
directly, and then checking for that on the cleanup path. Still works 
due kzalloc. Or if relying on kzalloc is not desired, at least bool 
static_load_balance could be replaced by re-naming the aring local as 
static_aring and using it like the name suggests.

Could also move the atomic_inc to the success path to avoid having to 
add code to error unwind.

Both cases are I think equally racy in the sense that parallel 
amdgpu_ctx_init_entity invocations can all pick the same sched. But that 
is true today AFAICT because score is not incremented until later in the 
job submit process.

I suppose one way to make the assignment more robust would be to 
"rotate" (or randomize) the sched list atomically before calling 
drm_sched_pick_best. Thoughts?

Regards,

Tvrtko

>   	}
>   
>   	r = drm_sched_entity_init(&entity->entity, drm_prio, scheds, num_scheds,
> @@ -264,6 +271,9 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
>   	drm_sched_entity_fini(&entity->entity);
>   
>   error_free_entity:
> +	if (static_load_balancing)
> +		atomic_dec(entity->sched_ring_score);
> +
>   	kfree(entity);
>   
>   	return r;
> @@ -514,6 +524,9 @@ static void amdgpu_ctx_do_release(struct kref *ref)
>   			if (!ctx->entities[i][j])
>   				continue;
>   
> +			if (ctx->entities[i][j]->sched_ring_score)
> +				atomic_dec(ctx->entities[i][j]->sched_ring_score);
> +
>   			drm_sched_entity_destroy(&ctx->entities[i][j]->entity);
>   		}
>   	}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
> index 090dfe86f75b..076a0e165ce0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
> @@ -39,6 +39,7 @@ struct amdgpu_ctx_entity {
>   	uint32_t		hw_ip;
>   	uint64_t		sequence;
>   	struct drm_sched_entity	entity;
> +	atomic_t		*sched_ring_score;
>   	struct dma_fence	*fences[];
>   };
>   


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 2/2] drm/sched: limit sched score update to jobs change
  2025-08-22 13:43 ` [PATCH v1 2/2] drm/sched: limit sched score update to jobs change Pierre-Eric Pelloux-Prayer
  2025-08-25 13:13   ` Philipp Stanner
@ 2025-09-01  9:20   ` Tvrtko Ursulin
  1 sibling, 0 replies; 7+ messages in thread
From: Tvrtko Ursulin @ 2025-09-01  9:20 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Matthew Brost, Danilo Krummrich,
	Philipp Stanner, Christian König, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Tomeu Vizoso, Oded Gabbay
  Cc: dri-devel, linux-kernel



+ Tomeu and Oded

On 22/08/2025 14:43, Pierre-Eric Pelloux-Prayer wrote:
> Currently, the scheduler score is incremented when a job is pushed to an
> entity and when an entity is attached to the scheduler.
> 
> This leads to some bad scheduling decision where the score value is
> largely made of idle entities.
> 
> For instance, a scenario with 2 schedulers and where 10 entities submit
> a single job, then do nothing, each scheduler will probably end up with
> a score of 5.
> Now, 5 userspace apps exit, so their entities will be dropped. In
> the worst case, these apps' entities where all attached to the same
> scheduler and we end up with score=5 (the 5 remaining entities) and
> score=0, despite the 2 schedulers being idle.
> When new entities show up, they will all select the second scheduler
> based on its low score value, instead of alternating between the 2.
> 
> Some amdgpu rings depended on this feature, but the previous commit
> implemented the same thing in amdgpu directly so it can be safely
> removed from drm/sched.
> 
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 2 --
>   1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 5a550fd76bf0..e6d232a8ec58 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -206,7 +206,6 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>   	if (!list_empty(&entity->list))
>   		return;
>   
> -	atomic_inc(rq->sched->score);
>   	list_add_tail(&entity->list, &rq->entities);
>   }
>   
> @@ -228,7 +227,6 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>   
>   	spin_lock(&rq->lock);
>   
> -	atomic_dec(rq->sched->score);
>   	list_del_init(&entity->list);
>   
>   	if (rq->current_entity == entity)

LGTM.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

Only detail is, I did a revisit of the scheduler users and it looks like 
the new rocket driver is the only one other than amdgpu which passes a 
list of more than one scheduler to drm_sched_entity_init. I don't 
*think* it would be affected though. It would still pick the least 
loaded (based on active jobs) scheduler at job submit time. Unless there 
is some hidden behaviour in that driver where it would be important to 
consider number of entities too. Anyway, it would be good for rocket 
driver to double-check and ack.

Regards,

Tvrtko


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 2/2] drm/sched: limit sched score update to jobs change
  2025-08-25 13:13   ` Philipp Stanner
@ 2025-09-01 13:14     ` Pierre-Eric Pelloux-Prayer
  2025-09-02  6:21       ` Philipp Stanner
  0 siblings, 1 reply; 7+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-09-01 13:14 UTC (permalink / raw)
  To: phasta, Pierre-Eric Pelloux-Prayer, Matthew Brost,
	Danilo Krummrich, Christian König, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter
  Cc: dri-devel, linux-kernel



Le 25/08/2025 à 15:13, Philipp Stanner a écrit :
> On Fri, 2025-08-22 at 15:43 +0200, Pierre-Eric Pelloux-Prayer wrote:
>> Currently, the scheduler score is incremented when a job is pushed to an
>> entity and when an entity is attached to the scheduler.
> 
> It's indeed awkward why attaching is treated equivalently to job
> submission.
> 
> Can you expand the documentation for drm_sched_init_args a bit so that
> it gets clearer what the score is supposed to do?


drm_sched_init_args.score is the feature allowing multiple schedulers to share a 
score so I suppose you meant drm_gpu_scheduler.score?

The doc currently says "score to help loadbalancer pick a idle sched" which is a 
bit vague. It could be modified to become:

   @score: holds the number of yet-to-be-completed jobs pushed to each scheduler.
           It's used when load balancing between different schedulers.

What do you think?

> 
>>
>> This leads to some bad scheduling decision where the score value is
>> largely made of idle entities.
>>
>> For instance, a scenario with 2 schedulers and where 10 entities submit
>> a single job, then do nothing, each scheduler will probably end up with
>> a score of 5.
>> Now, 5 userspace apps exit, so their entities will be dropped.
>>
> 
> "entities will be dropped" == "drm_sched_entity_kill() gets called",
> right?

Yes.

>> In
>> the worst case, these apps' entities where all attached to the same
> 
> s/where/were
> 
> or better yet: "could be"

Will fix, thanks.

> 
>> scheduler and we end up with score=5 (the 5 remaining entities) and
>> score=0, despite the 2 schedulers being idle.
> 
> Sounds indeed like a (small) problem to me.
> 
> 
>> When new entities show up, they will all select the second scheduler
>> based on its low score value, instead of alternating between the 2.
>>
>> Some amdgpu rings depended on this feature, but the previous commit
>> implemented the same thing in amdgpu directly so it can be safely
>> removed from drm/sched.
> 
> Can we be that sure that other drivers don't depend on it, though? I
> suspect it's likely that it's just amdgpu, but…
> 

Aside from the new "rocket" as pointed out by Tvrtko, amdgpu is the only driver 
passing more than one schedulers to entities so they're the only ones that could 
be affected.

I verified amdgpu and Tvrtko pinged the rocket maintainers in the other thread.

> 
> 
> BTW, since you're cleaning up related stuff currently: I saw that it
> seems that the only driver that sets &struct drm_sched_init_args.score
> is amdgpu. Would be cool if you can take a look whether that's still
> needed.

It cannot really be removed yet as it's useful when a single hardware block is 
exposed through different schedulers (so pushing jobs to one of the schedulers 
should increase the load of the underlying hw).

Thanks,
Pierre-Eric

> 
> 
> P.
> 
>>
>> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 5a550fd76bf0..e6d232a8ec58 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -206,7 +206,6 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>>   	if (!list_empty(&entity->list))
>>   		return;
>>   
>> -	atomic_inc(rq->sched->score);
>>   	list_add_tail(&entity->list, &rq->entities);
>>   }
>>   
>> @@ -228,7 +227,6 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>   
>>   	spin_lock(&rq->lock);
>>   
>> -	atomic_dec(rq->sched->score);
>>   	list_del_init(&entity->list);
>>   
>>   	if (rq->current_entity == entity)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 2/2] drm/sched: limit sched score update to jobs change
  2025-09-01 13:14     ` Pierre-Eric Pelloux-Prayer
@ 2025-09-02  6:21       ` Philipp Stanner
  0 siblings, 0 replies; 7+ messages in thread
From: Philipp Stanner @ 2025-09-02  6:21 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, phasta, Pierre-Eric Pelloux-Prayer,
	Matthew Brost, Danilo Krummrich, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter
  Cc: dri-devel, linux-kernel

On Mon, 2025-09-01 at 15:14 +0200, Pierre-Eric Pelloux-Prayer wrote:
> 
> 
> Le 25/08/2025 à 15:13, Philipp Stanner a écrit :
> > On Fri, 2025-08-22 at 15:43 +0200, Pierre-Eric Pelloux-Prayer wrote:
> > > Currently, the scheduler score is incremented when a job is pushed to an
> > > entity and when an entity is attached to the scheduler.
> > 
> > It's indeed awkward why attaching is treated equivalently to job
> > submission.
> > 
> > Can you expand the documentation for drm_sched_init_args a bit so that
> > it gets clearer what the score is supposed to do?
> 
> 
> drm_sched_init_args.score is the feature allowing multiple schedulers to share a 
> score so I suppose you meant drm_gpu_scheduler.score?

I mean both.

struct drm_sched_init_args has a @score which is passed by the driver
during init and will then be stored in drm_gpu_scheduler.score

The docu should be improved for both.

> 
> The doc currently says "score to help loadbalancer pick a idle sched" which is a 
> bit vague. It could be modified to become:
> 
>    @score: holds the number of yet-to-be-completed jobs pushed to each scheduler.
>            It's used when load balancing between different schedulers.

"to each scheduler" reads a bit as if each scheduler has an individual
score. Is it used that way, though? I think it's a pointer because all
schedulers share this atomic. Thus, it "holds the total number of yet-
to-be-completed jobs for all schedulers sharing this atomic", right?

> 
> What do you think?
> 
> > 
> > > 
> > > This leads to some bad scheduling decision where the score value is
> > > largely made of idle entities.
> > > 
> > > For instance, a scenario with 2 schedulers and where 10 entities submit
> > > a single job, then do nothing, each scheduler will probably end up with
> > > a score of 5.
> > > Now, 5 userspace apps exit, so their entities will be dropped.
> > > 
> > 
> > "entities will be dropped" == "drm_sched_entity_kill() gets called",
> > right?
> 
> Yes.

then nit: better say "their entities will be killed" instead of
dropped.

> 
> > > In
> > > the worst case, these apps' entities where all attached to the same
> > 
> > s/where/were
> > 
> > or better yet: "could be"
> 
> Will fix, thanks.
> 
> > 
> > > scheduler and we end up with score=5 (the 5 remaining entities) and
> > > score=0, despite the 2 schedulers being idle.
> > 
> > Sounds indeed like a (small) problem to me.
> > 
> > 
> > > When new entities show up, they will all select the second scheduler
> > > based on its low score value, instead of alternating between the 2.
> > > 
> > > Some amdgpu rings depended on this feature, but the previous commit
> > > implemented the same thing in amdgpu directly so it can be safely
> > > removed from drm/sched.
> > 
> > Can we be that sure that other drivers don't depend on it, though? I
> > suspect it's likely that it's just amdgpu, but…
> > 
> 
> Aside from the new "rocket" as pointed out by Tvrtko, amdgpu is the only driver 
> passing more than one schedulers to entities so they're the only ones that could 
> be affected.
> 
> I verified amdgpu and Tvrtko pinged the rocket maintainers in the other thread.

Very good! Then let's give those guys a few days to jump into the
discussion.

> 
> > 
> > 
> > BTW, since you're cleaning up related stuff currently: I saw that it
> > seems that the only driver that sets &struct drm_sched_init_args.score
> > is amdgpu. Would be cool if you can take a look whether that's still
> > needed.
> 
> It cannot really be removed yet as it's useful when a single hardware block is 
> exposed through different schedulers (so pushing jobs to one of the schedulers 
> should increase the load of the underlying hw).

OK.


Thx
P.

> 
> Thanks,
> Pierre-Eric
> 
> > 
> > 
> > P.
> > 
> > > 
> > > Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> > > ---
> > >   drivers/gpu/drm/scheduler/sched_main.c | 2 --
> > >   1 file changed, 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > index 5a550fd76bf0..e6d232a8ec58 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > @@ -206,7 +206,6 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
> > >   	if (!list_empty(&entity->list))
> > >   		return;
> > >   
> > > -	atomic_inc(rq->sched->score);
> > >   	list_add_tail(&entity->list, &rq->entities);
> > >   }
> > >   
> > > @@ -228,7 +227,6 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
> > >   
> > >   	spin_lock(&rq->lock);
> > >   
> > > -	atomic_dec(rq->sched->score);
> > >   	list_del_init(&entity->list);
> > >   
> > >   	if (rq->current_entity == entity)


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-09-02  6:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-22 13:43 [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Pierre-Eric Pelloux-Prayer
2025-08-22 13:43 ` [PATCH v1 2/2] drm/sched: limit sched score update to jobs change Pierre-Eric Pelloux-Prayer
2025-08-25 13:13   ` Philipp Stanner
2025-09-01 13:14     ` Pierre-Eric Pelloux-Prayer
2025-09-02  6:21       ` Philipp Stanner
2025-09-01  9:20   ` Tvrtko Ursulin
2025-09-01  9:02 ` [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).