* [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection
@ 2025-08-22 13:43 Pierre-Eric Pelloux-Prayer
2025-08-22 13:43 ` [PATCH v1 2/2] drm/sched: limit sched score update to jobs change Pierre-Eric Pelloux-Prayer
2025-09-01 9:02 ` [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Tvrtko Ursulin
0 siblings, 2 replies; 7+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-08-22 13:43 UTC (permalink / raw)
To: Alex Deucher, Christian König, David Airlie, Simona Vetter
Cc: Pierre-Eric Pelloux-Prayer, amd-gfx, dri-devel, linux-kernel
For hw engines that can't load balance jobs, entities are
"statically" load balanced: on their first submit, they select
the best scheduler based on its score.
The score is made up of 2 parts:
* the job queue depth (how much jobs are executing/waiting)
* the number of entities assigned
The second part is only relevant for the static load balance:
it's a way to consider how many entities are attached to this
scheduler, knowing that if they ever submit jobs they will go
to this one.
For rings that can load balance jobs freely, idle entities
aren't a concern and shouldn't impact the scheduler's decisions.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 23 ++++++++++++++++++-----
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 1 +
2 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index f5d5c45ddc0d..4a078d2d98c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -206,9 +206,11 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
{
struct drm_gpu_scheduler **scheds = NULL, *sched = NULL;
struct amdgpu_device *adev = ctx->mgr->adev;
+ bool static_load_balancing = false;
struct amdgpu_ctx_entity *entity;
enum drm_sched_priority drm_prio;
unsigned int hw_prio, num_scheds;
+ struct amdgpu_ring *aring;
int32_t ctx_prio;
int r;
@@ -236,17 +238,22 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
r = amdgpu_xcp_select_scheds(adev, hw_ip, hw_prio, fpriv,
&num_scheds, &scheds);
if (r)
- goto cleanup_entity;
+ goto error_free_entity;
}
/* disable load balance if the hw engine retains context among dependent jobs */
- if (hw_ip == AMDGPU_HW_IP_VCN_ENC ||
- hw_ip == AMDGPU_HW_IP_VCN_DEC ||
- hw_ip == AMDGPU_HW_IP_UVD_ENC ||
- hw_ip == AMDGPU_HW_IP_UVD) {
+ static_load_balancing = hw_ip == AMDGPU_HW_IP_VCN_ENC ||
+ hw_ip == AMDGPU_HW_IP_VCN_DEC ||
+ hw_ip == AMDGPU_HW_IP_UVD_ENC ||
+ hw_ip == AMDGPU_HW_IP_UVD;
+
+ if (static_load_balancing) {
sched = drm_sched_pick_best(scheds, num_scheds);
scheds = &sched;
num_scheds = 1;
+ aring = container_of(sched, struct amdgpu_ring, sched);
+ entity->sched_ring_score = aring->sched_score;
+ atomic_inc(entity->sched_ring_score);
}
r = drm_sched_entity_init(&entity->entity, drm_prio, scheds, num_scheds,
@@ -264,6 +271,9 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
drm_sched_entity_fini(&entity->entity);
error_free_entity:
+ if (static_load_balancing)
+ atomic_dec(entity->sched_ring_score);
+
kfree(entity);
return r;
@@ -514,6 +524,9 @@ static void amdgpu_ctx_do_release(struct kref *ref)
if (!ctx->entities[i][j])
continue;
+ if (ctx->entities[i][j]->sched_ring_score)
+ atomic_dec(ctx->entities[i][j]->sched_ring_score);
+
drm_sched_entity_destroy(&ctx->entities[i][j]->entity);
}
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
index 090dfe86f75b..076a0e165ce0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
@@ -39,6 +39,7 @@ struct amdgpu_ctx_entity {
uint32_t hw_ip;
uint64_t sequence;
struct drm_sched_entity entity;
+ atomic_t *sched_ring_score;
struct dma_fence *fences[];
};
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v1 2/2] drm/sched: limit sched score update to jobs change
2025-08-22 13:43 [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Pierre-Eric Pelloux-Prayer
@ 2025-08-22 13:43 ` Pierre-Eric Pelloux-Prayer
2025-08-25 13:13 ` Philipp Stanner
2025-09-01 9:20 ` Tvrtko Ursulin
2025-09-01 9:02 ` [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Tvrtko Ursulin
1 sibling, 2 replies; 7+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-08-22 13:43 UTC (permalink / raw)
To: Matthew Brost, Danilo Krummrich, Philipp Stanner,
Christian König, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter
Cc: Pierre-Eric Pelloux-Prayer, dri-devel, linux-kernel
Currently, the scheduler score is incremented when a job is pushed to an
entity and when an entity is attached to the scheduler.
This leads to some bad scheduling decision where the score value is
largely made of idle entities.
For instance, a scenario with 2 schedulers and where 10 entities submit
a single job, then do nothing, each scheduler will probably end up with
a score of 5.
Now, 5 userspace apps exit, so their entities will be dropped. In
the worst case, these apps' entities where all attached to the same
scheduler and we end up with score=5 (the 5 remaining entities) and
score=0, despite the 2 schedulers being idle.
When new entities show up, they will all select the second scheduler
based on its low score value, instead of alternating between the 2.
Some amdgpu rings depended on this feature, but the previous commit
implemented the same thing in amdgpu directly so it can be safely
removed from drm/sched.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
drivers/gpu/drm/scheduler/sched_main.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 5a550fd76bf0..e6d232a8ec58 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -206,7 +206,6 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
if (!list_empty(&entity->list))
return;
- atomic_inc(rq->sched->score);
list_add_tail(&entity->list, &rq->entities);
}
@@ -228,7 +227,6 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
spin_lock(&rq->lock);
- atomic_dec(rq->sched->score);
list_del_init(&entity->list);
if (rq->current_entity == entity)
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v1 2/2] drm/sched: limit sched score update to jobs change
2025-08-22 13:43 ` [PATCH v1 2/2] drm/sched: limit sched score update to jobs change Pierre-Eric Pelloux-Prayer
@ 2025-08-25 13:13 ` Philipp Stanner
2025-09-01 13:14 ` Pierre-Eric Pelloux-Prayer
2025-09-01 9:20 ` Tvrtko Ursulin
1 sibling, 1 reply; 7+ messages in thread
From: Philipp Stanner @ 2025-08-25 13:13 UTC (permalink / raw)
To: Pierre-Eric Pelloux-Prayer, Matthew Brost, Danilo Krummrich,
Philipp Stanner, Christian König, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter
Cc: dri-devel, linux-kernel
On Fri, 2025-08-22 at 15:43 +0200, Pierre-Eric Pelloux-Prayer wrote:
> Currently, the scheduler score is incremented when a job is pushed to an
> entity and when an entity is attached to the scheduler.
It's indeed awkward why attaching is treated equivalently to job
submission.
Can you expand the documentation for drm_sched_init_args a bit so that
it gets clearer what the score is supposed to do?
>
> This leads to some bad scheduling decision where the score value is
> largely made of idle entities.
>
> For instance, a scenario with 2 schedulers and where 10 entities submit
> a single job, then do nothing, each scheduler will probably end up with
> a score of 5.
> Now, 5 userspace apps exit, so their entities will be dropped.
>
"entities will be dropped" == "drm_sched_entity_kill() gets called",
right?
> In
> the worst case, these apps' entities where all attached to the same
s/where/were
or better yet: "could be"
> scheduler and we end up with score=5 (the 5 remaining entities) and
> score=0, despite the 2 schedulers being idle.
Sounds indeed like a (small) problem to me.
> When new entities show up, they will all select the second scheduler
> based on its low score value, instead of alternating between the 2.
>
> Some amdgpu rings depended on this feature, but the previous commit
> implemented the same thing in amdgpu directly so it can be safely
> removed from drm/sched.
Can we be that sure that other drivers don't depend on it, though? I
suspect it's likely that it's just amdgpu, but…
BTW, since you're cleaning up related stuff currently: I saw that it
seems that the only driver that sets &struct drm_sched_init_args.score
is amdgpu. Would be cool if you can take a look whether that's still
needed.
P.
>
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> ---
> drivers/gpu/drm/scheduler/sched_main.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 5a550fd76bf0..e6d232a8ec58 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -206,7 +206,6 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
> if (!list_empty(&entity->list))
> return;
>
> - atomic_inc(rq->sched->score);
> list_add_tail(&entity->list, &rq->entities);
> }
>
> @@ -228,7 +227,6 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>
> spin_lock(&rq->lock);
>
> - atomic_dec(rq->sched->score);
> list_del_init(&entity->list);
>
> if (rq->current_entity == entity)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection
2025-08-22 13:43 [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Pierre-Eric Pelloux-Prayer
2025-08-22 13:43 ` [PATCH v1 2/2] drm/sched: limit sched score update to jobs change Pierre-Eric Pelloux-Prayer
@ 2025-09-01 9:02 ` Tvrtko Ursulin
1 sibling, 0 replies; 7+ messages in thread
From: Tvrtko Ursulin @ 2025-09-01 9:02 UTC (permalink / raw)
To: Pierre-Eric Pelloux-Prayer, Alex Deucher, Christian König,
David Airlie, Simona Vetter
Cc: amd-gfx, dri-devel, linux-kernel
On 22/08/2025 14:43, Pierre-Eric Pelloux-Prayer wrote:
> For hw engines that can't load balance jobs, entities are
> "statically" load balanced: on their first submit, they select
> the best scheduler based on its score.
> The score is made up of 2 parts:
> * the job queue depth (how much jobs are executing/waiting)
> * the number of entities assigned
>
> The second part is only relevant for the static load balance:
> it's a way to consider how many entities are attached to this
> scheduler, knowing that if they ever submit jobs they will go
> to this one.
>
> For rings that can load balance jobs freely, idle entities
> aren't a concern and shouldn't impact the scheduler's decisions.
>
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 23 ++++++++++++++++++-----
> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 1 +
> 2 files changed, 19 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index f5d5c45ddc0d..4a078d2d98c5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -206,9 +206,11 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
> {
> struct drm_gpu_scheduler **scheds = NULL, *sched = NULL;
> struct amdgpu_device *adev = ctx->mgr->adev;
> + bool static_load_balancing = false;
> struct amdgpu_ctx_entity *entity;
> enum drm_sched_priority drm_prio;
> unsigned int hw_prio, num_scheds;
> + struct amdgpu_ring *aring;
> int32_t ctx_prio;
> int r;
>
> @@ -236,17 +238,22 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
> r = amdgpu_xcp_select_scheds(adev, hw_ip, hw_prio, fpriv,
> &num_scheds, &scheds);
> if (r)
> - goto cleanup_entity;
> + goto error_free_entity;
Indeed, previously this was calling drm_sched_entity_fini() before
drm_sched_entity_init() and it only worked because of kzalloc.
> }
>
> /* disable load balance if the hw engine retains context among dependent jobs */
> - if (hw_ip == AMDGPU_HW_IP_VCN_ENC ||
> - hw_ip == AMDGPU_HW_IP_VCN_DEC ||
> - hw_ip == AMDGPU_HW_IP_UVD_ENC ||
> - hw_ip == AMDGPU_HW_IP_UVD) {
> + static_load_balancing = hw_ip == AMDGPU_HW_IP_VCN_ENC ||
> + hw_ip == AMDGPU_HW_IP_VCN_DEC ||
> + hw_ip == AMDGPU_HW_IP_UVD_ENC ||
> + hw_ip == AMDGPU_HW_IP_UVD;
> +
> + if (static_load_balancing) {
> sched = drm_sched_pick_best(scheds, num_scheds);
> scheds = &sched;
> num_scheds = 1;
> + aring = container_of(sched, struct amdgpu_ring, sched);
> + entity->sched_ring_score = aring->sched_score;
> + atomic_inc(entity->sched_ring_score);
If we were to bike-shed we could find a way to avoid the new local
variables. Keeping the if as is and assign to entity->sched_ring_score
directly, and then checking for that on the cleanup path. Still works
due kzalloc. Or if relying on kzalloc is not desired, at least bool
static_load_balance could be replaced by re-naming the aring local as
static_aring and using it like the name suggests.
Could also move the atomic_inc to the success path to avoid having to
add code to error unwind.
Both cases are I think equally racy in the sense that parallel
amdgpu_ctx_init_entity invocations can all pick the same sched. But that
is true today AFAICT because score is not incremented until later in the
job submit process.
I suppose one way to make the assignment more robust would be to
"rotate" (or randomize) the sched list atomically before calling
drm_sched_pick_best. Thoughts?
Regards,
Tvrtko
> }
>
> r = drm_sched_entity_init(&entity->entity, drm_prio, scheds, num_scheds,
> @@ -264,6 +271,9 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
> drm_sched_entity_fini(&entity->entity);
>
> error_free_entity:
> + if (static_load_balancing)
> + atomic_dec(entity->sched_ring_score);
> +
> kfree(entity);
>
> return r;
> @@ -514,6 +524,9 @@ static void amdgpu_ctx_do_release(struct kref *ref)
> if (!ctx->entities[i][j])
> continue;
>
> + if (ctx->entities[i][j]->sched_ring_score)
> + atomic_dec(ctx->entities[i][j]->sched_ring_score);
> +
> drm_sched_entity_destroy(&ctx->entities[i][j]->entity);
> }
> }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
> index 090dfe86f75b..076a0e165ce0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
> @@ -39,6 +39,7 @@ struct amdgpu_ctx_entity {
> uint32_t hw_ip;
> uint64_t sequence;
> struct drm_sched_entity entity;
> + atomic_t *sched_ring_score;
> struct dma_fence *fences[];
> };
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1 2/2] drm/sched: limit sched score update to jobs change
2025-08-22 13:43 ` [PATCH v1 2/2] drm/sched: limit sched score update to jobs change Pierre-Eric Pelloux-Prayer
2025-08-25 13:13 ` Philipp Stanner
@ 2025-09-01 9:20 ` Tvrtko Ursulin
1 sibling, 0 replies; 7+ messages in thread
From: Tvrtko Ursulin @ 2025-09-01 9:20 UTC (permalink / raw)
To: Pierre-Eric Pelloux-Prayer, Matthew Brost, Danilo Krummrich,
Philipp Stanner, Christian König, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Tomeu Vizoso, Oded Gabbay
Cc: dri-devel, linux-kernel
+ Tomeu and Oded
On 22/08/2025 14:43, Pierre-Eric Pelloux-Prayer wrote:
> Currently, the scheduler score is incremented when a job is pushed to an
> entity and when an entity is attached to the scheduler.
>
> This leads to some bad scheduling decision where the score value is
> largely made of idle entities.
>
> For instance, a scenario with 2 schedulers and where 10 entities submit
> a single job, then do nothing, each scheduler will probably end up with
> a score of 5.
> Now, 5 userspace apps exit, so their entities will be dropped. In
> the worst case, these apps' entities where all attached to the same
> scheduler and we end up with score=5 (the 5 remaining entities) and
> score=0, despite the 2 schedulers being idle.
> When new entities show up, they will all select the second scheduler
> based on its low score value, instead of alternating between the 2.
>
> Some amdgpu rings depended on this feature, but the previous commit
> implemented the same thing in amdgpu directly so it can be safely
> removed from drm/sched.
>
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> ---
> drivers/gpu/drm/scheduler/sched_main.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 5a550fd76bf0..e6d232a8ec58 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -206,7 +206,6 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
> if (!list_empty(&entity->list))
> return;
>
> - atomic_inc(rq->sched->score);
> list_add_tail(&entity->list, &rq->entities);
> }
>
> @@ -228,7 +227,6 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>
> spin_lock(&rq->lock);
>
> - atomic_dec(rq->sched->score);
> list_del_init(&entity->list);
>
> if (rq->current_entity == entity)
LGTM.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Only detail is, I did a revisit of the scheduler users and it looks like
the new rocket driver is the only one other than amdgpu which passes a
list of more than one scheduler to drm_sched_entity_init. I don't
*think* it would be affected though. It would still pick the least
loaded (based on active jobs) scheduler at job submit time. Unless there
is some hidden behaviour in that driver where it would be important to
consider number of entities too. Anyway, it would be good for rocket
driver to double-check and ack.
Regards,
Tvrtko
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1 2/2] drm/sched: limit sched score update to jobs change
2025-08-25 13:13 ` Philipp Stanner
@ 2025-09-01 13:14 ` Pierre-Eric Pelloux-Prayer
2025-09-02 6:21 ` Philipp Stanner
0 siblings, 1 reply; 7+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-09-01 13:14 UTC (permalink / raw)
To: phasta, Pierre-Eric Pelloux-Prayer, Matthew Brost,
Danilo Krummrich, Christian König, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter
Cc: dri-devel, linux-kernel
Le 25/08/2025 à 15:13, Philipp Stanner a écrit :
> On Fri, 2025-08-22 at 15:43 +0200, Pierre-Eric Pelloux-Prayer wrote:
>> Currently, the scheduler score is incremented when a job is pushed to an
>> entity and when an entity is attached to the scheduler.
>
> It's indeed awkward why attaching is treated equivalently to job
> submission.
>
> Can you expand the documentation for drm_sched_init_args a bit so that
> it gets clearer what the score is supposed to do?
drm_sched_init_args.score is the feature allowing multiple schedulers to share a
score so I suppose you meant drm_gpu_scheduler.score?
The doc currently says "score to help loadbalancer pick a idle sched" which is a
bit vague. It could be modified to become:
@score: holds the number of yet-to-be-completed jobs pushed to each scheduler.
It's used when load balancing between different schedulers.
What do you think?
>
>>
>> This leads to some bad scheduling decision where the score value is
>> largely made of idle entities.
>>
>> For instance, a scenario with 2 schedulers and where 10 entities submit
>> a single job, then do nothing, each scheduler will probably end up with
>> a score of 5.
>> Now, 5 userspace apps exit, so their entities will be dropped.
>>
>
> "entities will be dropped" == "drm_sched_entity_kill() gets called",
> right?
Yes.
>> In
>> the worst case, these apps' entities where all attached to the same
>
> s/where/were
>
> or better yet: "could be"
Will fix, thanks.
>
>> scheduler and we end up with score=5 (the 5 remaining entities) and
>> score=0, despite the 2 schedulers being idle.
>
> Sounds indeed like a (small) problem to me.
>
>
>> When new entities show up, they will all select the second scheduler
>> based on its low score value, instead of alternating between the 2.
>>
>> Some amdgpu rings depended on this feature, but the previous commit
>> implemented the same thing in amdgpu directly so it can be safely
>> removed from drm/sched.
>
> Can we be that sure that other drivers don't depend on it, though? I
> suspect it's likely that it's just amdgpu, but…
>
Aside from the new "rocket" as pointed out by Tvrtko, amdgpu is the only driver
passing more than one schedulers to entities so they're the only ones that could
be affected.
I verified amdgpu and Tvrtko pinged the rocket maintainers in the other thread.
>
>
> BTW, since you're cleaning up related stuff currently: I saw that it
> seems that the only driver that sets &struct drm_sched_init_args.score
> is amdgpu. Would be cool if you can take a look whether that's still
> needed.
It cannot really be removed yet as it's useful when a single hardware block is
exposed through different schedulers (so pushing jobs to one of the schedulers
should increase the load of the underlying hw).
Thanks,
Pierre-Eric
>
>
> P.
>
>>
>> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
>> ---
>> drivers/gpu/drm/scheduler/sched_main.c | 2 --
>> 1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 5a550fd76bf0..e6d232a8ec58 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -206,7 +206,6 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>> if (!list_empty(&entity->list))
>> return;
>>
>> - atomic_inc(rq->sched->score);
>> list_add_tail(&entity->list, &rq->entities);
>> }
>>
>> @@ -228,7 +227,6 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>
>> spin_lock(&rq->lock);
>>
>> - atomic_dec(rq->sched->score);
>> list_del_init(&entity->list);
>>
>> if (rq->current_entity == entity)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1 2/2] drm/sched: limit sched score update to jobs change
2025-09-01 13:14 ` Pierre-Eric Pelloux-Prayer
@ 2025-09-02 6:21 ` Philipp Stanner
0 siblings, 0 replies; 7+ messages in thread
From: Philipp Stanner @ 2025-09-02 6:21 UTC (permalink / raw)
To: Pierre-Eric Pelloux-Prayer, phasta, Pierre-Eric Pelloux-Prayer,
Matthew Brost, Danilo Krummrich, Christian König,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter
Cc: dri-devel, linux-kernel
On Mon, 2025-09-01 at 15:14 +0200, Pierre-Eric Pelloux-Prayer wrote:
>
>
> Le 25/08/2025 à 15:13, Philipp Stanner a écrit :
> > On Fri, 2025-08-22 at 15:43 +0200, Pierre-Eric Pelloux-Prayer wrote:
> > > Currently, the scheduler score is incremented when a job is pushed to an
> > > entity and when an entity is attached to the scheduler.
> >
> > It's indeed awkward why attaching is treated equivalently to job
> > submission.
> >
> > Can you expand the documentation for drm_sched_init_args a bit so that
> > it gets clearer what the score is supposed to do?
>
>
> drm_sched_init_args.score is the feature allowing multiple schedulers to share a
> score so I suppose you meant drm_gpu_scheduler.score?
I mean both.
struct drm_sched_init_args has a @score which is passed by the driver
during init and will then be stored in drm_gpu_scheduler.score
The docu should be improved for both.
>
> The doc currently says "score to help loadbalancer pick a idle sched" which is a
> bit vague. It could be modified to become:
>
> @score: holds the number of yet-to-be-completed jobs pushed to each scheduler.
> It's used when load balancing between different schedulers.
"to each scheduler" reads a bit as if each scheduler has an individual
score. Is it used that way, though? I think it's a pointer because all
schedulers share this atomic. Thus, it "holds the total number of yet-
to-be-completed jobs for all schedulers sharing this atomic", right?
>
> What do you think?
>
> >
> > >
> > > This leads to some bad scheduling decision where the score value is
> > > largely made of idle entities.
> > >
> > > For instance, a scenario with 2 schedulers and where 10 entities submit
> > > a single job, then do nothing, each scheduler will probably end up with
> > > a score of 5.
> > > Now, 5 userspace apps exit, so their entities will be dropped.
> > >
> >
> > "entities will be dropped" == "drm_sched_entity_kill() gets called",
> > right?
>
> Yes.
then nit: better say "their entities will be killed" instead of
dropped.
>
> > > In
> > > the worst case, these apps' entities where all attached to the same
> >
> > s/where/were
> >
> > or better yet: "could be"
>
> Will fix, thanks.
>
> >
> > > scheduler and we end up with score=5 (the 5 remaining entities) and
> > > score=0, despite the 2 schedulers being idle.
> >
> > Sounds indeed like a (small) problem to me.
> >
> >
> > > When new entities show up, they will all select the second scheduler
> > > based on its low score value, instead of alternating between the 2.
> > >
> > > Some amdgpu rings depended on this feature, but the previous commit
> > > implemented the same thing in amdgpu directly so it can be safely
> > > removed from drm/sched.
> >
> > Can we be that sure that other drivers don't depend on it, though? I
> > suspect it's likely that it's just amdgpu, but…
> >
>
> Aside from the new "rocket" as pointed out by Tvrtko, amdgpu is the only driver
> passing more than one schedulers to entities so they're the only ones that could
> be affected.
>
> I verified amdgpu and Tvrtko pinged the rocket maintainers in the other thread.
Very good! Then let's give those guys a few days to jump into the
discussion.
>
> >
> >
> > BTW, since you're cleaning up related stuff currently: I saw that it
> > seems that the only driver that sets &struct drm_sched_init_args.score
> > is amdgpu. Would be cool if you can take a look whether that's still
> > needed.
>
> It cannot really be removed yet as it's useful when a single hardware block is
> exposed through different schedulers (so pushing jobs to one of the schedulers
> should increase the load of the underlying hw).
OK.
Thx
P.
>
> Thanks,
> Pierre-Eric
>
> >
> >
> > P.
> >
> > >
> > > Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> > > ---
> > > drivers/gpu/drm/scheduler/sched_main.c | 2 --
> > > 1 file changed, 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > index 5a550fd76bf0..e6d232a8ec58 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > @@ -206,7 +206,6 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
> > > if (!list_empty(&entity->list))
> > > return;
> > >
> > > - atomic_inc(rq->sched->score);
> > > list_add_tail(&entity->list, &rq->entities);
> > > }
> > >
> > > @@ -228,7 +227,6 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
> > >
> > > spin_lock(&rq->lock);
> > >
> > > - atomic_dec(rq->sched->score);
> > > list_del_init(&entity->list);
> > >
> > > if (rq->current_entity == entity)
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-09-02 6:21 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-22 13:43 [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Pierre-Eric Pelloux-Prayer
2025-08-22 13:43 ` [PATCH v1 2/2] drm/sched: limit sched score update to jobs change Pierre-Eric Pelloux-Prayer
2025-08-25 13:13 ` Philipp Stanner
2025-09-01 13:14 ` Pierre-Eric Pelloux-Prayer
2025-09-02 6:21 ` Philipp Stanner
2025-09-01 9:20 ` Tvrtko Ursulin
2025-09-01 9:02 ` [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Tvrtko Ursulin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).