dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Pierre-Eric Pelloux-Prayer <pierre-eric@damsy.net>
To: phasta@kernel.org,
	"Pierre-Eric Pelloux-Prayer" <pierre-eric.pelloux-prayer@amd.com>,
	"Matthew Brost" <matthew.brost@intel.com>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"Christian König" <ckoenig.leichtzumerken@gmail.com>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Maxime Ripard" <mripard@kernel.org>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	"David Airlie" <airlied@gmail.com>,
	"Simona Vetter" <simona@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 2/2] drm/sched: limit sched score update to jobs change
Date: Mon, 1 Sep 2025 15:14:05 +0200	[thread overview]
Message-ID: <b1a8f459-93dd-4b6c-b29e-c68fba19f6fc@damsy.net> (raw)
In-Reply-To: <9b68898ca34483b52d7f4510747a20bce52751c7.camel@mailbox.org>



Le 25/08/2025 à 15:13, Philipp Stanner a écrit :
> On Fri, 2025-08-22 at 15:43 +0200, Pierre-Eric Pelloux-Prayer wrote:
>> Currently, the scheduler score is incremented when a job is pushed to an
>> entity and when an entity is attached to the scheduler.
> 
> It's indeed awkward why attaching is treated equivalently to job
> submission.
> 
> Can you expand the documentation for drm_sched_init_args a bit so that
> it gets clearer what the score is supposed to do?


drm_sched_init_args.score is the feature allowing multiple schedulers to share a 
score so I suppose you meant drm_gpu_scheduler.score?

The doc currently says "score to help loadbalancer pick a idle sched" which is a 
bit vague. It could be modified to become:

   @score: holds the number of yet-to-be-completed jobs pushed to each scheduler.
           It's used when load balancing between different schedulers.

What do you think?

> 
>>
>> This leads to some bad scheduling decision where the score value is
>> largely made of idle entities.
>>
>> For instance, a scenario with 2 schedulers and where 10 entities submit
>> a single job, then do nothing, each scheduler will probably end up with
>> a score of 5.
>> Now, 5 userspace apps exit, so their entities will be dropped.
>>
> 
> "entities will be dropped" == "drm_sched_entity_kill() gets called",
> right?

Yes.

>> In
>> the worst case, these apps' entities where all attached to the same
> 
> s/where/were
> 
> or better yet: "could be"

Will fix, thanks.

> 
>> scheduler and we end up with score=5 (the 5 remaining entities) and
>> score=0, despite the 2 schedulers being idle.
> 
> Sounds indeed like a (small) problem to me.
> 
> 
>> When new entities show up, they will all select the second scheduler
>> based on its low score value, instead of alternating between the 2.
>>
>> Some amdgpu rings depended on this feature, but the previous commit
>> implemented the same thing in amdgpu directly so it can be safely
>> removed from drm/sched.
> 
> Can we be that sure that other drivers don't depend on it, though? I
> suspect it's likely that it's just amdgpu, but…
> 

Aside from the new "rocket" as pointed out by Tvrtko, amdgpu is the only driver 
passing more than one schedulers to entities so they're the only ones that could 
be affected.

I verified amdgpu and Tvrtko pinged the rocket maintainers in the other thread.

> 
> 
> BTW, since you're cleaning up related stuff currently: I saw that it
> seems that the only driver that sets &struct drm_sched_init_args.score
> is amdgpu. Would be cool if you can take a look whether that's still
> needed.

It cannot really be removed yet as it's useful when a single hardware block is 
exposed through different schedulers (so pushing jobs to one of the schedulers 
should increase the load of the underlying hw).

Thanks,
Pierre-Eric

> 
> 
> P.
> 
>>
>> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 5a550fd76bf0..e6d232a8ec58 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -206,7 +206,6 @@ void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>>   	if (!list_empty(&entity->list))
>>   		return;
>>   
>> -	atomic_inc(rq->sched->score);
>>   	list_add_tail(&entity->list, &rq->entities);
>>   }
>>   
>> @@ -228,7 +227,6 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>   
>>   	spin_lock(&rq->lock);
>>   
>> -	atomic_dec(rq->sched->score);
>>   	list_del_init(&entity->list);
>>   
>>   	if (rq->current_entity == entity)

  reply	other threads:[~2025-09-01 13:14 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-22 13:43 [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Pierre-Eric Pelloux-Prayer
2025-08-22 13:43 ` [PATCH v1 2/2] drm/sched: limit sched score update to jobs change Pierre-Eric Pelloux-Prayer
2025-08-25 13:13   ` Philipp Stanner
2025-09-01 13:14     ` Pierre-Eric Pelloux-Prayer [this message]
2025-09-02  6:21       ` Philipp Stanner
2025-09-01  9:20   ` Tvrtko Ursulin
2025-09-01  9:02 ` [PATCH v1 1/2] drm/amdgpu: increment share sched score on entity selection Tvrtko Ursulin
2025-09-03  8:48   ` Pierre-Eric Pelloux-Prayer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b1a8f459-93dd-4b6c-b29e-c68fba19f6fc@damsy.net \
    --to=pierre-eric@damsy.net \
    --cc=airlied@gmail.com \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=matthew.brost@intel.com \
    --cc=mripard@kernel.org \
    --cc=phasta@kernel.org \
    --cc=pierre-eric.pelloux-prayer@amd.com \
    --cc=simona@ffwll.ch \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).