Re: [PATCH] drm/sched: Simplify idle entity check

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: "Christian König" <christian.koenig@amd.com>
To: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>,
	phasta@kernel.org, dri-devel@lists.freedesktop.org
Cc: kernel-dev@igalia.com, intel-xe@lists.freedesktop.org,
	Danilo Krummrich <dakr@kernel.org>,
	Matthew Brost <matthew.brost@intel.com>
Subject: Re: [PATCH] drm/sched: Simplify idle entity check
Date: Fri, 9 Jan 2026 15:31:46 +0100	[thread overview]
Message-ID: <4d6ed6a7-89d6-4fa5-b754-2ecdbd42344c@amd.com> (raw)
In-Reply-To: <607847da-4f8a-4c19-9ebe-c07f79ce1362@igalia.com>

On 1/9/26 15:06, Tvrtko Ursulin wrote:
> \
> On 07/01/2026 14:11, Philipp Stanner wrote:
>> Happy 2026, Tvrtko!
>>
>>
>> On Wed, 2026-01-07 at 13:44 +0000, Tvrtko Ursulin wrote:
>>> drm_sched_entity_is_idle() is called when flushing the entity before
>>> destroying it and currently decides on the idle status based either on
>>> number of jobs in its queue being zero, or whether the entity is not part
>>> of any run-queue.
>>>
>>> If entity is not part of a run-queue it is implied it can have no jobs in
>>> its queue, from which it follows it is redundant to look at the both and
>>> we can simplify to only look at the queue.
>>>
>>> The list_empty() check was added in
>>> a875f58e237a ("drm/scheduler: stop setting rq to NULL")
>>> where it replaced the entity->rq check which was added in
>>> 741f01e636b7 ("drm/scheduler: Avoid using wait_event_killable for dying process (V4)").
>>>
>>> Since for the submit race involving exiting entities, explicit handling
>>> via entity->stopped was added soon after in
>>> 62347a33001c ("drm/scheduler: Add stopped flag to drm_sched_entity")
>>> it indeed looks we are safe to remove the list_empty() check.
>>>
>>> This mean we can remove the memory barrier as well and fewer memory
>>> barriers the better.
>>
>> I am not convinced that this change is worth it.
>>
>> Not to long ago we discussed that the spsc_queue should be removed and
>> replaced by some sort of list, with proper locks. Christian has agreed
>> that this should fly.
>>
>> The spsc queue has only 1 user in the kernel and it's super hard to
>> understand how it's supposed to work and when any why barriers and
>> READ_ONCE's are necessary (that's, of course, also not documented in
>> the queue).
>>
>> Until that is done I don't really want to touch any of those memory
>> barriers..
> 
> (I had a branch with spsc gone more than a year ago but I abandoned it for now since it contained some other too ambitious changes. So no complaints from me. Who is doing it BTW?)
> 
> Back to the point - this patch can wait, no problem. To explain the context though.
> 
> I wanted to get rid of looking at the list_empty here because I have a branch which improves the flow for the 1:1 sched:entity drivers.
> 
> Why are the two related? If you remember in the fair scheduler series all the run-queue stuff is nicely grouped in sched_rq.c and encapsulated in the rq API, which made it possible to follow up with virtualizing the rq operations.
> 
> The yet another relevant thing is the patch I sent this week which removes the special case where entity can be initialized with no schedulers.

If my memory not completely fails me the patch to reject initializing entities with no scheduler is actually a pre requisite of this patch here.

The list_empty() is most likely only there because we had entities initialized without a scheduler.

> 
> If we combined all these three pre-requisites, my branch allows the fully invariant sched:entity and 1:1:1 sched:rq:entity. Run-queue vfuncs for the 1:1 drivers become mostly no-ops (remove and pop entity are not needed at all, while add just checks for entity->stopped). So all the list and tree management needed by M:N drivers simply does not happen.

That sounds sane to me, but I would do one step at a time.

Regards,
Christian.

> 
> In that branch 1:1 entities do not take part in the rq->entities_list so, going back to this patch, the list_empty check will be in the way.
> 
> Anyway, just for context, I will park this cleanup for now but you can mull it over whether the bigger picture sounds interesting to you.
> 
> Regards,
> 
> Tvrtko
> 
>>>
>>> While at it, we add READ_ONCE annotation on the entity->stopped check to
>>> mark the unlocked read.
>>
>> This would effectively legalize the lockless access. But it is illegal
>> and undefined behavior. See drm_sched_fini().
>>
>> P.
>>
>>
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> Cc: Danilo Krummrich <dakr@kernel.org>
>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>> Cc: Philipp Stanner <phasta@kernel.org>
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_entity.c | 10 ++--------
>>>   1 file changed, 2 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index bb7e5fc47f99..2ed84504cfd6 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -137,14 +137,8 @@ EXPORT_SYMBOL(drm_sched_entity_modify_sched);
>>>     static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>>>   {
>>> -    rmb(); /* for list_empty to work without lock */
>>> -
>>> -    if (list_empty(&entity->list) ||
>>> -        spsc_queue_count(&entity->job_queue) == 0 ||
>>> -        entity->stopped)
>>> -        return true;
>>> -
>>> -    return false;
>>> +    return spsc_queue_count(&entity->job_queue) == 0 ||
>>> +           READ_ONCE(entity->stopped);
>>>   }
>>>     /**
>>
>

next prev parent reply	other threads:[~2026-01-09 14:31 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-07 13:44 [PATCH] drm/sched: Simplify idle entity check Tvrtko Ursulin
2026-01-07 13:50 ` ✗ CI.checkpatch: warning for " Patchwork
2026-01-07 13:51 ` ✓ CI.KUnit: success " Patchwork
2026-01-07 14:08 ` [PATCH] " Danilo Krummrich
2026-01-09 14:08   ` Danilo Krummrich
2026-01-07 14:11 ` Philipp Stanner
2026-01-09 14:06   ` Tvrtko Ursulin
2026-01-09 14:31     ` Christian König [this message]
2026-01-15 13:30       ` Tvrtko Ursulin
2026-01-09 14:48     ` Philipp Stanner
2026-01-09 15:11       ` Christian König
2026-01-12 10:35       ` Tvrtko Ursulin
2026-01-07 14:29 ` ✓ Xe.CI.BAT: success for " Patchwork
2026-01-07 16:32 ` ✗ Xe.CI.Full: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4d6ed6a7-89d6-4fa5-b754-2ecdbd42344c@amd.com \
    --to=christian.koenig@amd.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=kernel-dev@igalia.com \
    --cc=matthew.brost@intel.com \
    --cc=phasta@kernel.org \
    --cc=tvrtko.ursulin@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox