From: Philipp Stanner <phasta@mailbox.org>
To: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>,
amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Cc: kernel-dev@igalia.com,
"Christian König" <christian.koenig@amd.com>,
"Danilo Krummrich" <dakr@kernel.org>,
"Matthew Brost" <matthew.brost@intel.com>,
"Philipp Stanner" <phasta@kernel.org>
Subject: Re: [PATCH 13/28] drm/sched: Remove FIFO and RR and simplify to a single run queue
Date: Tue, 14 Oct 2025 13:16:17 +0200 [thread overview]
Message-ID: <44dfae80b8e504d6908cae79fab707f02b974834.camel@mailbox.org> (raw)
In-Reply-To: <20251008085359.52404-14-tvrtko.ursulin@igalia.com>
On Wed, 2025-10-08 at 09:53 +0100, Tvrtko Ursulin wrote:
> Since the new fair policy is at least as good as FIFO and we can afford to
s/fair/FAIR
> remove round-robin,
>
Better state that RR has not been used as the default since forever as
the justification.
> we can simplify the scheduler code by making the
> scheduler to run queue relationship always 1:1 and remove some code.
>
> Also, now that the FIFO policy is gone the tree of entities is not a FIFO
> tree any more so rename it to just the tree.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Philipp Stanner <phasta@kernel.org>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 23 ++-
> drivers/gpu/drm/scheduler/sched_entity.c | 29 +---
> drivers/gpu/drm/scheduler/sched_internal.h | 12 +-
> drivers/gpu/drm/scheduler/sched_main.c | 161 ++++++---------------
> drivers/gpu/drm/scheduler/sched_rq.c | 67 +++------
> include/drm/gpu_scheduler.h | 36 +----
> 6 files changed, 82 insertions(+), 246 deletions(-)
Now that's nice!
Just a few more comments below; I have a bit of a tight schedule this
week.
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index d020a890a0ea..bc07fd57310c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -434,25 +434,22 @@ drm_sched_entity_queue_pop(struct drm_sched_entity *entity)
>
> void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched)
> {
> + struct drm_sched_rq *rq = sched->rq;
> + struct drm_sched_entity *s_entity;
> struct drm_sched_job *s_job;
> - struct drm_sched_entity *s_entity = NULL;
> - int i;
>
> /* Signal all jobs not yet scheduled */
> - for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
> - struct drm_sched_rq *rq = sched->sched_rq[i];
> - spin_lock(&rq->lock);
> - list_for_each_entry(s_entity, &rq->entities, list) {
> - while ((s_job = drm_sched_entity_queue_pop(s_entity))) {
> - struct drm_sched_fence *s_fence = s_job->s_fence;
> + spin_lock(&rq->lock);
> + list_for_each_entry(s_entity, &rq->entities, list) {
> + while ((s_job = drm_sched_entity_queue_pop(s_entity))) {
> + struct drm_sched_fence *s_fence = s_job->s_fence;
>
> - dma_fence_signal(&s_fence->scheduled);
> - dma_fence_set_error(&s_fence->finished, -EHWPOISON);
> - dma_fence_signal(&s_fence->finished);
> - }
> + dma_fence_signal(&s_fence->scheduled);
> + dma_fence_set_error(&s_fence->finished, -EHWPOISON);
Do we btw. know why the error was even poisoned here?
> + dma_fence_signal(&s_fence->finished);
> }
> - spin_unlock(&rq->lock);
> }
> + spin_unlock(&rq->lock);
>
> /* Signal all jobs already scheduled to HW */
> list_for_each_entry(s_job, &sched->pending_list, list) {
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 1715e1caec40..2b03ca7c835a 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -109,8 +109,6 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
> entity->guilty = guilty;
> entity->num_sched_list = num_sched_list;
> entity->priority = priority;
> - entity->rq_priority = drm_sched_policy == DRM_SCHED_POLICY_FAIR ?
> - DRM_SCHED_PRIORITY_KERNEL : priority;
> /*
> * It's perfectly valid to initialize an entity without having a valid
> * scheduler attached. It's just not valid to use the scheduler before it
> @@ -120,30 +118,14 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
> RCU_INIT_POINTER(entity->last_scheduled, NULL);
> RB_CLEAR_NODE(&entity->rb_tree_node);
>
> - if (num_sched_list && !sched_list[0]->sched_rq) {
> + if (num_sched_list && !sched_list[0]->rq) {
> /* Since every entry covered by num_sched_list
> * should be non-NULL and therefore we warn drivers
> * not to do this and to fix their DRM calling order.
> */
> pr_warn("%s: called with uninitialized scheduler\n", __func__);
> } else if (num_sched_list) {
> - enum drm_sched_priority p = entity->priority;
> -
> - /*
> - * The "priority" of an entity cannot exceed the number of
> - * run-queues of a scheduler. Protect against num_rqs being 0,
> - * by converting to signed. Choose the lowest priority
> - * available.
> - */
> - if (p >= sched_list[0]->num_user_rqs) {
> - dev_err(sched_list[0]->dev, "entity with out-of-bounds priority:%u num_user_rqs:%u\n",
> - p, sched_list[0]->num_user_rqs);
> - p = max_t(s32,
> - (s32)sched_list[0]->num_user_rqs - 1,
> - (s32)DRM_SCHED_PRIORITY_KERNEL);
> - entity->priority = p;
> - }
> - entity->rq = sched_list[0]->sched_rq[entity->rq_priority];
> + entity->rq = sched_list[0]->rq;
> }
>
> init_completion(&entity->entity_idle);
> @@ -576,7 +558,7 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>
> spin_lock(&entity->lock);
> sched = drm_sched_pick_best(entity->sched_list, entity->num_sched_list);
> - rq = sched ? sched->sched_rq[entity->rq_priority] : NULL;
> + rq = sched ? sched->rq : NULL;
> if (rq != entity->rq) {
> drm_sched_rq_remove_entity(entity->rq, entity);
> entity->rq = rq;
> @@ -600,7 +582,6 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> {
> struct drm_sched_entity *entity = sched_job->entity;
> bool first;
> - ktime_t submit_ts;
>
> trace_drm_sched_job_queue(sched_job, entity);
>
> @@ -617,16 +598,14 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> /*
> * After the sched_job is pushed into the entity queue, it may be
> * completed and freed up at any time. We can no longer access it.
> - * Make sure to set the submit_ts first, to avoid a race.
> */
> - sched_job->submit_ts = submit_ts = ktime_get();
> first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node);
>
> /* first job wakes up scheduler */
> if (first) {
> struct drm_gpu_scheduler *sched;
>
> - sched = drm_sched_rq_add_entity(entity, submit_ts);
> + sched = drm_sched_rq_add_entity(entity);
> if (sched)
> drm_sched_wakeup(sched);
> }
> diff --git a/drivers/gpu/drm/scheduler/sched_internal.h b/drivers/gpu/drm/scheduler/sched_internal.h
> index a120efc5d763..0a5b7bf2cb93 100644
> --- a/drivers/gpu/drm/scheduler/sched_internal.h
> +++ b/drivers/gpu/drm/scheduler/sched_internal.h
> @@ -32,13 +32,6 @@ struct drm_sched_entity_stats {
> struct ewma_drm_sched_avgtime avg_job_us;
> };
>
> -/* Used to choose between FIFO and RR job-scheduling */
> -extern int drm_sched_policy;
> -
> -#define DRM_SCHED_POLICY_RR 0
> -#define DRM_SCHED_POLICY_FIFO 1
> -#define DRM_SCHED_POLICY_FAIR 2
> -
> bool drm_sched_can_queue(struct drm_gpu_scheduler *sched,
> struct drm_sched_entity *entity);
> void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
> @@ -46,10 +39,9 @@ void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
> void drm_sched_rq_init(struct drm_sched_rq *rq,
> struct drm_gpu_scheduler *sched);
> struct drm_sched_entity *
> -drm_sched_rq_select_entity(struct drm_gpu_scheduler *sched,
> - struct drm_sched_rq *rq);
> +drm_sched_select_entity(struct drm_gpu_scheduler *sched);
> struct drm_gpu_scheduler *
> -drm_sched_rq_add_entity(struct drm_sched_entity *entity, ktime_t ts);
> +drm_sched_rq_add_entity(struct drm_sched_entity *entity);
> void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
> struct drm_sched_entity *entity);
> void drm_sched_rq_pop_entity(struct drm_sched_entity *entity);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 858fc28e91e4..518ce87f844a 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -84,15 +84,6 @@
> #define CREATE_TRACE_POINTS
> #include "gpu_scheduler_trace.h"
>
> -int drm_sched_policy = DRM_SCHED_POLICY_FAIR;
> -
> -/**
> - * DOC: sched_policy (int)
> - * Used to override default entities scheduling policy in a run queue.
> - */
> -MODULE_PARM_DESC(sched_policy, "Specify the scheduling policy for entities on a run-queue, " __stringify(DRM_SCHED_POLICY_RR) " = Round Robin, " __stringify(DRM_SCHED_POLICY_FIFO) " = FIFO, " __stringify(DRM_SCHED_POLICY_FAIR) " = Fair (default).");
> -module_param_named(sched_policy, drm_sched_policy, int, 0444);
> -
> static u32 drm_sched_available_credits(struct drm_gpu_scheduler *sched)
> {
> u32 credits;
> @@ -876,34 +867,6 @@ void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
> drm_sched_run_job_queue(sched);
> }
>
> -/**
> - * drm_sched_select_entity - Select next entity to process
> - *
> - * @sched: scheduler instance
> - *
> - * Return an entity to process or NULL if none are found.
> - *
> - * Note, that we break out of the for-loop when "entity" is non-null, which can
> - * also be an error-pointer--this assures we don't process lower priority
> - * run-queues. See comments in the respectively called functions.
> - */
> -static struct drm_sched_entity *
> -drm_sched_select_entity(struct drm_gpu_scheduler *sched)
> -{
> - struct drm_sched_entity *entity = NULL;
> - int i;
> -
> - /* Start with the highest priority.
> - */
> - for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
> - entity = drm_sched_rq_select_entity(sched, sched->sched_rq[i]);
> - if (entity)
> - break;
> - }
> -
> - return IS_ERR(entity) ? NULL : entity;
> -}
> -
> /**
> * drm_sched_get_finished_job - fetch the next finished job to be destroyed
> *
> @@ -1029,7 +992,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
>
> /* Find entity with a ready job */
> entity = drm_sched_select_entity(sched);
> - if (!entity)
> + if (IS_ERR_OR_NULL(entity))
What's that about?
> return; /* No more work */
>
> sched_job = drm_sched_entity_pop_job(entity);
> @@ -1100,8 +1063,6 @@ static struct workqueue_struct *drm_sched_alloc_wq(const char *name)
> */
> int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_args *args)
> {
> - int i;
> -
> sched->ops = args->ops;
> sched->credit_limit = args->credit_limit;
> sched->name = args->name;
> @@ -1111,13 +1072,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_
> sched->score = args->score ? args->score : &sched->_score;
> sched->dev = args->dev;
>
> - if (args->num_rqs > DRM_SCHED_PRIORITY_COUNT) {
> - /* This is a gross violation--tell drivers what the problem is.
> - */
> - dev_err(sched->dev, "%s: num_rqs cannot be greater than DRM_SCHED_PRIORITY_COUNT\n",
> - __func__);
> - return -EINVAL;
> - } else if (sched->sched_rq) {
> + if (sched->rq) {
> /* Not an error, but warn anyway so drivers can
> * fine-tune their DRM calling order, and return all
> * is good.
> @@ -1137,21 +1092,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_
> sched->own_submit_wq = true;
> }
>
> - sched->num_user_rqs = args->num_rqs;
> - sched->num_rqs = drm_sched_policy != DRM_SCHED_POLICY_FAIR ?
> - args->num_rqs : 1;
> - sched->sched_rq = kmalloc_array(sched->num_rqs,
> - sizeof(*sched->sched_rq),
> - GFP_KERNEL | __GFP_ZERO);
> - if (!sched->sched_rq)
> + sched->rq = kmalloc(sizeof(*sched->rq), GFP_KERNEL | __GFP_ZERO);
> + if (!sched->rq)
> goto Out_check_own;
>
> - for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
> - sched->sched_rq[i] = kzalloc(sizeof(*sched->sched_rq[i]), GFP_KERNEL);
> - if (!sched->sched_rq[i])
> - goto Out_unroll;
> - drm_sched_rq_init(sched->sched_rq[i], sched);
> - }
> + drm_sched_rq_init(sched->rq, sched);
>
> init_waitqueue_head(&sched->job_scheduled);
> INIT_LIST_HEAD(&sched->pending_list);
> @@ -1167,12 +1112,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_
>
> sched->ready = true;
> return 0;
> -Out_unroll:
> - for (--i ; i >= DRM_SCHED_PRIORITY_KERNEL; i--)
> - kfree(sched->sched_rq[i]);
>
> - kfree(sched->sched_rq);
> - sched->sched_rq = NULL;
> Out_check_own:
> if (sched->own_submit_wq)
> destroy_workqueue(sched->submit_wq);
> @@ -1208,41 +1148,35 @@ static void drm_sched_cancel_remaining_jobs(struct drm_gpu_scheduler *sched)
> */
> void drm_sched_fini(struct drm_gpu_scheduler *sched)
> {
> +
Surplus empty line.
P.
> + struct drm_sched_rq *rq = sched->rq;
> struct drm_sched_entity *s_entity;
> - int i;
>
> drm_sched_wqueue_stop(sched);
>
> - for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
> - struct drm_sched_rq *rq = sched->sched_rq[i];
> -
> - spin_lock(&rq->lock);
> - list_for_each_entry(s_entity, &rq->entities, list)
> - /*
> - * Prevents reinsertion and marks job_queue as idle,
> - * it will be removed from the rq in drm_sched_entity_fini()
> - * eventually
> - *
> - * FIXME:
> - * This lacks the proper spin_lock(&s_entity->lock) and
> - * is, therefore, a race condition. Most notably, it
> - * can race with drm_sched_entity_push_job(). The lock
> - * cannot be taken here, however, because this would
> - * lead to lock inversion -> deadlock.
> - *
> - * The best solution probably is to enforce the life
> - * time rule of all entities having to be torn down
> - * before their scheduler. Then, however, locking could
> - * be dropped alltogether from this function.
> - *
> - * For now, this remains a potential race in all
> - * drivers that keep entities alive for longer than
> - * the scheduler.
> - */
> - s_entity->stopped = true;
> - spin_unlock(&rq->lock);
> - kfree(sched->sched_rq[i]);
> - }
> + spin_lock(&rq->lock);
> + list_for_each_entry(s_entity, &rq->entities, list)
> + /*
> + * Prevents re-insertion and marks job_queue as idle,
> + * it will be removed from the rq in drm_sched_entity_fini()
> + * eventually.
> + *
> + * FIXME:
> + * This lacks the proper spin_lock(&s_entity->lock) and is,
> + * therefore, a race condition. Most notably, it can race with
> + * drm_sched_entity_push_job(). The lock cannot be taken here,
> + * however, because this would lead to lock inversion.
> + *
> + * The best solution probably is to enforce the life time rule
> + * of all entities having to be torn down before their
> + * scheduler. Then locking could be dropped altogether from this
> + * function.
> + *
> + * For now, this remains a potential race in all drivers that
> + * keep entities alive for longer than the scheduler.
> + */
> + s_entity->stopped = true;
> + spin_unlock(&rq->lock);
>
> /* Wakeup everyone stuck in drm_sched_entity_flush for this scheduler */
> wake_up_all(&sched->job_scheduled);
> @@ -1257,8 +1191,8 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
> if (sched->own_submit_wq)
> destroy_workqueue(sched->submit_wq);
> sched->ready = false;
> - kfree(sched->sched_rq);
> - sched->sched_rq = NULL;
> + kfree(sched->rq);
> + sched->rq = NULL;
>
> if (!list_empty(&sched->pending_list))
> dev_warn(sched->dev, "Tearing down scheduler while jobs are pending!\n");
> @@ -1276,35 +1210,28 @@ EXPORT_SYMBOL(drm_sched_fini);
> */
> void drm_sched_increase_karma(struct drm_sched_job *bad)
> {
> - int i;
> - struct drm_sched_entity *tmp;
> - struct drm_sched_entity *entity;
> struct drm_gpu_scheduler *sched = bad->sched;
> + struct drm_sched_entity *entity, *tmp;
> + struct drm_sched_rq *rq = sched->rq;
>
> /* don't change @bad's karma if it's from KERNEL RQ,
> * because sometimes GPU hang would cause kernel jobs (like VM updating jobs)
> * corrupt but keep in mind that kernel jobs always considered good.
> */
> - if (bad->s_priority != DRM_SCHED_PRIORITY_KERNEL) {
> - atomic_inc(&bad->karma);
> + if (bad->s_priority == DRM_SCHED_PRIORITY_KERNEL)
> + return;
>
> - for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
> - struct drm_sched_rq *rq = sched->sched_rq[i];
> + atomic_inc(&bad->karma);
>
> - spin_lock(&rq->lock);
> - list_for_each_entry_safe(entity, tmp, &rq->entities, list) {
> - if (bad->s_fence->scheduled.context ==
> - entity->fence_context) {
> - if (entity->guilty)
> - atomic_set(entity->guilty, 1);
> - break;
> - }
> - }
> - spin_unlock(&rq->lock);
> - if (&entity->list != &rq->entities)
> - break;
> + spin_lock(&rq->lock);
> + list_for_each_entry_safe(entity, tmp, &rq->entities, list) {
> + if (bad->s_fence->scheduled.context == entity->fence_context) {
> + if (entity->guilty)
> + atomic_set(entity->guilty, 1);
> + break;
> }
> }
> + spin_unlock(&rq->lock);
> }
> EXPORT_SYMBOL(drm_sched_increase_karma);
>
> diff --git a/drivers/gpu/drm/scheduler/sched_rq.c b/drivers/gpu/drm/scheduler/sched_rq.c
> index 02742869e75b..f9c899a9629c 100644
> --- a/drivers/gpu/drm/scheduler/sched_rq.c
> +++ b/drivers/gpu/drm/scheduler/sched_rq.c
> @@ -34,7 +34,7 @@ static void drm_sched_rq_update_prio(struct drm_sched_rq *rq)
> rq->head_prio = prio;
> }
>
> -static void drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity,
> +static void drm_sched_rq_remove_tree_locked(struct drm_sched_entity *entity,
> struct drm_sched_rq *rq)
> {
> lockdep_assert_held(&entity->lock);
> @@ -47,7 +47,7 @@ static void drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity,
> }
> }
>
> -static void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
> +static void drm_sched_rq_update_tree_locked(struct drm_sched_entity *entity,
> struct drm_sched_rq *rq,
> ktime_t ts)
> {
> @@ -59,7 +59,7 @@ static void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
> lockdep_assert_held(&entity->lock);
> lockdep_assert_held(&rq->lock);
>
> - drm_sched_rq_remove_fifo_locked(entity, rq);
> + drm_sched_rq_remove_tree_locked(entity, rq);
>
> entity->oldest_job_waiting = ts;
>
> @@ -211,17 +211,17 @@ static ktime_t drm_sched_entity_get_job_ts(struct drm_sched_entity *entity)
> * drm_sched_rq_add_entity - add an entity
> *
> * @entity: scheduler entity
> - * @ts: submission timestamp
> *
> * Adds a scheduler entity to the run queue.
> *
> * Returns a DRM scheduler pre-selected to handle this entity.
> */
> struct drm_gpu_scheduler *
> -drm_sched_rq_add_entity(struct drm_sched_entity *entity, ktime_t ts)
> +drm_sched_rq_add_entity(struct drm_sched_entity *entity)
> {
> struct drm_gpu_scheduler *sched;
> struct drm_sched_rq *rq;
> + ktime_t ts;
>
> /* Add the entity to the run queue */
> spin_lock(&entity->lock);
> @@ -241,15 +241,9 @@ drm_sched_rq_add_entity(struct drm_sched_entity *entity, ktime_t ts)
> list_add_tail(&entity->list, &rq->entities);
> }
>
> - if (drm_sched_policy == DRM_SCHED_POLICY_FAIR) {
> - ts = drm_sched_rq_get_min_vruntime(rq);
> - ts = drm_sched_entity_restore_vruntime(entity, ts,
> - rq->head_prio);
> - } else if (drm_sched_policy == DRM_SCHED_POLICY_RR) {
> - ts = entity->rr_ts;
> - }
> -
> - drm_sched_rq_update_fifo_locked(entity, rq, ts);
> + ts = drm_sched_rq_get_min_vruntime(rq);
> + ts = drm_sched_entity_restore_vruntime(entity, ts, rq->head_prio);
> + drm_sched_rq_update_tree_locked(entity, rq, ts);
>
> spin_unlock(&rq->lock);
> spin_unlock(&entity->lock);
> @@ -278,26 +272,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
> atomic_dec(rq->sched->score);
> list_del_init(&entity->list);
>
> - drm_sched_rq_remove_fifo_locked(entity, rq);
> + drm_sched_rq_remove_tree_locked(entity, rq);
>
> spin_unlock(&rq->lock);
> }
>
> -static ktime_t
> -drm_sched_rq_get_rr_ts(struct drm_sched_rq *rq, struct drm_sched_entity *entity)
> -{
> - ktime_t ts;
> -
> - lockdep_assert_held(&entity->lock);
> - lockdep_assert_held(&rq->lock);
> -
> - ts = ktime_add_ns(rq->rr_ts, 1);
> - entity->rr_ts = ts;
> - rq->rr_ts = ts;
> -
> - return ts;
> -}
> -
> /**
> * drm_sched_rq_pop_entity - pops an entity
> *
> @@ -321,33 +300,23 @@ void drm_sched_rq_pop_entity(struct drm_sched_entity *entity)
> if (next_job) {
> ktime_t ts;
>
> - if (drm_sched_policy == DRM_SCHED_POLICY_FAIR)
> - ts = drm_sched_entity_get_job_ts(entity);
> - else if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> - ts = next_job->submit_ts;
> - else
> - ts = drm_sched_rq_get_rr_ts(rq, entity);
> -
> - drm_sched_rq_update_fifo_locked(entity, rq, ts);
> + ts = drm_sched_entity_get_job_ts(entity);
> + drm_sched_rq_update_tree_locked(entity, rq, ts);
> } else {
> - drm_sched_rq_remove_fifo_locked(entity, rq);
> + ktime_t min_vruntime;
>
> - if (drm_sched_policy == DRM_SCHED_POLICY_FAIR) {
> - ktime_t min_vruntime;
> -
> - min_vruntime = drm_sched_rq_get_min_vruntime(rq);
> - drm_sched_entity_save_vruntime(entity, min_vruntime);
> - }
> + drm_sched_rq_remove_tree_locked(entity, rq);
> + min_vruntime = drm_sched_rq_get_min_vruntime(rq);
> + drm_sched_entity_save_vruntime(entity, min_vruntime);
> }
> spin_unlock(&rq->lock);
> spin_unlock(&entity->lock);
> }
>
> /**
> - * drm_sched_rq_select_entity - Select an entity which provides a job to run
> + * drm_sched_select_entity - Select an entity which provides a job to run
> *
> * @sched: the gpu scheduler
> - * @rq: scheduler run queue to check.
> *
> * Find oldest waiting ready entity.
> *
> @@ -356,9 +325,9 @@ void drm_sched_rq_pop_entity(struct drm_sched_entity *entity)
> * its job; return NULL, if no ready entity was found.
> */
> struct drm_sched_entity *
> -drm_sched_rq_select_entity(struct drm_gpu_scheduler *sched,
> - struct drm_sched_rq *rq)
> +drm_sched_select_entity(struct drm_gpu_scheduler *sched)
> {
> + struct drm_sched_rq *rq = sched->rq;
> struct rb_node *rb;
>
> spin_lock(&rq->lock);
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index a7e407e04ce0..d4dc4b8b770a 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -99,8 +99,7 @@ struct drm_sched_entity {
> * @lock:
> *
> * Lock protecting the run-queue (@rq) to which this entity belongs,
> - * @priority, the list of schedulers (@sched_list, @num_sched_list) and
> - * the @rr_ts field.
> + * @priority and the list of schedulers (@sched_list, @num_sched_list).
> */
> spinlock_t lock;
>
> @@ -153,18 +152,6 @@ struct drm_sched_entity {
> */
> enum drm_sched_priority priority;
>
> - /**
> - * @rq_priority: Run-queue priority
> - */
> - enum drm_sched_priority rq_priority;
> -
> - /**
> - * @rr_ts:
> - *
> - * Fake timestamp of the last popped job from the entity.
> - */
> - ktime_t rr_ts;
> -
> /**
> * @job_queue: the list of jobs of this entity.
> */
> @@ -262,8 +249,7 @@ struct drm_sched_entity {
> * struct drm_sched_rq - queue of entities to be scheduled.
> *
> * @sched: the scheduler to which this rq belongs to.
> - * @lock: protects @entities, @rb_tree_root, @rr_ts and @head_prio.
> - * @rr_ts: monotonically incrementing fake timestamp for RR mode
> + * @lock: protects @entities, @rb_tree_root and @head_prio.
> * @entities: list of the entities to be scheduled.
> * @rb_tree_root: root of time based priority queue of entities for FIFO scheduling
> * @head_prio: priority of the top tree element
> @@ -277,7 +263,6 @@ struct drm_sched_rq {
>
> spinlock_t lock;
> /* Following members are protected by the @lock: */
> - ktime_t rr_ts;
> struct list_head entities;
> struct rb_root_cached rb_tree_root;
> enum drm_sched_priority head_prio;
> @@ -363,13 +348,6 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
> * to schedule the job.
> */
> struct drm_sched_job {
> - /**
> - * @submit_ts:
> - *
> - * When the job was pushed into the entity queue.
> - */
> - ktime_t submit_ts;
> -
> /**
> * @sched:
> *
> @@ -573,11 +551,7 @@ struct drm_sched_backend_ops {
> * @credit_count: the current credit count of this scheduler
> * @timeout: the time after which a job is removed from the scheduler.
> * @name: name of the ring for which this scheduler is being used.
> - * @num_user_rqs: Number of run-queues. This is at most
> - * DRM_SCHED_PRIORITY_COUNT, as there's usually one run-queue per
> - * priority, but could be less.
> - * @num_rqs: Equal to @num_user_rqs for FIFO and RR and 1 for the FAIR policy.
> - * @sched_rq: An allocated array of run-queues of size @num_rqs;
> + * @rq: Scheduler run queue
> * @job_scheduled: once drm_sched_entity_flush() is called the scheduler
> * waits on this wait queue until all the scheduled jobs are
> * finished.
> @@ -609,9 +583,7 @@ struct drm_gpu_scheduler {
> atomic_t credit_count;
> long timeout;
> const char *name;
> - u32 num_rqs;
> - u32 num_user_rqs;
> - struct drm_sched_rq **sched_rq;
> + struct drm_sched_rq *rq;
> wait_queue_head_t job_scheduled;
> atomic64_t job_id_count;
> struct workqueue_struct *submit_wq;
next prev parent reply other threads:[~2025-10-14 12:36 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-08 8:53 [PATCH 00/28] Fair DRM scheduler Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 01/28] drm/sched: Reverse drm_sched_rq_init arguments Tvrtko Ursulin
2025-10-10 8:55 ` Philipp Stanner
2025-10-10 9:46 ` Tvrtko Ursulin
2025-10-10 10:36 ` Philipp Stanner
2025-10-11 13:21 ` Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 02/28] drm/sched: Add some scheduling quality unit tests Tvrtko Ursulin
2025-10-10 9:38 ` Philipp Stanner
2025-10-11 13:09 ` Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 03/28] drm/sched: Add some more " Tvrtko Ursulin
2025-10-10 9:48 ` Philipp Stanner
2025-10-11 13:21 ` Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 04/28] drm/sched: Implement RR via FIFO Tvrtko Ursulin
2025-10-10 10:18 ` Philipp Stanner
2025-10-11 13:30 ` Tvrtko Ursulin
2025-10-14 6:40 ` Philipp Stanner
2025-10-08 8:53 ` [PATCH 05/28] drm/sched: Consolidate entity run queue management Tvrtko Ursulin
2025-10-10 10:49 ` Philipp Stanner
2025-10-11 14:19 ` Tvrtko Ursulin
2025-10-14 6:53 ` Philipp Stanner
2025-10-14 7:26 ` Tvrtko Ursulin
2025-10-14 8:52 ` Philipp Stanner
2025-10-14 10:04 ` Tvrtko Ursulin
2025-10-14 11:23 ` Philipp Stanner
2025-10-08 8:53 ` [PATCH 06/28] drm/sched: Move run queue related code into a separate file Tvrtko Ursulin
2025-10-08 22:49 ` Matthew Brost
2025-10-08 8:53 ` [PATCH 07/28] drm/sched: Free all finished jobs at once Tvrtko Ursulin
2025-10-08 22:48 ` Matthew Brost
2025-10-08 8:53 ` [PATCH 08/28] drm/sched: Account entity GPU time Tvrtko Ursulin
2025-10-10 12:22 ` Philipp Stanner
2025-10-11 14:56 ` Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 09/28] drm/sched: Remove idle entity from tree Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 10/28] drm/sched: Add fair scheduling policy Tvrtko Ursulin
2025-10-14 10:27 ` Philipp Stanner
2025-10-14 12:56 ` Tvrtko Ursulin
2025-10-14 14:02 ` Philipp Stanner
2025-10-14 14:32 ` Simona Vetter
2025-10-14 14:58 ` Tvrtko Ursulin
2025-10-16 7:06 ` Philipp Stanner
2025-10-16 8:42 ` Tvrtko Ursulin
2025-10-16 9:50 ` Danilo Krummrich
2025-10-16 10:54 ` Tvrtko Ursulin
2025-10-16 11:14 ` Danilo Krummrich
2025-10-08 8:53 ` [PATCH 11/28] drm/sched: Favour interactive clients slightly Tvrtko Ursulin
2025-10-14 10:53 ` Philipp Stanner
2025-10-14 12:20 ` Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 12/28] drm/sched: Switch default policy to fair Tvrtko Ursulin
2025-10-10 12:56 ` Philipp Stanner
2025-10-08 8:53 ` [PATCH 13/28] drm/sched: Remove FIFO and RR and simplify to a single run queue Tvrtko Ursulin
2025-10-14 11:16 ` Philipp Stanner [this message]
2025-10-14 13:16 ` Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 14/28] drm/sched: Embed run queue singleton into the scheduler Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 15/28] accel/amdxdna: Remove drm_sched_init_args->num_rqs usage Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 16/28] accel/rocket: " Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 17/28] drm/amdgpu: " Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 18/28] drm/etnaviv: " Tvrtko Ursulin
2025-10-08 10:31 ` Christian Gmeiner
2025-10-08 8:53 ` [PATCH 19/28] drm/imagination: " Tvrtko Ursulin
2025-10-10 14:29 ` Matt Coster
2025-10-08 8:53 ` [PATCH 20/28] drm/lima: " Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 21/28] drm/msm: " Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 22/28] drm/nouveau: " Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 23/28] drm/panfrost: " Tvrtko Ursulin
2025-10-08 14:55 ` Steven Price
2025-10-08 8:53 ` [PATCH 24/28] drm/panthor: " Tvrtko Ursulin
2025-10-08 14:55 ` Steven Price
2025-10-10 10:02 ` Liviu Dudau
2025-10-08 8:53 ` [PATCH 25/28] drm/sched: " Tvrtko Ursulin
2025-10-08 22:44 ` Matthew Brost
2025-10-08 8:53 ` [PATCH 26/28] drm/v3d: " Tvrtko Ursulin
2025-10-10 14:15 ` Melissa Wen
2025-10-08 8:53 ` [PATCH 27/28] drm/xe: " Tvrtko Ursulin
2025-10-08 8:53 ` [PATCH 28/28] drm/sched: Remove drm_sched_init_args->num_rqs Tvrtko Ursulin
2025-10-10 13:00 ` Philipp Stanner
2025-10-11 14:58 ` Tvrtko Ursulin
2025-10-10 8:59 ` [PATCH 00/28] Fair DRM scheduler Philipp Stanner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44dfae80b8e504d6908cae79fab707f02b974834.camel@mailbox.org \
--to=phasta@mailbox.org \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=dakr@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=kernel-dev@igalia.com \
--cc=matthew.brost@intel.com \
--cc=phasta@kernel.org \
--cc=tvrtko.ursulin@igalia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox