Re: [PATCH 11/28] drm/sched: Favour interactive clients slightly

AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Philipp Stanner <phasta@mailbox.org>
To: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>,
	amd-gfx@lists.freedesktop.org,  dri-devel@lists.freedesktop.org
Cc: kernel-dev@igalia.com,
	"Christian König" <christian.koenig@amd.com>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"Matthew Brost" <matthew.brost@intel.com>,
	"Philipp Stanner" <phasta@kernel.org>,
	"Pierre-Eric Pelloux-Prayer" <pierre-eric.pelloux-prayer@amd.com>
Subject: Re: [PATCH 11/28] drm/sched: Favour interactive clients slightly
Date: Tue, 14 Oct 2025 12:53:23 +0200	[thread overview]
Message-ID: <618a50aabddace2531375bc18fb1ca9b00297490.camel@mailbox.org> (raw)
In-Reply-To: <20251008085359.52404-12-tvrtko.ursulin@igalia.com>

On Wed, 2025-10-08 at 09:53 +0100, Tvrtko Ursulin wrote:
> GPUs do not always implement preemption and DRM scheduler definitely
> does not support it at the front end scheduling level. This means
> execution quanta can be quite long and is controlled by userspace,
> consequence of which is picking the "wrong" entity to run can have a
> larger negative effect than it would have with a virtual runtime based CPU
> scheduler.
> 
> Another important consideration is that rendering clients often have
> shallow submission queues, meaning they will be entering and exiting the
> scheduler's runnable queue often.
> 
> Relevant scenario here is what happens when an entity re-joins the
> runnable queue with other entities already present. One cornerstone of the
> virtual runtime algorithm is to let it re-join at the head and rely on the
> virtual runtime accounting and timeslicing to sort it out.
> 
> However, as explained above, this may not work perfectly in the GPU world.
> Entity could always get to overtake the existing entities, or not,
> depending on the submission order and rbtree equal key insertion
> behaviour.
> 
> Allow interactive jobs to overtake entities already queued up for the
> limited case when interactive entity is re-joining the queue after
> being idle.
> 
> This gives more opportunity for the compositors to have their rendering
> executed before the GPU hogs even if they have been configured with the
> same scheduling priority.
> 
> To classify a client as interactive we look at its average job duration
> versus the average for the whole scheduler. We can track this easily by
> plugging into the existing job runtime tracking and applying the
> exponential moving average window on the past submissions. Then, all other
> things being equal, we let the more interactive jobs go first.

OK so this patch is new. Why was it added? The cover letter says:

"Improved handling of interactive clients by replacing the random noise
on tie approach with the average job duration statistics."

So this is based on additional research you have done in the mean time?
Does it change behavior significantly when compared to the RFC?

The firmware scheduler bros are not affected in any case. Still, I
think that the RFC we discussed in the past and at XDC is now quite
more different from the actual proposal in this v1.

I suppose it's in general good for graphics applications.. what about
compute, doesn't that have longer jobs? Probably still good for people
who do compute on their productive system..

@AMD:
can you review / ack this?

P.

> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Philipp Stanner <phasta@kernel.org>
> Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c   |  1 +
>  drivers/gpu/drm/scheduler/sched_internal.h | 15 ++++++++++++---
>  drivers/gpu/drm/scheduler/sched_main.c     |  8 +++++++-
>  drivers/gpu/drm/scheduler/sched_rq.c       | 14 ++++++++++++++
>  include/drm/gpu_scheduler.h                |  5 +++++
>  5 files changed, 39 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 58f51875547a..1715e1caec40 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -61,6 +61,7 @@ static struct drm_sched_entity_stats *drm_sched_entity_stats_alloc(void)
>  
>  	kref_init(&stats->kref);
>  	spin_lock_init(&stats->lock);
> +	ewma_drm_sched_avgtime_init(&stats->avg_job_us);
>  
>  	return stats;
>  }
> diff --git a/drivers/gpu/drm/scheduler/sched_internal.h b/drivers/gpu/drm/scheduler/sched_internal.h
> index c94e38acc6f2..a120efc5d763 100644
> --- a/drivers/gpu/drm/scheduler/sched_internal.h
> +++ b/drivers/gpu/drm/scheduler/sched_internal.h
> @@ -20,6 +20,7 @@
>   * @runtime: time entity spent on the GPU.
>   * @prev_runtime: previous @runtime used to get the runtime delta
>   * @vruntime: virtual runtime as accumulated by the fair algorithm
> + * @avg_job_us: average job duration
>   */
>  struct drm_sched_entity_stats {
>  	struct kref	kref;
> @@ -27,6 +28,8 @@ struct drm_sched_entity_stats {
>  	ktime_t		runtime;
>  	ktime_t		prev_runtime;
>  	u64		vruntime;
> +
> +	struct ewma_drm_sched_avgtime   avg_job_us;
>  };
>  
>  /* Used to choose between FIFO and RR job-scheduling */
> @@ -153,20 +156,26 @@ drm_sched_entity_stats_put(struct drm_sched_entity_stats *stats)
>   * @job: Scheduler job to account.
>   *
>   * Accounts the execution time of @job to its respective entity stats object.
> + *
> + * Returns job's real duration in micro seconds.
>   */
> -static inline void
> +static inline ktime_t
>  drm_sched_entity_stats_job_add_gpu_time(struct drm_sched_job *job)
>  {
>  	struct drm_sched_entity_stats *stats = job->entity_stats;
>  	struct drm_sched_fence *s_fence = job->s_fence;
> -	ktime_t start, end;
> +	ktime_t start, end, duration;
>  
>  	start = dma_fence_timestamp(&s_fence->scheduled);
>  	end = dma_fence_timestamp(&s_fence->finished);
> +	duration = ktime_sub(end, start);
>  
>  	spin_lock(&stats->lock);
> -	stats->runtime = ktime_add(stats->runtime, ktime_sub(end, start));
> +	stats->runtime = ktime_add(stats->runtime, duration);
> +	ewma_drm_sched_avgtime_add(&stats->avg_job_us, ktime_to_us(duration));
>  	spin_unlock(&stats->lock);
> +
> +	return duration;
>  }
>  
>  #endif
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 8d8f9c8411f5..204d99c6699f 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -1000,7 +1000,12 @@ static void drm_sched_free_job_work(struct work_struct *w)
>  	struct drm_sched_job *job;
>  
>  	while ((job = drm_sched_get_finished_job(sched))) {
> -		drm_sched_entity_stats_job_add_gpu_time(job);
> +		ktime_t duration = drm_sched_entity_stats_job_add_gpu_time(job);
> +
> +		/* Serialized by the worker. */
> +		ewma_drm_sched_avgtime_add(&sched->avg_job_us,
> +					   ktime_to_us(duration));
> +
>  		sched->ops->free_job(job);
>  	}
>  
> @@ -1158,6 +1163,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_
>  	atomic_set(&sched->_score, 0);
>  	atomic64_set(&sched->job_id_count, 0);
>  	sched->pause_submit = false;
> +	ewma_drm_sched_avgtime_init(&sched->avg_job_us);
>  
>  	sched->ready = true;
>  	return 0;
> diff --git a/drivers/gpu/drm/scheduler/sched_rq.c b/drivers/gpu/drm/scheduler/sched_rq.c
> index b868c794cc9d..02742869e75b 100644
> --- a/drivers/gpu/drm/scheduler/sched_rq.c
> +++ b/drivers/gpu/drm/scheduler/sched_rq.c
> @@ -150,6 +150,20 @@ drm_sched_entity_restore_vruntime(struct drm_sched_entity *entity,
>  			 * Higher priority can go first.
>  			 */
>  			vruntime = -us_to_ktime(rq_prio - prio);
> +		} else {
> +			struct drm_gpu_scheduler *sched = entity->rq->sched;
> +
> +			/*
> +			 * Favour entity with shorter jobs (interactivity).
> +			 *
> +			 * (Unlocked read is fine since it is just heuristics.)
> +			 *
> +			 */
> +			if (ewma_drm_sched_avgtime_read(&stats->avg_job_us) <=
> +			    ewma_drm_sched_avgtime_read(&sched->avg_job_us))
> +				vruntime = -1;
> +			else
> +				vruntime = 1;
>  		}
>  	}
>  
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index bc25508a6ff6..a7e407e04ce0 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -25,11 +25,14 @@
>  #define _DRM_GPU_SCHEDULER_H_
>  
>  #include <drm/spsc_queue.h>
> +#include <linux/average.h>
>  #include <linux/dma-fence.h>
>  #include <linux/completion.h>
>  #include <linux/xarray.h>
>  #include <linux/workqueue.h>
>  
> +DECLARE_EWMA(drm_sched_avgtime, 6, 4);
> +
>  #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
>  
>  /**
> @@ -581,6 +584,7 @@ struct drm_sched_backend_ops {
>   * @job_id_count: used to assign unique id to the each job.
>   * @submit_wq: workqueue used to queue @work_run_job and @work_free_job
>   * @timeout_wq: workqueue used to queue @work_tdr
> + * @avg_job_us: Average job duration
>   * @work_run_job: work which calls run_job op of each scheduler.
>   * @work_free_job: work which calls free_job op of each scheduler.
>   * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the
> @@ -612,6 +616,7 @@ struct drm_gpu_scheduler {
>  	atomic64_t			job_id_count;
>  	struct workqueue_struct		*submit_wq;
>  	struct workqueue_struct		*timeout_wq;
> +	struct ewma_drm_sched_avgtime   avg_job_us;
>  	struct work_struct		work_run_job;
>  	struct work_struct		work_free_job;
>  	struct delayed_work		work_tdr;

next prev parent reply	other threads:[~2025-10-14 12:36 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-08  8:53 [PATCH 00/28] Fair DRM scheduler Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 01/28] drm/sched: Reverse drm_sched_rq_init arguments Tvrtko Ursulin
2025-10-10  8:55   ` Philipp Stanner
2025-10-10  9:46     ` Tvrtko Ursulin
2025-10-10 10:36       ` Philipp Stanner
2025-10-11 13:21         ` Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 02/28] drm/sched: Add some scheduling quality unit tests Tvrtko Ursulin
2025-10-10  9:38   ` Philipp Stanner
2025-10-11 13:09     ` Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 03/28] drm/sched: Add some more " Tvrtko Ursulin
2025-10-10  9:48   ` Philipp Stanner
2025-10-11 13:21     ` Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 04/28] drm/sched: Implement RR via FIFO Tvrtko Ursulin
2025-10-10 10:18   ` Philipp Stanner
2025-10-11 13:30     ` Tvrtko Ursulin
2025-10-14  6:40       ` Philipp Stanner
2025-10-08  8:53 ` [PATCH 05/28] drm/sched: Consolidate entity run queue management Tvrtko Ursulin
2025-10-10 10:49   ` Philipp Stanner
2025-10-11 14:19     ` Tvrtko Ursulin
2025-10-14  6:53       ` Philipp Stanner
2025-10-14  7:26         ` Tvrtko Ursulin
2025-10-14  8:52           ` Philipp Stanner
2025-10-14 10:04             ` Tvrtko Ursulin
2025-10-14 11:23               ` Philipp Stanner
2025-10-08  8:53 ` [PATCH 06/28] drm/sched: Move run queue related code into a separate file Tvrtko Ursulin
2025-10-08 22:49   ` Matthew Brost
2025-10-08  8:53 ` [PATCH 07/28] drm/sched: Free all finished jobs at once Tvrtko Ursulin
2025-10-08 22:48   ` Matthew Brost
2025-10-08  8:53 ` [PATCH 08/28] drm/sched: Account entity GPU time Tvrtko Ursulin
2025-10-10 12:22   ` Philipp Stanner
2025-10-11 14:56     ` Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 09/28] drm/sched: Remove idle entity from tree Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 10/28] drm/sched: Add fair scheduling policy Tvrtko Ursulin
2025-10-14 10:27   ` Philipp Stanner
2025-10-14 12:56     ` Tvrtko Ursulin
2025-10-14 14:02       ` Philipp Stanner
2025-10-14 14:32         ` Simona Vetter
2025-10-14 14:58           ` Tvrtko Ursulin
2025-10-16  7:06             ` Philipp Stanner
2025-10-16  8:42               ` Tvrtko Ursulin
2025-10-16  9:50                 ` Danilo Krummrich
2025-10-16 10:54                   ` Tvrtko Ursulin
2025-10-16 11:14                     ` Danilo Krummrich
2025-10-08  8:53 ` [PATCH 11/28] drm/sched: Favour interactive clients slightly Tvrtko Ursulin
2025-10-14 10:53   ` Philipp Stanner [this message]
2025-10-14 12:20     ` Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 12/28] drm/sched: Switch default policy to fair Tvrtko Ursulin
2025-10-10 12:56   ` Philipp Stanner
2025-10-08  8:53 ` [PATCH 13/28] drm/sched: Remove FIFO and RR and simplify to a single run queue Tvrtko Ursulin
2025-10-14 11:16   ` Philipp Stanner
2025-10-14 13:16     ` Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 14/28] drm/sched: Embed run queue singleton into the scheduler Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 15/28] accel/amdxdna: Remove drm_sched_init_args->num_rqs usage Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 16/28] accel/rocket: " Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 17/28] drm/amdgpu: " Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 18/28] drm/etnaviv: " Tvrtko Ursulin
2025-10-08 10:31   ` Christian Gmeiner
2025-10-08  8:53 ` [PATCH 19/28] drm/imagination: " Tvrtko Ursulin
2025-10-10 14:29   ` Matt Coster
2025-10-08  8:53 ` [PATCH 20/28] drm/lima: " Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 21/28] drm/msm: " Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 22/28] drm/nouveau: " Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 23/28] drm/panfrost: " Tvrtko Ursulin
2025-10-08 14:55   ` Steven Price
2025-10-08  8:53 ` [PATCH 24/28] drm/panthor: " Tvrtko Ursulin
2025-10-08 14:55   ` Steven Price
2025-10-10 10:02   ` Liviu Dudau
2025-10-08  8:53 ` [PATCH 25/28] drm/sched: " Tvrtko Ursulin
2025-10-08 22:44   ` Matthew Brost
2025-10-08  8:53 ` [PATCH 26/28] drm/v3d: " Tvrtko Ursulin
2025-10-10 14:15   ` Melissa Wen
2025-10-08  8:53 ` [PATCH 27/28] drm/xe: " Tvrtko Ursulin
2025-10-08  8:53 ` [PATCH 28/28] drm/sched: Remove drm_sched_init_args->num_rqs Tvrtko Ursulin
2025-10-10 13:00   ` Philipp Stanner
2025-10-11 14:58     ` Tvrtko Ursulin
2025-10-10  8:59 ` [PATCH 00/28] Fair DRM scheduler Philipp Stanner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=618a50aabddace2531375bc18fb1ca9b00297490.camel@mailbox.org \
    --to=phasta@mailbox.org \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kernel-dev@igalia.com \
    --cc=matthew.brost@intel.com \
    --cc=phasta@kernel.org \
    --cc=pierre-eric.pelloux-prayer@amd.com \
    --cc=tvrtko.ursulin@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox