Re: [Intel-xe] [PATCH] drm/sched: Eliminate drm_sched_run_job_queue_if_ready()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Luben Tuikov <ltuikov89@gmail.com>
Cc: robdclark@chromium.org, sarah.walker@imgtec.com,
	ltuikov@yahoo.com, ketil.johnsen@arm.com, lina@asahilina.net,
	mcanal@igalia.com, Liviu.Dudau@arm.com,
	dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	boris.brezillon@collabora.com, dakr@redhat.com,
	donald.robson@imgtec.com, christian.koenig@amd.com,
	faith.ekstrand@collabora.com
Subject: Re: [Intel-xe] [PATCH] drm/sched: Eliminate drm_sched_run_job_queue_if_ready()
Date: Fri, 3 Nov 2023 10:39:15 +0000	[thread overview]
Message-ID: <d2bf144f-e388-4cb1-bc18-12efad4f677b@linux.intel.com> (raw)
In-Reply-To: <20231102224653.5785-2-ltuikov89@gmail.com>


On 02/11/2023 22:46, Luben Tuikov wrote:
> Eliminate drm_sched_run_job_queue_if_ready() and instead just call
> drm_sched_run_job_queue() in drm_sched_free_job_work(). The problem is that
> the former function uses drm_sched_select_entity() to determine if the
> scheduler had an entity ready in one of its run-queues, and in the case of the
> Round-Robin (RR) scheduling, the function drm_sched_rq_select_entity_rr() does
> just that, selects the _next_ entity which is ready, sets up the run-queue and
> completion and returns that entity. The FIFO scheduling algorithm is unaffected.
> 
> Now, since drm_sched_run_job_work() also calls drm_sched_select_entity(), then
> in the case of RR scheduling, that would result in calling select_entity()
> twice, which may result in skipping a ready entity if more than one entity is
> ready. This commit fixes this by eliminating the if_ready() variant.

Fixes: is missing since the regression already landed.

> 
> Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 14 ++------------
>   1 file changed, 2 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 98b2ad54fc7071..05816e7cae8c8b 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -1040,16 +1040,6 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list,
>   }
>   EXPORT_SYMBOL(drm_sched_pick_best);
>   
> -/**
> - * drm_sched_run_job_queue_if_ready - enqueue run-job work if ready
> - * @sched: scheduler instance
> - */
> -static void drm_sched_run_job_queue_if_ready(struct drm_gpu_scheduler *sched)
> -{
> -	if (drm_sched_select_entity(sched))
> -		drm_sched_run_job_queue(sched);
> -}
> -
>   /**
>    * drm_sched_free_job_work - worker to call free_job
>    *
> @@ -1069,7 +1059,7 @@ static void drm_sched_free_job_work(struct work_struct *w)
>   		sched->ops->free_job(cleanup_job);
>   
>   		drm_sched_free_job_queue_if_done(sched);
> -		drm_sched_run_job_queue_if_ready(sched);
> +		drm_sched_run_job_queue(sched);

It works but is a bit wasteful causing needless CPU wake ups with a 
potentially empty queue, both here and in drm_sched_run_job_work below.

What would be the problem in having a "peek" type helper? It would be 
easy to do it in a single spin lock section instead of drop and re-acquire.

What is even the point of having the re-queue here _inside_ the if 
(cleanup_job) block? See 
https://lists.freedesktop.org/archives/dri-devel/2023-November/429037.html. 
Because of the lock drop and re-acquire I don't see that it makes sense 
to make potential re-queue depend on the existence of current finished job.

Also the point of doing the re-queue of the run job queue from the free 
worker?

(I suppose re-queuing the _free_ worker itself is needed in the current 
design, albeit inefficient.)

Regards,

Tvrtko

>   	}
>   }
>   
> @@ -1127,7 +1117,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
>   	}
>   
>   	wake_up(&sched->job_scheduled);
> -	drm_sched_run_job_queue_if_ready(sched);
> +	drm_sched_run_job_queue(sched);
>   }
>   
>   /**
> 
> base-commit: 6fd9487147c4f18ad77eea00bd8c9189eec74a3e

WARNING: multiple messages have this Message-ID (diff)

From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Luben Tuikov <ltuikov89@gmail.com>
Cc: matthew.brost@intel.com, robdclark@chromium.org,
	sarah.walker@imgtec.com, ltuikov@yahoo.com,
	ketil.johnsen@arm.com, lina@asahilina.net, mcanal@igalia.com,
	Liviu.Dudau@arm.com, dri-devel@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, boris.brezillon@collabora.com,
	dakr@redhat.com, donald.robson@imgtec.com,
	christian.koenig@amd.com, faith.ekstrand@collabora.com
Subject: Re: [PATCH] drm/sched: Eliminate drm_sched_run_job_queue_if_ready()
Date: Fri, 3 Nov 2023 10:39:15 +0000	[thread overview]
Message-ID: <d2bf144f-e388-4cb1-bc18-12efad4f677b@linux.intel.com> (raw)
In-Reply-To: <20231102224653.5785-2-ltuikov89@gmail.com>


On 02/11/2023 22:46, Luben Tuikov wrote:
> Eliminate drm_sched_run_job_queue_if_ready() and instead just call
> drm_sched_run_job_queue() in drm_sched_free_job_work(). The problem is that
> the former function uses drm_sched_select_entity() to determine if the
> scheduler had an entity ready in one of its run-queues, and in the case of the
> Round-Robin (RR) scheduling, the function drm_sched_rq_select_entity_rr() does
> just that, selects the _next_ entity which is ready, sets up the run-queue and
> completion and returns that entity. The FIFO scheduling algorithm is unaffected.
> 
> Now, since drm_sched_run_job_work() also calls drm_sched_select_entity(), then
> in the case of RR scheduling, that would result in calling select_entity()
> twice, which may result in skipping a ready entity if more than one entity is
> ready. This commit fixes this by eliminating the if_ready() variant.

Fixes: is missing since the regression already landed.

> 
> Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 14 ++------------
>   1 file changed, 2 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 98b2ad54fc7071..05816e7cae8c8b 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -1040,16 +1040,6 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list,
>   }
>   EXPORT_SYMBOL(drm_sched_pick_best);
>   
> -/**
> - * drm_sched_run_job_queue_if_ready - enqueue run-job work if ready
> - * @sched: scheduler instance
> - */
> -static void drm_sched_run_job_queue_if_ready(struct drm_gpu_scheduler *sched)
> -{
> -	if (drm_sched_select_entity(sched))
> -		drm_sched_run_job_queue(sched);
> -}
> -
>   /**
>    * drm_sched_free_job_work - worker to call free_job
>    *
> @@ -1069,7 +1059,7 @@ static void drm_sched_free_job_work(struct work_struct *w)
>   		sched->ops->free_job(cleanup_job);
>   
>   		drm_sched_free_job_queue_if_done(sched);
> -		drm_sched_run_job_queue_if_ready(sched);
> +		drm_sched_run_job_queue(sched);

It works but is a bit wasteful causing needless CPU wake ups with a 
potentially empty queue, both here and in drm_sched_run_job_work below.

What would be the problem in having a "peek" type helper? It would be 
easy to do it in a single spin lock section instead of drop and re-acquire.

What is even the point of having the re-queue here _inside_ the if 
(cleanup_job) block? See 
https://lists.freedesktop.org/archives/dri-devel/2023-November/429037.html. 
Because of the lock drop and re-acquire I don't see that it makes sense 
to make potential re-queue depend on the existence of current finished job.

Also the point of doing the re-queue of the run job queue from the free 
worker?

(I suppose re-queuing the _free_ worker itself is needed in the current 
design, albeit inefficient.)

Regards,

Tvrtko

>   	}
>   }
>   
> @@ -1127,7 +1117,7 @@ static void drm_sched_run_job_work(struct work_struct *w)
>   	}
>   
>   	wake_up(&sched->job_scheduled);
> -	drm_sched_run_job_queue_if_ready(sched);
> +	drm_sched_run_job_queue(sched);
>   }
>   
>   /**
> 
> base-commit: 6fd9487147c4f18ad77eea00bd8c9189eec74a3e

next prev parent reply	other threads:[~2023-11-03 10:39 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-31  3:24 [Intel-xe] [PATCH v8 0/5] DRM scheduler changes for Xe Matthew Brost
2023-10-31  3:24 ` Matthew Brost
2023-10-31  3:24 ` [Intel-xe] [PATCH v8 1/5] drm/sched: Add drm_sched_wqueue_* helpers Matthew Brost
2023-10-31  3:24   ` Matthew Brost
2023-10-31  3:24 ` [Intel-xe] [PATCH v8 2/5] drm/sched: Convert drm scheduler to use a work queue rather than kthread Matthew Brost
2023-10-31  3:24   ` Matthew Brost
2023-10-31  3:24 ` [Intel-xe] [PATCH v8 3/5] drm/sched: Split free_job into own work item Matthew Brost
2023-10-31  3:24   ` Matthew Brost
2023-11-01 22:13   ` [Intel-xe] " Luben Tuikov
2023-11-01 22:13     ` Luben Tuikov
2023-11-02 11:13   ` [Intel-xe] " Tvrtko Ursulin
2023-11-02 11:13     ` Tvrtko Ursulin
2023-11-02 22:46     ` [Intel-xe] [PATCH] drm/sched: Eliminate drm_sched_run_job_queue_if_ready() Luben Tuikov
2023-11-02 22:46       ` Luben Tuikov
2023-11-03 10:39       ` Tvrtko Ursulin [this message]
2023-11-03 10:39         ` Tvrtko Ursulin
2023-11-04  0:25         ` [Intel-xe] " Luben Tuikov
2023-11-04  0:25           ` Luben Tuikov
2023-11-06 12:54           ` [Intel-xe] " Tvrtko Ursulin
2023-11-06 12:54             ` Tvrtko Ursulin
2023-11-03 15:13       ` [Intel-xe] " Matthew Brost
2023-11-03 15:13         ` Matthew Brost
2023-11-04  0:24         ` [Intel-xe] " Luben Tuikov
2023-11-04  0:24           ` Luben Tuikov
2023-11-02 22:58     ` [Intel-xe] [PATCH v8 3/5] drm/sched: Split free_job into own work item Luben Tuikov
2023-11-02 22:58       ` Luben Tuikov
2023-11-07  4:10     ` [Intel-xe] [PATCH] drm/sched: Don't disturb the entity when in RR-mode scheduling Luben Tuikov
2023-11-07  4:10       ` Luben Tuikov
2023-11-07 11:48       ` [Intel-xe] " Matthew Brost
2023-11-07 11:48         ` Matthew Brost
2023-11-08  3:28         ` [Intel-xe] " Luben Tuikov
2023-11-08  3:28           ` Luben Tuikov
2023-11-07 17:53       ` [Intel-xe] " Danilo Krummrich
2023-11-07 17:53         ` Danilo Krummrich
2023-11-08  3:29         ` [Intel-xe] " Luben Tuikov
2023-11-08  3:29           ` Luben Tuikov
2023-11-08  0:41       ` [Intel-xe] " Danilo Krummrich
2023-11-08  0:41         ` Danilo Krummrich
2023-11-09  6:52         ` [Intel-xe] " Luben Tuikov
2023-11-09  6:52           ` Luben Tuikov
2023-11-09 19:24           ` [Intel-xe] " Danilo Krummrich
2023-11-09 19:24             ` Danilo Krummrich
2023-11-09 23:41             ` [Intel-xe] " Danilo Krummrich
2023-11-09 23:41               ` Danilo Krummrich
2023-11-09 23:49               ` [Intel-xe] " Luben Tuikov
2023-11-09 23:49                 ` Luben Tuikov
2023-11-27 13:30                 ` [Intel-xe] [PATCH] Revert "drm/sched: Qualify drm_sched_wakeup() by drm_sched_entity_is_ready()" Bert Karwatzki
2023-11-27 13:30                   ` Bert Karwatzki
2023-11-27 15:14                   ` [Intel-xe] " Luben Tuikov
2023-11-27 15:14                     ` Luben Tuikov
2023-10-31  3:24 ` [Intel-xe] [PATCH v8 4/5] drm/sched: Add drm_sched_start_timeout_unlocked helper Matthew Brost
2023-10-31  3:24   ` Matthew Brost
2023-10-31  3:24 ` [Intel-xe] [PATCH v8 5/5] drm/sched: Add a helper to queue TDR immediately Matthew Brost
2023-10-31  3:24   ` Matthew Brost
2023-10-31  3:31 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev10) Patchwork
2023-11-01 22:16 ` [Intel-xe] [PATCH v8 0/5] DRM scheduler changes for Xe Luben Tuikov
2023-11-01 22:16   ` Luben Tuikov
2023-11-02 22:49 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev11) Patchwork
2023-11-07  4:39 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev12) Patchwork
2023-11-09  7:12 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev13) Patchwork
2023-11-27 16:18 ` [Intel-xe] ✗ CI.Patch_applied: failure for DRM scheduler changes for Xe (rev14) Patchwork
2023-11-27 17:15   ` Bert Karwatzki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d2bf144f-e388-4cb1-bc18-12efad4f677b@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=Liviu.Dudau@arm.com \
    --cc=boris.brezillon@collabora.com \
    --cc=christian.koenig@amd.com \
    --cc=dakr@redhat.com \
    --cc=donald.robson@imgtec.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=faith.ekstrand@collabora.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=ketil.johnsen@arm.com \
    --cc=lina@asahilina.net \
    --cc=ltuikov89@gmail.com \
    --cc=ltuikov@yahoo.com \
    --cc=mcanal@igalia.com \
    --cc=robdclark@chromium.org \
    --cc=sarah.walker@imgtec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.