* [PATCH v3] drm/sched: Add warning for removing hack in drm_sched_fini()
@ 2025-10-23 12:34 Philipp Stanner
2025-10-30 15:23 ` Philipp Stanner
2025-10-31 14:10 ` Pierre-Eric Pelloux-Prayer
0 siblings, 2 replies; 4+ messages in thread
From: Philipp Stanner @ 2025-10-23 12:34 UTC (permalink / raw)
To: Matthew Brost, Danilo Krummrich, Philipp Stanner,
Christian König, David Airlie, Simona Vetter, Tvrtko Ursulin
Cc: dri-devel, linux-kernel, linux-media
The assembled developers agreed at the X.Org Developers Conference 2025
that the hack added for amdgpu in drm_sched_fini() shall be removed. It
shouldn't be needed by amdgpu anymore.
As it's unclear whether all drivers really follow the life time rule of
entities having to be torn down before their scheduler, it is reasonable
to warn for a while before removing the hack.
Add a warning in drm_sched_fini() that fires if an entity is still
active.
Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
Changes in v3:
- Add a READ_ONCE() + comment to make the warning slightly less
horrible.
Changes in v2:
- Fix broken brackets.
---
drivers/gpu/drm/scheduler/sched_main.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 46119aacb809..31039b08c7b9 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1419,7 +1419,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
struct drm_sched_rq *rq = sched->sched_rq[i];
spin_lock(&rq->lock);
- list_for_each_entry(s_entity, &rq->entities, list)
+ list_for_each_entry(s_entity, &rq->entities, list) {
/*
* Prevents reinsertion and marks job_queue as idle,
* it will be removed from the rq in drm_sched_entity_fini()
@@ -1440,8 +1440,15 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
* For now, this remains a potential race in all
* drivers that keep entities alive for longer than
* the scheduler.
+ *
+ * The READ_ONCE() is there to make the lockless read
+ * (warning about the lockless write below) slightly
+ * less broken...
*/
+ if (!READ_ONCE(s_entity->stopped))
+ dev_warn(sched->dev, "Tearing down scheduler with active entities!\n");
s_entity->stopped = true;
+ }
spin_unlock(&rq->lock);
kfree(sched->sched_rq[i]);
}
--
2.49.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v3] drm/sched: Add warning for removing hack in drm_sched_fini()
2025-10-23 12:34 [PATCH v3] drm/sched: Add warning for removing hack in drm_sched_fini() Philipp Stanner
@ 2025-10-30 15:23 ` Philipp Stanner
2025-10-31 14:10 ` Pierre-Eric Pelloux-Prayer
1 sibling, 0 replies; 4+ messages in thread
From: Philipp Stanner @ 2025-10-30 15:23 UTC (permalink / raw)
To: Philipp Stanner, Matthew Brost, Danilo Krummrich,
Christian König, David Airlie, Simona Vetter, Tvrtko Ursulin
Cc: dri-devel, linux-kernel, linux-media
On Thu, 2025-10-23 at 14:34 +0200, Philipp Stanner wrote:
> The assembled developers agreed at the X.Org Developers Conference 2025
> that the hack added for amdgpu in drm_sched_fini() shall be removed. It
> shouldn't be needed by amdgpu anymore.
>
> As it's unclear whether all drivers really follow the life time rule of
> entities having to be torn down before their scheduler, it is reasonable
> to warn for a while before removing the hack.
>
> Add a warning in drm_sched_fini() that fires if an entity is still
> active.
>
> Signed-off-by: Philipp Stanner <phasta@kernel.org>
Can someone review this?
At XDC we agreed on removing the hack, but wanted to add a warning
print first for a few releases, to really catch if there are no users
anymore.
Thx
P.
> ---
> Changes in v3:
> - Add a READ_ONCE() + comment to make the warning slightly less
> horrible.
>
> Changes in v2:
> - Fix broken brackets.
> ---
> drivers/gpu/drm/scheduler/sched_main.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 46119aacb809..31039b08c7b9 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -1419,7 +1419,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
> struct drm_sched_rq *rq = sched->sched_rq[i];
>
> spin_lock(&rq->lock);
> - list_for_each_entry(s_entity, &rq->entities, list)
> + list_for_each_entry(s_entity, &rq->entities, list) {
> /*
> * Prevents reinsertion and marks job_queue as idle,
> * it will be removed from the rq in drm_sched_entity_fini()
> @@ -1440,8 +1440,15 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
> * For now, this remains a potential race in all
> * drivers that keep entities alive for longer than
> * the scheduler.
> + *
> + * The READ_ONCE() is there to make the lockless read
> + * (warning about the lockless write below) slightly
> + * less broken...
> */
> + if (!READ_ONCE(s_entity->stopped))
> + dev_warn(sched->dev, "Tearing down scheduler with active entities!\n");
> s_entity->stopped = true;
> + }
> spin_unlock(&rq->lock);
> kfree(sched->sched_rq[i]);
> }
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3] drm/sched: Add warning for removing hack in drm_sched_fini()
2025-10-23 12:34 [PATCH v3] drm/sched: Add warning for removing hack in drm_sched_fini() Philipp Stanner
2025-10-30 15:23 ` Philipp Stanner
@ 2025-10-31 14:10 ` Pierre-Eric Pelloux-Prayer
2025-10-31 14:41 ` Philipp Stanner
1 sibling, 1 reply; 4+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2025-10-31 14:10 UTC (permalink / raw)
To: Philipp Stanner, Matthew Brost, Danilo Krummrich,
Christian König, David Airlie, Simona Vetter, Tvrtko Ursulin
Cc: dri-devel, linux-kernel, linux-media
Hi Philipp,
Le 23/10/2025 à 14:34, Philipp Stanner a écrit :
> The assembled developers agreed at the X.Org Developers Conference 2025
> that the hack added for amdgpu in drm_sched_fini() shall be removed. It
> shouldn't be needed by amdgpu anymore.
>
> As it's unclear whether all drivers really follow the life time rule of
> entities having to be torn down before their scheduler, it is reasonable
> to warn for a while before removing the hack.
>
> Add a warning in drm_sched_fini() that fires if an entity is still
> active.
>
> Signed-off-by: Philipp Stanner <phasta@kernel.org>
> ---
> Changes in v3:
> - Add a READ_ONCE() + comment to make the warning slightly less
> horrible.
>
> Changes in v2:
> - Fix broken brackets.
> ---
> drivers/gpu/drm/scheduler/sched_main.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 46119aacb809..31039b08c7b9 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -1419,7 +1419,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
> struct drm_sched_rq *rq = sched->sched_rq[i];
>
> spin_lock(&rq->lock);
> - list_for_each_entry(s_entity, &rq->entities, list)
> + list_for_each_entry(s_entity, &rq->entities, list) {
> /*
> * Prevents reinsertion and marks job_queue as idle,
> * it will be removed from the rq in drm_sched_entity_fini()
> @@ -1440,8 +1440,15 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
> * For now, this remains a potential race in all
> * drivers that keep entities alive for longer than
> * the scheduler.
> + *
> + * The READ_ONCE() is there to make the lockless read
> + * (warning about the lockless write below) slightly
> + * less broken...
> */
> + if (!READ_ONCE(s_entity->stopped))
> + dev_warn(sched->dev, "Tearing down scheduler with active entities!\n");
> s_entity->stopped = true;
> + }
The patch is Acked-by: Pierre-Eric Pelloux-Prayer
<pierre-eric.pelloux-prayer@amd.com>
Thanks.
> spin_unlock(&rq->lock);
> kfree(sched->sched_rq[i]);
> }
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3] drm/sched: Add warning for removing hack in drm_sched_fini()
2025-10-31 14:10 ` Pierre-Eric Pelloux-Prayer
@ 2025-10-31 14:41 ` Philipp Stanner
0 siblings, 0 replies; 4+ messages in thread
From: Philipp Stanner @ 2025-10-31 14:41 UTC (permalink / raw)
To: Pierre-Eric Pelloux-Prayer, Philipp Stanner, Matthew Brost,
Danilo Krummrich, Christian König, David Airlie,
Simona Vetter, Tvrtko Ursulin
Cc: dri-devel, linux-kernel, linux-media
On Fri, 2025-10-31 at 15:10 +0100, Pierre-Eric Pelloux-Prayer wrote:
> Hi Philipp,
>
> Le 23/10/2025 à 14:34, Philipp Stanner a écrit :
> > The assembled developers agreed at the X.Org Developers Conference 2025
> > that the hack added for amdgpu in drm_sched_fini() shall be removed. It
> > shouldn't be needed by amdgpu anymore.
> >
> > As it's unclear whether all drivers really follow the life time rule of
> > entities having to be torn down before their scheduler, it is reasonable
> > to warn for a while before removing the hack.
> >
> > Add a warning in drm_sched_fini() that fires if an entity is still
> > active.
> >
> > Signed-off-by: Philipp Stanner <phasta@kernel.org>
[…]
>
> The patch is Acked-by: Pierre-Eric Pelloux-Prayer
> <pierre-eric.pelloux-prayer@amd.com>
Pushed to drm-misc-next, thanks.
For the future: b4 / maintainer-tools wasn't able to automatically
harvest your Acked-by. Would be helpful if you have the A-b on a single
line without line break and without other content in the future
Have a nice weekend,
P.
>
> Thanks.
>
>
> > spin_unlock(&rq->lock);
> > kfree(sched->sched_rq[i]);
> > }
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-10-31 14:48 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-23 12:34 [PATCH v3] drm/sched: Add warning for removing hack in drm_sched_fini() Philipp Stanner
2025-10-30 15:23 ` Philipp Stanner
2025-10-31 14:10 ` Pierre-Eric Pelloux-Prayer
2025-10-31 14:41 ` Philipp Stanner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox