* [PATCH v2 1/2] drm/todo: Add section with task for GPU scheduler
2025-11-07 13:56 [PATCH v2 0/2] drm/todo: Add section for GPU Scheduler Philipp Stanner
@ 2025-11-07 13:57 ` Philipp Stanner
2025-11-07 13:57 ` [PATCH v2 2/2] drm/todo: Add entry for unlocked drm/sched rq readers Philipp Stanner
2025-11-27 12:49 ` [PATCH v2 0/2] drm/todo: Add section for GPU Scheduler Philipp Stanner
2 siblings, 0 replies; 6+ messages in thread
From: Philipp Stanner @ 2025-11-07 13:57 UTC (permalink / raw)
To: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Jonathan Corbet
Cc: dri-devel, linux-doc, linux-kernel, Philipp Stanner
The GPU scheduler has a great many problems and deserves its own TODO
section.
Add a section and a first task describing the problem of
drm_sched_resubmit_jobs() being deprecated without a successor.
Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
Documentation/gpu/todo.rst | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)
diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
index 9013ced318cb..084e148e78c1 100644
--- a/Documentation/gpu/todo.rst
+++ b/Documentation/gpu/todo.rst
@@ -878,6 +878,37 @@ Contact: Christian König
Level: Starter
+DRM GPU Scheduler
+=================
+
+Provide a universal successor for drm_sched_resubmit_jobs()
+------------------------------------------------------------
+
+drm_sched_resubmit_jobs() is deprecated. Main reason being that it leads to
+reinitializing dma_fences. See that function's docu for details. The better
+approach for valid resubmissions by amdgpu and Xe is (apparently) to figure out
+which job (and, through association: which entity) caused the hang. Then, the
+job's buffer data, together with all other jobs' buffer data currently in the
+same hardware ring, must be invalidated. This can for example be done by
+overwriting it. amdgpu currently determines which jobs are in the ring and need
+to be overwritten by keeping copies of the job. Xe obtains that information by
+directly accessing drm_sched's pending_list.
+
+Tasks:
+
+1. implement scheduler functionality through which the driver can obtain the
+ information which *broken* jobs are currently in the hardware ring.
+2. Such infrastructure would then typically be used in
+ drm_sched_backend_ops.timedout_job(). Document that.
+3. Port a driver as first user.
+4. Document the new alternative in the docu of deprecated
+ drm_sched_resubmit_jobs().
+
+Contact: Christian König <ckoenig.leichtzumerken@gmail.com>
+ Philipp Stanner <phasta@kernel.org>
+
+Level: Advanced
+
Outside DRM
===========
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 2/2] drm/todo: Add entry for unlocked drm/sched rq readers
2025-11-07 13:56 [PATCH v2 0/2] drm/todo: Add section for GPU Scheduler Philipp Stanner
2025-11-07 13:57 ` [PATCH v2 1/2] drm/todo: Add section with task for GPU scheduler Philipp Stanner
@ 2025-11-07 13:57 ` Philipp Stanner
2025-11-27 12:49 ` [PATCH v2 0/2] drm/todo: Add section for GPU Scheduler Philipp Stanner
2 siblings, 0 replies; 6+ messages in thread
From: Philipp Stanner @ 2025-11-07 13:57 UTC (permalink / raw)
To: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Jonathan Corbet
Cc: dri-devel, linux-doc, linux-kernel, Philipp Stanner
Runqueues are currently almost everywhere being read unlocked in
drm/sched. At XDC 2025, the assembled developers were unsure whether
that's legal and whether it can be fixed. Someone should find out.
Add a todo entry for the unlocked runqueue reader problem.
Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
Documentation/gpu/todo.rst | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
index 084e148e78c1..fc8bafd593d8 100644
--- a/Documentation/gpu/todo.rst
+++ b/Documentation/gpu/todo.rst
@@ -909,6 +909,20 @@ Contact: Christian König <ckoenig.leichtzumerken@gmail.com>
Level: Advanced
+Add locking for runqueues
+-------------------------
+
+There is an old FIXME by Sima in include/drm/gpu_scheduler.h. It details that
+struct drm_sched_rq is read at many places without any locks, not even with a
+READ_ONCE. At XDC 2025 no one could really tell why that is the case, whether
+locks are needed and whether they could be added. (But for real, that should
+probably be locked!). Check whether it's possible to add locks everywhere, and
+do so if yes.
+
+Contact: Philipp Stanner <phasta@kernel.org>
+
+Level: Intermediate
+
Outside DRM
===========
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 0/2] drm/todo: Add section for GPU Scheduler
2025-11-07 13:56 [PATCH v2 0/2] drm/todo: Add section for GPU Scheduler Philipp Stanner
2025-11-07 13:57 ` [PATCH v2 1/2] drm/todo: Add section with task for GPU scheduler Philipp Stanner
2025-11-07 13:57 ` [PATCH v2 2/2] drm/todo: Add entry for unlocked drm/sched rq readers Philipp Stanner
@ 2025-11-27 12:49 ` Philipp Stanner
2025-12-02 7:37 ` Dave Airlie
2 siblings, 1 reply; 6+ messages in thread
From: Philipp Stanner @ 2025-11-27 12:49 UTC (permalink / raw)
To: Philipp Stanner, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Jonathan Corbet,
Christian König, Tvrtko Ursulin, Matthew Brost
Cc: dri-devel, linux-doc, linux-kernel
+Cc Matthew, Tvrtko, Christian
On Fri, 2025-11-07 at 14:56 +0100, Philipp Stanner wrote:
> Changes in v2:
> - Fix wrong list item index in patch 1.
>
> The GPU Scheduler has enough problems to be covered by the drm todo
> list. Let's add an entry.
>
> This series is the succesor of [1].
>
> [1] https://lore.kernel.org/dri-devel/20251023143031.149496-2-phasta@kernel.org/
>
> Philipp Stanner (2):
> drm/todo: Add section with task for GPU scheduler
> drm/todo: Add entry for unlocked drm/sched rq readers
Can someone give this series a review please?
Thx,
P.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 0/2] drm/todo: Add section for GPU Scheduler
2025-11-27 12:49 ` [PATCH v2 0/2] drm/todo: Add section for GPU Scheduler Philipp Stanner
@ 2025-12-02 7:37 ` Dave Airlie
2025-12-02 9:42 ` Philipp Stanner
0 siblings, 1 reply; 6+ messages in thread
From: Dave Airlie @ 2025-12-02 7:37 UTC (permalink / raw)
To: phasta
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Simona Vetter, Jonathan Corbet, Christian König,
Tvrtko Ursulin, Matthew Brost, dri-devel, linux-doc, linux-kernel
Acked-by: Dave Airlie <airlied@redhat.com>
On Thu, 27 Nov 2025 at 22:50, Philipp Stanner <phasta@mailbox.org> wrote:
>
> +Cc Matthew, Tvrtko, Christian
>
> On Fri, 2025-11-07 at 14:56 +0100, Philipp Stanner wrote:
> > Changes in v2:
> > - Fix wrong list item index in patch 1.
> >
> > The GPU Scheduler has enough problems to be covered by the drm todo
> > list. Let's add an entry.
> >
> > This series is the succesor of [1].
> >
> > [1] https://lore.kernel.org/dri-devel/20251023143031.149496-2-phasta@kernel.org/
> >
> > Philipp Stanner (2):
> > drm/todo: Add section with task for GPU scheduler
> > drm/todo: Add entry for unlocked drm/sched rq readers
>
> Can someone give this series a review please?
>
> Thx,
> P.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 0/2] drm/todo: Add section for GPU Scheduler
2025-12-02 7:37 ` Dave Airlie
@ 2025-12-02 9:42 ` Philipp Stanner
0 siblings, 0 replies; 6+ messages in thread
From: Philipp Stanner @ 2025-12-02 9:42 UTC (permalink / raw)
To: Dave Airlie, phasta
Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Simona Vetter, Jonathan Corbet, Christian König,
Tvrtko Ursulin, Matthew Brost, dri-devel, linux-doc, linux-kernel
On Tue, 2025-12-02 at 17:37 +1000, Dave Airlie wrote:
> Acked-by: Dave Airlie <airlied@redhat.com>
>
> On Thu, 27 Nov 2025 at 22:50, Philipp Stanner <phasta@mailbox.org> wrote:
> >
> > +Cc Matthew, Tvrtko, Christian
> >
> > On Fri, 2025-11-07 at 14:56 +0100, Philipp Stanner wrote:
> > > Changes in v2:
> > > - Fix wrong list item index in patch 1.
> > >
> > > The GPU Scheduler has enough problems to be covered by the drm todo
> > > list. Let's add an entry.
> > >
> > > This series is the succesor of [1].
> > >
> > > [1] https://lore.kernel.org/dri-devel/20251023143031.149496-2-phasta@kernel.org/
> > >
> > > Philipp Stanner (2):
> > > drm/todo: Add section with task for GPU scheduler
> > > drm/todo: Add entry for unlocked drm/sched rq readers
Pushed to drm-misc-next
P.
^ permalink raw reply [flat|nested] 6+ messages in thread