[RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job

linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job
@ 2025-06-03  9:31 Philipp Stanner
  2025-06-03  9:31 ` [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback Philipp Stanner
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Philipp Stanner @ 2025-06-03  9:31 UTC (permalink / raw)
  To: Lyude Paul, Danilo Krummrich, David Airlie, Simona Vetter,
	Matthew Brost, Philipp Stanner, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Tvrtko Ursulin, Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media

An alternative version to [1], based on Tvrtko's suggestion from [2].

I tested this for Nouveau. Works.

I'm having, however, bigger problems properly porting the unit tests and
have seen various explosions. In the process I noticed that some things
in the unit tests aren't right and a bit of a larger rework will be
necessary (for example, the timedout job callback must signal the
timedout fence, remove it from the list and so on).

Anyways. Please comment on the general idea.

@Tvrtko: As briefly brainstormed about on IRC, if you'd be willing to
take care of the unit tests patch, I could remove that one (and,
maaaaybe, the warning print patch) from the series and we could merge
this RFC's successor version %N once it's ready. What do you think?

P.

[1] https://lore.kernel.org/dri-devel/20250522082742.148191-2-phasta@kernel.org/
[2] https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursulin@igalia.com/

Philipp Stanner (6):
  drm/sched: Avoid memory leaks with cancel_job() callback
  drm/sched/tests: Implement cancel_job()
  drm/sched: Warn if pending list is not empty
  drm/nouveau: Make fence container helper usable driver-wide
  drm/nouveau: Add new callback for scheduler teardown
  drm/nouveau: Remove waitque for sched teardown

 drivers/gpu/drm/nouveau/nouveau_fence.c       | 35 +++++----
 drivers/gpu/drm/nouveau/nouveau_fence.h       |  7 ++
 drivers/gpu/drm/nouveau/nouveau_sched.c       | 35 +++++----
 drivers/gpu/drm/nouveau/nouveau_sched.h       |  9 +--
 drivers/gpu/drm/nouveau/nouveau_uvmm.c        |  8 +--
 drivers/gpu/drm/scheduler/sched_main.c        | 37 ++++++----
 .../gpu/drm/scheduler/tests/mock_scheduler.c  | 71 +++++++------------
 drivers/gpu/drm/scheduler/tests/sched_tests.h |  4 +-
 include/drm/gpu_scheduler.h                   |  9 +++
 9 files changed, 115 insertions(+), 100 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback
  2025-06-03  9:31 [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job Philipp Stanner
@ 2025-06-03  9:31 ` Philipp Stanner
  2025-06-03 13:22   ` Tvrtko Ursulin
  2025-06-12 14:17   ` Tvrtko Ursulin
  2025-06-03  9:31 ` [RFC PATCH 2/6] drm/sched/tests: Implement cancel_job() Philipp Stanner
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 16+ messages in thread
From: Philipp Stanner @ 2025-06-03  9:31 UTC (permalink / raw)
  To: Lyude Paul, Danilo Krummrich, David Airlie, Simona Vetter,
	Matthew Brost, Philipp Stanner, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Tvrtko Ursulin, Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media

Since its inception, the GPU scheduler can leak memory if the driver
calls drm_sched_fini() while there are still jobs in flight.

The simplest way to solve this in a backwards compatible manner is by
adding a new callback, drm_sched_backend_ops.cancel_job(), which
instructs the driver to signal the hardware fence associated with the
job. Afterwards, the scheduler can savely use the established free_job()
callback for freeing the job.

Implement the new backend_ops callback cancel_job().

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
 drivers/gpu/drm/scheduler/sched_main.c | 34 ++++++++++++++++----------
 include/drm/gpu_scheduler.h            |  9 +++++++
 2 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index d20726d7adf0..3f14f1e151fa 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1352,6 +1352,18 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_
 }
 EXPORT_SYMBOL(drm_sched_init);
 
+static void drm_sched_kill_remaining_jobs(struct drm_gpu_scheduler *sched)
+{
+	struct drm_sched_job *job, *tmp;
+
+	/* All other accessors are stopped. No locking necessary. */
+	list_for_each_entry_safe_reverse(job, tmp, &sched->pending_list, list) {
+		sched->ops->cancel_job(job);
+		list_del(&job->list);
+		sched->ops->free_job(job);
+	}
+}
+
 /**
  * drm_sched_fini - Destroy a gpu scheduler
  *
@@ -1359,19 +1371,11 @@ EXPORT_SYMBOL(drm_sched_init);
  *
  * Tears down and cleans up the scheduler.
  *
- * This stops submission of new jobs to the hardware through
- * drm_sched_backend_ops.run_job(). Consequently, drm_sched_backend_ops.free_job()
- * will not be called for all jobs still in drm_gpu_scheduler.pending_list.
- * There is no solution for this currently. Thus, it is up to the driver to make
- * sure that:
- *
- *  a) drm_sched_fini() is only called after for all submitted jobs
- *     drm_sched_backend_ops.free_job() has been called or that
- *  b) the jobs for which drm_sched_backend_ops.free_job() has not been called
- *     after drm_sched_fini() ran are freed manually.
- *
- * FIXME: Take care of the above problem and prevent this function from leaking
- * the jobs in drm_gpu_scheduler.pending_list under any circumstances.
+ * This stops submission of new jobs to the hardware through &struct
+ * drm_sched_backend_ops.run_job. If &struct drm_sched_backend_ops.cancel_job
+ * is implemented, all jobs will be canceled through it and afterwards cleaned
+ * up through &struct drm_sched_backend_ops.free_job. If cancel_job is not
+ * implemented, memory could leak.
  */
 void drm_sched_fini(struct drm_gpu_scheduler *sched)
 {
@@ -1401,6 +1405,10 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
 	/* Confirm no work left behind accessing device structures */
 	cancel_delayed_work_sync(&sched->work_tdr);
 
+	/* Avoid memory leaks if supported by the driver. */
+	if (sched->ops->cancel_job)
+		drm_sched_kill_remaining_jobs(sched);
+
 	if (sched->own_submit_wq)
 		destroy_workqueue(sched->submit_wq);
 	sched->ready = false;
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index e62a7214e052..81dcbfc8c223 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -512,6 +512,15 @@ struct drm_sched_backend_ops {
          * and it's time to clean it up.
 	 */
 	void (*free_job)(struct drm_sched_job *sched_job);
+
+	/**
+	 * @cancel_job: Used by the scheduler to guarantee remaining jobs' fences
+	 * get signaled in drm_sched_fini().
+	 *
+	 * Drivers need to signal the passed job's hardware fence with
+	 * -ECANCELED in this callback. They must not free the job.
+	 */
+	void (*cancel_job)(struct drm_sched_job *sched_job);
 };
 
 /**
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 2/6] drm/sched/tests: Implement cancel_job()
  2025-06-03  9:31 [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job Philipp Stanner
  2025-06-03  9:31 ` [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback Philipp Stanner
@ 2025-06-03  9:31 ` Philipp Stanner
  2025-06-03  9:31 ` [RFC PATCH 3/6] drm/sched: Warn if pending list is not empty Philipp Stanner
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Philipp Stanner @ 2025-06-03  9:31 UTC (permalink / raw)
  To: Lyude Paul, Danilo Krummrich, David Airlie, Simona Vetter,
	Matthew Brost, Philipp Stanner, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Tvrtko Ursulin, Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media

The GPU scheduler now provides a new callback to prevent memory leaks on
scheduler teardown. The callback is optional, but should be implemented
since it simplifies the cleanup code path.

Moreover, the unit tests serve as a resource for understanding the
canonical usage of the scheduler API and should therefore support the
callback.

Provide the backend_ops callback cancel_job() in the unit tests.

This code is WIP and still buggy. Take it more as an RFC. It seems that
it interferes negatively with timeout handling, which is broken in the
sense of the timeout handler not signaling the hardware fence.

That should be repaired and cleaned up, but it's probably better to do
that in a separate series.

Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
 .../gpu/drm/scheduler/tests/mock_scheduler.c  | 71 +++++++------------
 drivers/gpu/drm/scheduler/tests/sched_tests.h |  4 +-
 2 files changed, 25 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
index 7f947ab9d322..33864b179704 100644
--- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
+++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
@@ -55,7 +55,7 @@ void drm_mock_sched_entity_free(struct drm_mock_sched_entity *entity)
 	drm_sched_entity_destroy(&entity->base);
 }
 
-static void drm_mock_sched_job_complete(struct drm_mock_sched_job *job)
+static void drm_mock_sched_job_complete(struct drm_mock_sched_job *job, int err)
 {
 	struct drm_mock_scheduler *sched =
 		drm_sched_to_mock_sched(job->base.sched);
@@ -63,8 +63,11 @@ static void drm_mock_sched_job_complete(struct drm_mock_sched_job *job)
 	lockdep_assert_held(&sched->lock);
 
 	job->flags |= DRM_MOCK_SCHED_JOB_DONE;
-	list_move_tail(&job->link, &sched->done_list);
-	dma_fence_signal_locked(&job->hw_fence);
+	list_del(&job->link);
+	if (!dma_fence_is_signaled(&job->hw_fence)) {
+		dma_fence_set_error(&job->hw_fence, err);
+		dma_fence_signal(&job->hw_fence);
+	}
 	complete(&job->done);
 }
 
@@ -89,7 +92,7 @@ drm_mock_sched_job_signal_timer(struct hrtimer *hrtimer)
 			break;
 
 		sched->hw_timeline.cur_seqno = job->hw_fence.seqno;
-		drm_mock_sched_job_complete(job);
+		drm_mock_sched_job_complete(job, 0);
 	}
 	spin_unlock_irqrestore(&sched->lock, flags);
 
@@ -212,26 +215,33 @@ mock_sched_timedout_job(struct drm_sched_job *sched_job)
 
 static void mock_sched_free_job(struct drm_sched_job *sched_job)
 {
-	struct drm_mock_scheduler *sched =
-			drm_sched_to_mock_sched(sched_job->sched);
 	struct drm_mock_sched_job *job = drm_sched_job_to_mock_job(sched_job);
-	unsigned long flags;
 
-	/* Remove from the scheduler done list. */
-	spin_lock_irqsave(&sched->lock, flags);
-	list_del(&job->link);
-	spin_unlock_irqrestore(&sched->lock, flags);
 	dma_fence_put(&job->hw_fence);
-
 	drm_sched_job_cleanup(sched_job);
 
 	/* Mock job itself is freed by the kunit framework. */
 }
 
+static void mock_sched_cancel_job(struct drm_sched_job *sched_job)
+{
+	struct drm_mock_scheduler *sched =
+		drm_sched_to_mock_sched(sched_job->sched);
+	struct drm_mock_sched_job *job = drm_sched_job_to_mock_job(sched_job);
+
+	hrtimer_cancel(&job->timer);
+
+	spin_lock_irq(&sched->lock);
+	if (!dma_fence_is_signaled(&job->hw_fence))
+		drm_mock_sched_job_complete(job, -ECANCELED);
+	spin_unlock_irq(&sched->lock);
+}
+
 static const struct drm_sched_backend_ops drm_mock_scheduler_ops = {
 	.run_job = mock_sched_run_job,
 	.timedout_job = mock_sched_timedout_job,
-	.free_job = mock_sched_free_job
+	.free_job = mock_sched_free_job,
+	.cancel_job = mock_sched_cancel_job,
 };
 
 /**
@@ -265,7 +275,6 @@ struct drm_mock_scheduler *drm_mock_sched_new(struct kunit *test, long timeout)
 	sched->hw_timeline.context = dma_fence_context_alloc(1);
 	atomic_set(&sched->hw_timeline.next_seqno, 0);
 	INIT_LIST_HEAD(&sched->job_list);
-	INIT_LIST_HEAD(&sched->done_list);
 	spin_lock_init(&sched->lock);
 
 	return sched;
@@ -280,38 +289,6 @@ struct drm_mock_scheduler *drm_mock_sched_new(struct kunit *test, long timeout)
  */
 void drm_mock_sched_fini(struct drm_mock_scheduler *sched)
 {
-	struct drm_mock_sched_job *job, *next;
-	unsigned long flags;
-	LIST_HEAD(list);
-
-	drm_sched_wqueue_stop(&sched->base);
-
-	/* Force complete all unfinished jobs. */
-	spin_lock_irqsave(&sched->lock, flags);
-	list_for_each_entry_safe(job, next, &sched->job_list, link)
-		list_move_tail(&job->link, &list);
-	spin_unlock_irqrestore(&sched->lock, flags);
-
-	list_for_each_entry(job, &list, link)
-		hrtimer_cancel(&job->timer);
-
-	spin_lock_irqsave(&sched->lock, flags);
-	list_for_each_entry_safe(job, next, &list, link)
-		drm_mock_sched_job_complete(job);
-	spin_unlock_irqrestore(&sched->lock, flags);
-
-	/*
-	 * Free completed jobs and jobs not yet processed by the DRM scheduler
-	 * free worker.
-	 */
-	spin_lock_irqsave(&sched->lock, flags);
-	list_for_each_entry_safe(job, next, &sched->done_list, link)
-		list_move_tail(&job->link, &list);
-	spin_unlock_irqrestore(&sched->lock, flags);
-
-	list_for_each_entry_safe(job, next, &list, link)
-		mock_sched_free_job(&job->base);
-
 	drm_sched_fini(&sched->base);
 }
 
@@ -346,7 +323,7 @@ unsigned int drm_mock_sched_advance(struct drm_mock_scheduler *sched,
 		if (sched->hw_timeline.cur_seqno < job->hw_fence.seqno)
 			break;
 
-		drm_mock_sched_job_complete(job);
+		drm_mock_sched_job_complete(job, 0);
 		found++;
 	}
 unlock:
diff --git a/drivers/gpu/drm/scheduler/tests/sched_tests.h b/drivers/gpu/drm/scheduler/tests/sched_tests.h
index fbba38137f0c..a905db835ccc 100644
--- a/drivers/gpu/drm/scheduler/tests/sched_tests.h
+++ b/drivers/gpu/drm/scheduler/tests/sched_tests.h
@@ -32,9 +32,8 @@
  *
  * @base: DRM scheduler base class
  * @test: Backpointer to owning the kunit test case
- * @lock: Lock to protect the simulated @hw_timeline, @job_list and @done_list
+ * @lock: Lock to protect the simulated @hw_timeline, @job_list
  * @job_list: List of jobs submitted to the mock GPU
- * @done_list: List of jobs completed by the mock GPU
  * @hw_timeline: Simulated hardware timeline has a @context, @next_seqno and
  *		 @cur_seqno for implementing a struct dma_fence signaling the
  *		 simulated job completion.
@@ -49,7 +48,6 @@ struct drm_mock_scheduler {
 
 	spinlock_t		lock;
 	struct list_head	job_list;
-	struct list_head	done_list;
 
 	struct {
 		u64		context;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 3/6] drm/sched: Warn if pending list is not empty
  2025-06-03  9:31 [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job Philipp Stanner
  2025-06-03  9:31 ` [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback Philipp Stanner
  2025-06-03  9:31 ` [RFC PATCH 2/6] drm/sched/tests: Implement cancel_job() Philipp Stanner
@ 2025-06-03  9:31 ` Philipp Stanner
  2025-06-03  9:31 ` [RFC PATCH 4/6] drm/nouveau: Make fence container helper usable driver-wide Philipp Stanner
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Philipp Stanner @ 2025-06-03  9:31 UTC (permalink / raw)
  To: Lyude Paul, Danilo Krummrich, David Airlie, Simona Vetter,
	Matthew Brost, Philipp Stanner, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Tvrtko Ursulin, Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media

drm_sched_fini() can leak jobs under certain circumstances.

Warn if that happens.

Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
 drivers/gpu/drm/scheduler/sched_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 3f14f1e151fa..df12d5aaa1af 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1414,6 +1414,9 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
 	sched->ready = false;
 	kfree(sched->sched_rq);
 	sched->sched_rq = NULL;
+
+	if (!list_empty(&sched->pending_list))
+		dev_err(sched->dev, "Tearing down scheduler while jobs are pending!\n");
 }
 EXPORT_SYMBOL(drm_sched_fini);
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 4/6] drm/nouveau: Make fence container helper usable driver-wide
  2025-06-03  9:31 [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job Philipp Stanner
                   ` (2 preceding siblings ...)
  2025-06-03  9:31 ` [RFC PATCH 3/6] drm/sched: Warn if pending list is not empty Philipp Stanner
@ 2025-06-03  9:31 ` Philipp Stanner
  2025-06-03  9:31 ` [RFC PATCH 5/6] drm/nouveau: Add new callback for scheduler teardown Philipp Stanner
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Philipp Stanner @ 2025-06-03  9:31 UTC (permalink / raw)
  To: Lyude Paul, Danilo Krummrich, David Airlie, Simona Vetter,
	Matthew Brost, Philipp Stanner, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Tvrtko Ursulin, Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media

In order to implement a new DRM GPU scheduler callback in Nouveau, a
helper for obtaining a nouveau_fence from a dma_fence is necessary. Such
a helper exists already inside nouveau_fence.c, called from_fence().

Make that helper available to other C files with a more precise name.

Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 20 +++++++-------------
 drivers/gpu/drm/nouveau/nouveau_fence.h |  6 ++++++
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index d5654e26d5bc..869d4335c0f4 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -38,12 +38,6 @@
 static const struct dma_fence_ops nouveau_fence_ops_uevent;
 static const struct dma_fence_ops nouveau_fence_ops_legacy;
 
-static inline struct nouveau_fence *
-from_fence(struct dma_fence *fence)
-{
-	return container_of(fence, struct nouveau_fence, base);
-}
-
 static inline struct nouveau_fence_chan *
 nouveau_fctx(struct nouveau_fence *fence)
 {
@@ -77,7 +71,7 @@ nouveau_local_fence(struct dma_fence *fence, struct nouveau_drm *drm)
 	    fence->ops != &nouveau_fence_ops_uevent)
 		return NULL;
 
-	return from_fence(fence);
+	return to_nouveau_fence(fence);
 }
 
 void
@@ -268,7 +262,7 @@ nouveau_fence_done(struct nouveau_fence *fence)
 static long
 nouveau_fence_wait_legacy(struct dma_fence *f, bool intr, long wait)
 {
-	struct nouveau_fence *fence = from_fence(f);
+	struct nouveau_fence *fence = to_nouveau_fence(f);
 	unsigned long sleep_time = NSEC_PER_MSEC / 1000;
 	unsigned long t = jiffies, timeout = t + wait;
 
@@ -448,7 +442,7 @@ static const char *nouveau_fence_get_get_driver_name(struct dma_fence *fence)
 
 static const char *nouveau_fence_get_timeline_name(struct dma_fence *f)
 {
-	struct nouveau_fence *fence = from_fence(f);
+	struct nouveau_fence *fence = to_nouveau_fence(f);
 	struct nouveau_fence_chan *fctx = nouveau_fctx(fence);
 
 	return !fctx->dead ? fctx->name : "dead channel";
@@ -462,7 +456,7 @@ static const char *nouveau_fence_get_timeline_name(struct dma_fence *f)
  */
 static bool nouveau_fence_is_signaled(struct dma_fence *f)
 {
-	struct nouveau_fence *fence = from_fence(f);
+	struct nouveau_fence *fence = to_nouveau_fence(f);
 	struct nouveau_fence_chan *fctx = nouveau_fctx(fence);
 	struct nouveau_channel *chan;
 	bool ret = false;
@@ -478,7 +472,7 @@ static bool nouveau_fence_is_signaled(struct dma_fence *f)
 
 static bool nouveau_fence_no_signaling(struct dma_fence *f)
 {
-	struct nouveau_fence *fence = from_fence(f);
+	struct nouveau_fence *fence = to_nouveau_fence(f);
 
 	/*
 	 * caller should have a reference on the fence,
@@ -503,7 +497,7 @@ static bool nouveau_fence_no_signaling(struct dma_fence *f)
 
 static void nouveau_fence_release(struct dma_fence *f)
 {
-	struct nouveau_fence *fence = from_fence(f);
+	struct nouveau_fence *fence = to_nouveau_fence(f);
 	struct nouveau_fence_chan *fctx = nouveau_fctx(fence);
 
 	kref_put(&fctx->fence_ref, nouveau_fence_context_put);
@@ -521,7 +515,7 @@ static const struct dma_fence_ops nouveau_fence_ops_legacy = {
 
 static bool nouveau_fence_enable_signaling(struct dma_fence *f)
 {
-	struct nouveau_fence *fence = from_fence(f);
+	struct nouveau_fence *fence = to_nouveau_fence(f);
 	struct nouveau_fence_chan *fctx = nouveau_fctx(fence);
 	bool ret;
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h b/drivers/gpu/drm/nouveau/nouveau_fence.h
index 8bc065acfe35..c3595c2197b5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.h
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.h
@@ -17,6 +17,12 @@ struct nouveau_fence {
 	unsigned long timeout;
 };
 
+static inline struct nouveau_fence *
+to_nouveau_fence(struct dma_fence *fence)
+{
+	return container_of(fence, struct nouveau_fence, base);
+}
+
 int  nouveau_fence_create(struct nouveau_fence **, struct nouveau_channel *);
 int  nouveau_fence_new(struct nouveau_fence **, struct nouveau_channel *);
 void nouveau_fence_unref(struct nouveau_fence **);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 5/6] drm/nouveau: Add new callback for scheduler teardown
  2025-06-03  9:31 [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job Philipp Stanner
                   ` (3 preceding siblings ...)
  2025-06-03  9:31 ` [RFC PATCH 4/6] drm/nouveau: Make fence container helper usable driver-wide Philipp Stanner
@ 2025-06-03  9:31 ` Philipp Stanner
  2025-06-03  9:31 ` [RFC PATCH 6/6] drm/nouveau: Remove waitque for sched teardown Philipp Stanner
  2025-06-03 12:27 ` [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job Tvrtko Ursulin
  6 siblings, 0 replies; 16+ messages in thread
From: Philipp Stanner @ 2025-06-03  9:31 UTC (permalink / raw)
  To: Lyude Paul, Danilo Krummrich, David Airlie, Simona Vetter,
	Matthew Brost, Philipp Stanner, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Tvrtko Ursulin, Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media

There is a new callback for always tearing the scheduler down in a
leak-free, deadlock-free manner.

Port Nouveau as its first user by providing the scheduler with a
callback that ensures the fence context gets killed in drm_sched_fini().

Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 15 +++++++++++++++
 drivers/gpu/drm/nouveau/nouveau_fence.h |  1 +
 drivers/gpu/drm/nouveau/nouveau_sched.c | 15 ++++++++++++++-
 3 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 869d4335c0f4..1c30ce686c6a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -240,6 +240,21 @@ nouveau_fence_emit(struct nouveau_fence *fence)
 	return ret;
 }
 
+void
+nouveau_fence_cancel(struct nouveau_fence *fence)
+{
+	struct nouveau_fence_chan *fctx = nouveau_fctx(fence);
+	unsigned long flags;
+
+	spin_lock_irqsave(&fctx->lock, flags);
+	if (!dma_fence_is_signaled(&fence->base)) {
+		dma_fence_set_error(&fence->base, -ECANCELED);
+		if (nouveau_fence_signal(fence))
+			nvif_event_block(&fctx->event);
+	}
+	spin_unlock_irqrestore(&fctx->lock, flags);
+}
+
 bool
 nouveau_fence_done(struct nouveau_fence *fence)
 {
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h b/drivers/gpu/drm/nouveau/nouveau_fence.h
index c3595c2197b5..4d8f78cd6ebc 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.h
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.h
@@ -29,6 +29,7 @@ void nouveau_fence_unref(struct nouveau_fence **);
 
 int  nouveau_fence_emit(struct nouveau_fence *);
 bool nouveau_fence_done(struct nouveau_fence *);
+void nouveau_fence_cancel(struct nouveau_fence *fence);
 int  nouveau_fence_wait(struct nouveau_fence *, bool lazy, bool intr);
 int  nouveau_fence_sync(struct nouveau_bo *, struct nouveau_channel *, bool exclusive, bool intr);
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
index 460a5fb02412..2ec62059c351 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sched.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
@@ -11,6 +11,7 @@
 #include "nouveau_exec.h"
 #include "nouveau_abi16.h"
 #include "nouveau_sched.h"
+#include "nouveau_chan.h"
 
 #define NOUVEAU_SCHED_JOB_TIMEOUT_MS		10000
 
@@ -393,10 +394,23 @@ nouveau_sched_free_job(struct drm_sched_job *sched_job)
 	nouveau_job_fini(job);
 }
 
+static void
+nouveau_sched_cancel_job(struct drm_sched_job *sched_job)
+{
+	struct nouveau_fence *fence;
+	struct nouveau_job *job;
+
+	job = to_nouveau_job(sched_job);
+	fence = to_nouveau_fence(job->done_fence);
+
+	nouveau_fence_cancel(fence);
+}
+
 static const struct drm_sched_backend_ops nouveau_sched_ops = {
 	.run_job = nouveau_sched_run_job,
 	.timedout_job = nouveau_sched_timedout_job,
 	.free_job = nouveau_sched_free_job,
+	.cancel_job = nouveau_sched_cancel_job,
 };
 
 static int
@@ -482,7 +496,6 @@ nouveau_sched_create(struct nouveau_sched **psched, struct nouveau_drm *drm,
 	return 0;
 }
 
-
 static void
 nouveau_sched_fini(struct nouveau_sched *sched)
 {
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 6/6] drm/nouveau: Remove waitque for sched teardown
  2025-06-03  9:31 [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job Philipp Stanner
                   ` (4 preceding siblings ...)
  2025-06-03  9:31 ` [RFC PATCH 5/6] drm/nouveau: Add new callback for scheduler teardown Philipp Stanner
@ 2025-06-03  9:31 ` Philipp Stanner
  2025-06-03 12:27 ` [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job Tvrtko Ursulin
  6 siblings, 0 replies; 16+ messages in thread
From: Philipp Stanner @ 2025-06-03  9:31 UTC (permalink / raw)
  To: Lyude Paul, Danilo Krummrich, David Airlie, Simona Vetter,
	Matthew Brost, Philipp Stanner, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Tvrtko Ursulin, Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media

struct nouveau_sched contains a waitque needed to prevent
drm_sched_fini() from being called while there are still jobs pending.
Doing so so far would have caused memory leaks.

With the new memleak-free mode of operation switched on in
drm_sched_fini() by providing the callback
nouveau_sched_fence_context_kill() the waitque is not necessary anymore.

Remove the waitque.

Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
 drivers/gpu/drm/nouveau/nouveau_sched.c | 20 +++++++-------------
 drivers/gpu/drm/nouveau/nouveau_sched.h |  9 +++------
 drivers/gpu/drm/nouveau/nouveau_uvmm.c  |  8 ++++----
 3 files changed, 14 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c
index 2ec62059c351..7d9c3418e76b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sched.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
@@ -122,11 +122,9 @@ nouveau_job_done(struct nouveau_job *job)
 {
 	struct nouveau_sched *sched = job->sched;
 
-	spin_lock(&sched->job.list.lock);
+	spin_lock(&sched->job_list.lock);
 	list_del(&job->entry);
-	spin_unlock(&sched->job.list.lock);
-
-	wake_up(&sched->job.wq);
+	spin_unlock(&sched->job_list.lock);
 }
 
 void
@@ -307,9 +305,9 @@ nouveau_job_submit(struct nouveau_job *job)
 	}
 
 	/* Submit was successful; add the job to the schedulers job list. */
-	spin_lock(&sched->job.list.lock);
-	list_add(&job->entry, &sched->job.list.head);
-	spin_unlock(&sched->job.list.lock);
+	spin_lock(&sched->job_list.lock);
+	list_add(&job->entry, &sched->job_list.head);
+	spin_unlock(&sched->job_list.lock);
 
 	drm_sched_job_arm(&job->base);
 	job->done_fence = dma_fence_get(&job->base.s_fence->finished);
@@ -460,9 +458,8 @@ nouveau_sched_init(struct nouveau_sched *sched, struct nouveau_drm *drm,
 		goto fail_sched;
 
 	mutex_init(&sched->mutex);
-	spin_lock_init(&sched->job.list.lock);
-	INIT_LIST_HEAD(&sched->job.list.head);
-	init_waitqueue_head(&sched->job.wq);
+	spin_lock_init(&sched->job_list.lock);
+	INIT_LIST_HEAD(&sched->job_list.head);
 
 	return 0;
 
@@ -502,9 +499,6 @@ nouveau_sched_fini(struct nouveau_sched *sched)
 	struct drm_gpu_scheduler *drm_sched = &sched->base;
 	struct drm_sched_entity *entity = &sched->entity;
 
-	rmb(); /* for list_empty to work without lock */
-	wait_event(sched->job.wq, list_empty(&sched->job.list.head));
-
 	drm_sched_entity_fini(entity);
 	drm_sched_fini(drm_sched);
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.h b/drivers/gpu/drm/nouveau/nouveau_sched.h
index 20cd1da8db73..b98c3f0bef30 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sched.h
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.h
@@ -103,12 +103,9 @@ struct nouveau_sched {
 	struct mutex mutex;
 
 	struct {
-		struct {
-			struct list_head head;
-			spinlock_t lock;
-		} list;
-		struct wait_queue_head wq;
-	} job;
+		struct list_head head;
+		spinlock_t lock;
+	} job_list;
 };
 
 int nouveau_sched_create(struct nouveau_sched **psched, struct nouveau_drm *drm,
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
index 48f105239f42..ddfc46bc1b3e 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -1019,8 +1019,8 @@ bind_validate_map_sparse(struct nouveau_job *job, u64 addr, u64 range)
 	u64 end = addr + range;
 
 again:
-	spin_lock(&sched->job.list.lock);
-	list_for_each_entry(__job, &sched->job.list.head, entry) {
+	spin_lock(&sched->job_list.lock);
+	list_for_each_entry(__job, &sched->job_list.head, entry) {
 		struct nouveau_uvmm_bind_job *bind_job = to_uvmm_bind_job(__job);
 
 		list_for_each_op(op, &bind_job->ops) {
@@ -1030,7 +1030,7 @@ bind_validate_map_sparse(struct nouveau_job *job, u64 addr, u64 range)
 
 				if (!(end <= op_addr || addr >= op_end)) {
 					nouveau_uvmm_bind_job_get(bind_job);
-					spin_unlock(&sched->job.list.lock);
+					spin_unlock(&sched->job_list.lock);
 					wait_for_completion(&bind_job->complete);
 					nouveau_uvmm_bind_job_put(bind_job);
 					goto again;
@@ -1038,7 +1038,7 @@ bind_validate_map_sparse(struct nouveau_job *job, u64 addr, u64 range)
 			}
 		}
 	}
-	spin_unlock(&sched->job.list.lock);
+	spin_unlock(&sched->job_list.lock);
 }
 
 static int
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job
  2025-06-03  9:31 [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job Philipp Stanner
                   ` (5 preceding siblings ...)
  2025-06-03  9:31 ` [RFC PATCH 6/6] drm/nouveau: Remove waitque for sched teardown Philipp Stanner
@ 2025-06-03 12:27 ` Tvrtko Ursulin
  2025-06-03 13:23   ` Philipp Stanner
  6 siblings, 1 reply; 16+ messages in thread
From: Tvrtko Ursulin @ 2025-06-03 12:27 UTC (permalink / raw)
  To: Philipp Stanner, Lyude Paul, Danilo Krummrich, David Airlie,
	Simona Vetter, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media


On 03/06/2025 10:31, Philipp Stanner wrote:
> An alternative version to [1], based on Tvrtko's suggestion from [2].
> 
> I tested this for Nouveau. Works.
> 
> I'm having, however, bigger problems properly porting the unit tests and
> have seen various explosions. In the process I noticed that some things
> in the unit tests aren't right and a bit of a larger rework will be
> necessary (for example, the timedout job callback must signal the
> timedout fence, remove it from the list and so on).

General approach I follow when implementing any mock component is to 
implement only as much is needed for a test to pass. Only add more and 
rework when a test/functionality is added which requires it.

Specifically for timedout callback signaling I see that I had exactly 
that added in the patch you linked as [2].
  > Anyways. Please comment on the general idea.

I am obviously okay with it. :) Especially now that you verified it 
works well for nouveau.

What I am not that ecstatic about is only getting the Suggested-by 
credit in 1/6. Given it is basically my patch with some cosmetic changes 
like the kernel doc and the cancel loop extracted to a helper.

> @Tvrtko: As briefly brainstormed about on IRC, if you'd be willing to
> take care of the unit tests patch, I could remove that one (and,
> maaaaybe, the warning print patch) from the series and we could merge
> this RFC's successor version %N once it's ready. What do you think?

Okay in principle but the first thing I would suggest you could try is 
to take my unit tests adaptations from [2] verbatim. Benefit of keeping 
everything in one series is more confidence we are merging a solid 
thing. But I can take it on myself as a follow up too if you want.

Regards,

Tvrtko

> 
> P.
> 
> [1] https://lore.kernel.org/dri-devel/20250522082742.148191-2-phasta@kernel.org/
> [2] https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursulin@igalia.com/
> 
> Philipp Stanner (6):
>    drm/sched: Avoid memory leaks with cancel_job() callback
>    drm/sched/tests: Implement cancel_job()
>    drm/sched: Warn if pending list is not empty
>    drm/nouveau: Make fence container helper usable driver-wide
>    drm/nouveau: Add new callback for scheduler teardown
>    drm/nouveau: Remove waitque for sched teardown
> 
>   drivers/gpu/drm/nouveau/nouveau_fence.c       | 35 +++++----
>   drivers/gpu/drm/nouveau/nouveau_fence.h       |  7 ++
>   drivers/gpu/drm/nouveau/nouveau_sched.c       | 35 +++++----
>   drivers/gpu/drm/nouveau/nouveau_sched.h       |  9 +--
>   drivers/gpu/drm/nouveau/nouveau_uvmm.c        |  8 +--
>   drivers/gpu/drm/scheduler/sched_main.c        | 37 ++++++----
>   .../gpu/drm/scheduler/tests/mock_scheduler.c  | 71 +++++++------------
>   drivers/gpu/drm/scheduler/tests/sched_tests.h |  4 +-
>   include/drm/gpu_scheduler.h                   |  9 +++
>   9 files changed, 115 insertions(+), 100 deletions(-)
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback
  2025-06-03  9:31 ` [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback Philipp Stanner
@ 2025-06-03 13:22   ` Tvrtko Ursulin
  2025-06-12 14:17   ` Tvrtko Ursulin
  1 sibling, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2025-06-03 13:22 UTC (permalink / raw)
  To: Philipp Stanner, Lyude Paul, Danilo Krummrich, David Airlie,
	Simona Vetter, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media


On 03/06/2025 10:31, Philipp Stanner wrote:
> Since its inception, the GPU scheduler can leak memory if the driver
> calls drm_sched_fini() while there are still jobs in flight.
> 
> The simplest way to solve this in a backwards compatible manner is by
> adding a new callback, drm_sched_backend_ops.cancel_job(), which
> instructs the driver to signal the hardware fence associated with the
> job. Afterwards, the scheduler can savely use the established free_job()
> callback for freeing the job.
> 
> Implement the new backend_ops callback cancel_job().
> 
> Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Signed-off-by: Philipp Stanner <phasta@kernel.org>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 34 ++++++++++++++++----------
>   include/drm/gpu_scheduler.h            |  9 +++++++
>   2 files changed, 30 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index d20726d7adf0..3f14f1e151fa 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -1352,6 +1352,18 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_
>   }
>   EXPORT_SYMBOL(drm_sched_init);
>   
> +static void drm_sched_kill_remaining_jobs(struct drm_gpu_scheduler *sched)

Only some bikeshedding comments, no need to act on them.

1)
I would maybe s/kill/cancel/ to align with the ->cancel_job naming.

2)
Would the callback presence check look better inside this helper so it 
is all consolidated and streamlined?

Regards,

Tvrtko

> +{
> +	struct drm_sched_job *job, *tmp;
> +
> +	/* All other accessors are stopped. No locking necessary. */
> +	list_for_each_entry_safe_reverse(job, tmp, &sched->pending_list, list) {
> +		sched->ops->cancel_job(job);
> +		list_del(&job->list);
> +		sched->ops->free_job(job);
> +	}
> +}
> +
>   /**
>    * drm_sched_fini - Destroy a gpu scheduler
>    *
> @@ -1359,19 +1371,11 @@ EXPORT_SYMBOL(drm_sched_init);
>    *
>    * Tears down and cleans up the scheduler.
>    *
> - * This stops submission of new jobs to the hardware through
> - * drm_sched_backend_ops.run_job(). Consequently, drm_sched_backend_ops.free_job()
> - * will not be called for all jobs still in drm_gpu_scheduler.pending_list.
> - * There is no solution for this currently. Thus, it is up to the driver to make
> - * sure that:
> - *
> - *  a) drm_sched_fini() is only called after for all submitted jobs
> - *     drm_sched_backend_ops.free_job() has been called or that
> - *  b) the jobs for which drm_sched_backend_ops.free_job() has not been called
> - *     after drm_sched_fini() ran are freed manually.
> - *
> - * FIXME: Take care of the above problem and prevent this function from leaking
> - * the jobs in drm_gpu_scheduler.pending_list under any circumstances.
> + * This stops submission of new jobs to the hardware through &struct
> + * drm_sched_backend_ops.run_job. If &struct drm_sched_backend_ops.cancel_job
> + * is implemented, all jobs will be canceled through it and afterwards cleaned
> + * up through &struct drm_sched_backend_ops.free_job. If cancel_job is not
> + * implemented, memory could leak.
>    */
>   void drm_sched_fini(struct drm_gpu_scheduler *sched)
>   {
> @@ -1401,6 +1405,10 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
>   	/* Confirm no work left behind accessing device structures */
>   	cancel_delayed_work_sync(&sched->work_tdr);
>   
> +	/* Avoid memory leaks if supported by the driver. */
> +	if (sched->ops->cancel_job)
> +		drm_sched_kill_remaining_jobs(sched);
> +
>   	if (sched->own_submit_wq)
>   		destroy_workqueue(sched->submit_wq);
>   	sched->ready = false;
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index e62a7214e052..81dcbfc8c223 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -512,6 +512,15 @@ struct drm_sched_backend_ops {
>            * and it's time to clean it up.
>   	 */
>   	void (*free_job)(struct drm_sched_job *sched_job);
> +
> +	/**
> +	 * @cancel_job: Used by the scheduler to guarantee remaining jobs' fences
> +	 * get signaled in drm_sched_fini().
> +	 *
> +	 * Drivers need to signal the passed job's hardware fence with
> +	 * -ECANCELED in this callback. They must not free the job.
> +	 */
> +	void (*cancel_job)(struct drm_sched_job *sched_job);
>   };
>   
>   /**


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job
  2025-06-03 12:27 ` [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job Tvrtko Ursulin
@ 2025-06-03 13:23   ` Philipp Stanner
  2025-06-11 21:21     ` Danilo Krummrich
  0 siblings, 1 reply; 16+ messages in thread
From: Philipp Stanner @ 2025-06-03 13:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, Philipp Stanner, Lyude Paul, Danilo Krummrich,
	David Airlie, Simona Vetter, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media

On Tue, 2025-06-03 at 13:27 +0100, Tvrtko Ursulin wrote:
> 
> On 03/06/2025 10:31, Philipp Stanner wrote:
> > An alternative version to [1], based on Tvrtko's suggestion from
> > [2].
> > 
> > I tested this for Nouveau. Works.
> > 
> > I'm having, however, bigger problems properly porting the unit
> > tests and
> > have seen various explosions. In the process I noticed that some
> > things
> > in the unit tests aren't right and a bit of a larger rework will be
> > necessary (for example, the timedout job callback must signal the
> > timedout fence, remove it from the list and so on).
> 
> General approach I follow when implementing any mock component is to 
> implement only as much is needed for a test to pass. Only add more
> and 
> rework when a test/functionality is added which requires it.
> 
> Specifically for timedout callback signaling I see that I had exactly
> that added in the patch you linked as [2].
>   > Anyways. Please comment on the general idea.
> 
> I am obviously okay with it. :) Especially now that you verified it 
> works well for nouveau.
> 
> What I am not that ecstatic about is only getting the Suggested-by 
> credit in 1/6. Given it is basically my patch with some cosmetic
> changes 
> like the kernel doc and the cancel loop extracted to a helper.

Sign the patch off and I give you the authorship if you want.

> 
> > @Tvrtko: As briefly brainstormed about on IRC, if you'd be willing
> > to
> > take care of the unit tests patch, I could remove that one (and,
> > maaaaybe, the warning print patch) from the series and we could
> > merge
> > this RFC's successor version %N once it's ready. What do you think?
> 
> Okay in principle but the first thing I would suggest you could try
> is 
> to take my unit tests adaptations from [2] verbatim. Benefit of
> keeping 
> everything in one series is more confidence we are merging a solid 
> thing. But I can take it on myself as a follow up too if you want.
> 
> Regards,
> 
> Tvrtko
> 
> > 
> > P.
> > 
> > [1]
> > https://lore.kernel.org/dri-devel/20250522082742.148191-2-phasta@kernel.org/
> > [2]
> > https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursulin@igalia.com/
> > 
> > Philipp Stanner (6):
> >    drm/sched: Avoid memory leaks with cancel_job() callback
> >    drm/sched/tests: Implement cancel_job()
> >    drm/sched: Warn if pending list is not empty
> >    drm/nouveau: Make fence container helper usable driver-wide
> >    drm/nouveau: Add new callback for scheduler teardown
> >    drm/nouveau: Remove waitque for sched teardown
> > 
> >   drivers/gpu/drm/nouveau/nouveau_fence.c       | 35 +++++----
> >   drivers/gpu/drm/nouveau/nouveau_fence.h       |  7 ++
> >   drivers/gpu/drm/nouveau/nouveau_sched.c       | 35 +++++----
> >   drivers/gpu/drm/nouveau/nouveau_sched.h       |  9 +--
> >   drivers/gpu/drm/nouveau/nouveau_uvmm.c        |  8 +--
> >   drivers/gpu/drm/scheduler/sched_main.c        | 37 ++++++----
> >   .../gpu/drm/scheduler/tests/mock_scheduler.c  | 71 +++++++-------
> > -----
> >   drivers/gpu/drm/scheduler/tests/sched_tests.h |  4 +-
> >   include/drm/gpu_scheduler.h                   |  9 +++
> >   9 files changed, 115 insertions(+), 100 deletions(-)
> > 
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job
  2025-06-03 13:23   ` Philipp Stanner
@ 2025-06-11 21:21     ` Danilo Krummrich
  2025-06-12 13:46       ` Tvrtko Ursulin
  0 siblings, 1 reply; 16+ messages in thread
From: Danilo Krummrich @ 2025-06-11 21:21 UTC (permalink / raw)
  To: phasta, Tvrtko Ursulin
  Cc: Lyude Paul, David Airlie, Simona Vetter, Matthew Brost,
	Christian König, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Sumit Semwal, Pierre-Eric Pelloux-Prayer,
	dri-devel, nouveau, linux-kernel, linux-media

> On Tue, 2025-06-03 at 13:27 +0100, Tvrtko Ursulin wrote:
> > On 03/06/2025 10:31, Philipp Stanner wrote:
> > What I am not that ecstatic about is only getting the Suggested-by 
> > credit in 1/6. Given it is basically my patch with some cosmetic
> > changes 
> > like the kernel doc and the cancel loop extracted to a helper.
> 
> Sign the patch off and I give you the authorship if you want.

AFAICS, the proposal of having cancel_job() has been a review comment which has
been clarified with a reference patch.

IMO, the fact that after some discussion Philipp decided to go with this
suggestion and implement the suggestion in his patch series does not result in
an obligation for him to hand over authorship of the patch he wrote to the
person who suggested the change in the context of the code review.

Anyways, it seems that Philipp did offer it however, so this seems to be
resolved?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job
  2025-06-11 21:21     ` Danilo Krummrich
@ 2025-06-12 13:46       ` Tvrtko Ursulin
  0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2025-06-12 13:46 UTC (permalink / raw)
  To: Danilo Krummrich, phasta
  Cc: Lyude Paul, David Airlie, Simona Vetter, Matthew Brost,
	Christian König, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Sumit Semwal, Pierre-Eric Pelloux-Prayer,
	dri-devel, nouveau, linux-kernel, linux-media

On 11/06/2025 22:21, Danilo Krummrich wrote:
>> On Tue, 2025-06-03 at 13:27 +0100, Tvrtko Ursulin wrote:
>>> On 03/06/2025 10:31, Philipp Stanner wrote:
>>> What I am not that ecstatic about is only getting the Suggested-by
>>> credit in 1/6. Given it is basically my patch with some cosmetic
>>> changes
>>> like the kernel doc and the cancel loop extracted to a helper.
>>
>> Sign the patch off and I give you the authorship if you want.
> 
> AFAICS, the proposal of having cancel_job() has been a review comment which has
> been clarified with a reference patch.

Right, this one:

https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursulin@igalia.com/

> IMO, the fact that after some discussion Philipp decided to go with this
> suggestion and implement the suggestion in his patch series does not result in
> an obligation for him to hand over authorship of the patch he wrote to the
> person who suggested the change in the context of the code review.

It is fine. Just that instead of rewriting we could have also said 
something along the lines of "Okay lets go with your version after all, 
just please tweak this or that". Which in my experience would have been 
more typical.

> Anyways, it seems that Philipp did offer it however, so this seems to be
> resolved?

At the end of the day the very fact a neater solution is going in is the 
main thing for me. Authorship is not that important, only that the way 
of working I follow, both as a maintainer and a colleague, aspires to be 
more like what I described in the previous paragraph.

I am not sure I can review this version though. It feels it would be too 
much like reviewing my own code so wouldn't carry the fully weight of 
review. Technically I probably could, but in reality someone else should 
probably better do it.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback
  2025-06-03  9:31 ` [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback Philipp Stanner
  2025-06-03 13:22   ` Tvrtko Ursulin
@ 2025-06-12 14:17   ` Tvrtko Ursulin
  2025-06-12 14:20     ` Philipp Stanner
  1 sibling, 1 reply; 16+ messages in thread
From: Tvrtko Ursulin @ 2025-06-12 14:17 UTC (permalink / raw)
  To: Philipp Stanner, Lyude Paul, Danilo Krummrich, David Airlie,
	Simona Vetter, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media


On 03/06/2025 10:31, Philipp Stanner wrote:
> Since its inception, the GPU scheduler can leak memory if the driver
> calls drm_sched_fini() while there are still jobs in flight.
> 
> The simplest way to solve this in a backwards compatible manner is by
> adding a new callback, drm_sched_backend_ops.cancel_job(), which
> instructs the driver to signal the hardware fence associated with the
> job. Afterwards, the scheduler can savely use the established free_job()
> callback for freeing the job.
> 
> Implement the new backend_ops callback cancel_job().
> 
> Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

Please just add the link to the patch here (it is only in the cover letter):

Link: 
https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursulin@igalia.com/

And you probably want to take the unit test modifications from the same 
patch too. You could put them in the same patch or separate.

Regards,

Tvrtko

> Signed-off-by: Philipp Stanner <phasta@kernel.org>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 34 ++++++++++++++++----------
>   include/drm/gpu_scheduler.h            |  9 +++++++
>   2 files changed, 30 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index d20726d7adf0..3f14f1e151fa 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -1352,6 +1352,18 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_
>   }
>   EXPORT_SYMBOL(drm_sched_init);
>   
> +static void drm_sched_kill_remaining_jobs(struct drm_gpu_scheduler *sched)
> +{
> +	struct drm_sched_job *job, *tmp;
> +
> +	/* All other accessors are stopped. No locking necessary. */
> +	list_for_each_entry_safe_reverse(job, tmp, &sched->pending_list, list) {
> +		sched->ops->cancel_job(job);
> +		list_del(&job->list);
> +		sched->ops->free_job(job);
> +	}
> +}
> +
>   /**
>    * drm_sched_fini - Destroy a gpu scheduler
>    *
> @@ -1359,19 +1371,11 @@ EXPORT_SYMBOL(drm_sched_init);
>    *
>    * Tears down and cleans up the scheduler.
>    *
> - * This stops submission of new jobs to the hardware through
> - * drm_sched_backend_ops.run_job(). Consequently, drm_sched_backend_ops.free_job()
> - * will not be called for all jobs still in drm_gpu_scheduler.pending_list.
> - * There is no solution for this currently. Thus, it is up to the driver to make
> - * sure that:
> - *
> - *  a) drm_sched_fini() is only called after for all submitted jobs
> - *     drm_sched_backend_ops.free_job() has been called or that
> - *  b) the jobs for which drm_sched_backend_ops.free_job() has not been called
> - *     after drm_sched_fini() ran are freed manually.
> - *
> - * FIXME: Take care of the above problem and prevent this function from leaking
> - * the jobs in drm_gpu_scheduler.pending_list under any circumstances.
> + * This stops submission of new jobs to the hardware through &struct
> + * drm_sched_backend_ops.run_job. If &struct drm_sched_backend_ops.cancel_job
> + * is implemented, all jobs will be canceled through it and afterwards cleaned
> + * up through &struct drm_sched_backend_ops.free_job. If cancel_job is not
> + * implemented, memory could leak.
>    */
>   void drm_sched_fini(struct drm_gpu_scheduler *sched)
>   {
> @@ -1401,6 +1405,10 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched)
>   	/* Confirm no work left behind accessing device structures */
>   	cancel_delayed_work_sync(&sched->work_tdr);
>   
> +	/* Avoid memory leaks if supported by the driver. */
> +	if (sched->ops->cancel_job)
> +		drm_sched_kill_remaining_jobs(sched);
> +
>   	if (sched->own_submit_wq)
>   		destroy_workqueue(sched->submit_wq);
>   	sched->ready = false;
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index e62a7214e052..81dcbfc8c223 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -512,6 +512,15 @@ struct drm_sched_backend_ops {
>            * and it's time to clean it up.
>   	 */
>   	void (*free_job)(struct drm_sched_job *sched_job);
> +
> +	/**
> +	 * @cancel_job: Used by the scheduler to guarantee remaining jobs' fences
> +	 * get signaled in drm_sched_fini().
> +	 *
> +	 * Drivers need to signal the passed job's hardware fence with
> +	 * -ECANCELED in this callback. They must not free the job.
> +	 */
> +	void (*cancel_job)(struct drm_sched_job *sched_job);
>   };
>   
>   /**


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback
  2025-06-12 14:17   ` Tvrtko Ursulin
@ 2025-06-12 14:20     ` Philipp Stanner
  2025-06-16  9:27       ` Tvrtko Ursulin
  0 siblings, 1 reply; 16+ messages in thread
From: Philipp Stanner @ 2025-06-12 14:20 UTC (permalink / raw)
  To: Tvrtko Ursulin, Philipp Stanner, Lyude Paul, Danilo Krummrich,
	David Airlie, Simona Vetter, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media

On Thu, 2025-06-12 at 15:17 +0100, Tvrtko Ursulin wrote:
> 
> On 03/06/2025 10:31, Philipp Stanner wrote:
> > Since its inception, the GPU scheduler can leak memory if the
> > driver
> > calls drm_sched_fini() while there are still jobs in flight.
> > 
> > The simplest way to solve this in a backwards compatible manner is
> > by
> > adding a new callback, drm_sched_backend_ops.cancel_job(), which
> > instructs the driver to signal the hardware fence associated with
> > the
> > job. Afterwards, the scheduler can savely use the established
> > free_job()
> > callback for freeing the job.
> > 
> > Implement the new backend_ops callback cancel_job().
> > 
> > Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> 
> Please just add the link to the patch here (it is only in the cover
> letter):
> 
> Link: 
> https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursulin@igalia.com/

That I can do, sure

> 
> And you probably want to take the unit test modifications from the
> same 
> patch too. You could put them in the same patch or separate.

Necessary adjustments for the unit tests are already implemented and
are waiting for review separately, since this can be done independently
from this entire series:

https://lore.kernel.org/dri-devel/20250605134154.191764-2-phasta@kernel.org/


Thx
P.

> 
> Regards,
> 
> Tvrtko
> 
> > Signed-off-by: Philipp Stanner <phasta@kernel.org>
> > ---
> >   drivers/gpu/drm/scheduler/sched_main.c | 34 ++++++++++++++++-----
> > -----
> >   include/drm/gpu_scheduler.h            |  9 +++++++
> >   2 files changed, 30 insertions(+), 13 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index d20726d7adf0..3f14f1e151fa 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -1352,6 +1352,18 @@ int drm_sched_init(struct drm_gpu_scheduler
> > *sched, const struct drm_sched_init_
> >   }
> >   EXPORT_SYMBOL(drm_sched_init);
> >   
> > +static void drm_sched_kill_remaining_jobs(struct drm_gpu_scheduler
> > *sched)
> > +{
> > +	struct drm_sched_job *job, *tmp;
> > +
> > +	/* All other accessors are stopped. No locking necessary.
> > */
> > +	list_for_each_entry_safe_reverse(job, tmp, &sched-
> > >pending_list, list) {
> > +		sched->ops->cancel_job(job);
> > +		list_del(&job->list);
> > +		sched->ops->free_job(job);
> > +	}
> > +}
> > +
> >   /**
> >    * drm_sched_fini - Destroy a gpu scheduler
> >    *
> > @@ -1359,19 +1371,11 @@ EXPORT_SYMBOL(drm_sched_init);
> >    *
> >    * Tears down and cleans up the scheduler.
> >    *
> > - * This stops submission of new jobs to the hardware through
> > - * drm_sched_backend_ops.run_job(). Consequently,
> > drm_sched_backend_ops.free_job()
> > - * will not be called for all jobs still in
> > drm_gpu_scheduler.pending_list.
> > - * There is no solution for this currently. Thus, it is up to the
> > driver to make
> > - * sure that:
> > - *
> > - *  a) drm_sched_fini() is only called after for all submitted
> > jobs
> > - *     drm_sched_backend_ops.free_job() has been called or that
> > - *  b) the jobs for which drm_sched_backend_ops.free_job() has not
> > been called
> > - *     after drm_sched_fini() ran are freed manually.
> > - *
> > - * FIXME: Take care of the above problem and prevent this function
> > from leaking
> > - * the jobs in drm_gpu_scheduler.pending_list under any
> > circumstances.
> > + * This stops submission of new jobs to the hardware through
> > &struct
> > + * drm_sched_backend_ops.run_job. If &struct
> > drm_sched_backend_ops.cancel_job
> > + * is implemented, all jobs will be canceled through it and
> > afterwards cleaned
> > + * up through &struct drm_sched_backend_ops.free_job. If
> > cancel_job is not
> > + * implemented, memory could leak.
> >    */
> >   void drm_sched_fini(struct drm_gpu_scheduler *sched)
> >   {
> > @@ -1401,6 +1405,10 @@ void drm_sched_fini(struct drm_gpu_scheduler
> > *sched)
> >   	/* Confirm no work left behind accessing device structures
> > */
> >   	cancel_delayed_work_sync(&sched->work_tdr);
> >   
> > +	/* Avoid memory leaks if supported by the driver. */
> > +	if (sched->ops->cancel_job)
> > +		drm_sched_kill_remaining_jobs(sched);
> > +
> >   	if (sched->own_submit_wq)
> >   		destroy_workqueue(sched->submit_wq);
> >   	sched->ready = false;
> > diff --git a/include/drm/gpu_scheduler.h
> > b/include/drm/gpu_scheduler.h
> > index e62a7214e052..81dcbfc8c223 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -512,6 +512,15 @@ struct drm_sched_backend_ops {
> >            * and it's time to clean it up.
> >   	 */
> >   	void (*free_job)(struct drm_sched_job *sched_job);
> > +
> > +	/**
> > +	 * @cancel_job: Used by the scheduler to guarantee
> > remaining jobs' fences
> > +	 * get signaled in drm_sched_fini().
> > +	 *
> > +	 * Drivers need to signal the passed job's hardware fence
> > with
> > +	 * -ECANCELED in this callback. They must not free the
> > job.
> > +	 */
> > +	void (*cancel_job)(struct drm_sched_job *sched_job);
> >   };
> >   
> >   /**
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback
  2025-06-12 14:20     ` Philipp Stanner
@ 2025-06-16  9:27       ` Tvrtko Ursulin
  2025-06-16 10:08         ` Philipp Stanner
  0 siblings, 1 reply; 16+ messages in thread
From: Tvrtko Ursulin @ 2025-06-16  9:27 UTC (permalink / raw)
  To: phasta, Lyude Paul, Danilo Krummrich, David Airlie, Simona Vetter,
	Matthew Brost, Christian König, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media


On 12/06/2025 15:20, Philipp Stanner wrote:
> On Thu, 2025-06-12 at 15:17 +0100, Tvrtko Ursulin wrote:
>>
>> On 03/06/2025 10:31, Philipp Stanner wrote:
>>> Since its inception, the GPU scheduler can leak memory if the
>>> driver
>>> calls drm_sched_fini() while there are still jobs in flight.
>>>
>>> The simplest way to solve this in a backwards compatible manner is
>>> by
>>> adding a new callback, drm_sched_backend_ops.cancel_job(), which
>>> instructs the driver to signal the hardware fence associated with
>>> the
>>> job. Afterwards, the scheduler can savely use the established
>>> free_job()
>>> callback for freeing the job.
>>>
>>> Implement the new backend_ops callback cancel_job().
>>>
>>> Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>
>> Please just add the link to the patch here (it is only in the cover
>> letter):
>>
>> Link:
>> https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursulin@igalia.com/
> 
> That I can do, sure

Cool, with that, for this patch:

Acked-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

>> And you probably want to take the unit test modifications from the
>> same
>> patch too. You could put them in the same patch or separate.
> 
> Necessary adjustments for the unit tests are already implemented and
> are waiting for review separately, since this can be done independently
> from this entire series:
> 
> https://lore.kernel.org/dri-devel/20250605134154.191764-2-phasta@kernel.org/

For me it would make most sense to fold that into 2/6 from this series. 
I don't see it making sense as standalone. So if you could repost the 
series with it integrated I will give it a spin and can review that 
patch at least.

Regards,

Tvrtko

> 
> Thx
> P.
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>> Signed-off-by: Philipp Stanner <phasta@kernel.org>
>>> ---
>>>    drivers/gpu/drm/scheduler/sched_main.c | 34 ++++++++++++++++-----
>>> -----
>>>    include/drm/gpu_scheduler.h            |  9 +++++++
>>>    2 files changed, 30 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index d20726d7adf0..3f14f1e151fa 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -1352,6 +1352,18 @@ int drm_sched_init(struct drm_gpu_scheduler
>>> *sched, const struct drm_sched_init_
>>>    }
>>>    EXPORT_SYMBOL(drm_sched_init);
>>>    
>>> +static void drm_sched_kill_remaining_jobs(struct drm_gpu_scheduler
>>> *sched)
>>> +{
>>> +	struct drm_sched_job *job, *tmp;
>>> +
>>> +	/* All other accessors are stopped. No locking necessary.
>>> */
>>> +	list_for_each_entry_safe_reverse(job, tmp, &sched-
>>>> pending_list, list) {
>>> +		sched->ops->cancel_job(job);
>>> +		list_del(&job->list);
>>> +		sched->ops->free_job(job);
>>> +	}
>>> +}
>>> +
>>>    /**
>>>     * drm_sched_fini - Destroy a gpu scheduler
>>>     *
>>> @@ -1359,19 +1371,11 @@ EXPORT_SYMBOL(drm_sched_init);
>>>     *
>>>     * Tears down and cleans up the scheduler.
>>>     *
>>> - * This stops submission of new jobs to the hardware through
>>> - * drm_sched_backend_ops.run_job(). Consequently,
>>> drm_sched_backend_ops.free_job()
>>> - * will not be called for all jobs still in
>>> drm_gpu_scheduler.pending_list.
>>> - * There is no solution for this currently. Thus, it is up to the
>>> driver to make
>>> - * sure that:
>>> - *
>>> - *  a) drm_sched_fini() is only called after for all submitted
>>> jobs
>>> - *     drm_sched_backend_ops.free_job() has been called or that
>>> - *  b) the jobs for which drm_sched_backend_ops.free_job() has not
>>> been called
>>> - *     after drm_sched_fini() ran are freed manually.
>>> - *
>>> - * FIXME: Take care of the above problem and prevent this function
>>> from leaking
>>> - * the jobs in drm_gpu_scheduler.pending_list under any
>>> circumstances.
>>> + * This stops submission of new jobs to the hardware through
>>> &struct
>>> + * drm_sched_backend_ops.run_job. If &struct
>>> drm_sched_backend_ops.cancel_job
>>> + * is implemented, all jobs will be canceled through it and
>>> afterwards cleaned
>>> + * up through &struct drm_sched_backend_ops.free_job. If
>>> cancel_job is not
>>> + * implemented, memory could leak.
>>>     */
>>>    void drm_sched_fini(struct drm_gpu_scheduler *sched)
>>>    {
>>> @@ -1401,6 +1405,10 @@ void drm_sched_fini(struct drm_gpu_scheduler
>>> *sched)
>>>    	/* Confirm no work left behind accessing device structures
>>> */
>>>    	cancel_delayed_work_sync(&sched->work_tdr);
>>>    
>>> +	/* Avoid memory leaks if supported by the driver. */
>>> +	if (sched->ops->cancel_job)
>>> +		drm_sched_kill_remaining_jobs(sched);
>>> +
>>>    	if (sched->own_submit_wq)
>>>    		destroy_workqueue(sched->submit_wq);
>>>    	sched->ready = false;
>>> diff --git a/include/drm/gpu_scheduler.h
>>> b/include/drm/gpu_scheduler.h
>>> index e62a7214e052..81dcbfc8c223 100644
>>> --- a/include/drm/gpu_scheduler.h
>>> +++ b/include/drm/gpu_scheduler.h
>>> @@ -512,6 +512,15 @@ struct drm_sched_backend_ops {
>>>             * and it's time to clean it up.
>>>    	 */
>>>    	void (*free_job)(struct drm_sched_job *sched_job);
>>> +
>>> +	/**
>>> +	 * @cancel_job: Used by the scheduler to guarantee
>>> remaining jobs' fences
>>> +	 * get signaled in drm_sched_fini().
>>> +	 *
>>> +	 * Drivers need to signal the passed job's hardware fence
>>> with
>>> +	 * -ECANCELED in this callback. They must not free the
>>> job.
>>> +	 */
>>> +	void (*cancel_job)(struct drm_sched_job *sched_job);
>>>    };
>>>    
>>>    /**
>>
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback
  2025-06-16  9:27       ` Tvrtko Ursulin
@ 2025-06-16 10:08         ` Philipp Stanner
  0 siblings, 0 replies; 16+ messages in thread
From: Philipp Stanner @ 2025-06-16 10:08 UTC (permalink / raw)
  To: Tvrtko Ursulin, phasta, Lyude Paul, Danilo Krummrich,
	David Airlie, Simona Vetter, Matthew Brost, Christian König,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, Sumit Semwal,
	Pierre-Eric Pelloux-Prayer
  Cc: dri-devel, nouveau, linux-kernel, linux-media

On Mon, 2025-06-16 at 10:27 +0100, Tvrtko Ursulin wrote:
> 
> On 12/06/2025 15:20, Philipp Stanner wrote:
> > On Thu, 2025-06-12 at 15:17 +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 03/06/2025 10:31, Philipp Stanner wrote:
> > > > Since its inception, the GPU scheduler can leak memory if the
> > > > driver
> > > > calls drm_sched_fini() while there are still jobs in flight.
> > > > 
> > > > The simplest way to solve this in a backwards compatible manner
> > > > is
> > > > by
> > > > adding a new callback, drm_sched_backend_ops.cancel_job(),
> > > > which
> > > > instructs the driver to signal the hardware fence associated
> > > > with
> > > > the
> > > > job. Afterwards, the scheduler can savely use the established
> > > > free_job()
> > > > callback for freeing the job.
> > > > 
> > > > Implement the new backend_ops callback cancel_job().
> > > > 
> > > > Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> > > 
> > > Please just add the link to the patch here (it is only in the
> > > cover
> > > letter):
> > > 
> > > Link:
> > > https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursulin@igalia.com/
> > 
> > That I can do, sure
> 
> Cool, with that, for this patch:
> 
> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> 
> > > And you probably want to take the unit test modifications from
> > > the
> > > same
> > > patch too. You could put them in the same patch or separate.
> > 
> > Necessary adjustments for the unit tests are already implemented
> > and
> > are waiting for review separately, since this can be done
> > independently
> > from this entire series:
> > 
> > https://lore.kernel.org/dri-devel/20250605134154.191764-2-phasta@kernel.org/
> 
> For me it would make most sense to fold that into 2/6 from this
> series. 
> I don't see it making sense as standalone. So if you could repost the
> series with it integrated I will give it a spin and can review that 
> patch at least.

It does make sense as an independent patch, because it is: independent.
It improves the unit tests in a way that they become a better role
model for the driver callbacks. All fences always must get signaled,
which is not the case there currently. Unit tests serve as a reference
implementation for new users, which is why I am stressing that point.

If you disagree with that patch's content, please answer on it

P.

> 
> Regards,
> 
> Tvrtko
> 
> > 
> > Thx
> > P.
> > 
> > > 
> > > Regards,
> > > 
> > > Tvrtko
> > > 
> > > > Signed-off-by: Philipp Stanner <phasta@kernel.org>
> > > > ---
> > > >    drivers/gpu/drm/scheduler/sched_main.c | 34
> > > > ++++++++++++++++-----
> > > > -----
> > > >    include/drm/gpu_scheduler.h            |  9 +++++++
> > > >    2 files changed, 30 insertions(+), 13 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > index d20726d7adf0..3f14f1e151fa 100644
> > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > @@ -1352,6 +1352,18 @@ int drm_sched_init(struct
> > > > drm_gpu_scheduler
> > > > *sched, const struct drm_sched_init_
> > > >    }
> > > >    EXPORT_SYMBOL(drm_sched_init);
> > > >    
> > > > +static void drm_sched_kill_remaining_jobs(struct
> > > > drm_gpu_scheduler
> > > > *sched)
> > > > +{
> > > > +	struct drm_sched_job *job, *tmp;
> > > > +
> > > > +	/* All other accessors are stopped. No locking
> > > > necessary.
> > > > */
> > > > +	list_for_each_entry_safe_reverse(job, tmp, &sched-
> > > > > pending_list, list) {
> > > > +		sched->ops->cancel_job(job);
> > > > +		list_del(&job->list);
> > > > +		sched->ops->free_job(job);
> > > > +	}
> > > > +}
> > > > +
> > > >    /**
> > > >     * drm_sched_fini - Destroy a gpu scheduler
> > > >     *
> > > > @@ -1359,19 +1371,11 @@ EXPORT_SYMBOL(drm_sched_init);
> > > >     *
> > > >     * Tears down and cleans up the scheduler.
> > > >     *
> > > > - * This stops submission of new jobs to the hardware through
> > > > - * drm_sched_backend_ops.run_job(). Consequently,
> > > > drm_sched_backend_ops.free_job()
> > > > - * will not be called for all jobs still in
> > > > drm_gpu_scheduler.pending_list.
> > > > - * There is no solution for this currently. Thus, it is up to
> > > > the
> > > > driver to make
> > > > - * sure that:
> > > > - *
> > > > - *  a) drm_sched_fini() is only called after for all submitted
> > > > jobs
> > > > - *     drm_sched_backend_ops.free_job() has been called or
> > > > that
> > > > - *  b) the jobs for which drm_sched_backend_ops.free_job() has
> > > > not
> > > > been called
> > > > - *     after drm_sched_fini() ran are freed manually.
> > > > - *
> > > > - * FIXME: Take care of the above problem and prevent this
> > > > function
> > > > from leaking
> > > > - * the jobs in drm_gpu_scheduler.pending_list under any
> > > > circumstances.
> > > > + * This stops submission of new jobs to the hardware through
> > > > &struct
> > > > + * drm_sched_backend_ops.run_job. If &struct
> > > > drm_sched_backend_ops.cancel_job
> > > > + * is implemented, all jobs will be canceled through it and
> > > > afterwards cleaned
> > > > + * up through &struct drm_sched_backend_ops.free_job. If
> > > > cancel_job is not
> > > > + * implemented, memory could leak.
> > > >     */
> > > >    void drm_sched_fini(struct drm_gpu_scheduler *sched)
> > > >    {
> > > > @@ -1401,6 +1405,10 @@ void drm_sched_fini(struct
> > > > drm_gpu_scheduler
> > > > *sched)
> > > >    	/* Confirm no work left behind accessing device
> > > > structures
> > > > */
> > > >    	cancel_delayed_work_sync(&sched->work_tdr);
> > > >    
> > > > +	/* Avoid memory leaks if supported by the driver. */
> > > > +	if (sched->ops->cancel_job)
> > > > +		drm_sched_kill_remaining_jobs(sched);
> > > > +
> > > >    	if (sched->own_submit_wq)
> > > >    		destroy_workqueue(sched->submit_wq);
> > > >    	sched->ready = false;
> > > > diff --git a/include/drm/gpu_scheduler.h
> > > > b/include/drm/gpu_scheduler.h
> > > > index e62a7214e052..81dcbfc8c223 100644
> > > > --- a/include/drm/gpu_scheduler.h
> > > > +++ b/include/drm/gpu_scheduler.h
> > > > @@ -512,6 +512,15 @@ struct drm_sched_backend_ops {
> > > >             * and it's time to clean it up.
> > > >    	 */
> > > >    	void (*free_job)(struct drm_sched_job *sched_job);
> > > > +
> > > > +	/**
> > > > +	 * @cancel_job: Used by the scheduler to guarantee
> > > > remaining jobs' fences
> > > > +	 * get signaled in drm_sched_fini().
> > > > +	 *
> > > > +	 * Drivers need to signal the passed job's hardware
> > > > fence
> > > > with
> > > > +	 * -ECANCELED in this callback. They must not free the
> > > > job.
> > > > +	 */
> > > > +	void (*cancel_job)(struct drm_sched_job *sched_job);
> > > >    };
> > > >    
> > > >    /**
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-06-16 10:08 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-03  9:31 [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job Philipp Stanner
2025-06-03  9:31 ` [RFC PATCH 1/6] drm/sched: Avoid memory leaks with cancel_job() callback Philipp Stanner
2025-06-03 13:22   ` Tvrtko Ursulin
2025-06-12 14:17   ` Tvrtko Ursulin
2025-06-12 14:20     ` Philipp Stanner
2025-06-16  9:27       ` Tvrtko Ursulin
2025-06-16 10:08         ` Philipp Stanner
2025-06-03  9:31 ` [RFC PATCH 2/6] drm/sched/tests: Implement cancel_job() Philipp Stanner
2025-06-03  9:31 ` [RFC PATCH 3/6] drm/sched: Warn if pending list is not empty Philipp Stanner
2025-06-03  9:31 ` [RFC PATCH 4/6] drm/nouveau: Make fence container helper usable driver-wide Philipp Stanner
2025-06-03  9:31 ` [RFC PATCH 5/6] drm/nouveau: Add new callback for scheduler teardown Philipp Stanner
2025-06-03  9:31 ` [RFC PATCH 6/6] drm/nouveau: Remove waitque for sched teardown Philipp Stanner
2025-06-03 12:27 ` [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling job-by-job Tvrtko Ursulin
2025-06-03 13:23   ` Philipp Stanner
2025-06-11 21:21     ` Danilo Krummrich
2025-06-12 13:46       ` Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).