From: Philipp Stanner <phasta@mailbox.org>
To: "Tvrtko Ursulin" <tvrtko.ursulin@igalia.com>,
"Philipp Stanner" <phasta@kernel.org>,
"Lyude Paul" <lyude@redhat.com>,
"Danilo Krummrich" <dakr@kernel.org>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Matthew Brost" <matthew.brost@intel.com>,
"Christian König" <ckoenig.leichtzumerken@gmail.com>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"Sumit Semwal" <sumit.semwal@linaro.org>,
"Pierre-Eric Pelloux-Prayer" <pierre-eric.pelloux-prayer@amd.com>
Cc: dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org,
linux-kernel@vger.kernel.org, linux-media@vger.kernel.org,
linaro-mm-sig@lists.linaro.org
Subject: Re: [PATCH 2/6] drm/sched/tests: Port to cancel_job()
Date: Wed, 02 Jul 2025 12:56:03 +0200 [thread overview]
Message-ID: <6762d33b4fe8e7b264a7403f228e6ec6723ae623.camel@mailbox.org> (raw)
In-Reply-To: <f9b55d5b-0018-4850-a9b7-2f267467e957@igalia.com>
On Wed, 2025-07-02 at 11:36 +0100, Tvrtko Ursulin wrote:
>
> On 01/07/2025 14:21, Philipp Stanner wrote:
> > The GPU Scheduler now supports a new callback, cancel_job(), which
> > lets
> > the scheduler cancel all jobs which might not yet be freed when
> > drm_sched_fini() runs. Using this callback allows for significantly
> > simplifying the mock scheduler teardown code.
> >
> > Implement the cancel_job() callback and adjust the code where
> > necessary.
>
> Cross referencing against my version I think you missed this hunk:
>
> --- a/drivers/gpu/drm/scheduler/tests/sched_tests.h
> +++ b/drivers/gpu/drm/scheduler/tests/sched_tests.h
> @@ -49,7 +49,6 @@ struct drm_mock_scheduler {
>
> spinlock_t lock;
> struct list_head job_list;
> - struct list_head done_list;
>
> struct {
> u64 context;
>
Right, overlooked that one.
>
> I also had this:
>
> @@ -97,7 +96,8 @@ struct drm_mock_sched_job {
> struct completion done;
>
> #define DRM_MOCK_SCHED_JOB_DONE 0x1
> -#define DRM_MOCK_SCHED_JOB_TIMEDOUT 0x2
> +#define DRM_MOCK_SCHED_JOB_CANCELED 0x2
> +#define DRM_MOCK_SCHED_JOB_TIMEDOUT 0x4
>
> And was setting it in the callback. And since we should add a test to
> explicitly cover the new callback, and just the callback, that could
> make it very easy to do it.
What do you imagine that to look like? The scheduler only invokes the
callback on tear down.
We also don't have tests that only test free_job() and the like, do
we?
You cannot test a callback for the scheduler, because the callback is
implemented in the driver.
Callbacks are tested by using the scheduler. In this case, it's tested
the intended way by the unit tests invoking drm_sched_free().
P.
>
> > Signed-off-by: Philipp Stanner <phasta@kernel.org>
> > ---
> > .../gpu/drm/scheduler/tests/mock_scheduler.c | 66 +++++++-------
> > -----
> > 1 file changed, 23 insertions(+), 43 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> > b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> > index 49d067fecd67..2d3169d95200 100644
> > --- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> > +++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
> > @@ -63,7 +63,7 @@ static void drm_mock_sched_job_complete(struct
> > drm_mock_sched_job *job)
> > lockdep_assert_held(&sched->lock);
> >
> > job->flags |= DRM_MOCK_SCHED_JOB_DONE;
> > - list_move_tail(&job->link, &sched->done_list);
> > + list_del(&job->link);
> > dma_fence_signal_locked(&job->hw_fence);
> > complete(&job->done);
> > }
> > @@ -236,26 +236,39 @@ mock_sched_timedout_job(struct drm_sched_job
> > *sched_job)
> >
> > static void mock_sched_free_job(struct drm_sched_job *sched_job)
> > {
> > - struct drm_mock_scheduler *sched =
> > - drm_sched_to_mock_sched(sched_job->sched);
> > struct drm_mock_sched_job *job =
> > drm_sched_job_to_mock_job(sched_job);
> > - unsigned long flags;
> >
> > - /* Remove from the scheduler done list. */
> > - spin_lock_irqsave(&sched->lock, flags);
> > - list_del(&job->link);
> > - spin_unlock_irqrestore(&sched->lock, flags);
> > dma_fence_put(&job->hw_fence);
> > -
> > drm_sched_job_cleanup(sched_job);
> >
> > /* Mock job itself is freed by the kunit framework. */
> > }
> >
> > +static void mock_sched_cancel_job(struct drm_sched_job *sched_job)
> > +{
> > + struct drm_mock_scheduler *sched =
> > drm_sched_to_mock_sched(sched_job->sched);
> > + struct drm_mock_sched_job *job =
> > drm_sched_job_to_mock_job(sched_job);
> > + unsigned long flags;
> > +
> > + hrtimer_cancel(&job->timer);
> > +
> > + spin_lock_irqsave(&sched->lock, flags);
> > + if (!dma_fence_is_signaled_locked(&job->hw_fence)) {
> > + list_del(&job->link);
> > + dma_fence_set_error(&job->hw_fence, -ECANCELED);
> > + dma_fence_signal_locked(&job->hw_fence);
> > + }
> > + spin_unlock_irqrestore(&sched->lock, flags);
> > +
> > + /* The GPU Scheduler will call
> > drm_sched_backend_ops.free_job(), still.
> > + * Mock job itself is freed by the kunit framework. */
>
> /*
> * Multiline comment style to stay consistent, at least in this
> file.
> */
>
> The rest looks good, but I need to revisit the timeout/free handling
> since it has been a while and you changed it recently.
>
> Regards,
>
> Tvrtko
>
> > +}
> > +
> > static const struct drm_sched_backend_ops drm_mock_scheduler_ops
> > = {
> > .run_job = mock_sched_run_job,
> > .timedout_job = mock_sched_timedout_job,
> > - .free_job = mock_sched_free_job
> > + .free_job = mock_sched_free_job,
> > + .cancel_job = mock_sched_cancel_job,
> > };
> >
> > /**
> > @@ -289,7 +302,6 @@ struct drm_mock_scheduler
> > *drm_mock_sched_new(struct kunit *test, long timeout)
> > sched->hw_timeline.context = dma_fence_context_alloc(1);
> > atomic_set(&sched->hw_timeline.next_seqno, 0);
> > INIT_LIST_HEAD(&sched->job_list);
> > - INIT_LIST_HEAD(&sched->done_list);
> > spin_lock_init(&sched->lock);
> >
> > return sched;
> > @@ -304,38 +316,6 @@ struct drm_mock_scheduler
> > *drm_mock_sched_new(struct kunit *test, long timeout)
> > */
> > void drm_mock_sched_fini(struct drm_mock_scheduler *sched)
> > {
> > - struct drm_mock_sched_job *job, *next;
> > - unsigned long flags;
> > - LIST_HEAD(list);
> > -
> > - drm_sched_wqueue_stop(&sched->base);
> > -
> > - /* Force complete all unfinished jobs. */
> > - spin_lock_irqsave(&sched->lock, flags);
> > - list_for_each_entry_safe(job, next, &sched->job_list,
> > link)
> > - list_move_tail(&job->link, &list);
> > - spin_unlock_irqrestore(&sched->lock, flags);
> > -
> > - list_for_each_entry(job, &list, link)
> > - hrtimer_cancel(&job->timer);
> > -
> > - spin_lock_irqsave(&sched->lock, flags);
> > - list_for_each_entry_safe(job, next, &list, link)
> > - drm_mock_sched_job_complete(job);
> > - spin_unlock_irqrestore(&sched->lock, flags);
> > -
> > - /*
> > - * Free completed jobs and jobs not yet processed by the
> > DRM scheduler
> > - * free worker.
> > - */
> > - spin_lock_irqsave(&sched->lock, flags);
> > - list_for_each_entry_safe(job, next, &sched->done_list,
> > link)
> > - list_move_tail(&job->link, &list);
> > - spin_unlock_irqrestore(&sched->lock, flags);
> > -
> > - list_for_each_entry_safe(job, next, &list, link)
> > - mock_sched_free_job(&job->base);
> > -
> > drm_sched_fini(&sched->base);
> > }
> >
>
next prev parent reply other threads:[~2025-07-02 10:56 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-01 13:21 [PATCH 0/6] drm/sched: Fix memory leaks with cancel_job() callback Philipp Stanner
2025-07-01 13:21 ` [PATCH 1/6] drm/sched: Avoid " Philipp Stanner
2025-07-02 10:19 ` Tvrtko Ursulin
2025-07-04 12:18 ` Maíra Canal
2025-07-01 13:21 ` [PATCH 2/6] drm/sched/tests: Port to cancel_job() Philipp Stanner
2025-07-02 10:36 ` Tvrtko Ursulin
2025-07-02 10:56 ` Philipp Stanner [this message]
2025-07-02 11:25 ` Tvrtko Ursulin
2025-07-04 9:53 ` Philipp Stanner
2025-07-04 11:27 ` Philipp Stanner
2025-07-04 12:30 ` Tvrtko Ursulin
2025-07-01 13:21 ` [PATCH 3/6] drm/sched: Warn if pending list is not empty Philipp Stanner
2025-07-01 13:21 ` [PATCH 4/6] drm/nouveau: Make fence container helper usable driver-wide Philipp Stanner
2025-07-01 13:21 ` [PATCH 5/6] drm/nouveau: Add new callback for scheduler teardown Philipp Stanner
2025-07-01 13:21 ` [PATCH 6/6] drm/nouveau: Remove waitque for sched teardown Philipp Stanner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6762d33b4fe8e7b264a7403f228e6ec6723ae623.camel@mailbox.org \
--to=phasta@mailbox.org \
--cc=airlied@gmail.com \
--cc=ckoenig.leichtzumerken@gmail.com \
--cc=dakr@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=lyude@redhat.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=matthew.brost@intel.com \
--cc=mripard@kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=phasta@kernel.org \
--cc=pierre-eric.pelloux-prayer@amd.com \
--cc=simona@ffwll.ch \
--cc=sumit.semwal@linaro.org \
--cc=tvrtko.ursulin@igalia.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).